Skip to content

Commit fe09e37

Browse files
authored
Merge pull request #80 from denizs/denizs-patch-1
Added `vocab` and `vocab_size` to CBOW exercise
2 parents 6c12152 + cdab611 commit fe09e37

File tree

1 file changed

+6
-1
lines changed

1 file changed

+6
-1
lines changed

beginner_source/nlp/word_embeddings_tutorial.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -309,7 +309,12 @@ def forward(self, inputs):
309309
The evolution of a process is directed by a pattern of rules
310310
called a program. People create programs to direct processes. In effect,
311311
we conjure the spirits of the computer with our spells.""".split()
312-
word_to_ix = {word: i for i, word in enumerate(raw_text)}
312+
313+
# By deriving a set from `raw_text`, we deduplicate the array
314+
vocab = set(raw_text)
315+
vocab_size = len(vocab)
316+
317+
word_to_ix = {word: i for i, word in enumerate(vocab)}
313318
data = []
314319
for i in range(2, len(raw_text) - 2):
315320
context = [raw_text[i - 2], raw_text[i - 1],

0 commit comments

Comments
 (0)