Skip to content Skip to sidebar Skip to footer

Create A Corpus Containing The Vocabulary Of Words

I am calculating inverse_document_frequency for all the words in my documents dictionary and I have to show the top 5 documents ranked according to the score on queries. But I am s

Solution 1:

How about instead of list, setting corpus as a set type? you won't need additional if too.

corpus = set() # a list that will store words of the vocabulary
for doc in documents.values(): #iterate through documents 
    corpus.update(doc) #add word in corpus if not already added

Post a Comment for "Create A Corpus Containing The Vocabulary Of Words"