Skip to content Skip to sidebar Skip to footer

With Nltk, How Can I Generate Different Form Of Word, When A Certain Word Is Given?

For example, Suppose the word 'happy' is given, I want to generate other forms of happy such as happiness, happily... etc. I have read some other previous questions on Stackoverflo

Solution 1:

This type of information is included in the Lemma class of NLTK's WordNet implementation. Specifically, it's found in Lemma.derivationally_related_forms().

Here's an example script for finding all possible derivation forms of "happy":

from nltk.corpus import wordnet as wn

forms = set() #We'll store the derivational forms in a set to eliminate duplicatesfor happy_lemma in wn.lemmas("happy"): #for each "happy" lemma in WordNet
    forms.add(happy_lemma.name()) #add the lemma itselffor related_lemma in happy_lemma.derivationally_related_forms(): #for each related lemma
        forms.add(related_lemma.name()) #add the related lemma

Unfortunately, the information in WordNet is not complete. The above script finds "happy" and "happiness" but it fails to find "happily", even though there are multiple "happily" lemmas.

Post a Comment for "With Nltk, How Can I Generate Different Form Of Word, When A Certain Word Is Given?"