Saturday, May 30, 2015

nlp23. WordNet Lemmatizer in Python NLTK

The nltk.stem.wordnet.WordNetLemmatizer object can be used to lemmatize and find a simple form for a word.


Note, if the part-of-speech is not indicated, the word is treated as a noun. We make this explicit, here, so we may use * to indicate the parameter should be treated as individual values and not as a tuple, in which case we have an error.

# nlp23.py
from __future__ import print_function
from nltk.stem.wordnet import WordNetLemmatizer
lem = WordNetLemmatizer()
words = [('order','v'),'order',('orders','v')]
for word in words:
    if type(word) == str:
        word = (word,'n')
    lemma = lem.lemmatize(*word)
    print('For:',word, end='\t')
    print('Lemma =',lemma)

# For: ('order', 'v')     Lemma = order
# For: ('order', 'n')     Lemma = order
# For: ('orders', 'v')    Lemma = order

No comments:

Post a Comment