We use PorterStemmer for stemming a bunch of words.
Since PorterStemmer is a class as seen from the beggining capital letter (the convention), we have to first make an object and then use a method. Since we will only use the method stem, the object is not stored but only the method.
# nlp18.py
from __future__ import print_function
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer
text = """
cats catlike cat stemmer stemming stemmed stem
fishing fished fisher fish argue argued argues
arguing argument arguments
"""
PS = PorterStemmer().stem
for a in word_tokenize(text):
print('%10s --> %10s' % (a,PS(a)) )
# cats --> cat
# catlike --> catlik
# cat --> cat
# stemmer --> stemmer
# stemming --> stem
# stemmed --> stem
# stem --> stem
# fishing --> fish
# fished --> fish
# fisher --> fisher
# fish --> fish
# argue --> argu
# argued --> argu
# argues --> argu
# arguing --> argu
# argument --> argument
# arguments --> argument
No comments:
Post a Comment