- Stemming is about replacing each word with its origin stem word in order to remove the suffixes like “es”, “ies”, “s”. For ex., “cats” => “cat”, “computers” => “computer” etc. This is more of a heuristic approach and not using any grammar or dictionary.
- Lemmatisation has the same purpose as above but doing it properly and not with heuristics. Replaces the word with the dictionary form of the word. Lemmatisation would need a database of words like dictionary.
What is the difference between stemming and lemmatisation?
Posted on