When all words are not converted to a single case, the vocabulary size will increase drastically as words like Up/up or Fast/fast or This/this will be treated differently which isn’t a correct behaviour for the NLP task. Sparsity is higher when building the language model since the cat is treated differently from The cat. Suppose…
What will happen if you do not convert all characters to a single case (either lower or upper) during the pre-processing step of an NLP algorithm?
Posted on