Suppose you build word vectors (embeddings) with each word vector having dimensions as the vocabulary size(V) and feature values as pPMI between corresponding words: What are the problems with this approach and how can you resolve them ?

Posted on February 17, 2019May 2, 2019 by MLNerds

Problems

As the vocabulary size (V) is large, these vectors will be large in size.
They will be sparse as a word may not have co-occurred with all possible words.

Resolution

Dimensionality Reduction using approaches like
1. Singular Value Decomposition (SVD) of the term document matrix to get a K dimensional approximation.
2. Other Matrix factorisation techniques can be employed for dimensionality reduction.

Possible followup question : What is the information lost in approximating a V dimensional word representation with a K dimensional representation. Answer: SVD finds the best possible K dimensional approximation of the term-document matrix from a information theoretic perspective.

Leave a Reply Cancel reply