What are popular ways of dimensionality reduction in NLP tasks ? Do you think this is even important ?

Common representation is bag of words that is very high dimensional given high vocab size. Commonly used ways for dimensionality reduction in NLP :

TF-IDF : Term frequency, inverse document frequency (link to relevant article)
Word2Vec / Glove : These are very popular recently. They are obtained by leveraging word co-occurrence, through an encoder – decoder setting in a deep neural network. (** give references ). A document embedding is obtained by averaging embeddings of all words in the document.
Elmo Embeddings: Deep contextual embeddings – Elmo might give a slightly different embedding for each context a word occurs in.
LSI : Latent semantic Indexing ( based on Singular Value Decomposition (SVD))
Topic Modeling : Techniques such as latent dirichlet allocation that find relevant topics in document collection and represent the document as a reduced dimensional vector of topic strengths.