Natural Language Processing – Machine Learning Interviews

Skip or Residual Connections in Deep Networks

Posted on March 1, 2024 by MLNerds

Explain latent dirichlet allocation – where is it typically used ?

Posted on February 10, 2019February 14, 2019 by MLNerds

Latent Dirichlet Allocation is a probabilistic model that models a document as a multinomial mixture of topics and the topics as a multinomial mixture of words. Each of these multinomials have a dirichlet prior. The goal is to learn these multinomial proportions using probabilistic inference techniques based on the observed data which is the words/content…

You are trying to cluster documents using a Bag of Words method. Typically words like if, of, is and so on are not great features. How do you make sure you are leveraging the more informative words better during the feature Engineering?

Posted on February 9, 2019 by MLNerds

Words like if, of, … are called stop words. Typical pre-processing in standard NLP pipeline involves identifying and removing stop-words (except in some cases where context/ word adjacency information is important). Common techniques to remove stop words include : TF-IDF – Term frequency inverse document frequency Leveraging manually curated stop word lists and eliminating…

Tag: Natural Language Processing

Skip or Residual Connections in Deep Networks

Top 50 Machine Learning Interview Questions

Explain latent dirichlet allocation – where is it typically used ?

You are trying to cluster documents using a Bag of Words method. Typically words like if, of, is and so on are not great features. How do you make sure you are leveraging the more informative words better during the feature Engineering?