Machine Learning Archives - Page 4 of 7 - Ace the Data Science Interview!

You want to find food related topics in twitter – how do you go about it ?

Posted on February 21, 2019May 13, 2019 by MLInterview

One can use any of the topic models above to get topics. However, to direct the topics to contain food related information, specialized topic modeling algorithms are available. However, one simple way to direct the topics to food related things is : Filter tweets by a limited set of food related keywords (food, meal, dinner,…

What is stratified sampling and why is it important ?

Posted on February 21, 2019July 1, 2019 by MLNerds

Stratified sampling is a sampling method where population is divided into homogenous subgroups called strata and the right number of instances are sampled from each stratum. For further explanation visit here. This sampling is important to ensure that sampled dataset is representative of the entire population. To realise this point, consider an example of predicting…

How do you measure quality of Machine translation ?

Posted on February 20, 2019June 27, 2019 by MLInterview

BLEU (Bilingual evaluation understudy) score is the most common metric used during machine translation. Typically, it is used to measure a candidate translation against a set of reference translations available as ground truth. BLEU score is based on precision – how many of the words in the candidate sentence are in the reference sentence….

Suppose you build word vectors (embeddings) with each word vector having dimensions as the vocabulary size(V) and feature values as pPMI between corresponding words: What are the problems with this approach and how can you resolve them ?

Posted on February 17, 2019May 2, 2019 by MLNerds

Problems As the vocabulary size (V) is large, these vectors will be large in size. They will be sparse as a word may not have co-occurred with all possible words. Resolution Dimensionality Reduction using approaches like Singular Value Decomposition (SVD) of the term document matrix to get a K dimensional approximation. Other Matrix factorisation techniques…

What is negative sampling when training the skip-gram model ?

Posted on February 17, 2019April 4, 2019 by MLNerds

Recap: Skip-Gram model tries to represent each word in a large text as a lower dimensional vector in a space of K dimensions such that similar words are closer to each other. This is achieved by training a feed-forward network where we try to predict the context words given a specific word, i.e., …

What is PMI ?

Posted on February 17, 2019May 2, 2019 by MLNerds

PMI : Pointwise Mutual Information, is a measure of correlation between two events x and y. As you can see from above expression, is directly proportional to the number of times both events occur together and inversely proportional to the individual counts which are in the denominator. This expression ensures high…

What is the complexity of Viterbi algorithm ?

Posted on February 16, 2019February 19, 2019 by MLNerds

Viterbi algorithm is a dynamic programming approach to find the most probable sequence of hidden states given the observed data, as modeled by a HMM. Without dynamic programming, it becomes an exponential problem as there are exponential number of possible sequences for a given observation(How – explained in answer below). Let the transition probabilities(state transition)…

Suppose you are modeling text with a HMM, What is the complexity of finding most the probable sequence of tags or states from a sequence of text using brute force algorithm?

Posted on February 16, 2019March 7, 2019 by MLNerds

Assume there are total states and let be the length of the largest sequence. Think how we generate text using an hMM. We first have a state sequence and from each state we emit an output. From each state, any word out of possible outcomes can be generated. Since there are states, at each possible…

How do you train a hMM model in practice ?

Posted on February 16, 2019March 7, 2019 by MLNerds

The joint probability distribution for the HMM model is given by the following equation where are the observed data points and the corresponding latent states: Before proceeding to answer the question on training a HMM, it makes sense to ask following questions What is the problem in hand for which we are training…

What are the different independence assumptions in hMM & Naive Bayes ?

Posted on February 16, 2019March 7, 2019 by MLNerds

Both the hMM and Naive Bayes have conditional independence assumption. hMM can be expressed by the equation below : Second equation implies a conditional independence assumption: Given the state observed variable is conditionally independent of previous observed variables, i.e. and Naive Bayes Model is expressed as: is the feature…

← Newer posts Older posts →