Top 50 Machine Learning Interview Questions

Whether you are kickstarting your interview preparation, or wrapping up your preparation and are looking for final touches, here are over 50 must see questions to prepare for a data science interview. We have put them in five categories for convenience. (Note: There are sevaral more questions along with answers in the main menu “Interview Questions and Answers”  )

Basic Data Science Questions

How do you evaluate a Machine Learning algorithm ?
Why do you need training set, test set and validation set ?
What is bias variance trade-off in Machine Learning ?
What is the difference between supervised and unsupervised learning ?
When are deep learning algorithms more appropriate compared to traditional machine learning algorithms?
What are popular clustering algorithms? How do you fix the number of clusters in a clustering algorithm ?
Why do you need dimensionality reduction ? What are some ways to do this ?
What is regularization ? What types of regularizers do you know ?
What are the various steps in the typical Machine Learning pipeline ?
What data cleanup and normalization is required in a typical Machine Learning pipeline ?
What is overfitting and underfitting ? Give examples. How do you overcome them?

Basic Data Cleanup/wrangling Questions

How do you deal with missing data ?
How do you detect outliers in data ? How do you deal with them ?
What pre-processing can you do when you have an imbalanced dataset ?
When and how do you do feature scaling ?
When do you need to normalize your data to have zero mean and unit variance ?
What do you do when you have very little training data ?
You realize you have duplicates in your data – what do you do ?
You have 10,000 features. How do you figure out if you need all of them.
You have a column that contains colors such as “red”, “blue”,… how do you handle this column ?
You have one file with person ID, eye color, ethnicity, height, weight in one file and person ID, salary, family size in another file. How do you make a combined file with these in pandas ?

Basic Deep Learning Questions

What is machine learning and where does deep learning fit in ?
What are the different loss functions you can use in Deep Learning ? How do you pick one ?
What is drop out ?
What are the different forms of regularizers used in deep learning ?
What are the learning algorithms you are aware of ?
How do you typically initialize weights in deep neural network ?
How do you fix the number of layers & hidden units in a deep neural network ?
What are the differences between keras, tensorflow and pytorch ?
What is the purpose of an LSTM ? Why do you need a bi-directional LSTM ?
What is the difference between a GRU and a bi-directional LSTM ?
What is an attention mechanism ? What are some examples where it is used ?

Basic NLP Questions

What are stop words ? How do we remove them ?
What are various ways of finding word embeddings ?
Explain the skip-gram model and word2vec embeddings?
How do you determine if two sentences are similar ?
Explain how you can approach sentiment analysis from twitter ?
I want to find topics in a set of documents, what models will I use ?
What is perplexity ?
How to measure the effectiveness of a model for spam filtering? Note that this is a highly imbalanced problem.
What are popular python libraries you will use for NLP ?
What is stemming and lemmatization ?
What are the difference ways in which you can represent a document ?

Basic Math Questions for Data Science

What is a valid probability distribution ?
Explain Bayes rule ?
What is Maximum likelihood estimate ?
What is joint probability and what is conditional probability ?
What are eigen-values and eigen-vectors ? Why do we care about them ?
What is the central limit theorem ?
You are given a function and a data point. How will you find if this point is a maximizer / minimizer / neither maximizer, minimizer ?
What is the difference between MLE and MAP estimates ?
What is the difference between global optima a`nd local optima ?
What is convex function ? Why do we care about convexity ?
What is bias ? How do you know if an estimator is biased ?
What is a Cumulative Distribution Function ?

Leave a Reply

Your email address will not be published. Required fields are marked *