What are Isolation Forests? How to use them for Anomaly Detection?

All of us know random forests, one of the most popular ML models. They are a supervised learning algorithm, used in a wide variety of applications for classification and regression. Can we use random forests in an unsupervised setting? (where we have no labeled data?) Isolation forests are a variation of random forests that can…

What is One-Class SVM ? How to use it for anomaly detection?

One-class SVM is a variation of the SVM that can be used in an unsupervised setting for anomaly detection. Let’s say we are analyzing credit card transactions to identify fraud. We are likely to have many normal transactions and very few fraudulent transactions. Also, the next fraud transaction might be completely different from all previous…

Can we use the AUC Metric for a SVM Classifier ? 

What is AUC ? AUC is the area under the ROC curve. It is a popularly used classification metric. Classifiers such as logistic regression and naive bayes predict class probabilities  as the outcome instead of the predicting the labels themselves. A new data point is classified as positive if the predicted probability of positive class…

When are deep learning algorithms more appropriate compared to traditional machine learning algorithms?

Deep learning algorithms are capable of learning arbitrarily complex non-linear functions by using a deep enough and a wide enough network with the appropriate non-linear activation function. Traditional ML algorithms often require feature engineering of finding the subset of meaningful features to use. Deep learning algorithms often avoid the need for the feature engineering step….

Why do you typically see overflow and underflow when implementing an ML algorithms ?

A common pre-processing step is to normalize/rescale inputs so that they are not too high or low. However, even on normalized inputs, overflows and underflows can occur: Underflow: Joint probability distribution often involves multiplying small individual probabilities. Many probabilistic algorithms involve multiplying probabilities of individual data points that leads to underflow. Example : Suppose you…

Is the run-time of an ML algorithm important? How do I evaluate whether the run-time is OK?

Runtime considerations are often important for many applications.  Typically you should look at training time and prediction time for an ML algorithm. Some common questions to ask include: Training: Do you want to train the algorithm in a batch mode? How often do you need to train? If you need to retrain your algorithm every…

How do you handle missing data in an ML algorithm ?

Missing data is caused either due to issues in data collection or sometimes, the data model could allow for missing data (for instance, the field ‘maximum credit limit on any of your cards’ might not make sense for someone who has no credit cards…). With missing data, typically the ML algorithm implementation might fail with…

With the maximum likelihood estimate are we guaranteed to find a global Optima ?

Maximum likelihood estimate finds that value of parameters that maximize the likelihood. If the likelihood is strictly concave(or negative of likelihood is strictly convex), we are guaranteed to find a unique optimum. This is usually not the case and we end up finding a local optima. Hence, the Maximum likelihood estimate usually finds a local…