Fairness in ML: How to deal with bias in ML pipelines?

In this 30 minute video, we talk about Bias and Fairness in ML workflows: Why we need to handle fairness in ML models How biases typically creep into the ML pipeline How to measure these biases How to rectify the biases in our pipeline. A usecase with word embeddings Click here to get the latest…

Naive Bayes Classifier : Advantages and Disadvantages

Recap: Naive Bayes Classifier Naive Bayes Classifier is a popular model for classification based on the Bayes Rule. Note that the classifier is called Naive – since it makes a simplistic assumption that the features are conditionally independant given the class label. In other words: Naive Assumption:  P(datapoint | class) = P(feature_1 | class) *…

Evaluation Metrics for Recommendation Systems

This video explores how one can evaluate recommender systems. Evaluating a recommender system involves (1) If the right results are being recommended (2) Whether more relevant results are being recommended at the top/ first compared to less relevant results. There are two popular types of recommender systems. Explicit Feedback recommender systems and implicit feedback recommender…

Target Encoding for Categorical Features

This video describes target encoding for categorical features, that is more effecient and more effective in several usecases than the popular one-hot encoding. Recap: Categorical Features and One-hot encoding Categorical features are variables that take one of discrete values. For instance: color that could take one of {red, blue, green} or city that can take…

Bias in Machine Learning : Types of Data Biases

Bias in Machine Learning models could often lead to unexpected outcomes. In this brief video we will look at different ways we might end up building biased ML models, with particular emphasis on societal biases such as gender, race and age. Why do we care about Societal Bias in ML Models? Consider an ML model…

What is AUC : Area Under the Curve?

What is AUC ? AUC is the area under the ROC curve. It is a popularly used classification metric. Classifiers such as logistic regression and naive bayes predict class probabilities  as the outcome instead of the predicting the labels themselves. A new data point is classified as positive if the predicted probability of positive class…

Learning Feature Importance from Decision Trees and Random Forests

This video shows the process of feature selection with Decision Trees and Random Forests. Why do we need Feature Selection? Often we end up with large datasets with redundant features that need to be cleaned up before making sense of the data.  Check out this related article on Recursive Feature Elimination that describes the challenges…

Recursive Feature Elimination for Feature Selection

This video explains the technique of Recursive Feature Elimination for feature selection when we have data with lots of features. Why do we need Feature Elimination? Often we end up with large datasets with redundant features that need to be cleaned up before making sense of the data. Some of the challenges with redundant features…

Berkson’s Paradox

This video explains the Berkson’s Paradox. Berkson’s Paradox typically arises from selection bias when we create our dataset, that could lead to unintended inferences from our data. Summary of contents: Berkson’s Paradox illustrated with Burger and Fries example Berkson’s Paradox in the dating scenario Mathematical explanation of Berkson’s Paradox Berkson’s Paradox Example in understanding correlation…

Bayesian Neural Networks

Bayesian Neural networks enable capturing uncertainity in the parameters of a neural network. This video contains: A brief Recap of Feedforward Neural Networks Motivation behind a Bayesian Neural Network What is a Bayesian Neural Network Inference in a Bayesian Neural Network Pros and Cons of using a Bayesian Neural Network References to Code samples to…