MLNerds

What are some ways to make Pandas code faster?

Posted on February 22, 2021March 1, 2021 by MLNerds

Pandas is super popular for data science tasks. But code written in pandas can often be slow. This article talks about how one can make pandas code faster. We will walk through a super simple task of adding 1 to every element in the first column of a dataframe and see how different ways of…

Should I Transition to Data Science?

Posted on February 15, 2021February 22, 2021 by MLNerds

Data Science is popular and lots of folks are considering a career transition to data science. Does it make sense to transition to data science? How can one answer this question? The factors to consider might be different for different people. This video talks about a couple of factors to consider when deciding whether to…

Differences between Pandas and NumPy

Posted on February 8, 2021February 8, 2021 by MLNerds

Pandas and NumPy are two of the most popular python libraries used for data science applications. What is Numpy? Numpy is a popular library used for scientfic computing. It has support for multidimensional arrays and mathematical functions that can operate on these arrays. NumPy arrays are homogeneously typed – which means they hold elements of…

How to learn Math for Machine Learning

Posted on January 25, 2021January 25, 2021 by MLNerds

Becoming a data scientist is intrinsically linked to being upto date on statistics and the underlying math along with other practical skills. But how much math do you need? And how do you actually pick up the math? Here is a brief video on learning the math for ML. What Math is required for ML The…

Monty Hall Problem

Posted on January 18, 2021January 18, 2021 by MLNerds

The Monty Hall problem is a puzzle based on an American reality show ‘Lets Make a Deal’. It is a popular probability riddle that comes up when one is learning probability and statistics, since the first cut solution that comes to mind is often different from what we get by applying basic principles of probability…

Covariance and Correlation

Posted on January 6, 2021January 11, 2021 by MLNerds

Often in data science, we want to understand how one variable is related to another. These variables could be features for an ML model, or sometimes we might want to see how important afeature is in determining the target we are trying to predict. Both covariance and correlation can be used to measure the direction…

Decoding the Data Scientist Hiring Gap

Posted on January 4, 2021January 4, 2021 by MLNerds

The need for AI/ML is growing and more and more jobs are being created as data awareness is increasing and more data is being collected. However, hiring data scientists has not been an easy task – most of these roles are not yet filled. On the other hand, data science is a very popular discipline….

How to find the Optimal Number of Clusters in K-means? Elbow and Silhouette Methods

Posted on December 21, 2020December 23, 2020 by MLNerds

K-means Clustering Recap Clustering is the process of finding cohesive groups of items in the data. K means clusterin is the most popular clustering algorithm. It is simple to implement and easily available in python and R libraries. Here is a quick recap of how K-means clustering works. Choose a value of K Initialize K…

Detecting and Removing Gender Bias in Word Embeddings

Posted on December 14, 2020December 14, 2020 by MLNerds

What are Word Embeddings? Word embeddings are vector representation of words that can be used as input (features) to other downstream tasks and ML models. Here is an article that explains popular word embeddings in more detail. They are used in many NLP applications such as sentiment analysis, document clustering, question answering, paraphrase detection…

Dartboard Paradox: Probability Density Function vs Probability

Posted on December 3, 2020December 7, 2020 by MLNerds

What is the Dartboard Paradox ? Assume your are throwing a dart at dartboard such that it hits somewhere on the dartboard. The dartboard paradox: The probability of hitting any specific point on the dartboard is zero. However the probability of hitting somewhere on the dartboard is 1. How can this be ? ( How…

← Newer posts Older posts →