What are some ways to make Pandas code faster?

Pandas is super popular for data science tasks. But code written in pandas can often be slow. This article talks about how one can make pandas code faster. We will walk through a super simple task of adding 1 to every element in the first column of a dataframe and see how different ways of…

Should I Transition to Data Science?

Data Science is popular and lots of folks are considering a career transition to data science. Does it make sense to transition to data science? How can one answer this question? The factors to consider might be different for different people.  This video talks about a couple of factors to consider when deciding whether to…

Differences between Pandas and NumPy

Pandas and NumPy are two of the most popular python libraries used for data science applications. What is Numpy? Numpy is a popular library used for scientfic computing. It has support for multidimensional arrays and mathematical functions that can operate on these arrays. NumPy arrays are homogeneously typed – which means they hold elements of…

How to learn Math for Machine Learning

Becoming a data scientist is intrinsically linked to being upto date on statistics and the underlying math along with other practical skills. But how much math do you need? And how do you actually pick up the math? Here is a brief video on learning the math for ML. What Math is required for ML The…

Monty Hall Problem

The Monty Hall problem is a puzzle based on an American reality show ‘Lets Make a Deal’. It is a popular probability riddle that comes up when one is learning probability and statistics, since the first cut solution that comes to mind is often different from what we get by applying basic principles of probability…

Covariance and Correlation

Often in data science, we want to understand how one variable is related to another. These variables could be features for an ML model, or sometimes we might want to see how important afeature is in determining the target we are trying to predict. Both covariance and correlation can be used to measure the direction…

Decoding the Data Scientist Hiring Gap

The need for AI/ML is growing and more and more jobs are being created as data awareness is increasing and more data is being collected. However, hiring data scientists has not been an easy task – most of these roles are not yet filled. On the other hand, data science is a very popular discipline….

Detecting and Removing Gender Bias in Word Embeddings

   What are Word Embeddings? Word embeddings are vector representation of words that can be used as input (features) to other downstream tasks and ML models. Here is an article that  explains popular word  embeddings in more detail.  They are used in many NLP applications such as sentiment analysis, document clustering, question answering, paraphrase detection…