![](https://machinelearninginterview.com/wp-content/uploads/2021/08/naive_bayes_pros_cons_thumbnail-4-1024x572.png)
Recap: Naive Bayes Classifier
Naive Bayes Classifier is a popular model for classification based on the Bayes Rule.
![computing probability of each class based on Bayes rule](https://machinelearninginterview.com/wp-content/uploads/2021/08/image.png)
Note that the classifier is called Naive – since it makes a simplistic assumption that the features are conditionally independant given the class label. In other words:
Naive Assumption:
P(datapoint | class) = P(feature_1 | class) * … * P(feature_n | class)
This assumption does not hold in a lot of usecases.
The probabilities used in the naive Bayes classifier are typically computed using the maximum likelihood estimate.
Lets take the example of spam detection.
![Naive Bayes classifier formula in the context of spam detection](https://machinelearninginterview.com/wp-content/uploads/2021/08/image-1.png)
![Naive bayes classifier example with sample training data for spam detection](https://machinelearninginterview.com/wp-content/uploads/2021/08/image-2.png)
Advantages of Using Naive Bayes Classifier
- Simple to Implement. The conditional probabilities are easy to evaluate.
- Very fast – no iterations since the probabilities can be directly computed. So this technique is useful where speed of training is important.
- If the conditional Independence assumption holds, it could give great results.
Disadvantages of Using Naive Bayes Classifier
- Conditional Independence Assumption does not always hold. In most situations, the feature show some form of dependency.
- Zero probability problem : When we encounter words in the test data for a particular class that are not present in the training data, we might end up with zero class probabilities. See the example below for more details: P(bumper | Ham) is 0 since bumper does not occuer in any ham (non-spam) documents in the training data.
![Naive bayes classifier example to show zero frequency problem for spam detection](https://machinelearninginterview.com/wp-content/uploads/2021/08/image-5.png)
The zero probability problem can be Remedied through smoothing where we add a small smoothing factor to the numerator and denominator of every probability to avoid zero even for new words. See the example below to understand smoothing.
![smoothing for naive Bayes classifier for spam detection](https://machinelearninginterview.com/wp-content/uploads/2021/08/image-4-1024x364.png)
- Bad binning of continuous variables with Multinomial naive bayes: Gaussian Naive Bayes
- Not great for imbalanced data: Complement Naive Bayes
Take a look at the wikipedia article on naive Bayes classifier to learn more.