## Recap: Naive Bayes Classifier

Naive Bayes Classifier is a popular model for classification based on the Bayes Rule.

Note that the classifier is called Naive – since it makes a simplistic assumption that the features are conditionally independant given the class label. In other words:

**Naive Assumption: **

**P(datapoint | class) = P(feature_1 | class) * … * P(feature_n | class)**

This assumption does not hold in a lot of usecases.

The probabilities used in the naive Bayes classifier are typically computed using the maximum likelihood estimate.

Lets take the example of spam detection.

## Advantages of Using Naive Bayes Classifier

- Simple to Implement. The conditional probabilities are easy to evaluate.
- Very fast – no iterations since the probabilities can be directly computed. So this technique is useful where speed of training is important.
- If the conditional Independence assumption holds, it could give great results.

## Disadvantages of Using Naive Bayes Classifier

- Conditional Independence Assumption does not always hold. In most situations, the feature show some form of dependency.
**Zero probability problem :**When we encounter words in the test data for a particular class that are not present in the training data, we might end up with zero class probabilities. See the example below for more details: P(bumper | Ham) is 0 since bumper does not occuer in any ham (non-spam) documents in the training data.

The zero probability problem can be Remedied through **smoothing** where we add a small smoothing factor to the numerator and denominator of every probability to avoid zero even for new words. See the example below to understand smoothing.

- Bad binning of continuous variables with Multinomial naive bayes: Gaussian Naive Bayes
- Not great for imbalanced data: Complement Naive Bayes

Take a look at the wikipedia article on naive Bayes classifier to learn more.