What is One-Class SVM ? How to use it for anomaly detection?

One-class SVM is a variation of the SVM that can be used in an unsupervised setting for anomaly detection.

Let’s say we are analyzing credit card transactions to identify fraud. We are likely to have many normal transactions and very few fraudulent transactions. Also, the next fraud transaction might be completely different from all previous fraud transactions.

As Leo Tolstoy said ‘All happy families are alike; each unhappy family is unhappy in its own way.’ How do we then figure out these unhappy families?

Often for anomaly detection, while there are many examples of non-anomalous points, it is hard to get labeled examples of outliers. Even if we manage to label the example, the number of anomalous examples available might be low and diverse. Hence, an unsupervised setting works well for this task. The task is also sometimes referred to as *novelty detection* since we are not given class labels for both the classes, and need to build a model to filter out *novel examples* given the dataset of points without labels.

The one-class SVM is explained in some depth in the video above, but here is a brief intuition:

Intuition behind a one-class SVM

Recall that a regular SVM for classification finds a max-margin hyperplane that seperates the positive examples from the negative ones. The one-class SVM finds a hyper-plane that separates the given dataset from the **origin** such that the hyperplane is as close to the datapoints as possible.

Note that usually the RBF kernel is used to fit a non-linear boundary around the dense region of the dataset separating the remaining points as outliers.

One can look at https://papers.nips.cc/paper/1723-support-vector-method-for-novelty-detection.pdf to refer to the original paper.

An alternate version of one class SVM involves fitting the sphere around the outlier points that most closely encloses them. One can refer to the following wiki page that describes this approach.

One-class SVM implementation in sklearn: 

The one-class SVM is readily available in the sklearn library with examples to use it.


Leave a Reply

Your email address will not be published. Required fields are marked *