Understanding Support Vector Machines

[Contributed by Raghu from Knoldus, Canada]

One of the important and popular classification techniques among Machine Learning algorithms is Support Vector Machines. This is also called large margin classification. Support Vector Machine technique results in a hyperplane that separates and hence classifies samples into two distinct classes. SVM results in such a plane that not only separates samples but does it with maximum separation possible. Thus the name large margin classifier. A 2-dimensional depiction of this is shown in the picture below. This is the case of a linear SVM where the decision boundary that separates the classes is linear.

Screenshot from 2016-08-18 22-34-29

Support Vector Machines also support classification where the decision boundary is non-linear. In this case, SVM uses a Kernel. Most popular kernel that is used for non-linear decision problems is what is called an Radial Basis Function Kernel (RBF Kernel in short). This is also called a Gaussian Kernel. Below are 2 images that will depict the working of the SVM with Gaussian Kernel which does classification using non-linear decision boundary.

Screenshot from 2016-08-18 22-35-40.png

One of the easiest ways to build SVM is to use a SVM implementations available in many of the popular ML libraries for various languages. LIBSVM, Scikit-learn and Spark ML are all examples of SVM implementations that are available to use. In this article, we will demonstrate a simple way to build an SVM, train it and then use it using scikit-learn using Python.

The following listing shows a Python session

Python 2.7.11 |Anaconda 4.0.0 (64-bit)| (default, Dec 6 2015, 18:08:32)
[GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
>>> from sklearn import svm
>>> theSVM = svm.SVC()
>>> X = [[0,0], [1,1]]
>>> y = [0,1]
>>> theSVM.fit(X,y)
SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=None, degree=3, gamma='auto', kernel='rbf',
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)
>>> theSVM.predict([[.3,.3]])
>>> theSVM.predict([[.6,.6]])

In the above Python session, we created a classifier that uses an SVM. As can see from the below output, the kind of kernel used is RBF. RBF kernel takes gamma as parameter. In this case, gamma is set automatically. We need to specify the value of C, which is another hyperparamter, which by default is set to 1.0.

SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0,
decision_function_shape=None, degree=3, gamma=’auto’, kernel=’rbf’,
max_iter=-1, probability=False, random_state=None, shrinking=True,
tol=0.001, verbose=False)

We have inputs with 2 dimensions. In this case, we have 2 examples [0,0] and [1,1] and the values of y for these inputs is 0 and 1. In this case, the SVM will come up with a decision boundary that is a line with [0,0] and [1,1] on either side. And we can now use this SVM, by giving it an X and and SVM classifies it and prints out the output. It classifies [.3,.3] as 0 and [.6,.6] as 1.



About Vikas Hazrati

Vikas is the Founding Partner @ Knoldus which is a group of software industry veterans who have joined hands to add value to the art of software development. Knoldus does niche Reactive and Big Data product development on Scala, Spark and Functional Java. Knoldus has a strong focus on software craftsmanship which ensures high-quality software development. It partners with the best in the industry like Lightbend (Scala Ecosystem), Databricks (Spark Ecosystem), Confluent (Kafka) and Datastax (Cassandra). To know more, send a mail to hello@knoldus.com or visit www.knoldus.com
This entry was posted in big data, Scala and tagged , . Bookmark the permalink.

One Response to Understanding Support Vector Machines

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s