Support Vector Machines

Exploring the Support Vector Machines Algorithm

Srihari Thyagarajan

Last updated on May 26, 2024 8 min read Programming, Data Science, Algorithms, Academic

Program output

Support Vector Machines (SVM)

This README file provides information on Support Vector Machines (SVM) and its implementation. The SVM algorithm is used for classification and regression analysis. Below is a breakdown of the content covered in this README. Implementation can be found here.

Part A: Computing Support Vectors for a Dataset
Part B: Creating an SVM Model with Linear Kernel
Part C: Plotting SVC Decision Function
Part D: Fetching LFW People Dataset
Implementation of SVM with PCA
Displaying Predicted Names and Labels
Confusion Matrix and Heatmap

Part A: Computing Support Vectors for a Dataset

In this section, we discuss the computation of support vectors for a given dataset. Support vectors are the data points that lie closest to the decision boundary of the SVM classifier.

Part B: Creating an SVM Model with Linear Kernel

To create an SVM model with a linear kernel, the SVC class from the sklearn.svm module is used. The kernel parameter is set to ’linear’, and the C parameter controls the regularization strength.

Example:

from sklearn.svm import SVC

model = SVC(kernel='linear', C=1E10)
model.fit(X, y)

Part C: Plotting SVC Decision Function

This section discusses the plot_svc_decision_function method, which allows visualizing the decision function of an SVM model.

Part D: Fetching LFW People Dataset

The LFW People dataset is fetched using the fetch_lfw_people function from the sklearn.datasets module. This dataset contains face images of different individuals.

Example:

from sklearn.datasets import fetch_lfw_people

faces = fetch_lfw_people(min_faces_per_person=60)
print(faces.target_names)
print(faces.images.shape)

Implementation of SVM with PCA

In this section, an SVM model with a radial basis function (RBF) kernel is implemented using Principal Component Analysis (PCA) for dimensionality reduction. The RandomizedPCA class from sklearn.decomposition and the make_pipeline function from sklearn.pipeline are used.

Example:

from sklearn.svm import SVC
from sklearn.decomposition import PCA as RandomizedPCA
from sklearn.pipeline import make_pipeline

pca = RandomizedPCA(n_components=150, whiten=True, random_state=42)
svc = SVC(kernel='rbf', class_weight='balanced')
model = make_pipeline(pca, svc)

Displaying Predicted Names and Labels

This section demonstrates how to display predicted names and labels using a subplot layout. The face images are shown, and the predicted names are displayed with incorrect labels highlighted in red.

Example:

fig, ax = plt.subplots(4, 6)
for i, axi in enumerate(ax.flat):
    axi.imshow(Xtest[i].reshape(62, 47), cmap='bone')
    axi.set(xticks=[], yticks=[])
    axi.set_ylabel(faces.target_names[yfit[i]].split()[-1], color='black' if yfit[i] == ytest[i] else 'red')
fig.suptitle('Predicted Names; Incorrect Labels in Red',

 size=14)

Confusion Matrix and Heatmap

This section demonstrates how to create a confusion matrix using the confusion_matrix function from sklearn.metrics. The confusion matrix is visualized using a heatmap with labeled axes.

Example:

import seaborn as sns
from sklearn.metrics import confusion_matrix

mat = confusion_matrix(ytest, yfit)
sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False, xticklabels=faces.target_names, yticklabels=faces.target_names)
plt.xlabel('True Label')
plt.ylabel('Predicted Label')

Use the provided instructions and code examples to explore and understand the implementation of Support Vector Machines (SVM).

# import libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn as sns

Part A: Computing Support vectors for a dataset.

# Generating the dataset:
from sklearn.datasets import make_blobs
X, y = make_blobs(n_samples = 50, centers = 2, random_state = 0, cluster_std = 0.60) # default samples = 100. Default number of features for each sample is 2.

# 50 samples, two clusters, default 2 factors, for each sample, standard deviation =  0.60

# X
X

array([[ 1.41281595,  1.5303347 ],
       [ 1.81336135,  1.6311307 ],
       [ 1.43289271,  4.37679234],
       [ 1.87271752,  4.18069237],
       [ 2.09517785,  1.0791468 ],
       [ 2.73890793,  0.15676817],
       [ 3.18515794,  0.08900822],
       [ 2.06156753,  1.96918596],
       [ 2.03835818,  1.15466278],
       [-0.04749204,  5.47425256],
       [ 1.71444449,  5.02521524],
       [ 0.22459286,  4.77028154],
       [ 1.06923853,  4.53068484],
       [ 1.53278923,  0.55035386],
       [ 1.4949318 ,  3.85848832],
       [ 1.1641107 ,  3.79132988],
       [ 0.74387399,  4.12240568],
       [ 2.29667251,  0.48677761],
       [ 0.44359863,  3.11530945],
       [ 0.91433877,  4.55014643],
       [ 1.67467427,  0.68001896],
       [ 2.26908736,  1.32160756],
       [ 1.5108885 ,  0.9288309 ],
       [ 1.65179125,  0.68193176],
       [ 2.49272186,  0.97505341],
       [ 2.33812285,  3.43116792],
       [ 0.67047877,  4.04094275],
       [-0.55552381,  4.69595848],
       [ 2.16172321,  0.6565951 ],
       [ 2.09680487,  3.7174206 ],
       [ 2.18023251,  1.48364708],
       [ 0.43899014,  4.53592883],
       [ 1.24258802,  4.50399192],
       [ 0.00793137,  4.17614316],
       [ 1.89593761,  5.18540259],
       [ 1.868336  ,  0.93136287],
       [ 2.13141478,  1.13885728],
       [ 1.06269622,  5.17635143],
       [ 2.33466499, -0.02408255],
       [ 0.669787  ,  3.59540802],
       [ 1.07714851,  1.17533301],
       [ 1.54632313,  4.212973  ],
       [ 1.56737975, -0.1381059 ],
       [ 1.35617762,  1.43815955],
       [ 1.00372519,  4.19147702],
       [ 1.29297652,  1.47930168],
       [ 2.94821884,  2.03519717],
       [ 0.3471383 ,  3.45177657],
       [ 2.76253526,  0.78970876],
       [ 0.76752279,  4.39759671]])

# Y
y

array([1, 1, 0, 0, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 0, 0, 0, 1, 0, 0, 1, 1,
       1, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 0, 1, 1,
       0, 1, 1, 0, 1, 0])

plt.scatter(X[:, 0], X[:, 1], c = y, cmap = "autumn")

# c = y gives colour to the marker, where y contains the labels corresponding to X.
# Different color to each cluster is given.

<matplotlib.collections.PathCollection at 0x7fa82b08bee0>

# Varying the parameters of centers and standard deviation:
A, b = make_blobs(n_samples = 50, centers = 10, random_state = 0, cluster_std = 0.1) # default samples = 100. Default number of features for each sample is 2.
plt.scatter(A[:, 0], A[:, 1], c = b, cmap = "autumn")

<matplotlib.collections.PathCollection at 0x7fa826168430>

Inference:

Since the standard deviation is less between the two clusters, the points seem to come closer and tend to even over lap sometimes depending on the value of the centers parameter.

xfit = np.linspace(-1, 3.5)
# linspace returns evenly spaced numbers from -1 to 3.5 which is the parameter mentioned.
# Default number of samples is 50.

xfit

array([-1.        , -0.90816327, -0.81632653, -0.7244898 , -0.63265306,
       -0.54081633, -0.44897959, -0.35714286, -0.26530612, -0.17346939,
       -0.08163265,  0.01020408,  0.10204082,  0.19387755,  0.28571429,
        0.37755102,  0.46938776,  0.56122449,  0.65306122,  0.74489796,
        0.83673469,  0.92857143,  1.02040816,  1.1122449 ,  1.20408163,
        1.29591837,  1.3877551 ,  1.47959184,  1.57142857,  1.66326531,
        1.75510204,  1.84693878,  1.93877551,  2.03061224,  2.12244898,
        2.21428571,  2.30612245,  2.39795918,  2.48979592,  2.58163265,
        2.67346939,  2.76530612,  2.85714286,  2.94897959,  3.04081633,
        3.13265306,  3.2244898 ,  3.31632653,  3.40816327,  3.5       ])

for m, b in [(1, 0.65), (0.5, 1.6), (-0.2, 2.9)]:
  # here m and b stand for slope and intercept respectively for the equations to be plotted
  # print(m, "\n", b)
  # print(xfit, m * xfit + b)
  # We have three points for each equation:
  plt.plot(xfit, m * xfit + b, 'k')
  plt.xlim(-1, 3.5)

plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = "autumn")
# plt.plot([0.6], [2.1], 'x', color = 'red', markeredgewidth = 2, markersize = 10)

for m, b in [(1, 0.65), (0.5, 1.6), (-0.2, 2.9)]:
  plt.plot(xfit, m * xfit + b, '-k')
  plt.xlim(-1, 3.5)

xfit = np.linspace(-1, 3.5)
plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = 'autumn')
for m, b, d in [(1, 0.65, 0.33), (0.5, 1.6, 0.55), (-0.2, 2.9, 0.2)]:
  yfit = m * xfit + b
  plt.plot(xfit, yfit, '-k')
  plt.fill_between(xfit, yfit - d, yfit + d, edgecolor = 'none', color = "#AAAAAA", alpha = 0.4)
  # Fill between fills the area between the two boundaries defined by (xfit , yfit - d) and (xfit, yfit + d)
  plt.xlim(-1, 3.5)

Inference:

From the above graph, we can conclude that the center line is a good decision boundary as there seems to be 1 red point and 2 yellow points touching on the boundary/width of the decision boundary.

# Creating a SVM model with linear kernel:
from sklearn.svm import SVC # Support vector classifier
model = SVC(kernel = 'linear', C = 1E10)
# C is the regularization parameter.
# The strength of resgularization is inversely proportional to C.
# Kernel used is linear, by default the input is taken as RBF, which is non-linear.

model.fit(X, y)

SVC(C=10000000000.0, kernel='linear')

In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.

def plot_svc_decision_function(model, ax = None, plot_support = True):
  ax = plt.gca()
  xlim = ax.get_xlim()
  ylim = ax.get_ylim()
  x = np.linspace(xlim[0], xlim[1], 30)
  y = np.linspace(ylim[0], ylim[1], 30)

  Y, X = np.meshgrid(y, x)
  xy = np.vstack([X.ravel(), Y.ravel()]).T
  P = model.decision_function(xy).reshape(X.shape)

  ax.contour(X, Y, P, colors = 'k', levels = [-1, 0, 1], alpha = 0.5, linestyles = ['--', '-', '--'])

  ax.scatter(model.support_vectors_[:, 0], model.support_vectors_[:, 1], s = 300, linewidth = 1, facecolors = 'none')
  ax.set_xlim(xlim)
  ax.set_ylim(ylim)

plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = 'autumn')
plot_svc_decision_function(model)

model.support_vectors_

array([[0.44359863, 3.11530945],
       [2.33812285, 3.43116792],
       [2.06156753, 1.96918596]])

from sklearn.datasets import make_circles
X, y = make_circles(100, factor = .1, noise = .1)
# clf = SVC(kernel = 'linear').fit(X, y)

plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = 'autumn')
# plot_svc_decision_function(clf, plot_support = False)

<matplotlib.collections.PathCollection at 0x7fa824e0a160>

If noise is less, points come closer.

clf = SVC(kernel = 'rbf', C = 1E6)
clf.fit(X, y)

SVC(C=1000000.0)

plt.scatter(X[:, 0], X[:, 1], c = y, s = 50, cmap = 'autumn')
plot_svc_decision_function(clf)

from sklearn.datasets import fetch_lfw_people
faces = fetch_lfw_people(min_faces_per_person =60)
print(faces.target_names)
print(faces.images.shape)

['Ariel Sharon' 'Colin Powell' 'Donald Rumsfeld' 'George W Bush'
 'Gerhard Schroeder' 'Hugo Chavez' 'Junichiro Koizumi' 'Tony Blair']
(1348, 62, 47)

fig,ax = plt.subplots(3,5)
for i,axi in enumerate(ax.flat):
  axi.imshow(faces.images[i],cmap='bone')
  axi.set(xticks=[],yticks=[],xlabel=faces.target_names[faces.target[i]])

from sklearn.svm import SVC
from sklearn.decomposition import PCA as RandomizedPCA
from sklearn.pipeline import make_pipeline

pca = RandomizedPCA(n_components=150 ,whiten = True, random_state = 42)
svc = SVC(kernel='rbf',class_weight='balanced')
model = make_pipeline(pca,svc)

We are using rbf as kernel and in make pipeline will define the order of models

from sklearn.model_selection import train_test_split
Xtrain,Xtest,ytrain,ytest = train_test_split(faces.data,faces.target,random_state =42)

model.fit(Xtrain,ytrain)
yfit = model.predict(Xtest)

fig,ax = plt.subplots(4,6)
for i,axi in enumerate(ax.flat):
  axi.imshow(Xtest[i].reshape(62,47),cmap = 'bone')
  axi.set(xticks=[],yticks=[])
  axi.set_ylabel(faces.target_names[yfit[i]].split()[-1],color ='black' if yfit[i]==ytest[i] else 'red')
fig.suptitle('Predicted Names; Incorrect Labels in Red',size=14)

Text(0.5, 0.98, 'Predicted Names; Incorrect Labels in Red')

from sklearn.metrics import classification_report
print(classification_report(ytest,yfit,target_names = faces.target_names))

                   precision    recall  f1-score   support

     Ariel Sharon       1.00      0.73      0.85        15
     Colin Powell       0.68      0.96      0.80        68
  Donald Rumsfeld       0.92      0.77      0.84        31
    George W Bush       0.91      0.87      0.89       126
Gerhard Schroeder       0.89      0.74      0.81        23
      Hugo Chavez       1.00      0.65      0.79        20
Junichiro Koizumi       1.00      0.92      0.96        12
       Tony Blair       0.86      0.86      0.86        42

         accuracy                           0.85       337
        macro avg       0.91      0.81      0.85       337
     weighted avg       0.87      0.85      0.85       337

import seaborn as sns 
from sklearn.metrics import confusion_matrix
mat = confusion_matrix (ytest, yfit)
sns.heatmap(mat.T, square=True, annot=True, fmt='d', cbar=False , xticklabels=faces.target_names, yticklabels=faces.target_names)
plt.xlabel('true label')
plt.ylabel ('predicted label')

Text(91.68, 0.5, 'predicted label')

Edit this page

Machine Learning Support Vector Machines SVM Classification Regression

Support Vector Machines

Support Vector Machines (SVM)

Table of Contents

Part A: Computing Support Vectors for a Dataset

Part B: Creating an SVM Model with Linear Kernel

Part C: Plotting SVC Decision Function

Part D: Fetching LFW People Dataset

Implementation of SVM with PCA

Displaying Predicted Names and Labels

Confusion Matrix and Heatmap

Part A: Computing Support vectors for a dataset.

Inference:

Since the standard deviation is less between the two clusters, the points seem to come closer and tend to even over lap sometimes depending on the value of the centers parameter.

Inference:

From the above graph, we can conclude that the center line is a good decision boundary as there seems to be 1 red point and 2 yellow points touching on the boundary/width of the decision boundary.

We are using rbf as kernel and in make pipeline will define the order of models

Srihari Thyagarajan

B Tech AI Senior Student

Related