SVM - código

Importar librerías:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Importar datos:

df = pd.read_csv("Clasificación.csv", sep = ";", decimal=",")
print(df.head())
      X1     X2  y
0  50.24  10.06  1
1  47.71   9.16  0
2  48.10  10.18  1
3  52.77  10.24  1
4  49.48   9.57  0

Visualización de los datos:

plt.scatter(df["X1"], df["X2"], marker='^', c = df["y"], cmap=plt.cm.RdYlGn)
plt.xlabel("X1")
plt.ylabel("X2");
../../../_images/output_6_0.png
X = df[["X1", "X2"]]
print(X.head())
      X1     X2
0  50.24  10.06
1  47.71   9.16
2  48.10  10.18
3  52.77  10.24
4  49.48   9.57
y = df["y"]
print(y.head())
0    1
1    0
2    1
3    1
4    0
Name: y, dtype: int64

Escalado de variables:

Feature Scaling

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X = scaler.fit_transform(X)
print(X[:10,])
[[ 0.29938111  0.48540279]
 [-1.17998259 -2.03617016]
 [-0.95193838  0.82161252]
 [ 1.77874481  0.98971739]
 [-0.14501273 -0.88745359]
 [ 0.54496718  1.69015432]
 [ 0.36954856  2.41860873]
 [ 1.0010556  -0.04692927]
 [ 0.67945479  1.04575234]
 [ 0.32277026 -1.13961089]]
plt.scatter(X[:,0], X[:,1], marker='^', c = y, cmap=plt.cm.RdYlGn)
plt.xlabel("X1")
plt.ylabel("X2");
../../../_images/output_13_03.png

Clasificación lineal:

Ajuste del modelo (Fitting the classifier):

from sklearn.svm import SVC
clf = SVC(kernel = 'linear', random_state = 0)
clf.fit(X,y)
SVC(kernel='linear', random_state=0)

Predicción:

y_pred = clf.predict(X)
print(y_pred)
[1 0 1 1 0 1 1 0 1 0 0 0 1 1 1 0 1 1 1 0 0 1 0 0 1 0 0 0 1 0 1 0 0 1 1 0 1
 1 0 1]

Evaluación del desempeño (performance):

from sklearn.metrics import accuracy_score
accuracy_score(y, y_pred)
0.875

Visualización de los resultados:

from matplotlib.colors import ListedColormap
X_Set, y_Set = X, y
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('#F0566F', '#51F192')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(X_Set[y_Set == j, 0], X_Set[y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_25_13.png

Clasificación no lineal:

Kernel: Polinomial:

Por defecto es de grado 3 degree = 3

Ajuste del modelo:

clf = SVC(kernel = 'poly', random_state = 0)
clf.fit(X,y)
SVC(kernel='poly', random_state=0)

Predicción:

y_pred = clf.predict(X)
print(y_pred)
[1 0 1 1 0 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 0 1 0 1 1 1 0 1 1 0 1 1 1 1
 1 0 1]

Evaluación del desempeño:

accuracy_score(y, y_pred)
0.725

Visualización de los resultados:

X_Set, y_Set = X, y
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('#F0566F', '#51F192')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(X_Set[y_Set == j, 0], X_Set[y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_36_11.png

Kernel Polinomial grado 2:

degree=2

clf = SVC(kernel = 'poly', degree=2, random_state = 0)
clf.fit(X,y)
y_pred = clf.predict(X)
print(y_pred)
[0 1 0 1 0 1 1 0 0 0 1 0 0 1 0 0 1 1 0 1 1 0 0 0 0 0 0 0 0 0 1 0 0 1 1 0 0
 0 0 1]
accuracy_score(y, y_pred)
0.65
X_Set, y_Set = X, y
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('#F0566F', '#51F192')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(X_Set[y_Set == j, 0], X_Set[y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_41_1.png

Kernel Polinomial grado 10:

degree=10

clf = SVC(kernel = 'poly', degree=10, random_state = 0)
clf.fit(X,y)
y_pred = clf.predict(X)
print(y_pred)
[0 0 0 1 0 1 1 0 0 0 1 0 0 1 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0
 0 0 0]
accuracy_score(y, y_pred)
0.7
X_Set, y_Set = X, y
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('#F0566F', '#51F192')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(X_Set[y_Set == j, 0], X_Set[y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_46_1.png

Kernel: RBF:

Ajuste del modelo:

clf = SVC(kernel = 'rbf', random_state = 0)
clf.fit(X,y)
SVC(random_state=0)

Predicción:

y_pred = clf.predict(X)
print(y_pred)
[1 0 1 1 0 1 1 0 1 0 1 0 1 1 1 0 1 1 1 1 0 1 0 0 1 0 0 0 1 0 1 0 0 1 1 0 1
 1 0 1]

Evaluación del desempeño:

accuracy_score(y, y_pred)
0.925

Visualización de los resultados:

X_Set, y_Set = X, y
X1, X2 = np.meshgrid(np.arange(start = X_Set[:, 0].min() - 1, stop = X_Set[:, 0].max() + 1, step = 0.01),
                     np.arange(start = X_Set[:, 1].min() - 1, stop = X_Set[:, 1].max() + 1, step = 0.01))
plt.contourf(X1, X2, clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
             alpha = 0.75, cmap = ListedColormap(('#F0566F', '#51F192')))
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(X_Set[y_Set == j, 0], X_Set[y_Set == j, 1],
                c = ListedColormap(('red', 'green'))(i), label = j)
plt.title('Support Vector Machine')
plt.xlabel('X1')
plt.ylabel('X2')
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_55_1.png
clf = SVC(kernel = 'rbf', gamma = 2.5, C= 100, random_state = 0)
clf.fit(X,y)
y_pred = clf.predict(X)
accuracy_score(y, y_pred)
1.0
X_Set, y_Set = X, y
X1, X2 = np.meshgrid(
    np.arange(start=X_Set[:, 0].min() - 1, stop=X_Set[:, 0].max() + 1, step=0.01),
    np.arange(start=X_Set[:, 1].min() - 1, stop=X_Set[:, 1].max() + 1, step=0.01),
)
plt.contourf(
    X1,
    X2,
    clf.predict(np.array([X1.ravel(), X2.ravel()]).T).reshape(X1.shape),
    alpha=0.75,
    cmap=ListedColormap(("#F0566F", "#51F192")),
)
plt.xlim(X1.min(), X1.max())
plt.ylim(X2.min(), X2.max())
for i, j in enumerate(np.unique(y_Set)):
    plt.scatter(
        X_Set[y_Set == j, 0],
        X_Set[y_Set == j, 1],
        c=ListedColormap(("red", "green"))(i),
        label=j,
    )
plt.title("Support Vector Machine")
plt.xlabel("X1")
plt.ylabel("X2")
plt.legend()
plt.show()
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
c argument looks like a single numeric RGB or RGBA sequence, which should be avoided as value-mapping will have precedence in case its length matches with x & y.  Please use the color keyword-argument or provide a 2D array with a single row if you intend to specify the same RGB or RGBA value for all points.
../../../_images/output_57_1.png