機器學習：SVM（非線性數據分類：SVM中使用多項式特征和核函數SVC）

本文轉載自查看原文 2018-08-12 21:26 5972 機器學習算法

一、基礎理解

數據：線性數據、非線性數據；
線性數據：線性相關、非線性相關；（非線性相關的數據不一定是非線性數據）

　1）SVM 解決非線性數據分類的方法

方法一：
多項式思維：擴充原本的數據，制造新的多項式特征；（對每一個樣本添加多項式特征）
步驟：

PolynomialFeatures(degree = degree)：擴充原始數據，生成多項式特征；
StandardScaler()：標准化處理擴充后的數據；
LinearSVC(C = C)：使用 SVM 算法訓練模型；

方法二：
使用scikit-learn 中封裝好的核函數： SVC(kernel='poly', degree=degree, C=C)
功能：當 SVC() 的參數 kernel = ‘poly’ 時，直接使用多項式特征處理數據；

注：使用 SVC() 前，也需要對數據進行標准化處理

二、例

　1）生成數據

datasets.make_ + 后綴：自動生成數據集；

如果想修改生成的數據量，可在make_moons()中填入參數；

import numpy as np
import matplotlib.pyplot as plt
from sklearn import datasets

X, y = datasets.make_moons(noise=0.15, random_state=666)
plt.scatter(X[y==0, 0], X[y==0, 1])
plt.scatter(X[y==1, 0], X[y==1, 1])
plt.show()

　2）繪圖函數

def plot_decision_boundary(model, axis):
    
    x0, x1 = np.meshgrid(
        np.linspace(axis[0], axis[1], int((axis[1]-axis[0])*100)).reshape(-1,1),
        np.linspace(axis[2], axis[3], int((axis[3]-axis[2])*100)).reshape(-1,1)
    )
    X_new = np.c_[x0.ravel(), x1.ravel()]
    
    y_predict = model.predict(X_new)
    zz = y_predict.reshape(x0.shape)
    
    from matplotlib.colors import ListedColormap
    custom_cmap = ListedColormap(['#EF9A9A','#FFF59D','#90CAF9'])
    
    plt.contourf(x0, x1, zz, linewidth=5, cmap=custom_cmap)

　3）方法一：多項式思維

from sklearn.preprocessing import PolynomialFeatures, StandardScaler
from sklearn.svm import LinearSVC
from sklearn.pipeline import Pipeline

def PolynomialSVC(degree, C=1.0):
    return Pipeline([
        ('poly', PolynomialFeatures(degree=degree)),
        ('std)scaler', StandardScaler()),
        ('linearSVC', LinearSVC(C=C))
    ])

poly_svc = PolynomialSVC(degree=3)
poly_svc.fit(X, y)

plot_decision_boundary(poly_svc, axis=[-1.5, 2.5, -1.0, 1.5])
plt.scatter(X[y==0, 0], X[y==0, 1])
plt.scatter(X[y==1, 0], X[y==1, 1])
plt.show()

改變參數：degree、C，模型的決策邊界也跟着改變；

　4）方法二：使用核函數 SVC()

對於SVM算法，在scikit-learn的封裝中，可以不使用 PolynomialFeatures的方式先將數據轉化為高維的具有多項式特征的數據，在將數據提供給算法;

SVC() 算法：直接使用多項式特征；

from sklearn.svm import SVC

# 當算法SVC()的參數 kernel='poly'時，SVC()能直接打到一種多項式特征的效果；
# 使用 SVC() 前，也需要對數據進行標准化處理
def PolynomialKernelSVC(degree, C=1.0):
    return Pipeline([
        ('std_scaler', StandardScaler()),
        ('kernelSVC', SVC(kernel='poly', degree=degree, C=C))
    ])

poly_kernel_svc = PolynomialKernelSVC(degree=3)
poly_kernel_svc.fit(X, y)

plot_decision_boundary(poly_kernel_svc, axis=[-1.5, 2.5, -1.0, 1.5])
plt.scatter(X[y==0, 0], X[y==0, 1])
plt.scatter(X[y==1, 0], X[y==1, 1])
plt.show()

調整 PolynomialkernelSVC() 的參數：degree、C，可改決策邊界；

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 吳裕雄 python 機器學習——支持向量機SVM非線性分類SVC模型【筆記】sklearn中的SVM以及使用多項式特征以及核函數《機器學習Python實現_07_03_svm_核函數與非線性支持向量機》機器學習-SVM-核函數機器學習：邏輯回歸（使用多項式特征） Python機器學習筆記：SVM（2）——SVM核函數 SVM: 用kernels(核函數)來定義新的features,避免使用多項式,高斯kernel 機器學習：SVM（核函數、高斯核函數RBF）機器學習數據量不足問題----1 做好特征工程 2 不要用太多的特征 3 做好交叉驗證使用線性svm 機器學習實戰-之SVM核函數與案例