SVM之Python實現


SVM Python實現

Python實現SVM的理論知識

  • SVM原始最優化問題:

\[min_{w,b,\xi}{1\over{2}}{||w||}^2 + C\sum_{i=1}^m\xi^{(i)} \]

\[s.t. \ \ y^{(i)}(w^{T}x^{(i)} + b), i=1,2,...,m \\ \xi^{(i)} \ge 0, i=1,2,...m \]

  • 原始問題轉為對偶問題

\[min_{\alpha}{1\over{2}}\sum_{i=1}^m\sum_{j=1}^{m}\alpha^{(i)}\alpha^{(j)}y^{(i)}y^{(j)}K(x^{(i)},x^{(j)})-\sum_{i=1}^m\alpha^{(i)} \]

\[s.t. \ \ \sum_{i=1}^m\alpha^{(i)}y^{(i)}=0 \\ 0 \le \alpha^{(i)} \le C, i=1,2,...,m \]

  • 對於第2點, 設\(\alpha^*=(\alpha^*_{1},\alpha^*_{2}, ..., \alpha^*_{m})\), 若存在\(\alpha^*\)的一個分量\(\alpha*_{j}, 0 \lt \alpha^*_{j} \lt C\), 則原始問題的解\(w^*,b^*\)

\[w^* = \sum_{i=1}^m\alpha^*_{i}y^{(i)}x^{(i)} \\ b^* = y^{(j)} - \sum_{i=1}^my^{(i)}\alpha^{(i)}K(x^{(i)}, x^{(j)}) \]

  • SMO算法中使用到的公式(公式用使用下標1和2不是指在樣本中的第1個樣本和第2個樣本, 而是指第1個參數和第2個參數, 在編程中我們使用i和j替代, i和j是在X輸入樣本中的樣本下標)

    • 計算\(E^{(i)}\) -> 在calc_E函數中

    \[E^{(i)} = g(x^{(i)})-y^{(i)} \\ 其中g(x^{(i)}) = \sum_{i=1}^m\alpha^{(i)}y^{(i)}K(x^{(i)},x^{(j)}) + b \\ 所以E^{(i)} = (\sum_{i=1}^m\alpha^{(i)}y^{(i)}K(x^{(i)},x^{(j)}) + b) - y^{(i)} \ \ \ where \ \ i = 1,2 \]

    • 計算\(\alpha^{new,unc}_2\) -> 在iter函數中

    \[\alpha^{new,unc}_2 = \alpha^{old}_2 + {{y_2}(E^{(i)} - E^{(j)})\over{\eta}} \\ 其中\eta = K_{11} + K_{22} - 2K_{12} \\ 注意: K_{11}指的是在使用核函數映射之后的輸入樣本中的第i行與第j行樣本, \\ 同理K_{22}指的是在使用核函數映射之后的輸入樣本中的第i行與第j行樣本... \\ 注意: \eta不能為小於等於0, 如果出現這種情況, 則在迭代函數中直接返回0 \]

    • 裁剪\(\alpha^{new,unc}_1\) -> 在clip_alpha函數中

      • 如果\(y^{(1)}\ne y^{(2)}\)

        \[L=max(0,\alpha^{old}_2-\alpha^{old}_1) \\ H=min(C, C + \alpha^{old}_2 - \alpha^{old}_1) \]

      • 如果\(y^{(1)}=y^{(2)}\)

        \[L=max(0,\alpha^{old}_2+\alpha^{old}_1-C) \\ H=min(C, \alpha^{old}_2+\alpha^{old}_1) \]

      • 注意: 得到的L與H不能相同, 如果相同則直接返回0
      • 定義函數clip_alpha(alpha, L, H)
        
        def calc_alpha(alpha, L, H):
            if alpha > H:
                alpha = H
            elif alpha < L:
                alpha = L
            return alpha
        
      • 計算\(\alpha^{new}_1\)

      \[\alpha^{new}_1=\alpha^{old}_1+y^{(1)}y^{(2)}(\alpha^{old}_2-\alpha^{new}_2) \]

      • 計算出\(\alpha^{new}_1\)之后, 比較\(abs(\alpha^{new}_1-\alpha^{old}_1)\)與我們規定的精度的值(一般零點幾), 如果小於精度則直接返回0
    • 選擇第2個\(\alpha_2\), 選擇依據就是讓\(abs(^{(i)}-E^{(j)})\)最大, 那個\(j\)就是我們期望的值, 從而\(\alpha_2=\alpha_j\), 定義函數select_j(Ei, i, model)

      # model封裝了整個SVM公式中的參數(C, xi)與數據(X, y)
      # 其中model還有一個E屬性, E為一個mx2的矩陣, 初始情況下都為0, 如果E對應的alpha被訪問處理過了, 就會在賦予model.E[i, :] = [1, Ei]
      def select_j(Ei, i, model):
          j = 0
          max_delta_E = 0
          Ej = 0
          # 查找所有已經被訪問過了樣本
          nonzeros_indice = nonzeros(model.E[:, 0])[0]
          if len(nonzeros_indice) > 1:
              # 在for循環中查找使得abs(Ei-Ej)最大的Ej和j
              for index in nonzeros_indice:
                  # 選擇的兩個alpha不能來自同一個樣本
                  if index == i: 
                      continue
                  E_temp = calc_E(i, model)
                  delta_E = abs(E_temp - Ei)
                  if delta_E > max_delta_E:
                      delta_E = max_delta_E
                      j = index
                      Ej = E_temp
          else:
              j = i
              while j = i:
                  Ej = int(random.uniform(0, model.m))
          return j, Ej
      
    • \(\alpha_1\)是否違反了KKT條件

    
    if (yi * Ei > toler and alphai > 0) or (yi * Ei < -toler and alphaj < 0):
        # 違反了
        pass
    else:
        # 沒有違反KKT, 直接返回0
        pass
    
  • 使用SMO算法對對偶問題求解, 求出\(\alpha\), 從而得出\(w,b\), 大致思路如下

    • 初始化\(\alpha\)向量為\(m\times1\), 元素為0的向量, 一開始\(\alpha^{(1)}\)的選擇沒有之前的依據, 所以使用從第一個\(alpha\)開始選取
    • 如果選入的\(\alpha\)沒有違反KKT條件則跳過, 迭代下一個\(\alpha\)
    • 將選出的\(\alpha^{(1)}\)代入iter函數, 該函數的工作是根據當前\(\alpha^{(1)}\)選擇出第二個\(\alpha^{(2)}\), 接着根據公式更新\(\alpha^{(2)},\alpha^{(1)}\), 公式在上面已經給出, 注意選出來的\(\alpha_2\)的在輸入樣本中不能與\(\alpha_1\)是同一個
    • 迭代完所有\(\alpha\), 下一步就是找出滿足支持向量條件的\(\alpha\), 即\(0 \le \alpha \le C\), 再將他們迭代

Python實現SVM的代碼



#!/usr/bin/env python
from numpy import *


class Model(object):

    def __init__(self, X, y, C, toler, kernel_param):
        self.X = X
        self.y = y
        self.C = C
        self.toler = toler
        self.kernel_param = kernel_param
        self.m = shape(X)[0]
        self.mapped_data = mat(zeros((self.m, self.m)))
        for i in range(self.m):
            self.mapped_data[:, i] = gaussian_kernel(self.X, X[i, :], self.kernel_param)
        self.E = mat(zeros((self.m, 2)))
        self.alphas = mat(zeros((self.m, 1)))
        self.b = 0


def load_data(filename):
    X = []
    y = []
    with open(filename, 'r') as fd:
        for line in fd.readlines():
            nums = line.strip().split(',')
            X_temp = []
            for i in range(len(nums)):
                if i == len(nums) - 1:
                    y.append(float(nums[i]))
                else:
                    X_temp.append(float(nums[i]))
            X.append(X_temp)
    return mat(X), mat(y)

def gaussian_kernel(X, l, kernel_param): 
    sigma = kernel_param 
    m = shape(X)[0]
    mapped_data = mat(zeros((m, 1)))
    for i in range(m):
        mapped_data[i] = exp(-sum((X[i, :] - l).T * (X[i, :] - l) / (2 * sigma ** 2)))
    return mapped_data

def clip_alpha(L, H, alpha):
    if alpha > H:
        alpha = H
    elif alpha < L:
        alpha = L
    return alpha

def calc_b(b1, b2):
    return (b1 + b2) / 2

def calc_E(i, model):
    yi = float(model.y[i])
    gxi = float(multiply(model.alphas, model.y).T * model.mapped_data[:, i] + model.b)
    Ei = gxi - yi
    return Ei

def select_j(Ei, i, model):
    nonzero_indices = nonzero(model.E[:, 0].A)[0]
    Ej = 0
    j = 0
    max_delta = 0
    if len(nonzero_indices) > 1:
        for index in nonzero_indices:
            if index == i:
                continue
            E_temp = calc_E(index, model)
            delta = abs(E_temp - Ei)
            if delta > max_delta:
                max_delta = delta
                Ej = E_temp
                j = index
    else:
        j = i
        while j == i:
            j = int(random.uniform(0, model.m))
        Ej = calc_E(j, model)
    return j, Ej

def iterate(i, model):
    yi = model.y[i]
    Ei = calc_E(i, model)
    model.E[i] = [1, Ei]
    # 如果alpahi不滿足KKT條件, 則進行之后的操作, 選擇alphaj, 更新alphai與alphaj, 還有b
    if (yi * Ei > model.toler and model.alphas[i] > 0) or (yi * Ei < -model.toler and model.alphas[i] < model.C):
        # alphai不滿足KKT條件
        # 選擇alphaj
        j, Ej = select_j(Ei, i, model)
        yj = model.y[j] 
        alpha1old = model.alphas[i].copy()
        alpha2old = model.alphas[j].copy()
        eta = model.mapped_data[i, i] + model.mapped_data[j, j] - 2 * model.mapped_data[i, j]   
        if eta <= 0:
            return 0
        alpha2new_unclip = alpha2old + yj * (Ei - Ej) / eta
        if yi == yj:
            L = max(0, alpha2old + alpha1old - model.C)
            H = min(model.C, alpha1old + alpha2old)
        else:
            L = max(0, alpha2old - alpha1old)
            H = min(model.C, model.C - alpha1old + alpha2old)
        if L == H:
            return 0
        alpha2new = clip_alpha(L, H, alpha2new_unclip)
        if abs(alpha2new - alpha2old) < 0.00001:
           return 0
        alpha1new = alpha1old + yi * yj * (alpha2old - alpha2new)
        b1new = -Ei - yi * model.mapped_data[i, i] * (alpha1new - alpha1old) \
                - yj * model.mapped_data[j, i] * (alpha2new - alpha2old) + model.b
        b2new = -Ej - yi * model.mapped_data[i, j] * (alpha1new - alpha1old) \
                - yj * model.mapped_data[j, j] * (alpha2new - alpha2old) + model.b
        model.b = calc_b(b1new, b2new)
        model.alphas[i] = alpha1new
        model.alphas[j] = alpha2new
        model.E[i] = [1, calc_E(i, model)]
        model.E[j] = [1, calc_E(j, model)]
        return 1
    return 0

def smo(X, y, C, toler, iter_num, kernel_param):
    model = Model(X, y.T, C, toler, kernel_param)
    changed_alphas = 0
    current_iter = 0
    for i in range(model.m):
        changed_alphas += iterate(i, model)
        print("iter:%d i:%d,pairs changed %d"
              %(current_iter, i, changed_alphas))
    current_iter += 1
    print('start...') 
    while current_iter < iter_num and changed_alphas > 0:
        changed_alphas = 0
        # 處理支持向量
        alphas_indice = nonzero((model.alphas.A > 0) * (model.alphas.A < C))[0]
        for i in alphas_indice:
            changed_alphas += iterate(i, model)
            print("iter:%d i:%d,pairs changed %d"
                  %(current_iter, i, changed_alphas))
        current_iter += 1
    return model.alphas, model.b
  • 注意: 在測試SVM的時候, 我們也需要把測試集通過核函數轉為m個特征的輸入, 所以我們需要將訓練集和測試集代入核函數中


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM