感知器做二分類的原理及python numpy實現

本文轉載自查看原文 2019-12-03 21:26 297 Machine Learning and Optimization

本文目錄：

1. 感知器

2. 感知器的訓練法則

3. 梯度下降和delta法則

4. python實現

1. 感知器^[1]

人工神經網絡以感知器（perceptron）為基礎。感知器以一個實數值向量作為輸入，計算這些輸入的線性組合，然后如果結果大於某個閾值，就輸出1，否則輸出-1（或0）。更精確地，如果輸入為$x_1$到$x_n$，那么感知器計算的輸出為：

其中，$w_i$是實數常量，叫做權值，用來決定輸入$x_i$對感知器輸出的貢獻率。因為僅以一個閾值來決定輸出，我們有時也把這種感知器叫做硬限幅感知器，當輸出為1和-1時，也叫做sgn感知器（符號感知器）。

2. 感知器的訓練法則^[1]

感知器的學習任務是決定一個權向量，它可以是感知器對於給定的訓練樣例輸出正確的1或-1。為得到可接受的權向量，一種辦法是從隨機的權值開始，然后反復應用這個感知器到每一個訓練樣例，只要它誤分類樣例就修改感知器的權值。重復這個過程，直到感知器正確分類所有的訓練樣例。每一步根據感知器訓練法則(perceptron Iraining rule) 來修改權值：${w_{i + 1}} \leftarrow {w_i} + \Delta {w_i}$，其中$\Delta {w_i} = \eta (t - o){x_i}$，$\eta$是學習速率，用來緩和或者加速每一步調整權值的程度。

3. 梯度下降和delta法則^[1]

4. python實現^[2]

訓練數據：總共500個訓練樣本，鏈接https://pan.baidu.com/s/1qWugzIzdN9qZUnEw4kWcww，提取碼：ncuj

損失函數：均方誤差（MSE）

代碼如下：

import numpy as np
import matplotlib.pyplot as plt


class hardlim():
    def __init__(self, path):
        self.path = path

    def file2matrix(self, delimiter):
        fp = open(self.path, 'r')
        content = fp.read()              # content現在是一行字符串，該字符串包含文件所有內容
        fp.close()
        rowlist = content.splitlines()   # 按行轉換為一維表
        # 逐行遍歷
        # 結果按分隔符分割為行向量
        recordlist = [list(map(float, row.split(delimiter))) for row in rowlist if row.strip()]
        return np.mat(recordlist)

    def drawScatterbyLabel(self, dataSet):
        m, n = dataSet.shape
        target = np.array(dataSet[:, -1])
        target = target.squeeze()        # 把二維數據變為一維數據
        for i in range(m):
            if target[i] == 0:
                plt.scatter(dataSet[i, 0], dataSet[i, 1], c='blue', marker='o')
            if target[i] == 1:
                plt.scatter(dataSet[i, 0], dataSet[i, 1], c='red', marker='o')

    def buildMat(self, dataSet):
        m, n = dataSet.shape
        dataMat = np.zeros((m, n))
        dataMat[:, 0] = 1
        dataMat[:, 1:] = dataSet[:, :-1]
        return dataMat

    def classfier(self, x):
        x[x >= 0.5] = 1
        x[x < 0.5] = 0
        return x


if __name__ == '__main__':
    hardlimit = hardlim('testSet.txt')

    print('1. 導入數據')
    inputData = hardlimit.file2matrix('\t')
    target = inputData[:, -1]
    m, n = inputData.shape
    print('size of input data: {} * {}'.format(m, n))

    print('2. 按分類繪制散點圖')
    hardlimit.drawScatterbyLabel(inputData)

    print('3. 構建系數矩陣')
    dataMat = hardlimit.buildMat(inputData)

    alpha = 0.1                 # learning rate
    steps = 600                 # total iterations
    weights = np.ones((n, 1))   # initialize weights
    weightlist = []

    print('4. 訓練模型')
    for k in range(steps):
        output = hardlimit.classfier(dataMat * np.mat(weights))
        errors = target - output
        print('iteration: {}  error_norm: {}'.format(k, np.linalg.norm(errors)))
        weights = weights + alpha*dataMat.T*errors  # 梯度下降
        weightlist.append(weights)

    print('5. 畫出訓練過程')
    X = np.linspace(-5, 15, 301)
    weights = np.array(weights)
    length = len(weightlist)
    for idx in range(length):
        if idx % 100 == 0:
            weight = np.array(weightlist[idx])
            Y = -(weight[0] + X * weight[1]) / weight[2]
            plt.plot(X, Y)
            plt.annotate('hplane:' + str(idx), xy=(X[0], Y[0]))
    plt.show()

    print('6. 應用模型到測試數據中')
    testdata = np.mat([-0.147324, 2.874846])           # 測試數據
    m, n = testdata.shape
    testmat = np.zeros((m, n+1))
    testmat[:, 0] = 1
    testmat[:, 1:] = testdata
    result = sum(testmat * (np.mat(weights)))
    if result < 0.5:
        print(0)
    else:
        print(1)

訓練結果如下：

【參考文獻】

《機器學習》Mitshell，第四章

《機器學習算法原理與編程實踐》鄭捷，第五章5.2.2

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 單層感知器 - 坐標點二分類問題邏輯回歸(Logistic Regression)二分類原理，交叉熵損失函數及python numpy實現感知器及其Python實現 ROC原理介紹及利用python實現二分類和多分類的ROC曲線【模式識別與機器學習】——3.6感知器算法3.7采用感知器算法的多類模式的分類機器學習之感知器算法原理和Python實現 matlab 實現感知機線性二分類算法（Perceptron）二分類實現多分類機器學習 | 剖析感知器算法 & Python實現用Python實現多層感知器神經網絡