問題描述:
具有9個特征值的數據三分類問題,每個特征值的取值集合為{-1,0,1}。數據如下格式:
設計感知機:
如何自己實現感知機的多分類,網上不調用庫的資料非常少。之前有上算法課的時候,老師講過多分類的神經網絡,相比較於回歸問題,多分類的損失函數設計時使用的是交叉熵。那么咱們按照這個思路從頭推導下如何一步步迭代出權重值使得它們擬合出較好的效果來。
第一步:隱藏層設計,h = W*x + b(其中W為3*9矩陣,x為9維向量,b為3維向量)
第二步:激活函數設計,a = softmax(h)(其中h為3維向量)
第三步:損失函數設計,Loss = y1lna1+y2lna2+y3lna3(其中a1,a2,a3,y1,y2,y3為單個數值)
權重值迭代:
如何迭代權重值,以擬合我們的分類器。這里我們使用梯度下降算法,即W = W - lr*dLoss/dW,b = b - lr*dLoss/db,lr是超參,那么我們要求的就只有對W和b偏導。
代碼實現:
import pandas as pd import numpy as np #數據集文件路徑 file = 'Dataset.xlsx' #獲取訓練集(原始訓練集百分之八十)、驗證集(原始訓練集百分之二十)、測試集 def getData(filepath): df_train = pd.read_excel(filepath, sheet_name='training') df_test = pd.read_excel(filepath, sheet_name='test') length = len(df_train.values) x_train = df_train.values[:int(0.8 * length), :-1] y_train = df_train.values[:int(0.8 * length), -1] x_val = df_train.values[int(0.8 * length):, :-1] y_val = df_train.values[int(0.8 * length):, -1] x_test = df_test.values[:, :-1] return x_train, y_train, x_val, y_val, x_test def main(): #學習率 lr = 0.000001 # 類別一維轉三維 classMap = {'-1': [1, 0, 0], '0': [0, 1, 0], '1': [0, 0, 1]} #類別映射 class_map = [-1, 0, 1] x_train, y_train, x_val, y_val, x_test = getData(file) #隨機初始化W、b W = np.random.randn(3, 9) b = np.random.randn(3) #訓練6000次 for i in range(6000): loss = 0 #初始化偏導 alpha1 = [0] * 9 alpha2 = [0] * 9 alpha3 = [0] * 9 beta1 = 0 beta2 = 0 beta3 = 0 for xi, yi in zip(x_train, y_train): ai = np.sum(np.multiply([xi] * 3, W), axis=1) + b y_predicti = np.exp(ai) / sum(np.exp(ai)) y_i = classMap[str(yi)] lossi = -sum(np.multiply(y_i, np.log(y_predicti))) loss += lossi # 每個訓練數據偏導累加 alpha1 += np.multiply(sum(np.multiply([0, 1, 1], y_i)), xi) alpha2 += np.multiply(sum(np.multiply([1, 0, 1], y_i)), xi) alpha3 += np.multiply(sum(np.multiply([1, 1, 0], y_i)), xi) beta1 += sum(np.multiply([0, 1, 1], y_i)) beta2 += sum(np.multiply([1, 0, 1], y_i)) beta3 += sum(np.multiply([1, 1, 0], y_i)) #W、b更新值 W[0] -= alpha1 * lr W[1] -= alpha2 * lr W[2] -= alpha3 * lr b[0] -= beta1 * lr b[1] -= beta2 * lr b[2] -= beta3 * lr loss = loss/len(x_train) recall = 0 #驗證 for xi, yi in zip(x_val, y_val): ai = np.sum(np.multiply([xi] * 3, W), axis=1) + b y_predicti = np.exp(ai) / sum(np.exp(ai)) y_predicti = [class_map[idx] for idx, i in enumerate(y_predicti) if i == max(y_predicti)][0] recall += 1 if int(y_predicti) == yi else 0 print('驗證集總條數:', len(x_val), '預測正確數:', recall) fp = open('perception.csv', 'w') #測試 for xi in x_test: ai = np.sum(np.multiply([xi] * 3, W), axis=1) + b y_predicti = np.exp(ai) / sum(np.exp(ai)) y_predicti = [class_map[idx] for idx, i in enumerate(y_predicti) if i == max(y_predicti)][0] fp.write(str(y_predicti)+'\n') fp.close() if __name__ == '__main__': print('方法三:感知機') main()