做這道題花費了五天左右的時間,主要是python基礎不怎么樣,看着別人的代碼,主要是參考https://blog.csdn.net/Snoopy_Yuan/article/details/63684219
一行一行地弄懂,然后再自己寫。
一、獲得以下經驗:
1、在使用梯度下降算法求解數值的最優解時,要觀察擬合曲線的擬合情況,擬合曲線逼近穩定值才可
在上圖所示的擬合曲線中,曲線沒有趨於穩定,應調節梯度下降的步長,或者增加擬合次數,
調節后的擬合結果如下圖
擬合曲線趨於穩定值,擬合效果還是比較好的(巧了,在擬合500次的情況下步長比它大,比它小的都不穩定,但增大擬合次數可解決這一問題)
二、源代碼
源代碼的算法部分是在看懂別人的代碼的基礎上自己敲的,其余部分直接從https://blog.csdn.net/Snoopy_Yuan/article/details/63684219摘的,希望原作者不要罵我。。。。。。。
1、主程序 main_self.py
1 import numpy as np # for matrix calculation 2 3 import matplotlib.pyplot as plt 4 5 # load the CSV file as a numpy matrix 6 7 dataset = np.loadtxt('../watermelon_3a.csv', delimiter=",") #以逗號為分隔符,讀取文件數據,讀出一個17行四列的數組 8 9 # separate the data from the target attributes 10 11 X = dataset[:, 1:3]#截取第一列,第二列數據 12 13 y = dataset[:, 3]#截取第三列數據 14 15 m, n = np.shape(X)#m反回行數,n反回列數 16 17 # draw scatter diagram to show the raw data 18 19 f1 = plt.figure(1) 20 21 plt.title('watermelon_3a') 22 23 plt.xlabel('density') 24 25 plt.ylabel('ratio_sugar') 26 27 plt.scatter(X[y == 0, 0], X[y == 0, 1], marker='o', color='k', s=100, label='bad') 28 #X[y == 0, 0]是一個一行九列的向量[0.666 0.243 0.245 0.343 0.639 0.657 0.36 0.593 0.719],它取得是X矩陣的第0列中,對y==0為真的行數的數據 29 #scatter函數是用來繪制散點圖的 https://blog.csdn.net/fei347795790/article/details/94331112 30 31 plt.scatter(X[y == 1, 0], X[y == 1, 1], marker='o', color='g', s=100, label='good') 32 33 plt.legend(loc='upper right') 34 35 #plt.show()#用來顯示圖像 #如果不把這一行注釋掉,程序執行到此處自動結束 36 37 38 from sklearn import model_selection 39 40 #import self_def 41 import self_def_xdl 42 43 # X_train, X_test, y_train, y_test 44 45 np.ones(n) 46 47 m, n = np.shape(X) 48 49 X_ex = np.c_[X, np.ones(m)] # extend the variable matrix to [x, 1] 為啥要寫成X_ex呢,在p59 式3.25下面 50 51 X_train, X_test, y_train, y_test = model_selection.train_test_split(X_ex, y, test_size=0.5, random_state=0)#划分訓練集和測試集 52 53 # using gradDescent to get the optimal parameter beta = [w, b] in page-59 54 55 beta = self_def_xdl.gradDscent(X_train, y_train)#beta = -4.7 0.3 2.5 56 57 # prediction, beta mapping to the model 58 59 y_pred = self_def_xdl.predict(X_test, beta) 60 61 m_test = np.shape(X_test)[0] 62 63 # calculation of confusion_matrix and prediction accuracy計算混淆矩陣和計算的准確性 64 65 cfmat = np.zeros((2, 2)) 66 67 for i in range(m_test): 68 69 if y_pred[i] == y_test[i] == 0: 70 cfmat[0, 0] += 1 71 72 elif y_pred[i] == y_test[i] == 1: 73 cfmat[1, 1] += 1 74 75 elif y_pred[i] == 0: 76 cfmat[1, 0] += 1 77 78 elif y_pred[i] == 1: 79 cfmat[0, 1] += 1 80 81 print(cfmat)
二、自定義函數 self_def_xdl.py
1 def likelihood_xdl(X,y,beta): 2 ''' 3 對應於實現P59-3.29 4 :param X:(x;1) m行n+1列的數組,n列代表樣本有n個維度,m行代表有m個樣本,為啥要n+1列呢,見3.25下面的一段話 5 :param y: m行1列的數組 6 :param beta: 1行n+1列的數組beta = (w;b) 7 :return: l_beta,一個數值 8 ''' 9 import numpy as np 10 m,n = np.shape(X) 11 sum = 0 12 for i in range(m): 13 sum += -y[i]*np.dot(beta,X[i].T)+np.math.log(1+np.math.exp(np.dot(beta,X[i].T))) 14 return sum 15 16 def gradDscent(X,y): 17 ''' 18 我使用的是批量梯度下降算法,如果樣本的數據量比較大,還可使用隨機梯度下降算法,詳見https://blog.csdn.net/lilyth_lilyth/article/details/8973972 19 X,y是訓練集的數據,對3.27式使用梯度下降算法 20 :param X: (x;1) m行n+1列的數組,n列代表樣本有n個維度,m行代表有m個樣本,為啥要n+1列呢,見3.25下面的一段話 21 :param y: m行1列的數組 22 :return:經過梯度下降算法一步一步迭代得出的beta 23 ''' 24 import numpy as np 25 import matplotlib.pyplot as plt 26 m,n = np.shape(X) 27 h = 0.11#步長 28 beta = np.zeros(n)#1行n列 29 delta_beta = h*np.ones(n) 30 max_times = 1500#迭代次數 31 beta_history = np.zeros((n,max_times))#記錄beta的歷史數據 #注意,括號中還有括號 #注意n和maxtime的順序 32 l_beta = 0 33 l_beta_temp = 0 34 35 for i in range(max_times): 36 temp_beta = beta #暫存beta數據 37 beta_history[:, i] = beta.T # 記錄下beta的歷史,后面去查看擬合情況 38 for j in range(n): 39 beta[j] += delta_beta[j] 40 41 l_beta_temp = likelihood_xdl(X,y,beta) 42 delta_beta[j] = -h*(l_beta_temp-l_beta)/delta_beta[j]#使用式B.16,B.17更新delta_beta 43 44 beta = temp_beta#恢復beta的數據 45 46 beta += delta_beta 47 l_beta = likelihood_xdl(X,y,beta) 48 49 t = np.arange(max_times) 50 f3 = plt.figure(3) 51 p1 = plt.subplot(311) # 311-3行1列第一個 52 p1.plot(t, beta_history[0]) 53 54 plt.ylabel('w1') 55 56 p2 = plt.subplot(312) 57 58 p2.plot(t, beta_history[1]) 59 60 plt.ylabel('w2') 61 62 p3 = plt.subplot(313) 63 64 p3.plot(t, beta_history[2]) 65 66 plt.ylabel('b') 67 68 plt.show() # 顯示所有圖像 69 70 return beta 71 72 def sigmoid(x, beta): 73 ''' 74 式3.18和3.25下面的一段話 75 @param x: 測試集的一個樣本,為一維行向量 76 77 @param beta: 一維行向量 78 79 @return: sigmoid函數預測結果 80 81 ''' 82 import numpy as np 83 return 1.0 / (1 + np.math.exp(- np.dot(beta, x.T))) 84 85 86 def predict(X, beta): 87 ''' 88 89 使用3.16式 90 91 @param X: data sample form like [x, 1] 92 93 @param beta: the parameter of sigmoid form like [w, b] 94 95 @return: the class lable array 96 97 ''' 98 import numpy as np 99 m, n = np.shape(X) 100 101 y = np.zeros(m) 102 103 for i in range(m): 104 105 if sigmoid(X[i], beta) > 0.5: y[i] = 1; 106 107 return y 108 109 return beta
三、原始數據集 watermelon_3a.csv
1,0.697,0.46,1
2,0.774,0.376,1
3,0.634,0.264,1
4,0.608,0.318,1
5,0.556,0.215,1
6,0.403,0.237,1
7,0.481,0.149,1
8,0.437,0.211,1
9,0.666,0.091,0
10,0.243,0.0267,0
11,0.245,0.057,0
12,0.343,0.099,0
13,0.639,0.161,0
14,0.657,0.198,0
15,0.36,0.37,0
16,0.593,0.042,0
17,0.719,0.103,0