吳恩達的神經網絡與深度學習課后作業（第一課第二周的作業）神經網絡基礎

本文轉載自查看原文 2017-11-16 11:17 5936 吳恩達神經網絡與深度學習作業/ 神經網絡和深度學習/ 機器學習/ BP

此內容主要針對於吳恩達的神經網絡與深度學習課后作業（第一課第二周的作業）進行學習，記錄。

參考連接https://github.com/andersy005/deep-learning-specialization-coursera

說明

實現功能：這段代碼主要實現的功能是判斷一張圖片是否有cat，實現的是二分類，有就為1，沒有就為0。

訓練方法：BP網絡，此代碼很簡單，沒有隱藏層，直接就是輸入層連着輸出層，z=W'X+b，a=sigmoid(z) ,y=a故權值w是一維。（這一步體現在initialiize_with_zeros()中第21行）網絡結構如下圖所示:（實際輸入不止x1,x2,x3。是x1,x2~~x12288.(個數是由圖片64*64*3算出來的)）

難點說明：第34行：dw = (1./m)*np.dot(X,((A-Y).T)) 此處的dw是指（dL/dw），即損失函數對權值w的導數，此公式是由微積分的鏈式求導法則推導出來的，吳恩達視頻里有，不清楚的請看視頻。（2.9 logistic回歸中的梯度下降法），這里如果理解了，整個代碼也沒啥難度了。

BP算法

基本思想：學習過程由信號的正向傳播和誤差的反向傳播兩個過程組成。（這一步體現在propagate()函數）

數學工具：微積分的鏈式求導法則。（這一步體現在propagate()函數中第34行）

求解最小化成本函數(cost function)：梯度下降法。（這一步體現在optimize()函數）

注意

1.損失函數和代價函數的區別：

損失函數(Loss function)：指單個訓練樣本進行預測的結果與實際結果的誤差。

代價函數(Cost function)：整個訓練集，所有樣本誤差總和(所有損失函數總和)的平均值。（這一步體現在propagate()函數中的第32行）

  1 #!/usr/bin/env python3
  2 # -*- coding: utf-8 
  3 
  4 import numpy as np
  5 import matplotlib.pyplot as plt
  6 import h5py
  7 import scipy
  8 from PIL import Image
  9 from scipy import ndimage
 10 from lr_utils import load_dataset
 11 import pylab
 12 
 13 #sigmoid函數
 14 def sigmoid(z):
 15     s = 1./(1+np.exp(-z))
 16     return(s)
 17 
 18 #初始化權值閾值
 19 def initialiize_with_zeros(dim):
 20     #這里只有一個神經元，w是一維的
 21     w = np.zeros(shape = (dim,1), dtype = np.float32)
 22     b = 0
 23     #斷言函數，判斷是否為真
 24     assert(w.shape == (dim,1))
 25     assert(isinstance(b,float) or isinstance(b,int))
 26     return(w,b)
 27 
 28 def propagate(w,b,X,Y):
 29     m = X.shape[1]
 30     #forward propagation
 31     A = sigmoid(np.dot(w.T,X) + b)
 32     cost = (-1./m)*np.sum(Y*np.log(A) + (1-Y)*np.log(1-A),axis = 1)#按行相加
 33     #backward propagation
 34     dw = (1./m)*np.dot(X,((A-Y).T))#dw就是損失函數對w的求導
 35     db = (1./m)*np.sum(A-Y, axis=1)#axis=0按列相加，axis=1按行相加
 36     assert(dw.shape == w.shape)
 37     assert(db.dtype == float)
 38     cost = np.squeeze(cost)#squeeze函數的作用是去掉維度為1的維,在這就是將一個一維變成一個數字
 39 #     [ 6.00006477]
 40 #     6.000064773192205
 41     assert(cost.shape == ())
 42     grads = {"dw": dw,
 43              "db": db}
 44     return grads, cost
 45 
 46 def optimize(w,b,X,Y,num_iterations,learning_rate,print_cost = False):
 47     costs = []
 48     for i in range(num_iterations):
 49         grads, cost = propagate(w=w, b=b, X=X, Y=Y)        
 50         dw = grads["dw"]
 51         db = grads["db"]   
 52         w = w - learning_rate*dw
 53         b = b -  learning_rate*db
 54         if i % 100 == 0:
 55             costs.append(cost)
 56         if print_cost and i % 100 == 0:#這句沒懂
 57             print ("Cost after iteration %i: %f" %(i, cost))
 58     params = {"w": w,
 59               "b": b}
 60     grads = {"dw": dw,
 61              "db": db}
 62     return params, grads, costs
 63 
 64 def predict(w, b, X):
 65     m = X.shape[1]
 66     Y_prediction = np.zeros((1,m))
 67     w = w.reshape(X.shape[0], 1)   
 68     A = sigmoid(np.dot(w.T, X) + b)
 69 #     [print(x) for x in A]這句沒懂，但對代碼沒啥影響
 70     for i in range(A.shape[1]):    
 71         if A[0, i] >= 0.5:
 72             Y_prediction[0, i] = 1    
 73         else:
 74             Y_prediction[0, i] = 0
 75     assert(Y_prediction.shape == (1, m))
 76     
 77     return Y_prediction
 78 
 79 def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):
 80     #初始化權值閾值
 81     w, b = initialiize_with_zeros(X_train.shape[0])
 82     #梯度下降法尋優獲取最佳權值閾值
 83     parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)
 84     w = parameters["w"]
 85     b = parameters["b"]
 86     #測試集進行預測
 87     Y_prediction_test = predict(w, b, X_test)
 88     Y_prediction_train = predict(w, b, X_train)
 89     #輸出正確率
 90     print("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))
 91     print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))
 92     #將所有結果以字典形式保存並返回
 93     d = {"costs": costs,
 94          "Y_prediction_test": Y_prediction_test, 
 95          "Y_prediction_train" : Y_prediction_train, 
 96          "w" : w, 
 97          "b" : b,
 98          "learning_rate" : learning_rate,
 99          "num_iterations": num_iterations}
100     
101     return d
102         
103 
104 
105 
106 '''主程序從這里開始'''
107 #獲取訓練數據，測試數據
108 train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
109 
110 #reshape()方法來更改數組的形狀，train_set_x_orig.shape[0]是行數，-1是代表列數未知，需要numpy自動計算出列數
111 #這里的列數：就是一張圖片64*64*3數據變成一行數據的個數
112 train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
113 test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T
114 
115 #歸一，顏色的數值是0~255
116 train_set_x = train_set_x_flatten/255.
117 test_set_x = test_set_x_flatten/255.
118 
119 #訓練模型
120 d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)
121 print(d)

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。