吳恩達深度學習第一課第四周課后編程作業 assignment4_2

本文轉載自查看原文 2020-07-25 00:00 588 DeepLearning.ai/ 吳恩達深度學習-課后代碼作業/ 吳恩達、深度學習、Andrew Ng、課后代碼作業

Deep Neural Network for Image Classification: Application（深度神經網絡在圖像分類中的應用）

本文作業是在jupyter notebook上一步一步做的，帶有一些過程中查找的資料等（出處已標明）並翻譯成了中文，如有錯誤，歡迎指正！

當你完成這個，你就完成了第四周的最后一個編程作業，也是這門課的最后一個編程作業!

您將使用在上一個任務中實現的函數來構建深度網絡，並將其應用於cat和非cat分類。希望您能看到相對於以前的邏輯回歸實現，准確度有所提高。

完成這項任務后，您將能夠:

　　•建立和應用深度神經網絡來監督學習。

讓我們開始吧！

1 - Packages 包

•numpy是使用Python進行科學計算的基本包。
•matplotlib是一個用Python繪制圖形的庫。
•h5py是與存儲在H5文件上的數據集交互的常用包。
•PIL和scipy在這里用你自己的圖片測試你的模型。
•dnn_app_utils提供了在“構建深層神經網絡:一步一步”任務中實現的函數。也就是上一節我們所做的函數
•seed(1)用於保持所有隨機函數調用的一致性。它將幫助我們批改你的作業。

 1 import time
 2 import numpy as np
 3 import h5py
 4 import matplotlib.pyplot as plt
 5 import scipy
 6 from PIL import Image
 7 from scipy import ndimage
 8 from dnn_app_utils_v2 import *
 9 
10 %matplotlib inline
11 plt.rcParams['figure.figsize'] = (5.0, 4.0) # set default size of plots
12 plt.rcParams['image.interpolation'] = 'nearest'
13 plt.rcParams['image.cmap'] = 'gray'
14 
15 %load_ext autoreload
16 %autoreload 2
17 
18 np.random.seed(1)

%load_ext autoreload自動加載來自：熊熊的小心心

2 - Dataset 數據集

你將使用與“邏輯回歸作為神經網絡”(作業2)中相同的“貓和非貓”數據集。你所建立的模型在對貓和非貓圖像進行分類方面有70%的測試准確率。希望您的新模型能表現得更好!

問題陳述:給你一個數據集(“data.h5”)，包含:

-標記為cat(1)或non-cat(0)的m_train圖像的訓練集
- m_test圖像標記為貓和非貓的測試集
-每個圖像是形狀(num_px, num_px, 3)，其中3是3通道(RGB)。

讓我們更加熟悉數據集。通過運行下面的單元格加載數據。

train_x_orig, train_y, test_x_orig, test_y, classes = load_data()

下面的代碼將顯示數據集中的映像。您可以隨意更改索引並多次重新運行單元格以查看其他圖像。（一共有209張照片）

# Example of a picture
index = 7
plt.imshow(train_x_orig[index])
print ("y = " + str(train_y[0,index]) + ". It's a " + classes[train_y[0,index]].decode("utf-8") +  " picture.")

# Explore your dataset 
m_train = train_x_orig.shape[0]
num_px = train_x_orig.shape[1]
m_test = test_x_orig.shape[0]

print ("Number of training examples: " + str(m_train))
print ("Number of testing examples: " + str(m_test))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_x_orig shape: " + str(train_x_orig.shape))
print ("train_y shape: " + str(train_y.shape))
print ("test_x_orig shape: " + str(test_x_orig.shape))
print ("test_y shape: " + str(test_y.shape))

像往常一樣，在將圖像提供給網絡之前，需要對它們進行重塑和標准化。代碼在下面的單元格中給出。

Figure 1: Image to vector conversion.（圖1:圖像到矢量的轉換。）

# Reshape the training and test examples 
train_x_flatten = train_x_orig.reshape(train_x_orig.shape[0], -1).T   # The "-1" makes reshape flatten the remaining dimensions
test_x_flatten = test_x_orig.reshape(test_x_orig.shape[0], -1).T    #“-1”使重塑壓平剩余的維度

# Standardize data to have feature values between 0 and 1. 對數據進行標准化，使其特征值在0到1之間。 因為RGB值最大就是255
train_x = train_x_flatten/255.
test_x = test_x_flatten/255.

print ("train_x's shape: " + str(train_x.shape))
print ("test_x's shape: " + str(test_x.shape))

結果：

3 - Architecture of your model （模型的架構）

現在您已經熟悉了數據集，現在可以構建一個深度神經網絡來區分cat圖像和非cat圖像了。

您將構建兩個不同的模型:

　　A 2-layer neural network 一個2層神經網絡

　　An L-layer deep neural network 一個L層深度神經網絡

然后您將比較這些模型的性能，並為 L嘗試不同的值。

讓我們看看這兩種架構。

3.1 - 2-layer neural network 二層的神經網絡

Figure 2: 2-layer neural network. 圖2 2層的神經網絡
The model can be summarized as: ***INPUT -> LINEAR -> RELU -> LINEAR -> SIGMOID -> OUTPUT***.

圖2的詳細架構:

•輸入是一個(64,64,3)圖像，它被平展成一個大小為矢量(12288,1)的圖像。
•對應向量:[x₀,x₁，…x₁₂₂₈₇]T乘以大小為(n^[1]，12288)的權值矩陣W^[1]。
•添加一個偏差項，取其relu得到以下向量:[a^[1] ₀,a^[1]₁，…，a^[1]_n^[1]−1]^T。
•然后重復同樣的過程。
•將得到的向量乘以W^[2]，並加上截距(偏置)。
•最后，取結果的sigmoid。如果它大於0.5，你就把它歸類為貓。

3.2 - L-layer deep neural network L層深度神經網絡

用上述表示方法來表示一個L層深度神經網絡是很困難的。但是，這里有一個簡化的網絡表示：

Figure 3: L-layer neural network. 圖3 L層神經網絡
The model can be summarized as: ***[LINEAR -> RELU]

圖3的詳細架構:

•輸入是一個(64,64,3)圖像，它被平展成一個大小為矢量(12288,1)的圖像。
•對應向量:[x₀,x₁，…x₁₂₂₈₇]^T乘以權重矩陣W，然后加上截距b。其結果稱為線性單元。
•接下來，取線性單位的relu。根據模型架構的不同，這個過程可以為每個(W^[l]，b^[l])重復幾次。
•最后，取最后一個線性單位的sigmoid。如果它大於0.5，你就把它歸類為貓。

3.3 - General methodology 一般方法

和往常一樣，你將遵循深度學習的方法來構建模型:

1. 初始化參數/定義超參數
2. num_iterations循環:
　　a.一個向前傳播。
　　b.計算成本函數
　　c.反向傳播
　　d.更新參數(使用參數和從backprop獲得的梯度)
3. 使用訓練過的參數來預測標簽

現在讓我們實現這兩個模型!

4 - Two-layer neural network 2層神經網絡

問:使用您在前一個任務中實現的輔助函數來構建一個2層神經網絡，其結構如下:LINEAR -> RELU -> LINEAR -> SIGMOID。你可能需要的功能和它們的輸入是:

def initialize_parameters(n_x, n_h, n_y):
    ...
    return parameters 
def linear_activation_forward(A_prev, W, b, activation):
    ...
    return A, cache
def compute_cost(AL, Y):
    ...
    return cost
def linear_activation_backward(dA, cache, activation):
    ...
    return dA_prev, dW, db
def update_parameters(parameters, grads, learning_rate):
    ...
    return parameters

### CONSTANTS DEFINING THE MODEL ####定義模型的常量
n_x = 12288     # num_px * num_px * 3 （64 X 64 X 3）
n_h = 7  #隱藏層的單元數有7個
n_y = 1  #輸出一個標簽值
layers_dims = (n_x, n_h, n_y)  #層的形狀

 1 # GRADED FUNCTION: two_layer_model
 2 
 3 def two_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):
 4     """
 5     Implements a two-layer neural network: LINEAR->RELU->LINEAR->SIGMOID.實現一個2層的神經網絡
 6     
 7     Arguments（參數）:
 8     X -- input data, of shape (n_x, number of examples) 輸入的數據，形狀是（n_x, 樣本數量）
 9     Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)
10     layers_dims -- dimensions of the layers (n_x, n_h, n_y)
11     num_iterations -- number of iterations of the optimization loop （優化循環迭代的次數）
12     learning_rate -- learning rate of the gradient descent update rule
13     print_cost -- If set to True, this will print the cost every 100 iterations 每迭代100次打印一次成本
14     
15     Returns:
16     parameters -- a dictionary containing W1, W2, b1, and b2  返回的是一個字典，包含了 W1, W2, b1 和 b2
17     """
18     
19     np.random.seed(1)
20     grads = {}
21     costs = []                              # to keep track of the cost 記錄成本
22     m = X.shape[1]                           # number of examples 樣本的數量
23     (n_x, n_h, n_y) = layers_dims
24     
25     # Initialize parameters dictionary, by calling one of the functions you'd previously implemented
26     ### START CODE HERE ### (≈ 1 line of code)
27     parameters = initialize_parameters(n_x, n_h, n_y, )
28     ### END CODE HERE ###
29     
30     # Get W1, b1, W2 and b2 from the dictionary parameters.
31     W1 = parameters["W1"]
32     b1 = parameters["b1"]
33     W2 = parameters["W2"]
34     b2 = parameters["b2"]
35     
36     # Loop (gradient descent)
37 
38     for i in range(0, num_iterations):
39 
40         # Forward propagation: LINEAR -> RELU -> LINEAR -> SIGMOID. Inputs: "X, W1, b1". Output: "A1, cache1, A2, cache2".
41         ### START CODE HERE ### (≈ 2 lines of code)
42         A1, cache1 = linear_activation_forward(X, W1, b1, activation = "relu")
43         A2, cache2 = linear_activation_forward(A1, W2, b2, activation = "sigmoid")
44         ### END CODE HERE ###
45         
46         # Compute cost
47         ### START CODE HERE ### (≈ 1 line of code)
48         cost = compute_cost(A2, Y)
49         ### END CODE HERE ###
50         
51         # Initializing backward propagation
52         dA2 = - (np.divide(Y, A2) - np.divide(1 - Y, 1 - A2))
53         
54         # Backward propagation. Inputs: "dA2, cache2, cache1". Outputs: "dA1, dW2, db2; also dA0 (not used), dW1, db1".
55         ### START CODE HERE ### (≈ 2 lines of code)
56         dA1, dW2, db2 = linear_activation_backward(dA2, cache2, activation = "sigmoid")
57         dA0, dW1, db1 = linear_activation_backward(dA1, cache1, activation = "relu")
58         ### END CODE HERE ###
59         
60         # Set grads['dWl'] to dW1, grads['db1'] to db1, grads['dW2'] to dW2, grads['db2'] to db2
61         grads['dW1'] = dW1
62         grads['db1'] = db1
63         grads['dW2'] = dW2
64         grads['db2'] = db2
65         
66         # Update parameters.
67         ### START CODE HERE ### (approx. 1 line of code)
68         parameters = update_parameters(parameters, grads, learning_rate)
69         ### END CODE HERE ###
70 
71         # Retrieve W1, b1, W2, b2 from parameters 從參數中檢索W1、b1、W2、b2
72         W1 = parameters["W1"]
73         b1 = parameters["b1"]
74         W2 = parameters["W2"]
75         b2 = parameters["b2"]
76         
77         # Print the cost every 100 training example 每100次訓練樣本打印一次成本
78         if print_cost and i % 100 == 0:
79             print("Cost after iteration {}: {}".format(i, np.squeeze(cost))) #從數組的形狀中刪除單維度條目，即把shape中為1的維度去掉
80         if print_cost and i % 100 == 0:
81             costs.append(cost)
82        
83     # plot the cost 圖示成本
84 
85     plt.plot(np.squeeze(costs))
86     plt.ylabel('cost')
87     plt.xlabel('iterations (per tens)')
88     plt.title("Learning rate =" + str(learning_rate))
89     plt.show()
90     
91     return parameters

# GRADED FUNCTION: two_layer_model

運行下面的單元來訓練參數。看看您的模型是否運行。成本應該會降低。運行2500次迭代可能需要5分鍾。檢查迭代0”后“成本與預期的輸出匹配,如果不點擊廣場(⬛)上酒吧的筆記本停止細胞,試圖找到你的錯誤。

parameters = two_layer_model(train_x, train_y, layers_dims = (n_x, n_h, n_y), num_iterations = 2500, print_cost=True)

幸好您構建了一個向量化的實現!否則可能要花10倍的時間來訓練它。

現在，您可以使用經過訓練的參數對數據集中的圖像進行分類。要查看對訓練集和測試集的預測，請運行下面的單元。

predictions_train = predict(train_x, train_y, parameters)

結果：

predictions_test = predict(test_x, test_y, parameters)
#這個predict（）函數是寫好的嗎？

結果：

注意:您可能注意到在更少的迭代(比如1500次)上運行模型可以在測試集上提供更好的准確性。這被稱為“早期停止”，我們將在下一課中討論它。提前停止是防止過擬合的一種方法。

恭喜你！看起來你的兩層神經網絡比邏輯回歸實現(70%，作業周2)有更好的性能(72%)。

5 - L-layer Neural Network L層的神經網絡

問題:使用您之前實現的輔助函數來構建一個結構如下的L層神經網絡:[LINEAR -> RELU] * (L-1) -> LINEAR -> SIGMOID。你可能需要的功能和它們的輸入是:

def initialize_parameters_deep(layer_dims):
    ...
    return parameters 
def L_model_forward(X, parameters):
    ...
    return AL, caches
def compute_cost(AL, Y):
    ...
    return cost
def L_model_backward(AL, Y, caches):
    ...
    return grads
def update_parameters(parameters, grads, learning_rate):
    ...
    return parameters

### CONSTANTS ###常量
layers_dims = [12288, 20, 7, 5, 1] #  5-layer model 這里我的理解是每層的單元數，第0層是64*64*3，第1~3層是隱藏層，單元數分別是20，7,5，第4層就是輸出層了

 1 # GRADED FUNCTION: L_layer_model
 2 
 3 def L_layer_model(X, Y, layers_dims, learning_rate = 0.0075, num_iterations = 3000, print_cost=False):#lr was 0.009
 4     """
 5     Implements a L-layer neural network: [LINEAR->RELU]*(L-1)->LINEAR->SIGMOID.最后一層的激活函數是sigmoid
 6     
 7     Arguments:
 8     X -- data, numpy array of shape (number of examples, num_px * num_px * 3)這里為啥把樣本數放前面了？個人認為是解釋標反了
 9     Y -- true "label" vector (containing 0 if cat, 1 if non-cat), of shape (1, number of examples)
10     layers_dims -- list containing the input size and each layer size, of length (number of layers + 1).
11     learning_rate -- learning rate of the gradient descent update rule
12     num_iterations -- number of iterations of the optimization loop
13     print_cost -- if True, it prints the cost every 100 steps
14     
15     Returns:
16     parameters -- parameters learnt by the model. They can then be used to predict.
17     """
18 
19     np.random.seed(1)
20     costs = []                         # keep track of cost
21     
22     # Parameters initialization.參數的初始化
23     ### START CODE HERE ###
24     parameters = initialize_parameters_deep(layers_dims)
25     ### END CODE HERE ###
26     
27     # Loop (gradient descent) 循環（梯度下降） 這邊少不了一個顯示的for循環，避免不了 
28     for i in range(0, num_iterations):
29 
30         # Forward propagation: [LINEAR -> RELU]*(L-1) -> LINEAR -> SIGMOID.
31         ### START CODE HERE ### (≈ 1 line of code)
32         AL, caches = L_model_forward(X, parameters)
33         ### END CODE HERE ###
34         
35         # Compute cost.
36         ### START CODE HERE ### (≈ 1 line of code)
37         cost = compute_cost(AL, Y)
38         ### END CODE HERE ###
39     
40         # Backward propagation.
41         ### START CODE HERE ### (≈ 1 line of code)
42         grads = L_model_backward(AL, Y, caches)
43         ### END CODE HERE ###
44  
45         # Update parameters.參數更新
46         ### START CODE HERE ### (≈ 1 line of code)
47         parameters = update_parameters(parameters, grads, learning_rate)
48         ### END CODE HERE ###
49                 
50         # Print the cost every 100 training example 每100次訓練樣本，打印一次成本
51         if print_cost and i % 100 == 0:
52             print ("Cost after iteration %i: %f" %(i, cost))
53         if print_cost and i % 100 == 0:
54             costs.append(cost)
55             
56     # plot the cost
57     plt.plot(np.squeeze(costs))
58     plt.ylabel('cost')
59     plt.xlabel('iterations (per tens)')
60     plt.title("Learning rate =" + str(learning_rate))
61     plt.show()
62     
63     return parameters

# GRADED FUNCTION: L_layer_model

現在，您將把模型訓練成一個5層神經網絡。

運行下面的單元來訓練你的模型。每次迭代的成本都應該降低。運行2500次迭代可能需要5分鍾。檢查迭代0”后“成本與預期的輸出匹配,如果不點擊廣場(⬛)上酒吧的筆記本停止細胞,試圖找到你的錯誤。

parameters = L_layer_model(train_x, train_y, layers_dims, num_iterations = 2500, print_cost = True)

結果：

pred_train = predict(train_x, train_y, parameters)

結果：

pred_test = predict(test_x, test_y, parameters)

結果：

恭喜!在相同的測試集上，5層神經網絡的性能(80%)似乎比2層神經網絡的性能(72%)要好。

對於這項任務來說，這是很好的表現。不錯的工作!

在下一節關於“改進深度神經網絡”的課程中，您將學習如何通過系統地搜索更好的超參數(learning_rate、layers_dims、num_iterations，以及其他您將在下一節課程中學習的超參數)來獲得更高的精度。

6) Results Analysis 結果分析

首先，讓我們看一些圖片的L-layer模型標簽不正確。這將顯示一些標簽錯誤的圖像。

print_mislabeled_images(classes, test_x, test_y, pred_test)

一些類型的圖像模型往往做得不好，包括:

•貓的身體處於不尋常的位置
•貓出現在相似顏色的背景上
•不尋常的貓的顏色和種類
•相機角度
•圖片的亮度
•尺度變化(cat在圖像中非常大或很小)

7) Test with your own image (optional/ungraded exercise) 用自己的圖像測試(可選/未分級練習)

恭喜你完成了這項任務。您可以使用自己的圖像並查看模型的輸出。這樣做:
1. 點擊筆記本上端的“文件”，然后點擊“打開”進入Coursera中心。
2. 將您的圖像添加到此木星筆記本的目錄，在“images”文件夾中
3.在下面的代碼中更改映像的名稱
4. 運行代碼並檢查算法是否正確(1 = cat, 0 = non-cat)!

會報警：

## START CODE HERE ##
my_image = "tree.jpg" # change this to the name of your image file 
my_label_y = [1] # the true class of your image (1 -> cat, 0 -> non-cat)
## END CODE HERE ##

fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((num_px*num_px*3,1))
my_predicted_image = predict(my_image, my_label_y, parameters)

plt.imshow(image)
print ("y = " + str(np.squeeze(my_predicted_image)) + ", your L-layer model predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

結果：

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

吳恩達深度學習 第一課第四周課后編程作業 assignment4_2