吳恩達深度學習筆記 course4 week1 測驗

1. 第 1 個問題

What do you think applying this filter to a grayscale image will do?

⎡⎣⎢⎢ 0 1 1 0 1 3 3 1 - 1 - 3 - 3 - 10

Detect vertical edges √ (左邊像素為正,右邊為負)

Detect 45 degree edges

Detect image contrast

Detect horizontal edges

第 2 個問題

1
point

2. 第 2 個問題

Suppose your input is a 300 by 300 color (RGB) image, and you are not using a convolutional network. If the first hidden layer has 100 neurons, each one fully connected to the input, how many parameters does this hidden layer have (including the bias parameters)?

9,000,001

9,000,100

27,000,001

27,000,100 √

第 3 個問題

1
point

3. 第 3 個問題

Suppose your input is a 300 by 300 color (RGB) image, and you use a convolutional layer with 100 filters that are each 5x5. How many parameters does this hidden layer have (including the bias parameters)?

2501

2600

7500

7600 √

第 4 個問題

1
point

4. 第 4 個問題

You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7x7, using a stride of 2 and no padding. What is the output volume?

29x29x16

16x16x16

16x16x32

29x29x32 √

第 5 個問題

1
point

5. 第 5 個問題

You have an input volume that is 15x15x8, and pad it using “pad=2.” What is the dimension of the resulting volume (after padding)?

17x17x8

19x19x8 √

19x19x12

17x17x10

第 6 個問題

1
point

6. 第 6 個問題

You have an input volume that is 63x63x16, and convolve it with 32 filters that are each 7x7, and stride of 1. You want to use a “same” convolution. What is the padding?

1

2

√

3

7

第 7 個問題

1
point

7. 第 7 個問題

You have an input volume that is 32x32x16, and apply max pooling with a stride of 2 and a filter size of 2. What is the output volume?

32x32x8

16x16x8

15x15x16

16x16x16 √

第 8 個問題

1
point

8. 第 8 個問題

Because pooling layers do not have parameters, they do not affect the backpropagation (derivatives) calculation.

True

False √

第 9 個問題

1
point

9. 第 9 個問題

In lecture we talked about “parameter sharing” as a benefit of using convolutional networks. Which of the following statements about parameter sharing in ConvNets are true? (Check all that apply.)

It allows gradient descent to set many of the parameters to zero, thus making the connections sparse.

It reduces the total number of parameters, thus reducing overfitting.

It allows parameters learned for one task to be shared even for a different task (transfer learning).

It allows a feature detector to be used in multiple locations throughout the whole input image/input volume.

第 10 個問題

1
point

10. 第 10 個問題

In lecture we talked about “sparsity of connections” as a benefit of using convolutional layers. What does this mean?

Regularization causes gradient descent to set many of the parameters to zero.

Each activation in the next layer depends on only a small number of activations from the previous layer. √

Each filter is connected to every channel in the previous layer.

Each layer in a convolutional network is connected only to two other layers

---------------------------------------------------------------------中文版-------------------------------------------------------------------------------------------

第一周測驗 - 卷積神經網絡的基本知識

1. 問題 1

你認為把下面這個過濾器應用到灰度圖像會怎么樣？
$[\begin{matrix} 0 & 1 & - 1 & 0 \\ 1 & 3 & - 3 & - 1 \\ 1 & 3 & - 3 & - 1 \\ 0 & 1 & - 1 & 0 \end{matrix}]$

【】會檢測45度邊緣
【★】會檢測垂直邊緣
【】會檢測水平邊緣
【】會檢測圖像對比度

Because the left part is positive, and the right part is negative.

因為因為左邊的部分是正的，右邊的部分是負的。（博主注：左邊亮，右邊暗）

2. 問題 2

假設你的輸入是一個300×300的彩色（RGB）圖像，而你沒有使用卷積神經網絡。如果第一個隱藏層有100個神經元，每個神經元與輸入層進行全連接，那么這個隱藏層有多少個參數（包括偏置參數）？

【】 9,000,001
【】 9,000,100
【】 27,000,001
【★】 27,000,100

博主注：先計算 $W^{[1]} = [l^{[1]}, X] = [100, 300 * 300 * 3] = 100 * 300 * 300 * 3 = 27, 000, 000$

3. 問題 3

假設你的輸入是300×300彩色（RGB）圖像，並且你使用卷積層和100個過濾器，每個過濾器都是5×5的大小，請問這個隱藏層有多少個參數（包括偏置參數）？

【】 2501
【】 2600
【】 7500
【★】 7600

博主注：視頻【1.7單層卷積網絡】，05:10處。首先，參數和輸入的圖片大小是沒有關系的，無論你給的圖像像素有多大，參數值都是不變的，在這個題中，參數值只與過濾器有關。我們來看一下怎么算：單片過濾器的大小是 $5 * 5$

4. 問題 4

你有一個63x63x16的輸入，並使用大小為7x7的32個過濾器進行卷積，使用步幅為2和無填充，請問輸出是多少？

【★】 29x29x32
【】 16x16x32
【】 29x29x16
【】 16x16x16

n = 63, f = 7, s = 2, p = 0, 32 filters.

博主注：我們先來看一下這個輸出尺寸的公式： $⌊ \frac{n_{h} + 2 p - f}{s} + 1 ⌋ \times ⌊ \frac{n_{w} + 2 p - f}{s} + 1 ⌋$

5. 問題 5

你有一個15x15x8的輸入，並使用“pad = 2”進行填充，填充后的尺寸是多少？

【】 17x17x10
【★】 19x19x8
【】 19x19x12
【】 17x17x8

6. 問題 6

你有一個63x63x16的輸入，有32個過濾器進行卷積，每個過濾器的大小為7x7，步幅為1，你想要使用“same”的卷積方式，請問pad的值是多少？

【】 1
【】 2
【★】 3
【】 7

博主注：“same”的卷積方式就是卷積前后的大小不變，也就是63x63x16的輸入進行卷積后的大小依舊為63x63x16，這需要我們對輸入過來的數據進行填充處理。我們來看一下這個輸出尺寸的公式(假設輸入圖像的寬、高相同)： $⌊ \frac{n + 2 p - f}{s} + 1 ⌋$

7. 問題 7

你有一個32x32x16的輸入，並使用步幅為2、過濾器大小為2的最大化池，請問輸出是多少？

【】 15x15x16
【】 16x16x8
【★】 16x16x16
【】 32x32x8

8. 問題 8

因為池化層不具有參數，所以它們不影響反向傳播的計算。

【】正確
【★】錯誤

博主注：由卷積層->池化層作為一個layer，在前向傳播過程中，池化層里保存着卷積層的各個部分的最大值/平均值，然后由池化層傳遞給下一層，在反向傳播過程中，由下一層傳遞梯度過來，“不影響反向傳播的計算”這意味着池化層到卷積層（反向）沒有梯度變化，梯度值就為0，既然梯度值為0，那么例如在 $W^{[l]} = W^{[l]} - α \times d W^{[l]}$

9. 問題 9

在視頻中，我們談到了“參數共享”是使用卷積網絡的好處。關於參數共享的下列哪個陳述是正確的？（檢查所有選項。）

【★】它減少了參數的總數，從而減少過擬合。
【★】它允許在整個輸入值的多個位置使用特征檢測器。
【】它允許為一項任務學習的參數即使對於不同的任務也可以共享（遷移學習）。
【】它允許梯度下降將許多參數設置為零，從而使得連接稀疏。

10. 問題 10

在課堂上，我們討論了“稀疏連接”是使用卷積層的好處。這是什么意思?

【】正則化導致梯度下降將許多參數設置為零。
【】每個過濾器都連接到上一層的每個通道。
【★】下一層中的每個激活只依賴於前一層的少量激活。
【】卷積網絡中的每一層只連接到另外兩層。