0-Background
作為Deep Learning中的Hello World 項目無論如何都要做一遍的。
代碼地址:Github 練習過程中將持續更新blog及代碼。
第一次寫博客,很多地方可能語言組織不清,請多多提出意見。。謝謝~
0.1 背景知識:
- Linear regression
- CNN
LeNet-5
AlexNet
ResNet
VGG
- 各種regularization方式
0.2 Catalog
1-Prepare
- Numpy 開源的數值計算庫
- matplotlib Python 的 2D繪圖庫
- TensorFlow 開源的人工智能學習系統
- Keras 基Tensorflow、Theano以及CNTK后端的一個高層神經網絡API
2-MNIST
MNIST作為NIST的一個超集,是一個由來自 250 個不同人手寫的數字構成。其中包含60,000個訓練樣本和10,000個測試樣本。
加載MNIST
import numpy as np
import os
import struct
import matplotlib.pyplot as plt
class load:
def __init__(self,
path='mnist'):
self.path = path
def load_mnist(self):
"""Read train and test dataset and labels from path"""
train_image_path = 'train-images.idx3-ubyte'
train_label_path = 'train-labels.idx1-ubyte'
test_image_path = 't10k-images.idx3-ubyte'
test_label_path = 't10k-labels.idx1-ubyte'
with open(os.path.join(self.path, train_label_path), 'rb') as labelpath:
magic, n = struct.unpack('>II', labelpath.read(8))
labels = np.fromfile(labelpath, dtype=np.uint8)
train_labels = labels.reshape(len(labels), 1)
with open(os.path.join(self.path, train_image_path), 'rb') as imgpath:
magic, num, rows, cols = struct.unpack('>IIII', imgpath.read(16))
images = np.fromfile(imgpath,
dtype=np.uint8).reshape(len(train_labels), 784)
train_images = images
with open(os.path.join(self.path, test_label_path), 'rb') as labelpath:
magic, n = struct.unpack('>II', labelpath.read(8))
labels = np.fromfile(labelpath,
dtype=np.uint8)
test_labels = labels.reshape(len(labels), 1)
with open(os.path.join(self.path, test_image_path), 'rb') as imgpath:
magic, num, rows, cols = struct.unpack('>IIII', imgpath.read(16))
images = np.fromfile(imgpath, dtype=np.uint8).reshape(len(test_labels), 784)
test_images = images
return train_images, train_labels, test_images, test_labels
if __name__ == '__main__':
train_images, train_labels, test_images, test_labels = load().load_mnist()
print('train_images shape:%s' % str(train_images.shape))
print('train_labels shape:%s' % str(train_labels.shape))
print('test_images shape:%s' % str(test_images.shape))
print('test_labels shape:%s' % str(test_labels.shape))
np.random.seed(1024)
trainImage = np.random.randint(60000, size=4)
testImage = np.random.randint(10000, size=2)
img1 = train_images[trainImage[0]].reshape(28, 28)
label1 = train_labels[trainImage[0]]
img2 = train_images[trainImage[1]].reshape(28, 28)
label2 = train_labels[trainImage[1]]
img3 = train_images[trainImage[2]].reshape(28, 28)
label3 = train_labels[trainImage[2]]
img4 = train_images[trainImage[3]].reshape(28, 28)
label4 = train_labels[trainImage[3]]
img5 = test_images[testImage[0]].reshape(28, 28)
label5 = test_labels[testImage[0]]
img6 = test_images[testImage[1]].reshape(28, 28)
label6 = test_labels[testImage[1]]
plt.figure(num='mnist', figsize=(2, 3))
plt.subplot(2, 3, 1)
plt.title(label1)
plt.imshow(img1)
plt.subplot(2, 3, 2)
plt.title(label2)
plt.imshow(img2)
plt.subplot(2, 3, 3)
plt.title(label3)
plt.imshow(img3)
plt.subplot(2, 3, 4)
plt.title(label4)
plt.imshow(img4)
plt.subplot(2, 3, 5)
plt.title(label5)
plt.imshow(img5)
plt.subplot(2, 3, 6)
plt.title(label6)
plt.imshow(img6)
plt.show()
運行得到輸出:
3-LinearRegression
采用線性回歸的方式對MNIST數據集訓練識別。
采用2層網絡,hidden layer具有四個神經元,激活函數分別使用Tanh和ReLu。
由於MNIST是一個多分類問題,故輸出層采用Softmax作為激活函數,並使用cross entropy作為Loss Function。
3.1 使用Numpy實現
3.1.1 通過Tran data、label獲取 layer size
Code:
def layer_size(X, Y):
"""
Get number of input and output size, and set hidden layer size
:param X: input dataset's shape(m, 784)
:param Y: input labels's shape(m,1)
:return:
n_x -- the size of the input layer
n_h -- the size of the hidden layer
n_y -- the size of the output layer
"""
n_x = X.T.shape[0]
n_h = 4
n_y = Y.T.shape[0]
return n_x, n_h, n_y
3.1.2 初始化參數
初始化W1、b1、W2、b2*
W初始化為非0數字
b均初始化為0
Code:
def initialize_parameters(n_x, n_h, n_y):
"""
Initialize parameters
:param n_x: the size of the input layer
:param n_h: the size of the hidden layer
:param n_y: the size of the output layer
:return: dictionary of parameters
"""
W1 = np.random.randn(n_h, n_x) * 0.01
b1 = np.zeros((n_h, 1))
W2 = np.random.randn(n_y, n_h) * 0.01
b2 = np.zeros((n_y, 1))
parameters = {"W1": W1,
"b1": b1,
"W2": W2,
"b2": b2
}
return parameters
3.1.3 Forward Propagation
ReLu采用\((|Z|+Z)/2\)的方式實現
def ReLu(Z):
return (abs(Z) + Z) / 2
def forward_propagation(X, parameters, activation="tanh"):
"""
Compute the forword propagation
:param X: input data (m, n_x)
:param parameters: parameters from initialize_parameters
:param activation: activation function name, has "tanh" and "relu"
:return:
cache: caches of forword result
A2: sigmoid output
"""
X = X.T
W1 = parameters["W1"]
b1 = parameters["b1"]
W2 = parameters["W2"]
b2 = parameters["b2"]
Z1 = np.dot(W1, X) + b1
if activation == "tanh":
A1 = np.tanh(Z1)
elif activation == "relu":
A1 = ReLu(Z1)
else:
raise Exception('Activation function is not found!')
Z2 = np.dot(W2, A1) + b2
A2 = 1 / (1 + np.exp(-Z2))
cache = {"Z1": Z1,
"A1": A1,
"Z2": Z2,
"A2": A2}
return A2, cache