《TensorFlow2深度學習》學習筆記（二）手動搭建並測試簡單神經網絡（附mnist.npz下載方式）

本文轉載自查看原文 2019-11-23 15:43 1965 Machine Leaming

本實驗使用了mnist.npz數據集，可以使用在線方式導入，但是我在下載過程中老是因為網絡原因被打斷，因此使用離線方式導入，離線包已傳至github方便大家下載：

https://github.com/guangfuhao/Deeplearning/blob/master/mnist.npz （mnist.npz下載）

下面是全部代碼：

#1.Import the neccessary libraries needed
import numpy as np
import tensorflow as tf
import matplotlib
from matplotlib import pyplot as plt

########################################################################

#2.Set default parameters for plots
matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False

########################################################################
#3.Initialize Parameters

#Initialize learning rate
lr = 1e-3
#Initialize loss array
losses = []
#Initialize the weights layers and the bias layers
w1=tf.Variable(tf.random.truncated_normal([784,256],stddev=0.1))
b1=tf.Variable(tf.zeros([256]))
w2=tf.Variable(tf.random.truncated_normal([256,128],stddev=0.1))
b2=tf.Variable(tf.zeros([128]))
w3=tf.Variable(tf.random.truncated_normal([128,10],stddev=0.1))
b3=tf.Variable(tf.zeros([10]))

########################################################################

#4.Import the minist dataset by numpy offline
def load_mnist():
    #define the directory where mnist.npz is(Please watch the '\'!)
    path = r'F:\learning\machineLearning\forward_progression\mnist.npz'
    f = np.load(path)
    x_train, y_train = f['x_train'],f['y_train']
    x_test, y_test = f['x_test'],f['y_test']
    f.close()
    return (x_train, y_train), (x_test, y_test)
(train_image,train_label),_ = load_mnist()
x = tf.convert_to_tensor(train_image, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(train_label, dtype=tf.int32)
#Reshape x from [60k, 28, 28] to [60k, 28*28]
x=tf.reshape(x,[-1,28*28])

########################################################################

#5.Combine x and y as a tuple and batch them
train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)
'''
#Encapsulate train_db as an iterator object
train_iter = iter(train_db)
sample = next(train_iter)
'''

########################################################################

#6.Iterate database for 20 times
for epoch in range(20):
    #For every batch:x:[128, 28*28],y: [128]
    for step, (x, y) in enumerate(train_db):
        with tf.GradientTape() as tape: # tf.Variable
            # x: [b, 28*28]
            # h1 = x@w1 + b1
            # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
            h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
            h1 = tf.nn.relu(h1)
            # [b, 256] => [b, 128]
            h2 = h1@w2 + b2
            h2 = tf.nn.relu(h2)
            # [b, 128] => [b, 10]
            out = h2@w3 + b3

            # y: [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)

            # compute loss
            # mse = mean(sum(y-out)^2)
            # [b, 10]
            loss = tf.square(y_onehot - out)
            # mean: scalar
            loss = tf.reduce_mean(loss)

        # compute gradients
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        #Update the weights and the bias
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if step % 100 == 0:
            print(epoch, step, 'loss:', float(loss))

    losses.append(float(loss))

########################################################################

#7.Show the change of losses via matplotlib
plt.figure()
plt.plot(losses, color='C0', marker='s', label='訓練')
plt.xlabel('Epoch')
plt.legend()
plt.ylabel('MSE')
#Save figure as '.svg' file
#plt.savefig('forward.svg')
plt.show()

第一部分沒什么好講的，導入了numpy,tensorflow,matplot和pyplot庫

import numpy as np
import tensorflow as tf
import matplotlib
from matplotlib import pyplot as plt

第二部分設置了matplot畫圖的一些參數

pylot使用rc配置文件來自定義圖形的各種默認屬性，稱之為rc配置或rc參數。通過rc參數可以修改默認的屬性，包括窗體大小、每英寸的點數、線條寬度、顏色、樣式、坐標軸、坐標和網絡屬性、文本、字體等

font.size為字體大小，figure.titlesize為標題大小，figure.figsize為圖像顯示大小，font.family設置字體為STKaiTi顯示中文，axes.unicode_minus設置正常顯示字符

matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False

第三部分初始化一些參數，lr為學習率（我將lr調整為1e-2時最終的losses變的更小了，但是目前並不知道這個值會對網絡的最終表現產生什么樣的影響），就是控制參數在每次梯度下降中下降的速率，losses用來存儲每次epoch結束時的loss，還用截斷正態分布（在tf.truncated_normal中如果x的取值在區間（μ-2σ，μ+2σ）之外則重新進行選擇。這樣保證了生成的值都在均值附近）初始化了三層權重層和用0初始化了偏置層

#Initialize learning rate
lr = 1e-3
#Initialize loss array
losses = []
#Initialize the weights layers and the bias layers
w1=tf.Variable(tf.random.truncated_normal([784,256],stddev=0.1))
b1=tf.Variable(tf.zeros([256]))
w2=tf.Variable(tf.random.truncated_normal([256,128],stddev=0.1))
b2=tf.Variable(tf.zeros([128]))
w3=tf.Variable(tf.random.truncated_normal([128,10],stddev=0.1))
b3=tf.Variable(tf.zeros([10]))

第四部分導入了minist數據集並對x的維度做了預處理，其中path為自己本地下載的mnist.npz的位置，注意這里是右斜杠！

def load_mnist():
    #define the directory where mnist.npz is(Please watch the '\'!)
    path = r'F:\learning\machineLearning\forward_progression\mnist.npz'
    f = np.load(path)
    x_train, y_train = f['x_train'],f['y_train']
    x_test, y_test = f['x_test'],f['y_test']
    f.close()
    return (x_train, y_train), (x_test, y_test)
(train_image,train_label),_ = load_mnist()
x = tf.convert_to_tensor(train_image, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(train_label, dtype=tf.int32)
#Reshape x from [60k, 28, 28] to [60k, 28*28]
x=tf.reshape(x,[-1,28*28])

第五部分將數據集做了batch切分，每個batch為128（這里的batch大小為什么是128存疑，我試過200和100但沒發現什么區別）條數據，至於什么是Batch和Epoch，可以直接向下至文末查看

train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)

第六部分Epoch20次，用mse計算loss,下面為mse的解釋：

tf.GradientTape（梯度帶）

__init__(persistent=False,watch_accessed_variables=True)
作用：創建一個新的GradientTape
參數:

persistent: 布爾值，用來指定新創建的gradient tape是否是可持續性的。默認是False，意味着只能夠調用一次gradient（）函數

watch_accessed_variables: 布爾值，表明這個gradien tap是不是會自動追蹤任何能被訓練（trainable）的變量。默認是True。要是為False的話，意味着你需要手動去指定你想追蹤的那些變量

下面的前向計算過程都需要包裹在 with tf.GradientTape() as tape 上下文中，使得前向計算時能夠保存計算圖信息，方便反向求導運算。assign_sub()將原地(In-place)減去給定的參數值，實現參數的自我更新操作

for epoch in range(20):
    #For every batch:x:[128, 28*28],y: [128]
    for step, (x, y) in enumerate(train_db):
        with tf.GradientTape() as tape: # tf.Variable
            # x: [b, 28*28]
            # h1 = x@w1 + b1
            # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
            h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
            h1 = tf.nn.relu(h1)
            # [b, 256] => [b, 128]
            h2 = h1@w2 + b2
            h2 = tf.nn.relu(h2)
            # [b, 128] => [b, 10]
            out = h2@w3 + b3

            # y: [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)

            # compute loss
            # mse = mean(sum(y-out)^2)
            # [b, 10]
            loss = tf.square(y_onehot - out)
            # mean: scalar
            loss = tf.reduce_mean(loss)

        # compute gradients
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        #Update the weights and the bias
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if step % 100 == 0:
            print(epoch, step, 'loss:', float(loss))

    losses.append(float(loss))

第七部分將losses隨訓練次數的增加的變化展示出來

plt.figure()
plt.plot(losses, color='C0', marker='s', label='訓練')
plt.xlabel('Epoch')
plt.legend()
plt.ylabel('MSE')
#Save figure as '.svg' file
#plt.savefig('forward.svg')
plt.show()

下圖是最終的loss曲線：

Batch和Epoch通俗易懂的解釋：（參考自https://blog.csdn.net/weixin_42137700/article/details/84302045）

假設您有一個包含200個樣本（數據行）的數據集，並且您選擇的Batch大小為5和1,000個Epoch。

這意味着數據集將分為40個Batch，每個Batch有5個樣本。每批五個樣品后，模型權重將更新。

這也意味着一個epoch將涉及40個Batch或40個模型更新。

有1000個Epoch，模型將暴露或傳遞整個數據集1,000次。在整個培訓過程中，總共有40,000Batch。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tensorflow學習筆記四：mnist實例--用簡單的神經網絡來訓練和測試 tensorflow學習筆記五：mnist實例--卷積神經網絡(CNN） TensorFlow深度學習筆記循環神經網絡實踐 [DL學習筆記]從人工神經網絡到卷積神經網絡_3_使用tensorflow搭建CNN來分類not_MNIST數據(有一些問題) mnist.npz的使用 Tensorflow學習：（二）搭建神經網絡 TensorFlow學習筆記（二）深層神經網絡 Pytorch學習筆記（二）---- 神經網絡搭建神經網絡和深度學習深度學習——學習筆記（2）神經網絡入門