《TensorFlow2深度學習》學習筆記(二)手動搭建並測試簡單神經網絡(附mnist.npz下載方式)


本實驗使用了mnist.npz數據集,可以使用在線方式導入,但是我在下載過程中老是因為網絡原因被打斷,因此使用離線方式導入,離線包已傳至github方便大家下載:

https://github.com/guangfuhao/Deeplearning/blob/master/mnist.npz (mnist.npz下載)

下面是全部代碼:

#1.Import the neccessary libraries needed
import numpy as np
import tensorflow as tf
import matplotlib
from matplotlib import pyplot as plt

########################################################################

#2.Set default parameters for plots
matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False

########################################################################
#3.Initialize Parameters

#Initialize learning rate
lr = 1e-3
#Initialize loss array
losses = []
#Initialize the weights layers and the bias layers
w1=tf.Variable(tf.random.truncated_normal([784,256],stddev=0.1))
b1=tf.Variable(tf.zeros([256]))
w2=tf.Variable(tf.random.truncated_normal([256,128],stddev=0.1))
b2=tf.Variable(tf.zeros([128]))
w3=tf.Variable(tf.random.truncated_normal([128,10],stddev=0.1))
b3=tf.Variable(tf.zeros([10]))

########################################################################

#4.Import the minist dataset by numpy offline
def load_mnist():
    #define the directory where mnist.npz is(Please watch the '\'!)
    path = r'F:\learning\machineLearning\forward_progression\mnist.npz'
    f = np.load(path)
    x_train, y_train = f['x_train'],f['y_train']
    x_test, y_test = f['x_test'],f['y_test']
    f.close()
    return (x_train, y_train), (x_test, y_test)
(train_image,train_label),_ = load_mnist()
x = tf.convert_to_tensor(train_image, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(train_label, dtype=tf.int32)
#Reshape x from [60k, 28, 28] to [60k, 28*28]
x=tf.reshape(x,[-1,28*28])

########################################################################

#5.Combine x and y as a tuple and batch them
train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)
'''
#Encapsulate train_db as an iterator object
train_iter = iter(train_db)
sample = next(train_iter)
'''

########################################################################

#6.Iterate database for 20 times
for epoch in range(20):
    #For every batch:x:[128, 28*28],y: [128]
    for step, (x, y) in enumerate(train_db):
        with tf.GradientTape() as tape: # tf.Variable
            # x: [b, 28*28]
            # h1 = x@w1 + b1
            # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
            h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
            h1 = tf.nn.relu(h1)
            # [b, 256] => [b, 128]
            h2 = h1@w2 + b2
            h2 = tf.nn.relu(h2)
            # [b, 128] => [b, 10]
            out = h2@w3 + b3

            # y: [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)

            # compute loss
            # mse = mean(sum(y-out)^2)
            # [b, 10]
            loss = tf.square(y_onehot - out)
            # mean: scalar
            loss = tf.reduce_mean(loss)

        # compute gradients
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        #Update the weights and the bias
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if step % 100 == 0:
            print(epoch, step, 'loss:', float(loss))

    losses.append(float(loss))

########################################################################

#7.Show the change of losses via matplotlib
plt.figure()
plt.plot(losses, color='C0', marker='s', label='訓練')
plt.xlabel('Epoch')
plt.legend()
plt.ylabel('MSE')
#Save figure as '.svg' file
#plt.savefig('forward.svg')
plt.show()

 第一部分沒什么好講的,導入了numpy,tensorflow,matplot和pyplot庫

import numpy as np
import tensorflow as tf
import matplotlib
from matplotlib import pyplot as plt

第二部分設置了matplot畫圖的一些參數

pylot使用rc配置文件來自定義圖形的各種默認屬性,稱之為rc配置或rc參數。通過rc參數可以修改默認的屬性,包括窗體大小、每英寸的點數、線條寬度、顏色、樣式、坐標軸、坐標和網絡屬性、文本、字體

font.size為字體大小,figure.titlesize為標題大小,figure.figsize為圖像顯示大小,font.family設置字體為STKaiTi顯示中文,axes.unicode_minus設置正常顯示字符

matplotlib.rcParams['font.size'] = 20
matplotlib.rcParams['figure.titlesize'] = 20
matplotlib.rcParams['figure.figsize'] = [9, 7]
matplotlib.rcParams['font.family'] = ['STKaiTi']
matplotlib.rcParams['axes.unicode_minus']=False

第三部分初始化一些參數,lr為學習率(我將lr調整為1e-2時最終的losses變的更小了,但是目前並不知道這個值會對網絡的最終表現產生什么樣的影響),就是控制參數在每次梯度下降中下降的速率,losses用來存儲每次epoch結束時的loss,還用截斷正態分布(在tf.truncated_normal中如果x的取值在區間(μ-2σ,μ+2σ)之外則重新進行選擇。這樣保證了生成的值都在均值附近)初始化了三層權重層和用0初始化了偏置層

#Initialize learning rate
lr = 1e-3
#Initialize loss array
losses = []
#Initialize the weights layers and the bias layers
w1=tf.Variable(tf.random.truncated_normal([784,256],stddev=0.1))
b1=tf.Variable(tf.zeros([256]))
w2=tf.Variable(tf.random.truncated_normal([256,128],stddev=0.1))
b2=tf.Variable(tf.zeros([128]))
w3=tf.Variable(tf.random.truncated_normal([128,10],stddev=0.1))
b3=tf.Variable(tf.zeros([10]))

第四部分導入了minist數據集並對x的維度做了預處理,其中path為自己本地下載的mnist.npz的位置,注意這里是右斜杠!

def load_mnist():
    #define the directory where mnist.npz is(Please watch the '\'!)
    path = r'F:\learning\machineLearning\forward_progression\mnist.npz'
    f = np.load(path)
    x_train, y_train = f['x_train'],f['y_train']
    x_test, y_test = f['x_test'],f['y_test']
    f.close()
    return (x_train, y_train), (x_test, y_test)
(train_image,train_label),_ = load_mnist()
x = tf.convert_to_tensor(train_image, dtype=tf.float32) / 255.
y = tf.convert_to_tensor(train_label, dtype=tf.int32)
#Reshape x from [60k, 28, 28] to [60k, 28*28]
x=tf.reshape(x,[-1,28*28])

第五部分將數據集做了batch切分,每個batch為128(這里的batch大小為什么是128存疑,我試過200和100但沒發現什么區別)條數據,至於什么是Batch和Epoch,可以直接向下至文末查看

train_db = tf.data.Dataset.from_tensor_slices((x,y)).batch(128)

第六部分Epoch20次,用mse計算loss,下面為mse的解釋:

tf.GradientTape(梯度帶)

__init__(persistent=False,watch_accessed_variables=True)
作用:創建一個新的GradientTape
參數:

persistent: 布爾值,用來指定新創建的gradient tape是否是可持續性的。默認是False,意味着只能夠調用一次gradient()函數

watch_accessed_variables: 布爾值,表明這個gradien tap是不是會自動追蹤任何能被訓練(trainable)的變量。默認是True。要是為False的話,意味着你需要手動去指定你想追蹤的那些變量

下面的前向計算過程都需要包裹在 with tf.GradientTape() as tape 上下文中,使得前向計算時能夠保存計算圖信息,方便反向求導運算。assign_sub()將原地(In-place)減去給定的參數值,實現參數的自我更新操作

for epoch in range(20):
    #For every batch:x:[128, 28*28],y: [128]
    for step, (x, y) in enumerate(train_db):
        with tf.GradientTape() as tape: # tf.Variable
            # x: [b, 28*28]
            # h1 = x@w1 + b1
            # [b, 784]@[784, 256] + [256] => [b, 256] + [256] => [b, 256] + [b, 256]
            h1 = x@w1 + tf.broadcast_to(b1, [x.shape[0], 256])
            h1 = tf.nn.relu(h1)
            # [b, 256] => [b, 128]
            h2 = h1@w2 + b2
            h2 = tf.nn.relu(h2)
            # [b, 128] => [b, 10]
            out = h2@w3 + b3

            # y: [b] => [b, 10]
            y_onehot = tf.one_hot(y, depth=10)

            # compute loss
            # mse = mean(sum(y-out)^2)
            # [b, 10]
            loss = tf.square(y_onehot - out)
            # mean: scalar
            loss = tf.reduce_mean(loss)

        # compute gradients
        grads = tape.gradient(loss, [w1, b1, w2, b2, w3, b3])
        #Update the weights and the bias
        w1.assign_sub(lr * grads[0])
        b1.assign_sub(lr * grads[1])
        w2.assign_sub(lr * grads[2])
        b2.assign_sub(lr * grads[3])
        w3.assign_sub(lr * grads[4])
        b3.assign_sub(lr * grads[5])

        if step % 100 == 0:
            print(epoch, step, 'loss:', float(loss))

    losses.append(float(loss))

 第七部分將losses隨訓練次數的增加的變化展示出來

plt.figure()
plt.plot(losses, color='C0', marker='s', label='訓練')
plt.xlabel('Epoch')
plt.legend()
plt.ylabel('MSE')
#Save figure as '.svg' file
#plt.savefig('forward.svg')
plt.show()

下圖是最終的loss曲線:

Batch和Epoch通俗易懂的解釋:(參考自https://blog.csdn.net/weixin_42137700/article/details/84302045

假設您有一個包含200個樣本(數據行)的數據集,並且您選擇的Batch大小為5和1,000個Epoch。

這意味着數據集將分為40個Batch,每個Batch有5個樣本。每批五個樣品后,模型權重將更新。

這也意味着一個epoch將涉及40個Batch或40個模型更新。

有1000個Epoch,模型將暴露或傳遞整個數據集1,000次。在整個培訓過程中,總共有40,000Batch。


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM