機器學習與Tensorflow（3）—— 機器學習及MNIST數據集分類優化

本文轉載自查看原文 2018-12-21 17:30 563 機器學習與Tensorflow

一、二次代價函數

1. 形式：

其中，C為代價函數，X表示樣本，Y表示實際值，a表示輸出值，n為樣本總數

2. 利用梯度下降法調整權值參數大小，推導過程如下圖所示：

根據結果可得，權重w和偏置b的梯度跟激活函數的梯度成正比（即激活函數的梯度越大，w和b的大小調整的越快，訓練速度也越快）

3. 激活函數是sigmoid函數時，二次代價函數調整參數過程分析

理想調整參數狀態：距離目標點遠時，梯度大，參數調整較快；距離目標點近時，梯度小，參數調整較慢。
如果我的目標點是調整到M點，從A點==>B點的調整過程，A點距離目標點遠，梯度大，調整參數較快；B點距離目標較近，梯度小，調整參數慢。符合參數調整策略
如果我的目標點是調整到N點，從B點==>A點的調整過程，A點距離目標點近，梯度大，調整參數較快；B點距離目標較遠，梯度小，調整參數慢。不符合參數調整策略

二、交叉熵代價函數

1.形式：

其中，C為代價函數，X表示樣本，Y表示實際值，a表示輸出值，n為樣本總數

2. 利用梯度下降法調整權值參數大小，推導過程如下圖所示：

根據結果可得，權重w和偏置b的梯度跟激活函數的梯度無關。而和輸出值與實際值的誤差成正比（即誤差越大，w和b的大小調整的越快，訓練速度也越快）

3.激活函數是sigmoid函數時，二次代價函數調整參數過程分析

理想調整參數狀態：距離目標點遠時，梯度大，參數調整較快；距離目標點近時，梯度小，參數調整較慢。
如果我的目標點是調整到M點，從A點==>B點的調整過程，A點距離目標點遠，誤差大，調整參數較快；B點距離目標較近，誤差小，調整參數較慢。符合參數調整策略
如果我的目標點是調整到N點，從B點==>A點的調整過程，A點距離目標點近，誤差小，調整參數較慢；B點距離目標較遠，誤差大，調整參數較快。符合參數調整策略

總結：

如果輸出神經元是線性的，選擇二次代價函數較為合適
如果輸出神經元是S型函數（sigmoid函數），選擇交叉熵代價函數較為合適
如果輸出神經元是softmax回歸的代價函數，選擇對數釋然代價函數較為合適

二、利用代價函數優化MNIST數據集識別程序

1.在Tensorflow中代價函數的選擇：

如果輸出神經元是線性的，選擇二次代價函數較為合適 loss = tf.reduce_mean(tf.square())
如果輸出神經元是S型函數（sigmoid函數），選擇交叉熵代價函數較為合適 loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits())
如果輸出神經元是softmax回歸的代價函數，選擇對數釋然代價函數較為合適 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits())

2.通過代價函數選擇對MNIST數據集分類程序優化

#使用交叉熵代價函數

 1 import os
 2 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 3 import tensorflow as tf
 4 from tensorflow.examples.tutorials.mnist import input_data
 5 #載入數據集
 6 mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 7 #每個批次的大小（即每次訓練的圖片數量）
 8 batch_size = 50
 9 #計算一共有多少個批次
10 n_bitch = mnist.train.num_examples // batch_size
11 #定義兩個placeholder
12 x = tf.placeholder(tf.float32, [None, 784])
13 y = tf.placeholder(tf.float32, [None, 10])
14 #創建一個只有輸入層（784個神經元）和輸出層（10個神經元）的簡單神經網絡
15 Weights = tf.Variable(tf.zeros([784, 10]))
16 Biases = tf.Variable(tf.zeros([10]))
17 Wx_plus_B = tf.matmul(x, Weights) + Biases
18 prediction = tf.nn.softmax(Wx_plus_B)
19 #交叉熵代價函數
20 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=prediction))
21 #使用梯度下降法
22 train_step = tf.train.GradientDescentOptimizer(0.15).minimize(loss)
23 #初始化變量
24 init = tf.global_variables_initializer()
25 #結果存放在一個布爾型列表中
26 correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(prediction, 1)) #argmax返回一維張量中最大的值所在的位置，標簽值和預測值相同，返回為True
27 #求准確率
28 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) #cast函數將correct_prediction的布爾型轉換為浮點型，然后計算平均值即為准確率
29 
30 with tf.Session() as sess:
31     sess.run(init)
32     #將測試集循環訓練20次
33     for epoch in range(21):
34         #將測試集中所有數據循環一次
35         for batch in range(n_bitch):
36             batch_xs, batch_ys = mnist.train.next_batch(batch_size)   #取測試集中batch_size數量的圖片及對應的標簽值
37             sess.run(train_step, feed_dict={x:batch_xs, y:batch_ys})  #將上一行代碼取到的數據進行訓練
38         acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels})  #准確率的計算
39         print('Iter : ' + str(epoch) + ',Testing Accuracy = ' + str(acc))

View Code

#執行結果

 1 Iter : 0,Testing Accuracy = 0.8323
 2 Iter : 1,Testing Accuracy = 0.8947
 3 Iter : 2,Testing Accuracy = 0.9032
 4 Iter : 3,Testing Accuracy = 0.9068
 5 Iter : 4,Testing Accuracy = 0.909
 6 Iter : 5,Testing Accuracy = 0.9105
 7 Iter : 6,Testing Accuracy = 0.9126
 8 Iter : 7,Testing Accuracy = 0.9131
 9 Iter : 8,Testing Accuracy = 0.9151
10 Iter : 9,Testing Accuracy = 0.9168
11 Iter : 10,Testing Accuracy = 0.9178
12 Iter : 11,Testing Accuracy = 0.9173
13 Iter : 12,Testing Accuracy = 0.9181
14 Iter : 13,Testing Accuracy = 0.9194
15 Iter : 14,Testing Accuracy = 0.9201
16 Iter : 15,Testing Accuracy = 0.9197
17 Iter : 16,Testing Accuracy = 0.9213
18 Iter : 17,Testing Accuracy = 0.9212
19 Iter : 18,Testing Accuracy = 0.9205
20 Iter : 19,Testing Accuracy = 0.9215

View Code

#使用二次代價函數

 1 import os
 2 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 3 import tensorflow as tf
 4 from tensorflow.examples.tutorials.mnist import input_data
 5 #載入數據集
 6 mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 7 #每個批次的大小（即每次訓練的圖片數量）
 8 batch_size = 100
 9 #計算一共有多少個批次
10 n_bitch = mnist.train.num_examples // batch_size
11 #定義兩個placeholder
12 x = tf.placeholder(tf.float32, [None, 784])
13 y = tf.placeholder(tf.float32, [None, 10])
14 #創建一個只有輸入層（784個神經元）和輸出層（10個神經元）的簡單神經網絡
15 Weights = tf.Variable(tf.zeros([784, 10]))
16 Biases = tf.Variable(tf.zeros([10]))
17 Wx_plus_B = tf.matmul(x, Weights) + Biases
18 prediction = tf.nn.softmax(Wx_plus_B)
19 #二次代價函數
20 loss = tf.reduce_mean(tf.square(y - prediction))
21 #使用梯度下降法
22 train_step = tf.train.GradientDescentOptimizer(0.2).minimize(loss)
23 #初始化變量
24 init = tf.global_variables_initializer()
25 #結果存放在一個布爾型列表中
26 correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(prediction, 1)) #argmax返回一維張量中最大的值所在的位置，標簽值和預測值相同，返回為True
27 #求准確率
28 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) #cast函數將correct_prediction的布爾型轉換為浮點型，然后計算平均值即為准確率
29 
30 with tf.Session() as sess:
31     sess.run(init)
32     #將測試集循環訓練20次
33     for epoch in range(21):
34         #將測試集中所有數據循環一次
35         for batch in range(n_bitch):
36             batch_xs, batch_ys = mnist.train.next_batch(batch_size)   #取測試集中batch_size數量的圖片及對應的標簽值
37             sess.run(train_step, feed_dict={x:batch_xs, y:batch_ys})  #將上一行代碼取到的數據進行訓練
38         acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels})  #准確率的計算
39         print('Iter : ' + str(epoch) + ',Testing Accuracy = ' + str(acc))

View Code

#執行結果

 1 Iter : 0,Testing Accuracy = 0.8325
 2 Iter : 1,Testing Accuracy = 0.8711
 3 Iter : 2,Testing Accuracy = 0.8831
 4 Iter : 3,Testing Accuracy = 0.8876
 5 Iter : 4,Testing Accuracy = 0.8942
 6 Iter : 5,Testing Accuracy = 0.898
 7 Iter : 6,Testing Accuracy = 0.9002
 8 Iter : 7,Testing Accuracy = 0.9014
 9 Iter : 8,Testing Accuracy = 0.9036
10 Iter : 9,Testing Accuracy = 0.9052
11 Iter : 10,Testing Accuracy = 0.9065
12 Iter : 11,Testing Accuracy = 0.9073
13 Iter : 12,Testing Accuracy = 0.9084
14 Iter : 13,Testing Accuracy = 0.909
15 Iter : 14,Testing Accuracy = 0.9095
16 Iter : 15,Testing Accuracy = 0.9115
17 Iter : 16,Testing Accuracy = 0.912
18 Iter : 17,Testing Accuracy = 0.9126
19 Iter : 18,Testing Accuracy = 0.913
20 Iter : 19,Testing Accuracy = 0.9136
21 Iter : 20,Testing Accuracy = 0.914

View Code

結論：（二者只有代價函數不同）

正確率達到90%所用迭代次數：使用交叉熵代價函數為第三次；使用二次代價函數為第六次（在MNIST數據集分類中，使用交叉熵代價函數收斂速度較快）
最終正確率：使用交叉熵代價函數為92.15%，使用二次代價函數為91.4%（在MNIST數據集分類中，使用交叉熵代價函數識別准確率較高）

三、擬合問題

參考文章：
https://blog.csdn.net/willduan1/article/details/53070777

1.根據擬合結果分類：

欠擬合：模型沒有很好地捕捉到數據特征，不能夠很好地擬合數據
正確擬合
過擬合：模型把數據學習的太徹底，以至於把噪聲數據的特征也學習到了，這樣就會導致在后期測試的時候不能夠很好地識別數據，即不能正確的分類，模型泛化能力太差

2.解決欠擬合和過擬合

解決欠擬合常用方法：

添加其他特征項，有時候我們模型出現欠擬合的時候是因為特征項不夠導致的，可以添加其他特征項來很好地解決。
添加多項式特征，這個在機器學習算法里面用的很普遍，例如將線性模型通過添加二次項或者三次項使模型泛化能力更強。
減少正則化參數，正則化的目的是用來防止過擬合的，但是現在模型出現了欠擬合，則需要減少正則化參數。

解決過擬合常用方法：

增加數據集
正則化方法
Dropout（通俗一點講就是dropout方法在訓練的時候讓神經元以一定的概率不工作）

四、初始化優化MNIST數據集分類問題

#改變初始化方法

Weights = tf.Variable(tf.truncated_normal([784, 10]))
Biases = tf.Variable(tf.zeros([10]) + 0.1)

五、優化器優化MNIST數據集分類問題

大多數機器學習任務就是最小化損失，在損失定義的情況下，后面的工作就交給優化器。
因為深度學習常見的是對於梯度的優化，也就是說，優化器最后其實就是各種對於梯度下降算法的優化。

1.梯度下降法分類及其介紹

標准梯度下降法：先計算所有樣本匯總誤差，然后根據總誤差來更新權值
隨機梯度下降法：隨機抽取一個樣本來計算誤差，然后更新權值
批量梯度下降法：是一種折中方案，從總樣本中選取一個批次（batch），然后計算這個batch的總誤差，根據總誤差來更新權值

2.常見優化器介紹

參考文章：
https://www.leiphone.com/news/201706/e0PuNeEzaXWsMPZX.html

3.優化器優化MNIST數據集分類問題

#選擇Adam優化器

 1 import os
 2 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 3 import tensorflow as tf
 4 from tensorflow.examples.tutorials.mnist import input_data
 5 #載入數據集
 6 mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 7 #每個批次的大小（即每次訓練的圖片數量）
 8 batch_size = 50
 9 #計算一共有多少個批次
10 n_bitch = mnist.train.num_examples // batch_size
11 #定義兩個placeholder
12 x = tf.placeholder(tf.float32, [None, 784])
13 y = tf.placeholder(tf.float32, [None, 10])
14 #創建一個只有輸入層（784個神經元）和輸出層（10個神經元）的簡單神經網絡
15 Weights = tf.Variable(tf.zeros([784, 10]))
16 Biases = tf.Variable(tf.zeros([10]))
17 Wx_plus_B = tf.matmul(x, Weights) + Biases
18 prediction = tf.nn.softmax(Wx_plus_B)
19 #交叉熵代價函數
20 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=prediction))
21 #使用Adam優化器
22 train_step = tf.train.AdamOptimizer(1e-2).minimize(loss)
23 #初始化變量
24 init = tf.global_variables_initializer()
25 #結果存放在一個布爾型列表中
26 correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(prediction, 1)) #argmax返回一維張量中最大的值所在的位置，標簽值和預測值相同，返回為True
27 #求准確率
28 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) #cast函數將correct_prediction的布爾型轉換為浮點型，然后計算平均值即為准確率
29 
30 with tf.Session() as sess:
31     sess.run(init)
32     #將測試集循環訓練20次
33     for epoch in range(21):
34         #將測試集中所有數據循環一次
35         for batch in range(n_bitch):
36             batch_xs, batch_ys = mnist.train.next_batch(batch_size)   #取測試集中batch_size數量的圖片及對應的標簽值
37             sess.run(train_step, feed_dict={x:batch_xs, y:batch_ys})  #將上一行代碼取到的數據進行訓練
38         acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels})  #准確率的計算
39         print('Iter : ' + str(epoch) + ',Testing Accuracy = ' + str(acc))

View Code

#執行結果

Iter : 1,Testing Accuracy = 0.9224
Iter : 2,Testing Accuracy = 0.9293
Iter : 3,Testing Accuracy = 0.9195
Iter : 4,Testing Accuracy = 0.9282
Iter : 5,Testing Accuracy = 0.926
Iter : 6,Testing Accuracy = 0.9291
Iter : 7,Testing Accuracy = 0.9288
Iter : 8,Testing Accuracy = 0.9274
Iter : 9,Testing Accuracy = 0.9277
Iter : 10,Testing Accuracy = 0.9249
Iter : 11,Testing Accuracy = 0.9313
Iter : 12,Testing Accuracy = 0.9301
Iter : 13,Testing Accuracy = 0.9315
Iter : 14,Testing Accuracy = 0.9295
Iter : 15,Testing Accuracy = 0.9299
Iter : 16,Testing Accuracy = 0.9303
Iter : 17,Testing Accuracy = 0.93
Iter : 18,Testing Accuracy = 0.9304
Iter : 19,Testing Accuracy = 0.9269
Iter : 20,Testing Accuracy = 0.9273

View Code

注意：不同優化器參數的設置是關鍵。在機器學習中，參數的調整應該是技術加經驗，而不是盲目調整。這邊是我以后需要學習和積累的地方

六、根據今天所學內容，對MNIST數據集分類進行優化，准確率達到95%以上

#優化程序

 1 import os
 2 os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'
 3 import tensorflow as tf
 4 from tensorflow.examples.tutorials.mnist import input_data
 5 #載入數據集
 6 mnist = input_data.read_data_sets('MNIST_data', one_hot=True)
 7 #每個批次的大小（即每次訓練的圖片數量）
 8 batch_size = 50
 9 #計算一共有多少個批次
10 n_bitch = mnist.train.num_examples // batch_size
11 #定義兩個placeholder
12 x = tf.placeholder(tf.float32, [None, 784])
13 y = tf.placeholder(tf.float32, [None, 10])
14 #創建一個只有輸入層（784個神經元）和輸出層（10個神經元）的簡單神經網絡
15 Weights1 = tf.Variable(tf.truncated_normal([784, 200]))
16 Biases1 = tf.Variable(tf.zeros([200]) + 0.1)
17 Wx_plus_B_L1 = tf.matmul(x, Weights1) + Biases1
18 L1 = tf.nn.tanh(Wx_plus_B_L1)
19 
20 Weights2 = tf.Variable(tf.truncated_normal([200, 50]))
21 Biases2 = tf.Variable(tf.zeros([50]) + 0.1)
22 Wx_plus_B_L2 = tf.matmul(L1, Weights2) + Biases2
23 L2 = tf.nn.tanh(Wx_plus_B_L2)
24 
25 Weights3 = tf.Variable(tf.truncated_normal([50, 10]))
26 Biases3 = tf.Variable(tf.zeros([10]) + 0.1)
27 Wx_plus_B_L3 = tf.matmul(L2, Weights3) + Biases3
28 prediction = tf.nn.softmax(Wx_plus_B_L3)
29 
30 #交叉熵代價函數
31 loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y, logits=prediction))
32 #使用梯度下降法
33 train_step = tf.train.AdamOptimizer(2e-3).minimize(loss)
34 #初始化變量
35 init = tf.global_variables_initializer()
36 #結果存放在一個布爾型列表中
37 correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(prediction, 1))
38 #求准確率
39 accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
40 
41 with tf.Session() as sess:
42     sess.run(init)
43     #將測試集循環訓練50次
44     for epoch in range(51):
45         #將測試集中所有數據循環一次
46         for batch in range(n_bitch):
47             batch_xs, batch_ys = mnist.train.next_batch(batch_size)   #取測試集中batch_size數量的圖片及對應的標簽值
48             sess.run(train_step, feed_dict={x:batch_xs, y:batch_ys})  #將上一行代碼取到的數據進行訓練
49         test_acc = sess.run(accuracy, feed_dict={x:mnist.test.images, y:mnist.test.labels})  #准確率的計算
50         print('Iter : ' + str(epoch) + ',Testing Accuracy = ' + str(test_acc))

View Code

#執行結果

 1 Iter : 0,Testing Accuracy = 0.6914
 2 Iter : 1,Testing Accuracy = 0.7236
 3 Iter : 2,Testing Accuracy = 0.8269
 4 Iter : 3,Testing Accuracy = 0.8885
 5 Iter : 4,Testing Accuracy = 0.9073
 6 Iter : 5,Testing Accuracy = 0.9147
 7 Iter : 6,Testing Accuracy = 0.9125
 8 Iter : 7,Testing Accuracy = 0.922
 9 Iter : 8,Testing Accuracy = 0.9287
10 Iter : 9,Testing Accuracy = 0.9248
11 Iter : 10,Testing Accuracy = 0.9263
12 Iter : 11,Testing Accuracy = 0.9328
13 Iter : 12,Testing Accuracy = 0.9316
14 Iter : 13,Testing Accuracy = 0.9387
15 Iter : 14,Testing Accuracy = 0.9374
16 Iter : 15,Testing Accuracy = 0.9433
17 Iter : 16,Testing Accuracy = 0.9419
18 Iter : 17,Testing Accuracy = 0.9379
19 Iter : 18,Testing Accuracy = 0.9379
20 Iter : 19,Testing Accuracy = 0.9462
21 Iter : 20,Testing Accuracy = 0.9437
22 Iter : 21,Testing Accuracy = 0.9466
23 Iter : 22,Testing Accuracy = 0.9479
24 Iter : 23,Testing Accuracy = 0.9498
25 Iter : 24,Testing Accuracy = 0.9481
26 Iter : 25,Testing Accuracy = 0.9489
27 Iter : 26,Testing Accuracy = 0.9496
28 Iter : 27,Testing Accuracy = 0.95
29 Iter : 28,Testing Accuracy = 0.9508
30 Iter : 29,Testing Accuracy = 0.9533
31 Iter : 30,Testing Accuracy = 0.9509
32 Iter : 31,Testing Accuracy = 0.9516
33 Iter : 32,Testing Accuracy = 0.9541
34 Iter : 33,Testing Accuracy = 0.9513
35 Iter : 34,Testing Accuracy = 0.951
36 Iter : 35,Testing Accuracy = 0.9556
37 Iter : 36,Testing Accuracy = 0.9527
38 Iter : 37,Testing Accuracy = 0.9521
39 Iter : 38,Testing Accuracy = 0.9546
40 Iter : 39,Testing Accuracy = 0.9544
41 Iter : 40,Testing Accuracy = 0.9555
42 Iter : 41,Testing Accuracy = 0.9546
43 Iter : 42,Testing Accuracy = 0.9553
44 Iter : 43,Testing Accuracy = 0.9534
45 Iter : 44,Testing Accuracy = 0.9576
46 Iter : 45,Testing Accuracy = 0.9535
47 Iter : 46,Testing Accuracy = 0.9569
48 Iter : 47,Testing Accuracy = 0.9556
49 Iter : 48,Testing Accuracy = 0.9568
50 Iter : 49,Testing Accuracy = 0.956
51 Iter : 50,Testing Accuracy = 0.9557

View Code

#寫在后面

呀呀呀呀

本來想着先把python學差不多再開始機器學習和這些框架的學習

老師觸不及防的任務

給了論文讓我搭一個模型出來

我只能硬着頭皮上了

不想用公式編譯器了

手寫版計算過程請忽略那丑丑的字兒

加油哦！小伙郭

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 MNIST機器學習數據集機器學習-MNIST數據集使用二分類機器學習數據集大全機器學習數據集(Dataset) 機器學習在用到mnist數據集報錯No module named 'tensorflow.examples.tutorials'解決辦法 MNIST機器學習 Python機器學習（七十三）Keras 加載MNIST數據集 TensorFlow框架(3)之MNIST機器學習入門機器學習深度學習常用數據集機器學習sklearn（三）：加載數據集(數據導入)