視頻學習來源
https://www.bilibili.com/video/av40787141?from=search&seid=17003307842787199553
筆記
使用dropout是要改善過擬合,將訓練和測試的准確率差距變小
訓練集,測試集結果相比差距較大時,過擬合狀態
使用dropout后,每一周期准確率可能不高反而最后一步提升很快,這是訓練的時候部分神經元工作,而最后的評估所有神經元工作
正則化同樣是改善過擬合作用
Softmax一般用在神經網絡的最后一層
import numpy as np from keras.datasets import mnist #將會從網絡下載mnist數據集 from keras.utils import np_utils from keras.models import Sequential #序列模型 from keras.layers import Dense,Dropout #在這里導入dropout from keras.optimizers import SGD
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Anaconda3_64\lib\site-packages\h5py\__init__.py:36: FutureWarning: Conversion of the second argument of issubdtype from `float` to `np.floating` is deprecated. In future, it will be treated as `np.float64 == np.dtype(float).type`. from ._conv import register_converters as _register_converters Using TensorFlow backend.
#載入數據 (x_train,y_train),(x_test,y_test)=mnist.load_data() #查看格式 #(60000,28,28) print('x_shape:',x_train.shape) #(60000) print('y_shape:',y_train.shape) #(60000,28,28)->(60000,784) #行數60000,列-1表示自動設置 #除以255是做數據歸一化處理 x_train=x_train.reshape(x_train.shape[0],-1)/255.0 #轉換數據格式 x_test=x_test.reshape(x_test.shape[0],-1)/255.0 #轉換數據格式 #label標簽轉換成 one hot 形式 y_train=np_utils.to_categorical(y_train,num_classes=10) #分成10類 y_test=np_utils.to_categorical(y_test,num_classes=10) #分成10類 #創建模型,輸入754個神經元,輸出10個神經元 #偏執值初始值設為zeros(默認為zeros) model=Sequential([ Dense(units=200,input_dim=784,bias_initializer='zeros',activation='tanh'), #雙曲正切激活函數 #Dropout(0.4), #百分之40的神經元不工作 Dense(units=100,bias_initializer='zeros',activation='tanh'), #雙曲正切激活函數 #Dropout(0.4), #百分之40的神經元不工作 Dense(units=10,bias_initializer='zeros',activation='softmax') ]) #也可用下面的方式添加網絡層 ### #model.add(Dense(...)) #model.add(Dense(...)) ### #定義優化器 #學習速率為0.2 sgd=SGD(lr=0.2) #定義優化器,損失函數,訓練效果中計算准確率 model.compile( optimizer=sgd, #sgd優化器 loss='categorical_crossentropy', #損失用交叉熵,速度會更快 metrics=['accuracy'], #計算准確率 ) #訓練(不同於之前,這是新的訓練方式) #六萬張,每次訓練32張,訓練10個周期(六萬張全部訓練完算一個周期) model.fit(x_train,y_train,batch_size=32,epochs=10) #評估模型 loss,accuracy=model.evaluate(x_test,y_test) print('\ntest loss',loss) print('\ntest accuracy',accuracy) loss,accuracy=model.evaluate(x_train,y_train) print('\ntrain loss',loss) print('\ntrain accuracy',accuracy)
x_shape: (60000, 28, 28)
y_shape: (60000,)
Epoch 1/10
60000/60000 [==============================] - 6s 100us/step - loss: 0.2539 - acc: 0.9235
Epoch 2/10
60000/60000 [==============================] - 6s 95us/step - loss: 0.1175 - acc: 0.9639
Epoch 3/10
60000/60000 [==============================] - 5s 90us/step - loss: 0.0815 - acc: 0.9745
Epoch 4/10
60000/60000 [==============================] - 5s 90us/step - loss: 0.0601 - acc: 0.9809
Epoch 5/10
60000/60000 [==============================] - 6s 92us/step - loss: 0.0451 - acc: 0.9860
Epoch 6/10
60000/60000 [==============================] - 5s 91us/step - loss: 0.0336 - acc: 0.9899
Epoch 7/10
60000/60000 [==============================] - 5s 92us/step - loss: 0.0248 - acc: 0.9926
Epoch 8/10
60000/60000 [==============================] - 6s 93us/step - loss: 0.0185 - acc: 0.9948
Epoch 9/10
60000/60000 [==============================] - 6s 93us/step - loss: 0.0128 - acc: 0.9970
Epoch 10/10
60000/60000 [==============================] - 6s 93us/step - loss: 0.0082 - acc: 0.9988
10000/10000 [==============================] - 0s 39us/step
test loss 0.07058678171953651
test accuracy 0.9786
60000/60000 [==============================] - 2s 34us/step
train loss 0.0052643890143993
train accuracy 0.9995
使用后
(將#Dropout(0.4), 去掉注釋)
model=Sequential([ Dense(units=200,input_dim=784,bias_initializer='zeros',activation='tanh'), #雙曲正切激活函數 Dropout(0.4), #百分之40的神經元不工作 Dense(units=100,bias_initializer='zeros',activation='tanh'), #雙曲正切激活函數 Dropout(0.4), #百分之40的神經元不工作 Dense(units=10,bias_initializer='zeros',activation='softmax') #雙曲正切激活函數 ])
x_shape: (60000, 28, 28)
y_shape: (60000,)
Epoch 1/10
60000/60000 [==============================] - 11s 184us/step - loss: 0.4158 - acc: 0.8753
Epoch 2/10
60000/60000 [==============================] - 10s 166us/step - loss: 0.2799 - acc: 0.9177
Epoch 3/10
60000/60000 [==============================] - 11s 177us/step - loss: 0.2377 - acc: 0.9302
Epoch 4/10
60000/60000 [==============================] - 10s 164us/step - loss: 0.2169 - acc: 0.9356
Epoch 5/10
60000/60000 [==============================] - 10s 170us/step - loss: 0.1979 - acc: 0.9413
Epoch 6/10
60000/60000 [==============================] - 11s 183us/step - loss: 0.1873 - acc: 0.9439
Epoch 7/10
60000/60000 [==============================] - 11s 180us/step - loss: 0.1771 - acc: 0.9472
Epoch 8/10
60000/60000 [==============================] - 12s 204us/step - loss: 0.1676 - acc: 0.9501
Epoch 9/10
60000/60000 [==============================] - 11s 187us/step - loss: 0.1608 - acc: 0.9527
Epoch 10/10
60000/60000 [==============================] - 10s 170us/step - loss: 0.1534 - acc: 0.9542
10000/10000 [==============================] - 1s 68us/step
test loss 0.09667835112037138
test accuracy 0.9692
60000/60000 [==============================] - 4s 70us/step
train loss 0.07203661710163578
train accuracy 0.9774666666666667
PS 本例並不能很好的體現dropout的優化,但是提供示例來使用dropout
正則化
Kernel_regularizer 權值正則化
Bias_regularizer 偏置正則化
Activity_regularizer 激活正則化
激活正則化是信號乘以權值加上偏置值得到的激活
一般設置權值較多
如果模型對於數據較為復雜,可用dropout和正則化來克服一些過擬合
如果模型對於數據較為簡單,可用dropout和正則化可能會降低訓練效果
import numpy as np from keras.datasets import mnist #將會從網絡下載mnist數據集 from keras.utils import np_utils from keras.models import Sequential #序列模型 from keras.layers import Dense from keras.optimizers import SGD from keras.regularizers import l2 #導入正則化l2(小寫L)
#載入數據 (x_train,y_train),(x_test,y_test)=mnist.load_data() #查看格式 #(60000,28,28) print('x_shape:',x_train.shape) #(60000) print('y_shape:',y_train.shape) #(60000,28,28)->(60000,784) #行數60000,列-1表示自動設置 #除以255是做數據歸一化處理 x_train=x_train.reshape(x_train.shape[0],-1)/255.0 #轉換數據格式 x_test=x_test.reshape(x_test.shape[0],-1)/255.0 #轉換數據格式 #label標簽轉換成 one hot 形式 y_train=np_utils.to_categorical(y_train,num_classes=10) #分成10類 y_test=np_utils.to_categorical(y_test,num_classes=10) #分成10類 #創建模型,輸入754個神經元,輸出10個神經元 #偏執值初始值設為zeros(默認為zeros) model=Sequential([ #加上權值正則化 Dense(units=200,input_dim=784,bias_initializer='zeros',activation='tanh',kernel_regularizer=l2(0.0003)), #雙曲正切激活函數 Dense(units=100,bias_initializer='zeros',activation='tanh',kernel_regularizer=l2(0.0003)), #雙曲正切激活函數 Dense(units=10,bias_initializer='zeros',activation='softmax',kernel_regularizer=l2(0.0003)) ]) #也可用下面的方式添加網絡層 ### #model.add(Dense(...)) #model.add(Dense(...)) ### #定義優化器 #學習速率為0.2 sgd=SGD(lr=0.2) #定義優化器,損失函數,訓練效果中計算准確率 model.compile( optimizer=sgd, #sgd優化器 loss='categorical_crossentropy', #損失用交叉熵,速度會更快 metrics=['accuracy'], #計算准確率 ) #訓練(不同於之前,這是新的訓練方式) #六萬張,每次訓練32張,訓練10個周期(六萬張全部訓練完算一個周期) model.fit(x_train,y_train,batch_size=32,epochs=10) #評估模型 loss,accuracy=model.evaluate(x_test,y_test) print('\ntest loss',loss) print('\ntest accuracy',accuracy) loss,accuracy=model.evaluate(x_train,y_train) print('\ntrain loss',loss) print('\ntrain accuracy',accuracy)
x_shape: (60000, 28, 28) y_shape: (60000,) Epoch 1/10 60000/60000 [==============================] - 8s 127us/step - loss: 0.4064 - acc: 0.9202 Epoch 2/10 60000/60000 [==============================] - 7s 121us/step - loss: 0.2616 - acc: 0.9603 Epoch 3/10 60000/60000 [==============================] - 8s 135us/step - loss: 0.2185 - acc: 0.9683 Epoch 4/10 60000/60000 [==============================] - 8s 132us/step - loss: 0.1950 - acc: 0.9723 Epoch 5/10 60000/60000 [==============================] - 8s 130us/step - loss: 0.1793 - acc: 0.9754 Epoch 6/10 60000/60000 [==============================] - 8s 125us/step - loss: 0.1681 - acc: 0.9775 Epoch 7/10 60000/60000 [==============================] - 8s 130us/step - loss: 0.1625 - acc: 0.9783 Epoch 8/10 60000/60000 [==============================] - 7s 125us/step - loss: 0.1566 - acc: 0.9797 Epoch 9/10 60000/60000 [==============================] - 8s 136us/step - loss: 0.1515 - acc: 0.9811 Epoch 10/10 60000/60000 [==============================] - 8s 140us/step - loss: 0.1515 - acc: 0.9808 10000/10000 [==============================] - 1s 57us/step test loss 0.17750378291606903 test accuracy 0.9721 60000/60000 [==============================] - 3s 52us/step train loss 0.1493431808312734 train accuracy 0.9822666666666666