本文將和大家一起一步步嘗試對Fashion MNIST數據集進行調參,看看每一步對模型精度的影響。(調參過程中,基礎模型架構大致保持不變)
廢話不多說,先上任務:
模型的主體框架如下(此為拿到的原始代碼,使用的框架是keras):
裸跑的精度為
--
將epochs由1變為12
當加上dropout之后時,model的結構如下:
=================================================================
layer_conv_1 (Conv2D) (None, 28, 28, 32) 320
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 14, 14, 32) 0
_________________________________________________________________
layer_conv_2 (Conv2D) (None, 14, 14, 64) 18496
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 7, 7, 64) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 7, 7, 64) 0
_________________________________________________________________
flatten_1 (Flatten) (None, 3136) 0
_________________________________________________________________
dense_1 (Dense) (None, 64) 200768
_________________________________________________________________
dropout_2 (Dropout) (None, 64) 0
_________________________________________________________________
dense_2 (Dense) (None, 10) 650
=================================================================
Total params: 220,234
Trainable params: 220,234
Non-trainable params: 0
此時的精度為:
48384/55000 [=========================>....] - ETA: 0s - loss: 0.2562 - acc: 0.9068
49536/55000 [==========================>...] - ETA: 0s - loss: 0.2573 - acc: 0.9065
50688/55000 [==========================>...] - ETA: 0s - loss: 0.2572 - acc: 0.9064
51840/55000 [===========================>..] - ETA: 0s - loss: 0.2570 - acc: 0.9065
52992/55000 [===========================>..] - ETA: 0s - loss: 0.2574 - acc: 0.9063
54144/55000 [============================>.] - ETA: 0s - loss: 0.2573 - acc: 0.9062
55000/55000 [==============================] - 3s 50us/step - loss: 0.2574 - acc: 0.9063 - val_loss: 0.2196 - val_acc: 0.9192
Test loss: 0.2387163639187813
Test accuracy: 0.9133
Model saved
Select the first 9 images
加上dropout之后,對模型的分類精度還是有着不錯的提升。提升了0.0276。
之后,嘗試着將epoches增加,但對精度已經沒有什么影響。
因此考慮數據增強,看看對分類精度的變化。代碼如下:
#you can change the data_aug according to your need datagen = ImageDataGenerator( rotation_range=0.2, zoom_range=0.2, width_shift_range=0.2, height_shift_range=0.2, zca_epsilon=1e-6, horizontal_flip=True, ) model.fit_generator(datagen.flow(X_train, y_train, batch_size=batch_size), steps_per_epoch=X_train.shape[0] // batch_size, epochs=epochs, validation_data=(X_test, y_test),workers=4,callbacks=[reduce_lr])
采取的epoches為24,因為epoches數太少,可能導致訓練的不充分而使得精度不高,結果如下:
426/429 [============================>.] - ETA: 0s - loss: 0.5142 - acc: 0.8114
428/429 [============================>.] - ETA: 0s - loss: 0.5140 - acc: 0.8114
429/429 [==============================] - 22s 52ms/step - loss: 0.5140 - acc: 0.8115 - val_loss: 0.3480 - val_acc: 0.8751
Test loss: 0.3479545822381973
Test accuracy: 0.8751
Model saved
Select the first 9 images
可以看出,訓練的精度為0.8115,驗證的精度為0.8751。表明模型還有進步空間,離過擬合,還有一段距離。增加輪次和增加神經網絡的節點數,對模型精度還能繼續提高。
但是數據增強后,模型的收斂速度變慢了,精度也下降了些。
接下去,對模型學習率進行調整。初始學習率為1e-3
每12個epoches,學習率為原來的 1/10,總的epoches數為36,代碼,結果如下:
def scheduler(epoch): # 每隔12個epoch,學習率減小為原來的1/10 if epoch % 12 == 0 and epoch != 0: lr = K.get_value(model.optimizer.lr) K.set_value(model.optimizer.lr, lr * 0.1) print("lr changed to {}".format(lr * 0.1)) return K.get_value(model.optimizer.lr) reduce_lr = LearningRateScheduler(scheduler) model.fit(X_train, y_train, batch_size=batch_size, epochs=epochs, verbose=1, validation_data=(X_val, y_val),callbacks=[reduce_lr])
53248/55000 [============================>.] - ETA: 0s - loss: 0.1977 - acc: 0.9287
54400/55000 [============================>.] - ETA: 0s - loss: 0.1981 - acc: 0.9287
55000/55000 [==============================] - 3s 53us/step - loss: 0.1980 - acc: 0.9288 - val_loss: 0.2026 - val_acc: 0.9266
Test loss: 0.22499265049099923
Test accuracy: 0.9216
Model saved
Select the first 9 images
精度還是有着不錯的提升,達到了0.9216.
時間有限,文末將放所有完整的github代碼鏈接: https://github.com/air-y/pythontrain_cnn
(PS:有興趣的朋友可以自己調試下玩玩(基本保持2個卷積和全連接的結構不變,盡可能的提升模型精度),我們交流下。真是好久沒用keras了,一直在使用pytorch)