在Keras框架下訓練深度學習模型時,一般思路是在訓練環境下訓練出模型,然后拿訓練好的模型(即保存模型相應信息的文件)到生產環境下去部署。在訓練過程中我們可能會遇到以下情況:
需要運行很長時間的程序在迭代到中間某一代時出現意外;
人為地想停止訓練過程,也許是為了用測試數據測試模型,然后從上一個檢查點繼續訓練模型;
想通過損失函數和評估指標,在每次訓練過程中保存模型的最佳版本。
以上這些情況都要求我們能夠在訓練過程中保存模型和加載模型,下面將通過這篇博客來總結一下最近學習的Keras框架下保存和加載模型的方法和技巧。
目錄
1 保存模型
1.1 完整地保存整個模型
1.2 分別保存模型的結構和權重
1.3 保存模型圖
2 加載模型
2.1 加載整個模型
2.2 分別加載模型結構和權重
這里會以MNIST手寫數據集識別為例來說明如何保存和加載模型,代碼如下:
MNIST數據集鏈接:https://pan.baidu.com/s/1LZRbpaE7eiHtoVPmByQQyQ
提取碼:radl
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 打印模型概述信息
model.summary()
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
1 保存模型
在模型保存方面,通常有以下兩種方式:完整地保存整個模型、分別保存模型的結構和權重,以上兩種方式都是將關於模型的信息保存到文件中,除此之外,我們還會通過模型圖來保存模型的結構信息。
1.1 完整地保存整個模型
我們使用model.save()完整地保存整個模型,將Keras模型和權重保存在一個HDF5文件中,該文件將包含:
模型的結構
模型的參數
優化器參數:用於繼續訓練過程
代碼如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 打印模型概述信息
model.summary()
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# 保存模型
model.save('my_model.h5')
效果截圖如下:
1.2 分別保存模型的結構和權重
分別保存模型的結構和權重,容易理解,這里我們介紹幾種保存模型結構和權重的方式。
1.2.1 只保存模型的結構
我們使用to_json()方法或者to_yaml()將模型結構保存到json文件或者yaml文件。
代碼如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 打印模型概述信息
model.summary()
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# 保存模型的結構
json_string = model.to_json() # 方式1
open('model_architecture_1.json', 'w').write(json_string)
yaml_string = model.to_yaml() # 方式2
open('model_arthitecture_2.yaml', 'w').write(yaml_string)
# 打印消息
print('訓練和保存模型結構完成!!!')
效果截圖如下:
1.2.2 只保留模型的權重
只保留模型的權重可以通過save_weights()方法實現,也可以通過檢查點checkpoint的設置實現。
通過save_weights方法實現
實現代碼如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# 保存模型的權重
model.save_weights('my_model_weights.h5')
效果截圖如下:
通過設置檢查點實現
我們通過創建ModelCheckpoint類的實例來設置檢查點,代碼如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 40
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 設置檢查點
filepath = 'saved_models/weights-improvement-{epoch:02d}-{val_acc:.5f}.hdf5'
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=VERBOSE,
save_best_only=True, mode='max')
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT, callbacks=[checkpoint])
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
效果截圖如下:
可以看到這種方法可以保存最優的評估結果對應的模型權重,即最后一個文件。
1.3 保存模型圖
我們可以通過model.summary()打印模型的概述信息,通過模型的概述信息可以知道模型的基本結構。
代碼和效果如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# 打印模型的概述信息
model.summary()
效果截圖如下:
這里可以看到模型的詳細結構信息以及參數信息,此外,我們可以通過plot_model()方法保存模型的基本結構圖。
代碼如下:
from __future__ import print_function
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.utils import plot_model
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 定義網絡結構
model = Sequential()
model.add(Dense(N_HIDDEN, input_shape=(RESHAPED, )))
model.add(Activation('relu'))
model.add(Dense(N_HIDDEN))
model.add(Activation('relu'))
model.add(Dense(NB_CLASSES))
model.add(Activation('softmax'))
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
# 保存模型的基本結構圖
plot_model(model, 'model_plot.png')
效果截圖如下:
2 加載模型
加載模型對應着上面所說的模型部署,一般我們會通過完整地加載整個模型或者分別加載模型的結構或權重來完成模型的加載,具體情況視任務而定。
2.1 加載整個模型
加載整個模型,對應着1.1中的保存整個模型,我們從保存模型信息的文件中獲取模型的結構、模型的權重和優化器的配置信息,則可以使用數據繼續訓練模型。這里使用load_model()來加載整個模型。
代碼如下:
from __future__ import print_function
import numpy as np
from keras.utils import np_utils
from keras.models import load_model
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 20
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 加載整個模型
model = load_model('my_model.h5')
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT)
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
效果截圖如下:
可以看到,針對同一個訓練數據集,適當增加迭代次數,訓練的結果變好了。
2.2 分別加載模型結構和權重
分別加載模型結構和權重,就是分別加載模型結構文件和模型權重文件,這里分別使用model_from_json或者model_from_yaml()方法和load_weights()方法,為了說明效果,請看以下的代碼和截圖。
代碼如下:
from __future__ import print_function
import numpy as np
from keras.optimizers import SGD
from keras.utils import np_utils
from keras.callbacks import ModelCheckpoint
from keras.models import model_from_json
from keras.models import model_from_yaml
# 隨機數種子,重復性設置
np.random.seed(1671)
# 網絡結構和訓練的參數
NB_EPOCH = 40
BATCH_SIZE = 128
VERBOSE = 1
NB_CLASSES = 10
OPTIMIZER = SGD()
N_HIDDEN = 128
VALIDATION_SPLIT = 0.2
RESHAPED = 784
# 加載數據
def load_data(path="mnist.npz"):
f = np.load(path)
x_train, y_train = f['x_train'], f['y_train']
x_test, y_test = f['x_test'], f['y_test']
f.close()
return (x_train, y_train), (x_test, y_test)
# 調用函數加載數據
(x_train, y_train), (x_test, y_test) = load_data()
# 數據預處理
(x_train, y_train), (x_test, y_test) = load_data()
# 數據變形、類型轉換及歸一化
x_train = x_train.reshape(60000, 784).astype('float32') / 255
x_test = x_test.reshape(10000, 784).astype('float32') / 255
# 打印消息
print('Training samples:', x_train.shape)
print('Testing samples:', x_test.shape)
# 將類別轉換為one-hot編碼
y_train = np_utils.to_categorical(y_train, NB_CLASSES)
y_test = np_utils.to_categorical(y_test, NB_CLASSES)
# 加載模型結構
model = model_from_json(open('model_architecture_1.json', 'r').read())
# 加載模型權重
model.load_weights('weights-improvement-40-0.96208.hdf5')
# 編譯模型
model.compile(loss='categorical_crossentropy', optimizer=OPTIMIZER, metrics=['accuracy'])
# 設置檢查點
filepath = 'save_model/weights-improvement-{epoch:02d}-{val_acc:.5f}.hdf5'
checkpoint = ModelCheckpoint(filepath=filepath, monitor='val_acc', verbose=VERBOSE,
save_best_only=True, mode='max')
# 訓練模型
history = model.fit(x_train, y_train, batch_size=BATCH_SIZE, epochs=NB_EPOCH, verbose=VERBOSE,
validation_split=VALIDATION_SPLIT, callbacks=[checkpoint])
# 評估模型
score = model.evaluate(x_test, y_test, verbose=VERBOSE)
print('Test score:', score[0])
print('Test accuracy:', score[1])
效果截圖如下:
寫這篇博客參考和總結了許多其他博客,這里就不一一列舉了,后面會再想想這篇博客還有哪里可以改進,謝謝大家。
歡迎交流! QQ:3408649893
---------------------
作者:tsz_upUP
來源:CSDN
原文:https://blog.csdn.net/tszupup/article/details/85198949
版權聲明:本文為博主原創文章,轉載請附上博文鏈接!