在keras中,可以通過組合層來構建模型。模型是由層構成的圖。最常見的模型類型是層的堆疊:tf.keras.Sequential.
model = tf.keras.Sequential() # Adds a densely-connected layer with 64 units to the model: model.add(layers.Dense(64, activation='relu')) # Add another: model.add(layers.Dense(64, activation='relu')) # Add a softmax layer with 10 output units: model.add(layers.Dense(10, activation='softmax'))
tf.keras.layers的參數,activation:激活函數,由內置函數的名稱指定,或指定為可用的調用對象。kernel_initializer和bias_initializer:層權重的初始化方案。名稱或可調用對象。kernel_regularizer和bias_regularizer:層權重的正則化方案。
# Create a sigmoid layer: layers.Dense(64, activation='sigmoid') # Or: layers.Dense(64, activation=tf.sigmoid) # A linear layer with L1 regularization of factor 0.01 applied to the kernel matrix: layers.Dense(64, kernel_regularizer=tf.keras.regularizers.l1(0.01)) # A linear layer with L2 regularization of factor 0.01 applied to the bias vector: layers.Dense(64, bias_regularizer=tf.keras.regularizers.l2(0.01)) # A linear layer with a kernel initialized to a random orthogonal matrix: layers.Dense(64, kernel_initializer='orthogonal') # A linear layer with a bias vector initialized to 2.0s: layers.Dense(64, bias_initializer=tf.keras.initializers.constant(2.0))
訓練和評估
設置訓練流程
構建好模型后,通過調用compile方法配置該模型的學習流程:
model = tf.keras.Sequential([ # Adds a densely-connected layer with 64 units to the model: layers.Dense(64, activation='relu'), # Add another: layers.Dense(64, activation='relu'), # Add a softmax layer with 10 output units: layers.Dense(10, activation='softmax')]) model.compile(optimizer=tf.train.AdamOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy'])
tf.keras.Model.compile采用三個重要參數:
- optimizer:從tf.train模塊向其傳遞優化器實例,例如tf.train.AdamOptimizer,tf.train.RMSPropOptimizer或tf.train.GradientDescentOptimizer。
- loss:損失函數。常見選擇包括均方誤差(mse)、categorical_crossentropy和binary_crossentropy.
- metrics:評估指標
對於小型數據集,可以使用numpy數據訓練。使用fit方法使模型與訓練數據擬合。tf.keras.Model.fit采用三個重要參數:
- epochs:以周期為單位進行訓練。
- batch_size:此整數制定每個批次的大小。
- validation_data:驗證集,監控該模型在驗證數據上的達到的效果。
import numpy as np data = np.random.random((1000, 32)) labels = np.random.random((1000, 10)) val_data = np.random.random((100, 32)) val_labels = np.random.random((100, 10)) model.fit(data, labels, epochs=10, batch_size=32, validation_data=(val_data, val_labels)) Train on 1000 samples, validate on 100 samples Epoch 1/10 1000/1000 [==============================] - 0s 124us/step - loss: 11.5267 - categorical_accuracy: 0.1070 - val_loss: 11.0015 - val_categorical_accuracy: 0.0500 Epoch 2/10 1000/1000 [==============================] - 0s 72us/step - loss: 11.5243 - categorical_accuracy: 0.0840 - val_loss: 10.9809 - val_categorical_accuracy: 0.1200 Epoch 3/10 1000/1000 [==============================] - 0s 73us/step - loss: 11.5213 - categorical_accuracy: 0.1000 - val_loss: 10.9945 - val_categorical_accuracy: 0.0800 Epoch 4/10 1000/1000 [==============================] - 0s 73us/step - loss: 11.5213 - categorical_accuracy: 0.1080 - val_loss: 10.9967 - val_categorical_accuracy: 0.0700 Epoch 5/10 1000/1000 [==============================] - 0s 73us/step - loss: 11.5181 - categorical_accuracy: 0.1150 - val_loss: 11.0184 - val_categorical_accuracy: 0.0500 Epoch 6/10 1000/1000 [==============================] - 0s 72us/step - loss: 11.5177 - categorical_accuracy: 0.1150 - val_loss: 10.9892 - val_categorical_accuracy: 0.0200 Epoch 7/10 1000/1000 [==============================] - 0s 72us/step - loss: 11.5130 - categorical_accuracy: 0.1320 - val_loss: 11.0038 - val_categorical_accuracy: 0.0500 Epoch 8/10 1000/1000 [==============================] - 0s 74us/step - loss: 11.5123 - categorical_accuracy: 0.1130 - val_loss: 11.0065 - val_categorical_accuracy: 0.0100 Epoch 9/10 1000/1000 [==============================] - 0s 72us/step - loss: 11.5076 - categorical_accuracy: 0.1150 - val_loss: 11.0062 - val_categorical_accuracy: 0.0800 Epoch 10/10 1000/1000 [==============================] - 0s 67us/step - loss: 11.5035 - categorical_accuracy: 0.1390 - val_loss: 11.0241 - val_categorical_accuracy: 0.1100
使用Datasets可擴展為大型數據集或多設備訓練。將tf.data.Dataset實力傳遞到fit方法。
tf.keras.Model.evaluate和tf.keras.Model.predict方法可以使用Numpy和tf.data.Dataset評估和預測。
tf.keras.Sequential模型是層的簡單堆疊,無法表示任意模型。使用keras函數式API可以構建復雜的模型。
inputs = tf.keras.Input(shape=(32,)) # Returns a placeholder tensor # A layer instance is callable on a tensor, and returns a tensor. x = layers.Dense(64, activation='relu')(inputs) x = layers.Dense(64, activation='relu')(x) predictions = layers.Dense(10, activation='softmax')(x) #給定輸入和輸出的情況下實例化模型。 model = tf.keras.Model(inputs=inputs, outputs=predictions) # The compile step specifies the training configuration. model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) # Trains for 5 epochs model.fit(data, labels, batch_size=32, epochs=5)
模型子類化
在__init__方法中創建層並將他們設置為類實例的屬性。在__call__方法中定義前向傳播。
class MyModel(tf.keras.Model): def __init__(self, num_classes=10): super(MyModel, self).__init__(name='my_model') self.num_classes = num_classes # Define your layers here. self.dense_1 = layers.Dense(32, activation='relu') self.dense_2 = layers.Dense(num_classes, activation='sigmoid') def call(self, inputs): # Define your forward pass here, # using layers you previously defined (in `__init__`). x = self.dense_1(inputs) return self.dense_2(x) def compute_output_shape(self, input_shape): # You need to override this function if you want to use the subclassed model # as part of a functional-style model. # Otherwise, this method is optional. shape = tf.TensorShape(input_shape).as_list() shape[-1] = self.num_classes return tf.TensorShape(shape) model = MyModel(num_classes=10) # The compile step specifies the training configuration. model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) # Trains for 5 epochs. model.fit(data, labels, batch_size=32, epochs=5)
通過對tf.keras.layers.Layer進行子類化並實現以下方法來創建自定義層:
- build:創建層的權重。使用add_weight方法添加權重。
- call:定義前向傳播
- compute_output_shape:指定在給定輸入形狀的情況下如何計算輸出形狀。
- 或者,可以通過get_config方法和from_config類方法序列化層。
class MyLayer(layers.Layer): def __init__(self, output_dim, **kwargs): self.output_dim = output_dim super(MyLayer, self).__init__(**kwargs) def build(self, input_shape): shape = tf.TensorShape((input_shape[1], self.output_dim)) # Create a trainable weight variable for this layer. self.kernel = self.add_weight(name='kernel', shape=shape, initializer='uniform', trainable=True) # Be sure to call this at the end super(MyLayer, self).build(input_shape) def call(self, inputs): return tf.matmul(inputs, self.kernel) def compute_output_shape(self, input_shape): shape = tf.TensorShape(input_shape).as_list() shape[-1] = self.output_dim return tf.TensorShape(shape) def get_config(self): base_config = super(MyLayer, self).get_config() base_config['output_dim'] = self.output_dim return base_config @classmethod def from_config(cls, config): return cls(**config) model = tf.keras.Sequential([ MyLayer(10), layers.Activation('softmax')]) # The compile step specifies the training configuration model.compile(optimizer=tf.train.RMSPropOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) # Trains for 5 epochs. model.fit(data, labels, batch_size=32, epochs=5)
回調是傳遞給模型的對象,用於在訓練期間自定義該模型並擴展其行為。可以編寫自定義回調,也可以使用內置tf.keras.callbacks:
- tf.keras.callbacks.ModelCheckpoint:定期保存模型的檢查點。
- tf.keras.callbacks.LearningRateScheduler:動態更改學習速率。
- tf.keras.callbacks.EarlyStopping:在驗證效果不再改進時中斷訓練。
- tf.keras.callbacks.TensorBoard:使用TensorBoard監控模型的行為。
- 要使用tf.keras.callbacks.Callback,需將其傳遞給模型的fit方法。
callbacks = [ # Interrupt training if `val_loss` stops improving for over 2 epochs tf.keras.callbacks.EarlyStopping(patience=2, monitor='val_loss'), # Write TensorBoard logs to `./logs` directory tf.keras.callbacks.TensorBoard(log_dir='./logs') ] model.fit(data, labels, batch_size=32, epochs=5, callbacks=callbacks, validation_data=(val_data, val_labels))
保存和恢復
(1)僅限權重。使用tf.keras.Model.save_weights保存並加載模型的權重。
model = tf.keras.Sequential([ layers.Dense(64, activation='relu'), layers.Dense(10, activation='softmax')]) model.compile(optimizer=tf.train.AdamOptimizer(0.001), loss='categorical_crossentropy', metrics=['accuracy']) # Save weights to a TensorFlow Checkpoint file model.save_weights('./weights/my_model') # Restore the model's state, # this requires a model with the same architecture. model.load_weights('./weights/my_model')
默認情況下,會以TensorFlow檢查點文件格式保存模型的權重。權重也可以另存為Keras HDF5格式(keras多后端實現的默認格式)。
# Save weights to a HDF5 file model.save_weights('my_model.h5', save_format='h5') # Restore the model's state model.load_weights('my_model.h5')
(2)僅限配置。可以保存模型的結構,此操作會對模型架構(不含任何權重)進行序列化。即使沒有定義原始模型的代碼,保存的配置也可以重新創建並初始化相同的模型。Keras支持JSON和YAML序列化格式:
# Serialize a model to JSON format json_string = model.to_json() json_string '{"backend": "tensorflow", "keras_version": "2.1.6-tf", "config": {"name": "sequential_3", "layers": [{"config": {"units": 64, "kernel_regularizer": null, "activation": "relu", "bias_constraint": null, "trainable": true, "use_bias": true, "bias_initializer": {"config": {"dtype": "float32"}, "class_name": "Zeros"}, "activity_regularizer": null, "dtype": null, "kernel_constraint": null, "kernel_initializer": {"config": {"mode": "fan_avg", "seed": null, "distribution": "uniform", "scale": 1.0, "dtype": "float32"}, "class_name": "VarianceScaling"}, "name": "dense_17", "bias_regularizer": null}, "class_name": "Dense"}, {"config": {"units": 10, "kernel_regularizer": null, "activation": "softmax", "bias_constraint": null, "trainable": true, "use_bias": true, "bias_initializer": {"config": {"dtype": "float32"}, "class_name": "Zeros"}, "activity_regularizer": null, "dtype": null, "kernel_constraint": null, "kernel_initializer": {"config": {"mode": "fan_avg", "seed": null, "distribution": "uniform", "scale": 1.0, "dtype": "float32"}, "class_name": "VarianceScaling"}, "name": "dense_18", "bias_regularizer": null}, "class_name": "Dense"}]}, "class_name": "Sequential"}' {'backend': 'tensorflow', 'class_name': 'Sequential', 'config': {'layers': [{'class_name': 'Dense', 'config': {'activation': 'relu', 'activity_regularizer': None, 'bias_constraint': None, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'bias_regularizer': None, 'dtype': None, 'kernel_constraint': None, 'kernel_initializer': {'class_name': 'VarianceScaling', 'config': {'distribution': 'uniform', 'dtype': 'float32', 'mode': 'fan_avg', 'scale': 1.0, 'seed': None}}, 'kernel_regularizer': None, 'name': 'dense_17', 'trainable': True, 'units': 64, 'use_bias': True}}, {'class_name': 'Dense', 'config': {'activation': 'softmax', 'activity_regularizer': None, 'bias_constraint': None, 'bias_initializer': {'class_name': 'Zeros', 'config': {'dtype': 'float32'}}, 'bias_regularizer': None, 'dtype': None, 'kernel_constraint': None, 'kernel_initializer': {'class_name': 'VarianceScaling', 'config': {'distribution': 'uniform', 'dtype': 'float32', 'mode': 'fan_avg', 'scale': 1.0, 'seed': None}}, 'kernel_regularizer': None, 'name': 'dense_18', 'trainable': True, 'units': 10, 'use_bias': True}}], 'name': 'sequential_3'}, 'keras_version': '2.1.6-tf'} #從json重新創建模型(剛剛初始化) fresh_model = tf.keras.models.model_from_json(json_string) #將模型序列化為YAML格式 yaml_string = model.to_yaml() print(yaml_string) backend: tensorflow class_name: Sequential config: layers: - class_name: Dense config: activation: relu activity_regularizer: null bias_constraint: null bias_initializer: class_name: Zeros config: {dtype: float32} bias_regularizer: null dtype: null kernel_constraint: null kernel_initializer: class_name: VarianceScaling config: {distribution: uniform, dtype: float32, mode: fan_avg, scale: 1.0, seed: null} kernel_regularizer: null name: dense_17 trainable: true units: 64 use_bias: true - class_name: Dense config: activation: softmax activity_regularizer: null bias_constraint: null bias_initializer: class_name: Zeros config: {dtype: float32} bias_regularizer: null dtype: null kernel_constraint: null kernel_initializer: class_name: VarianceScaling config: {distribution: uniform, dtype: float32, mode: fan_avg, scale: 1.0, seed: null} kernel_regularizer: null name: dense_18 trainable: true units: 10 use_bias: true name: sequential_3 keras_version: 2.1.6-tf #從yaml重新創建模型 fresh_model = tf.keras.models.model_from_yaml(yaml_string)
注意:子類化模型不可序列化,因為它們的架構有call方法正文中的python代碼定義。
(3)整個模型。整個模型可以保存到一個文件中,其中包含權重值、模型配置乃至優化其配置。這樣,您就可以對模型設置檢查點並稍后從完全相同的狀態繼續訓練,而無需訪問原始代碼。
# Create a trivial model model = tf.keras.Sequential([ layers.Dense(10, activation='softmax', input_shape=(32,)), layers.Dense(10, activation='softmax') ]) model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy']) model.fit(data, labels, batch_size=32, epochs=5) # Save entire model to a HDF5 file model.save('my_model.h5') # Recreate the exact same model, including weights and optimizer. model = tf.keras.models.load_model('my_model.h5') Epoch 1/5 1000/1000 [==============================] - 0s 297us/step - loss: 11.5009 - acc: 0.0980 Epoch 2/5 1000/1000 [==============================] - 0s 76us/step - loss: 11.4844 - acc: 0.0960 Epoch 3/5 1000/1000 [==============================] - 0s 77us/step - loss: 11.4791 - acc: 0.0850 Epoch 4/5 1000/1000 [==============================] - 0s 78us/step - loss: 11.4771 - acc: 0.1020 Epoch 5/5 1000/1000 [==============================] - 0s 79us/step - loss: 11.4763 - acc: 0.0900