本文轉自微信公眾號--Python中文社區--淺談keras的擴展性:自定義keras
1. 自定義keras
keras是一種深度學習的API,能夠快速實現你的實驗。keras也集成了很多預訓練的模型,可以實現很多常規的任務,如圖像分類。TensorFlow 2.0之后tensorflow本身也變的很keras化。
另一方面,keras表現出高度的模塊化和封裝性,所以有的人會覺得keras不易於擴展, 比如實現一種新的Loss,新的網絡層結構;其實可以通過keras的基礎模塊進行快速的擴展,實現更新的算法。
本文就keras的擴展性,總結了對layer,model和loss的自定義。
2. 自定義keras layers
layers是keras中重要的組成部分,網絡結構中每一個組成都要以layers來表現。keras提供了很多常規的layer,如Convolution layers,pooling layers, activation layers, dense layers等, 我們可以通過繼承基礎layers來擴展自定義的layers。
2.1 base layer
layer實了輸入tensor和輸出tensor的操作類,以下為base layer的5個方法,自定義layer只要重寫這些方法就可以了。
- init(): 定義自定義layer的一些屬性
- build(self, input_shape):定義layer需要的權重weights
- call(self, *args, **kwargs):layer具體的操作,會在調用自定義layer自動執行
- get_config(self):layer初始化的配置,是一個字典dictionary。
- compute_output_shape(self,input_shape):計算輸出tensor的shape
2.2 例子
# 標准化層 class InstanceNormalize(Layer): def __init__(self, **kwargs): super(InstanceNormalize, self).__init__(**kwargs) self.epsilon = 1e-3 def call(self, x, mask=None): mean, var = tf.nn.moments(x, [1, 2], keep_dims=True) return tf.div(tf.subtract(x, mean), tf.sqrt(tf.add(var, self.epsilon))) def compute_output_shape(self,input_shape): return input_shape # 調用 inputs = keras.Input(shape=(None, None, 3)) x = InstanceNormalize()(inputs)
# 可以通過add_weight() 創建權重 class SimpleDense(Layer): def __init__(self, units=32): super(SimpleDense, self).__init__() self.units = units def build(self, input_shape): self.w = self.add_weight(shape=(input_shape[-1], self.units), initializer='random_normal', trainable=True) self.b = self.add_weight(shape=(self.units,), initializer='random_normal', trainable=True) def call(self, inputs): return tf.matmul(inputs, self.w) + self.b # 調用 inputs = keras.Input(shape=(None, None, 3)) x = SimpleDense(units=64)(inputs)
3. 自定義keras model
我們在定義完網絡結構時,會把整個工作流放在 keras.Model
, 進行 compile()
, 然后通過 fit()
進行訓練過程。執行 fit()
的時候,執行每個 batch size data 的時候,都會調用 Model
中train_step(self, data)
from keras.models import Sequential from keras.layers import Dense, Activation model = Sequential() model.add(Dense(units=64, input_dim=100)) model.add(Activation("relu")) model.add(Dense(units=10)) model.add(Activation("softmax")) model.compile(loss='categorical_crossentropy', optimizer='sgd', metrics=['accuracy']) model.fit(x_train, y_train, epochs=5, batch_size=32)
當你需要自己控制訓練過程的時候,可以重寫Model
的train_step(self, data)
方法
class CustomModel(keras.Model): def train_step(self, data): # Unpack the data. Its structure depends on your model and # on what you pass to `fit()`. x, y = data with tf.GradientTape() as tape: y_pred = self(x, training=True) # Forward pass # Compute the loss value # (the loss function is configured in `compile()`) loss = self.compiled_loss(y, y_pred, regularization_losses=self.losses) # Compute gradients trainable_vars = self.trainable_variables gradients = tape.gradient(loss, trainable_vars) # Update weights self.optimizer.apply_gradients(zip(gradients, trainable_vars)) # Update metrics (includes the metric that tracks the loss) self.compiled_metrics.update_state(y, y_pred) # Return a dict mapping metric names to current value return {m.name: m.result() for m in self.metrics} import numpy as np # Construct and compile an instance of CustomModel inputs = keras.Input(shape=(32,)) outputs = keras.layers.Dense(1)(inputs) model = CustomModel(inputs, outputs) model.compile(optimizer="adam", loss="mse", metrics=["mae"]) # Just use `fit` as usual x = np.random.random((1000, 32)) y = np.random.random((1000, 1)) model.fit(x, y, epochs=3)
4. 自定義keras loss
keras實現了交叉熵等常見的loss,自定義loss對於使用keras來說是比較常見,實現各種魔改loss,如focal loss。
我們來看看keras源碼中對loss實現
def categorical_crossentropy(y_true, y_pred): return K.categorical_crossentropy(y_true, y_pred) def mean_squared_error(y_true, y_pred): return K.mean(K.square(y_pred - y_true), axis=-1)
可以看出輸入是groud true y_true
和預測值y_pred
, 返回為計算loss的函數。自定義loss可以參照如此模式即可。
def focal_loss(weights=None, alpha=0.25, gamma=2): r"""Compute focal loss for predictions. Multi-labels Focal loss formula: FL = -alpha * (z-p)^gamma * log(p) -(1-alpha) * p^gamma * log(1-p) ,which alpha = 0.25, gamma = 2, p = sigmoid(x), z = target_tensor. # https://github.com/ailias/Focal-Loss-implement-on-Tensorflow/blob/master/focal_loss.py Args: prediction_tensor: A float tensor of shape [batch_size, num_anchors, num_classes] representing the predicted logits for each class target_tensor: A float tensor of shape [batch_size, num_anchors, num_classes] representing one-hot encoded classification targets weights: A float tensor of shape [batch_size, num_anchors] alpha: A scalar tensor for focal loss alpha hyper-parameter gamma: A scalar tensor for focal loss gamma hyper-parameter Returns: loss: A (scalar) tensor representing the value of the loss function """ def _custom_loss(y_true, y_pred): sigmoid_p = tf.nn.sigmoid(y_pred) zeros = array_ops.zeros_like(sigmoid_p, dtype=sigmoid_p.dtype) # For poitive prediction, only need consider front part loss, back part is 0; # target_tensor > zeros <=> z=1, so poitive coefficient = z - p. pos_p_sub = array_ops.where(y_true > zeros, y_true - sigmoid_p, zeros) # For negative prediction, only need consider back part loss, front part is 0; # target_tensor > zeros <=> z=1, so negative coefficient = 0. neg_p_sub = array_ops.where(y_true > zeros, zeros, sigmoid_p) per_entry_cross_ent = - alpha * (pos_p_sub ** gamma) * tf.log(tf.clip_by_value(sigmoid_p, 1e-8, 1.0)) \ - (1 - alpha) * (neg_p_sub ** gamma) * tf.log( tf.clip_by_value(1.0 - sigmoid_p, 1e-8, 1.0)) return tf.reduce_sum(per_entry_cross_ent) return _custom_loss
5. 總結
本文分享了keras的擴展功能,擴展功能其實也是實現Keras模塊化的一種繼承實現。
總結如下:
- 繼承Layer實現自定義layer, 記住
bulid()
call()
- 繼續Model實現
train_step
定義訓練過程,記住梯度計算tape.gradient(loss, trainable_vars)
,權重更新optimizer.apply_gradients
, 計算evaluatecompiled_metrics.update_state(y, y_pred)
- 魔改loss,記住groud true
y_true
和預測值y_pred
輸入,返回loss function