tensorflow2的差異總結


主要將模型的搭建移植到keras,參照上一篇博客。

新的差異主要如下:

1. 之前我們可以初始化一個tensor,可以通過tf.nn,或者tf.layers模塊,有些模塊中出現了重復的片段,因此新的版本保留的前提下,  引入了一個新的tensorflow.keras.layers全新的模塊。

tf.keras.layers.Dense(units,activate,use_bais,input_size=[])

 例如,全連接層改編為此,我們可以通過這個方式完成我們全連接層的搭建。

我們當然也可以通過繼承  tf.keras.layers.Layer  類,重寫build和call方法,build中我們需要把Input_shape作為我們的輸入參數。call方法則是調用我們這個實例時,運行當前層。eg:

class MyDense(tf.keras.layers.Layer):
def __init__(self, n_outputs):
super(MyDense, self).__init__()
self.n_outputs = n_outputs

def build(self, input_shape):
self.kernel = self.add_variable('kernel',
shape=[int(input_shape[-1]),
self.n_outputs])
def call(self, input):
return tf.matmul(input, self.kernel)
layer = MyDense(10)
print(layer(tf.ones([6, 5])))
print(layer.trainable_variables)

 

# 殘差塊
class ResnetBlock(tf.keras.Model):
def __init__(self, kernel_size, filters):
super(ResnetBlock, self).__init__(name='resnet_block')

# 每個子層卷積核數
filter1, filter2, filter3 = filters

# 三個子層,每層1個卷積加一個批正則化
# 第一個子層, 1*1的卷積
self.conv1 = tf.keras.layers.Conv2D(filter1, (1,1))
self.bn1 = tf.keras.layers.BatchNormalization()
# 第二個子層, 使用特點的kernel_size
self.conv2 = tf.keras.layers.Conv2D(filter2, kernel_size, padding='same')
self.bn2 = tf.keras.layers.BatchNormalization()
# 第三個子層,1*1卷積
self.conv3 = tf.keras.layers.Conv2D(filter3, (1,1))
self.bn3 = tf.keras.layers.BatchNormalization()

def call(self, inputs, training=False):

# 堆疊每個子層
x = self.conv1(inputs)
x = self.bn1(x, training=training)

x = self.conv2(x)
x = self.bn2(x, training=training)

x = self.conv3(x)
x = self.bn3(x, training=training)

# 殘差連接
x += inputs
outputs = tf.nn.relu(x)

return outputs

resnetBlock = ResnetBlock(2, [6,4,9])
# 數據測試
print(resnetBlock(tf.ones([1,3,9,9])))
# 查看網絡中的變量名
print([x.name for x in resnetBlock.trainable_variables])

2. 目前有兩種創建模型的方式,分別是函數式的方式和面向對象式的。

  

 

   

 

   第二個方法中,我們就可以通過類似於pytorch的方式,完成我們的前向傳遞的調試過程,不需要再向tensorflow1中那樣。

  

  他們都是通過tf,keras.Model對象衍生出來的,可以通過該對象的方法完成訓練評估等。詳情見:https://www.cnblogs.com/king-lps/p/12743485.html

  

 

   此方法用於model.compile完成模型的優化目標。其中,metrics可以傳遞我們的測評指標,包括不限於召回率精確率等。

  函數式模式:

inputs = tf.keras.Input(shape=(784,), name='img')
# 以上一層的輸出作為下一層的輸入
h1 = layers.Dense(32, activation='relu')(inputs)
h2 = layers.Dense(32, activation='relu')(h1)
outputs = layers.Dense(10, activation='softmax')(h2)
model = tf.keras.Model(inputs=inputs, outputs=outputs, name='mnist_model') # 名字字符串中不能有空格

model.summary()
keras.utils.plot_model(model, 'mnist_model.png')
keras.utils.plot_model(model, 'model_info.png', show_shapes=True)
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# 將數值歸到0-1之間
x_train = x_train.reshape(60000, 784).astype('float32') /255
x_test = x_test.reshape(10000, 784).astype('float32') /255
model.compile(optimizer=keras.optimizers.RMSprop(),
loss='sparse_categorical_crossentropy', # 直接填api,后面會報錯
metrics=['accuracy'])
history = model.fit(x_train, y_train, batch_size=64, epochs=5, validation_split=0.2)
test_scores = model.evaluate(x_test, y_test, verbose=0)
print('test loss:', test_scores[0])
print('test acc:', test_scores[1])

 

 還有一種通過tf.keras.Sequential()來創建模型,我們可以使用如下方式:

model = tf.keras.Sequential()
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(32, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))
//優化的方式
model.compile(optimizer=tf.keras.optimizers.Adam(0.001),
loss=tf.keras.losses.categorical_crossentropy,
metrics=[tf.keras.metrics.categorical_accuracy])
//或
model.compile(optimizer=keras.optimizers.RMSprop(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
//或多個優化目標的權重
model.compile(
optimizer=keras.optimizers.RMSprop(1e-3),
loss={'score_output': keras.losses.MeanSquaredError(),
'class_output': keras.losses.CategoricalCrossentropy()},
metrics={'score_output': [keras.metrics.MeanAbsolutePercentageError(),
keras.metrics.MeanAbsoluteError()],
'class_output': [keras.metrics.CategoricalAccuracy()]},
loss_weight={'score_output': 2., 'class_output': 1.})

 //訓練

model.fit(train_x, train_y, batch_size=32, epochs=5)

//下面這種可以保證在驗證集上的指標在多少輪次之后每次提升的性能指標低於閾值,則結束訓練
callbacks = [
keras.callbacks.EarlyStopping(
# 不再提升的關注指標
monitor='val_loss',
# 不再提升的閾值
min_delta=1e-2,
# 不再提升的輪次
patience=2,
verbose=1)
]
model.fit(x_train, y_train,
epochs=20,
batch_size=64,
callbacks=callbacks,
validation_split=0.2)
//下面可以保證每次如果新的輪次中模型的性能在某個指標上得到了提升,那么把這個模型保存為filepath的路徑下。
check_callback = keras.callbacks.ModelCheckpoint(
# 模型路徑
filepath='mymodel_{epoch}.h5',
# 是否保存最佳
save_best_only=True,
# 監控指標
monitor='val_loss',
# 進度條類型
verbose=1
)

model.fit(x_train, y_train,
epochs=3,
batch_size=64,
callbacks=[check_callback],
validation_split=0.2)
//動態調整學習率
initial_learning_rate = 0.1
lr_schedule = keras.optimizers.schedules.ExponentialDecay(
# 初始學習率
initial_learning_rate,
# 延遲步數
decay_steps=10000,
# 調整百分比
decay_rate=0.96,
staircase=True
)
optimizer = keras.optimizers.RMSprop(learning_rate=lr_schedule)
model.compile(
optimizer=optimizer,
loss=keras.losses.SparseCategoricalCrossentropy(),
metrics=[keras.metrics.SparseCategoricalAccuracy()])
//保存 加載

model.save_weights('./model.h5', save_format='h5')
model.load_weights('./model.h5')

 //此處的test_scores里面包含了我們需要的loss和compile中的metrics

test_scores = model.evaluate(x_test, y_test, verbose=0)
print('test loss:', test_scores[0])
print('test acc:', test_scores[1])

 

 

 

還可以從配置文件中去重新加載模型的結構,當然參數是沒有的。

 

  或者

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

tensorflow2精化了很多,但是同時也保留了tensorflow1中的訓練方式。

train_dataset = tf.data.Dataset.from_tensor_slices(x_train)
train_dataset = train_dataset.shuffle(buffer_size=1024).batch(64)

original_dim = 784
vae = VAE(original_dim, 64, 32)

# 優化器
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
# 損失函數
mse_loss_fn = tf.keras.losses.MeanSquaredError()
# 評價指標
loss_metric = tf.keras.metrics.Mean()

# 訓練循環
for epoch in range(3):
print('Start of epoch %d' % (epoch,))

# 每批次訓練
for step, x_batch_train in enumerate(train_dataset):
with tf.GradientTape() as tape:
# 前向傳播
reconstructed = vae(x_batch_train)
# 計算loss
loss = mse_loss_fn(x_batch_train, reconstructed)
loss += sum(vae.losses) # Add KLD regularization loss
    # 計算梯度
grads = tape.gradient(loss, vae.trainable_variables)
# 反向傳播
optimizer.apply_gradients(zip(grads, vae.trainable_variables))
# 統計指標
loss_metric(loss)
# 輸出
if step % 100 == 0:
print('step %s: mean loss = %s' % (step, loss_metric.result()))

 

loss_history = []
for (batch, (images, labels)) in enumerate(dataset.take(400)):
if batch % 10 == 0:
print('.', end='')
with tf.GradientTape() as tape:
# 獲取預測結果
logits = mnist_model(images, training=True)
# 獲取損失
loss_value = loss_object(labels, logits)

loss_history.append(loss_value.numpy().mean())
# 獲取本批數據梯度
grads = tape.gradient(loss_value, mnist_model.trainable_variables)
# 反向傳播優化
optimizer.apply_gradients(zip(grads, mnist_model.trainable_variables))
# 繪圖展示loss變化
import matplotlib.pyplot as plt
plt.plot(loss_history)
plt.xlabel('Batch #')
plt.ylabel('Loss [entropy]')

callback作為fit的一個參數,可以自定義等等,需要繼承keras.callbacks.CallBack類。

LR_SCHEDULE = [
# (epoch to start, learning rate) tuples
(3, 0.05), (6, 0.01), (9, 0.005), (12, 0.001)
]

def lr_schedule(epoch, lr):
if epoch < LR_SCHEDULE[0][0] or epoch > LR_SCHEDULE[-1][0]:
return lr
for i in range(len(LR_SCHEDULE)):
if epoch == LR_SCHEDULE[i][0]:
return LR_SCHEDULE[i][1]
return lr

model = get_model()
_ = model.fit(x_train, y_train,
batch_size=64,
steps_per_epoch=5,
epochs=15,
verbose=0,
callbacks=[LossAndErrorPrintingCallback(), LearningRateScheduler(lr_schedule)])




nlp:
常用的句子填充:
raw_inputs = [
[83, 91, 1, 645, 1253, 927],
[73, 8, 3215, 55, 927],
[711, 632, 71]
]
# 默認左填充
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(raw_inputs)
print(padded_inputs)
# 右填充需要設 padding='post'
padded_inputs = tf.keras.preprocessing.sequence.pad_sequences(raw_inputs,
padding='post')
print(padded_inputs)

 

更一般的:

train_x = keras.preprocessing.sequence.pad_sequences(
train_x, value=word2id['<PAD>'],
padding='post', maxlen=256
)
test_x = keras.preprocessing.sequence.pad_sequences(
test_x, value=word2id['<PAD>'],
padding='post', maxlen=256
)
print(train_x[0])
print('len: ',len(train_x[0]), len(train_x[1]))

 

嵌入層:

inputs = tf.keras.Input(shape=(None,), dtype='int32')
x = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)(inputs)
outputs = layers.LSTM(32)(x)

//或者
class MyLayer(layers.Layer):

def __init__(self, **kwargs):
super(MyLayer, self).__init__(**kwargs)
self.embedding = layers.Embedding(input_dim=5000, output_dim=16, mask_zero=True)
self.lstm = layers.LSTM(32)

def call(self, inputs):
x = self.embedding(inputs)
# 也可手動構造掩碼
mask = self.embedding.compute_mask(inputs)
output = self.lstm(x, mask=mask) # The layer will ignore the masked values
return output


//引入nlp中常用的transformers,這個是mask的transformers
def scaled_dot_product_attention(q, k, v, mask):
# query key 相乘獲取匹配關系
matmul_qk = tf.matmul(q, k, transpose_b=True)

# 使用dk進行縮放
dk = tf.cast(tf.shape(k)[-1], tf.float32)
scaled_attention_logits = matmul_qk / tf.math.sqrt(dk)

# 掩碼
if mask is not None:
scaled_attention_logits += (mask * -1e9)

# 通過softmax獲取attention權重
attention_weights = tf.nn.softmax(scaled_attention_logits, axis=-1)

# attention 乘上value
output = tf.matmul(attention_weights, v) # (.., seq_len_v, depth)

return output, attention_weights






















穿插一個plt的畫圖工具:
import matplotlib.pyplot as plt
history_dict = history.history
history_dict.keys()
acc = history_dict['accuracy']
val_acc = history_dict['val_accuracy']
loss = history_dict['loss']
val_loss = history_dict['val_loss']
epochs = range(1, len(acc)+1)

plt.plot(epochs, loss, 'bo', label='train loss')
plt.plot(epochs, val_loss, 'b', label='val loss')
plt.title('Train and val loss')
plt.xlabel('Epochs')
plt.xlabel('loss')
plt.legend()
plt.show()

 

 

 

 

 





免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM