- keras
- 搭建一個神經網絡
- binary classification
- Multi-class classification
- 編碼形式
- Multi-label classification
- Keras callbacks
- history callback
- Early stopping your model
- EarlyStopping and ModelCheckpoint callbacks
- 模型的保存方式
- Learning curves
- 激活函數
- Batch size and batch normalization
- Hyperparameter tuning
- keras backend
- autoencoders
- CNN
- LSTM
這個圖描述神經網絡挺形象的

keras
搭建一個神經網絡
增加各個層
compiling:compile
train:fit
predict:predict
evaluate loss:evaluate
sequence這里比較方便一點
以輸入為2,輸出層只有1為例

# Import the Sequential model and Dense layer
from keras.models import Sequential
from keras.layers import Dense
# Create a Sequential model
model = Sequential()
# Add an input layer and a hidden layer with 10 neurons
model.add(Dense(10, input_shape=(2,), activation="relu"))
# Add a 1-neuron output layer
model.add(Dense(1))
# Summarise your model
model.summary()
一個比較完整的栗子
# Instantiate a Sequential model
model = Sequential()
# Add a Dense layer with 50 neurons and an input of 1 neuron
model.add(Dense(50, input_shape=(1,), activation='relu'))
# Add two Dense layers with 50 neurons and relu activation
model.add(Dense(50, activation='relu'))
model.add(Dense(50, activation='relu'))
# End your model with a Dense layer and no activation
model.add(Dense(1))
# Compile your model
model.compile(optimizer = 'adam', loss = 'mse')
print("Training started..., this can take a while:")
# Fit your model on your data for 30 epochs
model.fit(time_steps, y_positions, epochs = 30)
# Evaluate your model
print("Final lost value:",model.evaluate(time_steps, y_positions))
# Predict the twenty minutes orbit
twenty_min_orbit = model.predict(np.arange(-10, 11))
# Plot the twenty minute orbit
plot_orbit(twenty_min_orbit)
binary classification
那輸出的dense層的激活函數是sigmoid就可以了
# Import the sequential model and dense layer
from keras.models import Sequential
from keras.layers import Dense
# Create a sequential model
model = Sequential()
# Add a dense layer
model.add(Dense(1, input_shape=(4,), activation='sigmoid'))
# Compile your model
model.compile(loss='binary_crossentropy', optimizer='sgd', metrics=['accuracy'])
# Display a summary of your model
model.summary()
# Train your model for 20 epochs,假設這里划分好了數據集
model.fit(X_train, y_train, epochs=20)
# Evaluate your model accuracy on the test set
accuracy = model.evaluate(X_test, y_test)[1]
# Print accuracy
print('Accuracy:',accuracy)
Multi-class classification
多分類的話,就是輸出的激活函數不是sigmoid了,而是softmax了
所以寫一個多分類的流程
就是
定義輸入層和隱藏層
定義更多的隱藏層
定義輸出層,輸出層的結點數量大於一
demo
定義一個含有三個隱藏層神經網絡,輸出為4分類
# Instantiate a sequential model
model = Sequential()
# Add 3 dense layers of 128, 64 and 32 neurons each
model.add(Dense(128, input_shape=(2,), activation='relu'))
model.add(Dense(64, activation='relu'))
model.add(Dense(32, activation='relu'))
# Add a dense layer with as many neurons as competitors
model.add(Dense(4, activation='softmax'))
# Compile your model using categorical_crossentropy loss
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
# Train your model on the training data for 200 epochs
model.fit(coord_train, competitors_train, epochs=200)
# Evaluate your model accuracy on the test data
accuracy = model.evaluate(coord_test, competitors_test)[1]
# Print accuracy
print('Accuracy:', accuracy)
編碼形式
labelcoder
# Transform into a categorical variable
darts.competitor = pd.Categorical(darts.competitor)
# Assign a number to each category (label encoding)
darts.competitor = darts.competitor.cat.codes
# Print the label encoded competitors
print('Label encoded competitors: \n',darts.competitor.head())
<script.py> output:
Label encoded competitors:
0 2
1 3
2 1
3 0
4 2
Name: competitor, dtype: int8
one hot
# Transform into a categorical variable
darts.competitor = pd.Categorical(darts.competitor)
# Assign a number to each category (label encoding)
darts.competitor = darts.competitor.cat.codes
# Import to_categorical from keras utils module
from keras.utils import to_categorical
# Use to_categorical on your labels
coordinates = darts.drop(['competitor'], axis=1)
competitors = to_categorical(darts.competitor)
# Now print the to_categorical() result
print('One-hot encoded competitors: \n',competitors)
Multi-label classification
dense的結點不是1了,而是大於1的,激活函數還是sigmoid
# Instantiate a Sequential model
model = Sequential()
# Add a hidden layer of 64 neurons and a 20 neuron's input
model.add(Dense(64, input_shape=(20,), activation='relu'))
# Add an output layer of 3 neurons with sigmoid activation
model.add(Dense(3, activation='sigmoid'))
# Compile your model with adam and binary crossentropy loss
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
model.summary()
# Train for 100 epochs using a validation split of 0.2
model.fit(sensors_train, parcels_train, epochs=100, validation_split=0.2)
# Predict on sensors_test and round up the predictions
preds = model.predict(sensors_test)
preds_rounded = np.round(preds)
# Print rounded preds
print('Rounded Predictions: \n', preds_rounded)
# Evaluate your model's accuracy on the test data
accuracy = model.evaluate(sensors_test, parcels_test)[1]
# Print accuracy
print('Accuracy:', accuracy)
Keras callbacks
回調函數使用
回調函數是一個函數的合集,會在訓練的階段中所使用。你可以使用回調函數來查看訓練模型的內在狀態和統計。你可以傳遞一個列表的回調函數(作為 callbacks 關鍵字參數)到 Sequential 或 Model 類型的 .fit() 方法。在訓練時,相應的回調函數的方法就會被在各自的階段被調用。keras.cn
在每個training/epoch/batch結束時,如果我們想執行某些任務,例如模型緩存、輸出日志、計算當前的auc等等,Keras中的callback就派上用場了。

callbacks可以用來做這些事情:
模型斷點續訓:保存當前模型的所有權重
提早結束:當模型的損失不再下降的時候就終止訓練,當然,會保存最優的模型。
動態調整訓練時的參數,比如優化的學習速度。
等等
earlystopping和modelcheckpoint抄書俠
import keras
# Callbacks are passed to the model fit the `callbacks` argument in `fit`,
# which takes a list of callbacks. You can pass any number of callbacks.
callbacks_list = [
# This callback will interrupt training when we have stopped improving
keras.callbacks.EarlyStopping(
# This callback will monitor the validation accuracy of the model
monitor='acc',
# Training will be interrupted when the accuracy
# has stopped improving for *more* than 1 epochs (i.e. 2 epochs)
patience=1,
),
# This callback will save the current weights after every epoch
keras.callbacks.ModelCheckpoint(
filepath='my_model.h5', # Path to the destination model file
# The two arguments below mean that we will not overwrite the
# model file unless `val_loss` has improved, which
# allows us to keep the best model every seen during training.
monitor='val_loss',
save_best_only=True,
)
]
# Since we monitor `acc`, it should be part of the metrics of the model.
model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['acc'])
# Note that since the callback will be monitor validation accuracy,
# we need to pass some `validation_data` to our call to `fit`.
model.fit(x, y,
epochs=10,
batch_size=32,
callbacks=callbacks_list,
validation_data=(x_val, y_val))
monitor為選擇的檢測指標,我們這里選擇檢測'acc'識別率為指標,patience就是我們能讓訓練停止變好多少epochs才終止訓練,這里選擇了1,而modelcheckpoint就起到了存儲最優的模型的作用,filepath為我們存儲的位置和模型名稱,以.h5為后綴,monitor為檢測的指標,這里我們檢測驗證集里面的成功率,save_best_only代表我們只保存最優的訓練結果。
而validation_data就是給定的驗證集數據。
學習率減少callback抄書俠
callbacks_list = [
keras.callbacks.ReduceLROnPlateau(
# This callback will monitor the validation loss of the model
monitor='val_loss',
# It will divide the learning by 10 when it gets triggered
factor=0.1,
# It will get triggered after the validation loss has stopped improving
# for at least 10 epochs
patience=10,
)
]# Note that since the callback will be monitor validation loss,
# we need to pass some `validation_data` to our call to `fit`.
model.fit(x, y,
epochs=10,
batch_size=32,
callbacks=callbacks_list,
validation_data=(x_val, y_val))
翻譯一下,就是如果連續10個批次,val_loss不再下降,就把學習率弄到原來的0.1倍。
history callback
# Train your model and save its history
history = model.fit(X_train, y_train, epochs = 50,
validation_data=(X_test, y_test))
# Plot train vs test loss during training
plot_loss(history.history['loss'], history.history['val_loss'])
# Plot train vs test accuracy during training
plot_accuracy(history.history['acc'], history.history['val_acc'])


Early stopping your model
The early stopping callback is useful since it allows for you to stop the model training if it no longer improves after a given number of epochs. To make use of this functionality you need to pass the callback inside a list to the model's callback parameter in the .fit() method.
# Import the early stopping callback
from keras.callbacks import EarlyStopping
# Define a callback to monitor val_acc
monitor_val_acc = EarlyStopping(monitor='val_acc',
patience=5)
# Train your model using the early stopping callback
model.fit(X_train, y_train,
epochs=1000, validation_data=(X_test, y_test),
callbacks=[monitor_val_acc])
EarlyStopping and ModelCheckpoint callbacks
當驗證集的誤差不再發生變化的時候,停止迭代,並且保存模型為hdf5格式
# Import the EarlyStopping and ModelCheckpoint callbacks
from keras.callbacks import EarlyStopping, ModelCheckpoint
# Early stop on validation accuracy
monitor_val_acc = EarlyStopping(monitor = 'val_acc', patience = 3)
# Save the best model as best_banknote_model.hdf5
modelCheckpoint = ModelCheckpoint('best_banknote_model.hdf5', save_best_only = True)
# Fit your model for a stupid amount of epochs
history = model.fit(X_train, y_train,
epochs = 10000000,
callbacks = [monitor_val_acc, modelCheckpoint],
validation_data = (X_test, y_test))
模型的保存方式
h5格式
Learning curves
學習曲線
查看損失函數的curve和accuary的curve
# Instantiate a Sequential model
model = Sequential()
# Input and hidden layer with input_shape, 16 neurons, and relu
model.add(Dense(16, input_shape = (64,), activation = 'relu'))
# Output layer with 10 neurons (one per digit) and softmax
model.add(Dense(10, activation = 'softmax'))
# Compile your model
model.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
# Test if your model works and can process input data
print(model.predict(X_train))
# 這里划分數據集的方式是通用的,不過這個是留出法
# Train your model for 60 epochs, using X_test and y_test as validation data
history = model.fit(X_train, y_train, epochs=60, validation_data=(X_test, y_test), verbose=0)
# Extract from the history object loss and val_loss to plot the learning curve
plot_loss(history.history['loss'], history.history['val_loss'])
for size in training_sizes:
# Get a fraction of training data (we only care about the training data)
X_train_frac, y_train_frac = X_train[:size], y_train[:size]
# Reset the model to the initial weights and train it on the new data fraction
model.set_weights(initial_weights)
model.fit(X_train_frac, y_train_frac, epochs = 50, callbacks = [early_stop])
# Evaluate and store the train fraction and the complete test set results
train_accs.append(model.evaluate(X_train_frac, y_train_frac)[1])
test_accs.append(model.evaluate(X_test, y_test)[1])
# Plot train vs test accuracies
plot_results(train_accs, test_accs)

激活函數
神經網絡的內部就是一堆的數相乘,求和的過程,這個圖很形象了

Batch size and batch normalization
分批的尺寸和每批的正則化
批量標准化層 (Ioffe and Szegedy, 2014)。
在每一個批次的數據中標准化前一層的激活項, 即,應用一個維持激活項平均值接近 0,標准差接近 1 的轉換。
# Import batch normalization from keras layers
from keras.layers import BatchNormalization
# Build your deep network
batchnorm_model = Sequential()
batchnorm_model.add(Dense(50, input_shape=(64,), activation='relu', kernel_initializer='normal'))
batchnorm_model.add(BatchNormalization())
batchnorm_model.add(Dense(50, activation='relu', kernel_initializer='normal'))
batchnorm_model.add(BatchNormalization())
batchnorm_model.add(Dense(50, activation='relu', kernel_initializer='normal'))
batchnorm_model.add(BatchNormalization())
batchnorm_model.add(Dense(10, activation='softmax', kernel_initializer='normal'))
# Compile your model with sgd
batchnorm_model.compile(optimizer='sgd', loss='categorical_crossentropy', metrics=['accuracy'])
# Train your standard model, storing its history
history1 = standard_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, verbose=0)
# Train the batch normalized model you recently built, store its history
history2 = batchnorm_model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=10, verbose=0)
# Call compare_acc_histories passing in both model histories
compare_histories_acc(history1, history2)

Hyperparameter tuning
超參數調節
# Import KerasClassifier from keras wrappers
from keras.wrappers.scikit_learn import KerasClassifier
# Create a KerasClassifier
model = KerasClassifier(build_fn = create_model, epochs = 50,
batch_size = 128, verbose = 0)
# Calculate the accuracy score for each fold
kfolds = cross_val_score(model, X, y, cv = 3)
# Print the mean accuracy
print('The mean accuracy was:', kfolds.mean())
# Print the accuracy standard deviation
print('With a standard deviation of:', kfolds.std())
目前來看是一樣的
keras backend
這里等熟悉keras包之后回看
Keras是一個模型級的庫,提供了快速構建深度學習網絡的模塊。Keras並不處理如張量乘法、卷積等底層操作。這些操作依賴於某種特定的、優化良好的張量操作庫。Keras依賴於處理張量的庫就稱為“后端引擎”。Keras提供了三種后端引擎Theano/Tensorflow/CNTK,並將其函數統一封裝,使得用戶可以以同一個接口調用不同后端引擎的函數
Theano是一個開源的符號主義張量操作框架,由蒙特利爾大學LISA/MILA實驗室開發。
TensorFlow是一個符號主義的張量操作框架,由Google開發。
CNTK是一個由微軟開發的商業級工具包。keras中文文檔
from keras import backend as K
導入后K模塊提供的所有方法都是abstract keras backend API。
autoencoders
Autoencoders have several interesting applications like anomaly detection or image denoising. They aim at producing an output identical to its inputs. The input will be compressed into a lower dimensional space, encoded. The model then learns to decode it back to its original form.
自編碼器
自編碼器的輸入和輸出是一樣的,降維再dense層
# Start with a sequential model
autoencoder = Sequential()
# Add a dense layer with the original image as input
autoencoder.add(Dense(32, input_shape=(784, ), activation="relu"))
# Add an output layer with as many nodes as the image
autoencoder.add(Dense(784, activation="sigmoid"))
# Compile your model
autoencoder.compile(optimizer='adadelta', loss='binary_crossentropy')
# Take a look at your model structure
autoencoder.summary()
<script.py> output:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 32) 25120
_________________________________________________________________
dense_2 (Dense) (None, 4) 132
=================================================================
Total params: 25,252
Trainable params: 25,252
Non-trainable params: 0
_________________________________________________________________
<script.py> output:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 32) 25120
_________________________________________________________________
dense_2 (Dense) (None, 784) 25872
=================================================================
Total params: 50,992
Trainable params: 50,992
Non-trainable params: 0
_________________________________________________________________
增加一層encoder編碼
# Build your encoder
encoder = Sequential()
encoder.add(autoencoder.layers[0])
# Encode the images and show the encodings
preds = encoder.predict(X_test_noise)
show_encodings(preds)
# Predict on the noisy images with your autoencoder
decoded_imgs = autoencoder.predict(X_test_noise)
# Plot noisy vs decoded images
compare_plot(X_test_noise, decoded_imgs)


CNN

卷積神經網絡
圖是三維張量
# Import the Conv2D and Flatten layers and instantiate model
from keras.layers import Conv2D,Flatten
model = Sequential()
# Add a convolutional layer of 32 filters of size 3x3
model.add(Conv2D(32, input_shape=(28, 28, 1), kernel_size=3, activation='relu'))
# Add a convolutional layer of 16 filters of size 3x3
model.add(Conv2D(16, kernel_size=3, activation='relu'))
# Flatten the previous layer output
model.add(Flatten())
# Add as many outputs as classes with softmax activation
model.add(Dense(10, activation='softmax'))
# Obtain a reference to the outputs of the first layer
layer_output = model.layers[0].output
# Build a model using the model input and the first layer output
first_layer_model = Model(inputs = model.input, outputs = layer_output)
# Use this model to predict on X_test
activations = first_layer_model.predict(X_test)
# Plot the first digit of X_test for the 15th filter
axs[0].matshow(activations[0,:,:,14], cmap = 'viridis')
# Do the same but for the 18th filter now
axs[1].matshow(activations[0,:,:,17], cmap = 'viridis')
plt.show()

ResNet50
預訓練模型
LSTM
# Split text into an array of words
words = text.split()
# Make lines of 4 words each, moving one word at a time
lines = []
for i in range(4, len(words)):
lines.append(' '.join(words[i-4:i]))
# Instantiate a Tokenizer, then fit it on the lines
tokenizer = Tokenizer()
tokenizer.fit_on_texts(lines)
# Turn lines into a sequence of numbers
sequences = tokenizer.texts_to_sequences(lines)
print("Lines: \n {} \n Sequences: \n {}".format(lines[:5],sequences[:5]))
搭建一個lstm
# Import the Embedding, LSTM and Dense layer
from keras.layers import Embedding, LSTM, Dense
model = Sequential()
# Add an Embedding layer with the right parameters
model.add(Embedding(input_dim=vocab_size, output_dim=8, input_length=3))
# Add a 32 unit LSTM layer
model.add(LSTM(32))
# Add a hidden Dense layer of 32 units and an output layer of vocab_size with softmax
model.add(Dense(32, activation='relu'))
model.add(Dense(vocab_size, activation='softmax'))
model.summary()
<script.py> output:
Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
embedding_1 (Embedding) (None, 3, 8) 352
_________________________________________________________________
lstm_1 (LSTM) (None, 32) 5248
_________________________________________________________________
dense_1 (Dense) (None, 32) 1056
_________________________________________________________________
dense_2 (Dense) (None, 44) 1452
=================================================================
Total params: 8,108
Trainable params: 8,108
Non-trainable params: 0
_________________________________________________________________

