keras系列︱Sequential與Model模型、keras基本結構功能（一）

本文轉載自查看原文 2019-07-07 16:45 646 DL/ dl/ keras

引自：http://blog.csdn.net/sinat_26917383/article/details/72857454

中文文檔：http://keras-cn.readthedocs.io/en/latest/
官方文檔：https://keras.io/
文檔主要是以keras2.0。

Keras系列：

1、keras系列︱Sequential與Model模型、keras基本結構功能（一）
2、keras系列︱Application中五款已訓練模型、VGG16框架（Sequential式、Model式）解讀（二）
3、keras系列︱圖像多分類訓練與利用bottleneck features進行微調（三）
4、keras系列︱人臉表情分類與識別：opencv人臉檢測+Keras情緒分類（四）
5、keras系列︱遷移學習：利用InceptionV3進行fine-tuning及預測、完整案例（五）

零、keras介紹與基本的模型保存

寫成了思維導圖，便於觀察與理解。

1.keras網絡結構

這里寫圖片描述

2.keras網絡配置

這里寫圖片描述
其中回調函數callbacks應該是keras的精髓~

3.keras預處理功能

這里寫圖片描述

4、模型的節點信息提取

# 節點信息提取 config = model.get_config() # 把model中的信息，solver.prototxt和train.prototxt信息提取出來 model = Model.from_config(config) # 還回去 # or, for Sequential: model = Sequential.from_config(config) # 重構一個新的Model模型，用去其他訓練，fine-tuning比較好用

5、模型概況查詢（包括權重查詢）

# 1、模型概括打印 model.summary() # 2、返回代表模型的JSON字符串，僅包含網絡結構，不包含權值。可以從JSON字符串中重構原模型：
 from models import model_from_json
 json_string = model.to_json()
 model = model_from_json(json_string)
 # 3、model.to_yaml：與model.to_json類似，同樣可以從產生的YAML字符串中重構模型
 from models import model_from_yaml
 yaml_string = model.to_yaml()
 model = model_from_yaml(yaml_string)
 # 4、權重獲取
 model.get_layer() #依據層名或下標獲得層對象
 model.get_weights() #返回模型權重張量的列表，類型為numpy array
 model.set_weights() #從numpy array里將權重載入給模型，要求數組具有與model.get_weights()相同的形狀。
 # 查看model中Layer的信息
 model.layers 查看layer信息

6、模型保存與加載

model.save_weights(filepath) # 將模型權重保存到指定路徑，文件類型是HDF5（后綴是.h5）

model.load_weights(filepath, by_name=False) # 從HDF5文件中加載權重到當前模型中, 默認情況下模型的結構將保持不變。 # 如果想將權重載入不同的模型（有些層相同）中，則設置by_name=True，只有名字匹配的層才會載入權重.

7、如何在keras中設定GPU使用的大小

本節來源於：深度學習theano/tensorflow多顯卡多人使用問題集（參見：Limit the resource usage for tensorflow backend · Issue #1538 · fchollet/keras · GitHub）
在使用keras時候會出現總是占滿GPU顯存的情況，可以通過重設backend的GPU占用情況來進行調節。

import tensorflow as tf
from keras.backend.tensorflow_backend import set_session config = tf.ConfigProto() config.gpu_options.per_process_gpu_memory_fraction = 0.3 set_session(tf.Session(config=config))

需要注意的是，雖然代碼或配置層面設置了對顯存占用百分比閾值，但在實際運行中如果達到了這個閾值，程序有需要的話還是會突破這個閾值。換而言之如果跑在一個大數據集上還是會用到更多的顯存。以上的顯存限制僅僅為了在跑小數據集時避免對顯存的浪費而已。（2017年2月20日補充）

8.更科學地模型訓練與模型保存

filepath = 'model-ep{epoch:03d}-loss{loss:.3f}-val_loss{val_loss:.3f}.h5' checkpoint = ModelCheckpoint(filepath, monitor='val_loss', verbose=1, save_best_only=True, mode='min') # fit model model.fit(x, y, epochs=20, verbose=2, callbacks=[checkpoint], validation_data=(x, y))

save_best_only打開之后，會如下：

 ETA: 3s - loss: 0.5820Epoch 00017: val_loss did not improve

如果val_loss 提高了就會保存，沒有提高就不會保存。

9.如何在keras中使用tensorboard

    RUN = RUN + 1 if 'RUN' in locals() else 1 # locals() 函數會以字典類型返回當前位置的全部局部變量。 LOG_DIR = model_save_path + <span class="hljs-string"><span class="hljs-string">'/training_logs/run{}'</span>.format(RUN)
LOG_FILE_PATH = LOG_DIR + <span class="hljs-string"><span class="hljs-string">'/checkpoint-{epoch:02d}-{val_loss:.4f}.hdf5'</span>   <span class="hljs-comment"><span class="hljs-comment"># 模型Log文件以及.h5模型文件存放地址</span>

tensorboard = TensorBoard(log_dir=LOG_DIR, write_images=<span class="hljs-keyword"><span class="hljs-keyword">True</span>)
checkpoint = ModelCheckpoint(filepath=LOG_FILE_PATH, monitor=<span class="hljs-string"><span class="hljs-string">'val_loss'</span>, verbose=<span class="hljs-number"><span class="hljs-number">1</span>, save_best_only=<span class="hljs-keyword"><span class="hljs-keyword">True</span>)
early_stopping = EarlyStopping(monitor=<span class="hljs-string"><span class="hljs-string">'val_loss'</span>, patience=<span class="hljs-number"><span class="hljs-number">5</span>, verbose=<span class="hljs-number"><span class="hljs-number">1</span>)

history = model.fit_generator(generator=gen.generate(<span class="hljs-keyword"><span class="hljs-keyword">True</span>), steps_per_epoch=int(gen.train_batches / <span class="hljs-number"><span class="hljs-number">4</span>),
                              validation_data=gen.generate(<span class="hljs-keyword"><span class="hljs-keyword">False</span>), validation_steps=int(gen.val_batches / <span class="hljs-number"><span class="hljs-number">4</span>),
                              epochs=EPOCHS, verbose=<span class="hljs-number"><span class="hljs-number">1</span>, callbacks=[tensorboard, checkpoint, early_stopping])</span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></span></code></pre>
  
 都是在回調函數中起作用： 
                   
                    EarlyStopping patience：當early 
（1）stop被激活（如發現loss相比上一個epoch訓練沒有下降），則經過patience個epoch后停止訓練。 
（2）mode：‘auto’，‘min’，‘max’之一，在min模式下，如果檢測值停止下降則中止訓練。在max模式下，當檢測值不再上升則停止訓練。
  
                    模型檢查點ModelCheckpoint  
（1）save_best_only：當設置為True時，將只保存在驗證集上性能最好的模型 
（2） mode：‘auto’，‘min’，‘max’之一，在save_best_only=True時決定性能最佳模型的評判准則，例如，當監測值為val_acc時，模式應為max，當檢測值為val_loss時，模式應為min。在auto模式下，評價准則由被監測值的名字自動推斷。 
（3）save_weights_only：若設置為True，則只保存模型權重，否則將保存整個模型（包括模型結構，配置信息等） 
（4）period：CheckPoint之間的間隔的epoch數
  
                   可視化tensorboard write_images: 是否將模型權重以圖片的形式可視化 
                  
 其他內容可參考keras中文文檔
 . 
                  
 一、Sequential 序貫模型
 序貫模型是函數式模型的簡略版，為最簡單的線性、從頭到尾的結構順序，不分叉。
 Sequential模型的基本組件
 一般需要： 
                   
                   1、model.add，添加層； 
                   2、model.compile,模型訓練的BP模式設置； 
                   3、model.fit，模型訓練參數設置 + 訓練； 
                   4、模型評估 
                   5、模型預測 
                  
 1. add：添加層——train_val.prototxt
 
  
 add里面只有層layer的內容，當然在序貫式里面，也可以model.add（other_model）加載另外模型，在函數式里面就不太一樣，詳見函數式。
 2、compile 訓練模式——solver.prototxt文件
 compile(self, optimizer, loss, metrics=None, sample_weight_mode=None)
  
 其中： 
optimizer： 字符串（預定義優化器名）或優化器對象，參考優化器 
loss： 字符串（預定義損失函數名）或目標函數，參考損失函數 
metrics： 列表，包含評估模型在訓練和測試時的網絡性能的指標，典型用法是metrics=[‘accuracy’] 
sample_weight_mode：如果你需要按時間步為樣本賦權（2D權矩陣），將該值設為“temporal”。 
默認為“None”，代表按樣本賦權（1D權）。在下面fit函數的解釋中有相關的參考內容。 
kwargs： 使用TensorFlow作為后端請忽略該參數，若使用Theano作為后端，kwargs的值將會傳遞給 K.function
 注意： 
模型在使用前必須編譯，否則在調用fit或evaluate時會拋出異常。
 3、fit 模型訓練參數+訓練——train.sh+soler.prototxt（部分）
 fit(self, x, y, batch_size=32, epochs=10, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)
  
 本函數將模型訓練nb_epoch輪，其參數有： 
                  
                  x：輸入數據。如果模型只有一個輸入，那么x的類型是numpy 
array，如果模型有多個輸入，那么x的類型應當為list，list的元素是對應於各個輸入的numpy array 
                  y：標簽，numpy array 
                  batch_size：整數，指定進行梯度下降時每個batch包含的樣本數。訓練時一個batch的樣本會被計算一次梯度下降，使目標函數優化一步。 
                  epochs：整數，訓練的輪數，每個epoch會把訓練集輪一遍。 
                  verbose：日志顯示，0為不在標准輸出流輸出日志信息，1為輸出進度條記錄，2為每個epoch輸出一行記錄 
                  callbacks：list，其中的元素是keras.callbacks.Callback的對象。這個list中的回調函數將會在訓練過程中的適當時機被調用，參考回調函數 
                  validation_split：0~1之間的浮點數，用來指定訓練集的一定比例數據作為驗證集。驗證集將不參與訓練，並在每個epoch結束后測試的模型的指標，如損失函數、精確度等。注意，validation_split的划分在shuffle之前，因此如果你的數據本身是有序的，需要先手工打亂再指定validation_split，否則可能會出現驗證集樣本不均勻。 
                  validation_data：形式為（X，y）的tuple，是指定的驗證集。此參數將覆蓋validation_spilt。 
                  shuffle：布爾值或字符串，一般為布爾值，表示是否在訓練過程中隨機打亂輸入樣本的順序。若為字符串“batch”，則是用來處理HDF5數據的特殊情況，它將在batch內部將數據打亂。 
                  class_weight：字典，將不同的類別映射為不同的權值，該參數用來在訓練過程中調整損失函數（只能用於訓練） 
                  sample_weight：權值的numpy 
array，用於在訓練時調整損失函數（僅用於訓練）。可以傳遞一個1D的與樣本等長的向量用於對樣本進行1對1的加權，或者在面對時序數據時，傳遞一個的形式為（samples，sequence_length）的矩陣來為每個時間步上的樣本賦不同的權。這種情況下請確定在編譯模型時添加了sample_weight_mode=’temporal’。 
                  initial_epoch: 從該參數指定的epoch開始訓練，在繼續之前的訓練時有用。 
                 
 fit函數返回一個History的對象，其History.history屬性記錄了損失函數和其他指標的數值隨epoch變化的情況，如果有驗證集的話，也包含了驗證集的這些指標變化情況 
注意： 
要與之后的fit_generator做區別，兩者輸入x/y不同。
 4.evaluate 模型評估
 evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)
  
 本函數按batch計算在某些輸入數據上模型的誤差，其參數有： 
                  
                  x：輸入數據，與fit一樣，是numpy array或numpy array的list 
                  y：標簽，numpy array 
                  batch_size：整數，含義同fit的同名參數 
                  verbose：含義同fit的同名參數，但只能取0或1 
                  sample_weight：numpy array，含義同fit的同名參數 
                 
 本函數返回一個測試誤差的標量值（如果模型沒有其他評價指標），或一個標量的list（如果模型還有其他的評價指標）。model.metrics_names將給出list中各個值的含義。
 如果沒有特殊說明，以下函數的參數均保持與fit的同名參數相同的含義 
如果沒有特殊說明，以下函數的verbose參數（如果有）均只能取0或1
 5 predict 模型評估
 predict(self, x, batch_size=32, verbose=0) predict_classes(self, x, batch_size=32, verbose=1) predict_proba(self, x, batch_size=32, verbose=1)
 本函數按batch獲得輸入數據對應的輸出，其參數有：
 函數的返回值是預測值的numpy array 
predict_classes：本函數按batch產生輸入數據的類別預測結果； 
predict_proba：本函數按batch產生輸入數據屬於各個類別的概率
 6 on_batch 、batch的結果，檢查
 train_on_batch(self, x, y, class_weight=None, sample_weight=None) test_on_batch(self, x, y, sample_weight=None) predict_on_batch(self, x) 
                  
                  train_on_batch：本函數在一個batch的數據上進行一次參數更新，函數返回訓練誤差的標量值或標量值的list，與evaluate的情形相同。 
                  test_on_batch：本函數在一個batch的樣本上對模型進行評估，函數的返回與evaluate的情形相同 
                  predict_on_batch：本函數在一個batch的樣本上對模型進行測試，函數返回模型在一個batch上的預測結果 
                 
 7 fit_generator
 #利用Python的生成器，逐個生成數據的batch並進行訓練。 #生成器與模型將並行執行以提高效率。 #例如，該函數允許我們在CPU上進行實時的數據提升，同時在GPU上進行模型訓練 # 參考鏈接：http://keras-cn.readthedocs.io/en/latest/models/sequential/
  
 有了該函數，圖像分類訓練任務變得很簡單。
 fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0) # 案例：
 def generate_arrays_from_file(path):
 while 1:
 f = open(path)
 for line in f:
 # create Numpy arrays of input data
 # and labels, from each line in the file
 x, y = process_line(line)
 yield (x, y)
 f.close()
 model.fit_generator(generate_arrays_from_file('/my_file.txt'),
 samples_per_epoch=10000, epochs=10)
  
 其他的兩個輔助的內容：
 evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False) predict_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False, verbose=0)
  
 evaluate_generator：本函數使用一個生成器作為數據源評估模型，生成器應返回與test_on_batch的輸入數據相同類型的數據。該函數的參數與fit_generator同名參數含義相同，steps是生成器要返回數據的輪數。 
predcit_generator：本函數使用一個生成器作為數據源預測模型，生成器應返回與test_on_batch的輸入數據相同類型的數據。該函數的參數與fit_generator同名參數含義相同，steps是生成器要返回數據的輪數。
 案例一：簡單的2分類
 For a single-input model with 2 classes (binary classification):
 from keras.models import Sequential from keras.layers import Dense, Activation
 #模型搭建階段 model= Sequential() model.add(Dense(32, activation='relu', input_dim=100)) # Dense(32) is a fully-connected layer with 32 hidden units. model.add(Dense(1, activation='sigmoid')) model.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
 其中： 
Sequential()代表類的初始化； 
Dense代表全連接層，此時有32個全連接層，最后接relu，輸入的是100維度 
model.add，添加新的全連接層， 
compile，跟prototxt一樣，一些訓練參數,solver.prototxt
 # Generate dummy data import numpy as np data = np.random.random((1000, 100)) labels = np.random.randint(2, size=(1000, 1)) # Train the model, iterating on the data in batches of 32 samples
 model.fit(data, labels, nb_epoch =10, batch_size=32)
  
 之前報過這樣的錯誤，是因為版本的問題。 版本1.2里面是nb_epoch ，而keras2.0是epochs = 10
  error:
    TypeError: Received unknown keyword arguments: {'epochs': 10}
  
 其中： 
epoch=batch_size * iteration,10次epoch代表訓練十次訓練集
 最終代碼是基於keras ==1.2 
                  
                   
                    
                   
                  # -*- coding:utf-8 -*-
from keras.models import Sequential
 from keras.layers import Dense, Activation
模型搭建階段
model= Sequential()#最簡單的線性、從頭到尾的結構順序，不分叉
 model.add(Dense(32, activation='relu', input_dim=100))
Dense(32) is a fully-connected layer with 32 hidden units.
model.add(Dense(1, activation='sigmoid'))
 model.compile(optimizer='rmsprop',
 loss='binary_crossentropy',
 metrics=['accuracy'])
Generate dummy data
import numpy as np
 data = np.random.random((1000, 100))
 labels = np.random.randint(2, size=(1000, 1))
Train the model, iterating on the data in batches of 32 samples
model.fit(data, labels, nb_epoch =10, batch_size=32) 
                   
                   
                    
                   
                 
  
  
 案例二:多分類-VGG的卷積神經網絡
 import numpy as np import keras from keras.models import Sequential from keras.layers import Dense, Dropout, Flatten from keras.layers import Conv2D, MaxPooling2D from keras.optimizers import SGD from keras.utils import np_utils # Generate dummy data
 x_train = np.random.random((100, 100, 100, 3))
 # 100張圖片，每張1001003
 y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
 # 10010
 x_test = np.random.random((20, 100, 100, 3))
 y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
 # 20100
 model = Sequential()
 # input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
 # this applies 32 convolution filters of size 3x3 each.
 model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
 model.add(Conv2D(32, (3, 3), activation='relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
 model.add(Conv2D(64, (3, 3), activation='relu'))
 model.add(Conv2D(64, (3, 3), activation='relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
 model.add(Flatten())
 model.add(Dense(256, activation='relu'))
 model.add(Dropout(0.5))
 model.add(Dense(10, activation='softmax'))
 sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
 model.compile(loss='categorical_crossentropy', optimizer=sgd)
 model.fit(x_train, y_train, batch_size=32, epochs=10)
 score = model.evaluate(x_test, y_test, batch_size=32)
  
 標准序貫網絡，標簽的訓練模式 
注意： 
這里非常重要的一點，對於我這樣的新手，這一步的作用？
 keras.utils.to_categorical
  
 特別是多分類時候，我之前以為輸入的就是一列（100，），但是keras在多分類任務中是不認得這個的，所以需要再加上這一步，讓其轉化為Keras認得的數據格式。
  
 最終代碼基於Keras==2.0 
                  
                   
                    
                   
                  # -*- coding:utf-8 -*-
import numpy as np
 import keras
 from keras.models import Sequential
 from keras.layers import Dense, Dropout, Flatten
 from keras.layers import Conv2D, MaxPooling2D
 from keras.optimizers import SGD
 from keras.utils import np_utils
Generate dummy data
x_train = np.random.random((100, 100, 100, 3))
100張圖片，每張1001003
y_train = keras.utils.to_categorical(np.random.randint(10, size=(100, 1)), num_classes=10)
100*10
x_test = np.random.random((20, 100, 100, 3))
 y_test = keras.utils.to_categorical(np.random.randint(10, size=(20, 1)), num_classes=10)
20*100
model = Sequential()#最簡單的線性、從頭到尾的結構順序，不分叉
input: 100x100 images with 3 channels -> (100, 100, 3) tensors.
this applies 32 convolution filters of size 3x3 each.
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(100, 100, 3)))
 model.add(Conv2D(32, (3, 3), activation='relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
model.add(Conv2D(64, (3, 3), activation='relu'))
 model.add(Conv2D(64, (3, 3), activation='relu'))
 model.add(MaxPooling2D(pool_size=(2, 2)))
 model.add(Dropout(0.25))
model.add(Flatten())
 model.add(Dense(256, activation='relu'))
 model.add(Dropout(0.5))
 model.add(Dense(10, activation='softmax'))
sgd = SGD(lr=0.01, decay=1e-6, momentum=0.9, nesterov=True)
 model.compile(loss='categorical_crossentropy', optimizer=sgd)
model.fit(x_train, y_train, batch_size=32, epochs=10)
 score = model.evaluate(x_test, y_test, batch_size=32) 
                   
                   
                    
                   
                 
  
 案例三：使用LSTM的序列分類
 具體的專門有鏈接來寫：

 . 
                 
 三、Model式模型
 來自keras中文文檔：http://keras-cn.readthedocs.io/en/latest/ 
比序貫模型要復雜，但是效果很好，可以同時/分階段輸入變量，分階段輸出想要的模型； 
一句話，只要你的模型不是類似VGG一樣一條路走到黑的模型，或者你的模型需要多於一個的輸出，那么你總應該選擇函數式模型。
 不同之處： 
書寫結構完全不一致
 函數式模型基本屬性與訓練流程
 一般需要： 
1、model.layers，添加層信息；  
2、model.compile,模型訓練的BP模式設置； 
3、model.fit，模型訓練參數設置 + 訓練； 
4、evaluate，模型評估； 
5、predict 模型預測
 1 常用Model屬性
 model.layers：組成模型圖的各個層
model.inputs：模型的輸入張量列表
model.outputs：模型的輸出張量列表
  
 2 compile 訓練模式設置——solver.prototxt
 compile(self, optimizer, loss, metrics=None, loss_weights=None, sample_weight_mode=None)
  
 本函數編譯模型以供訓練，參數有
 optimizer：優化器，為預定義優化器名或優化器對象，參考優化器 
loss：損失函數，為預定義損失函數名或一個目標函數，參考損失函數 
metrics：列表，包含評估模型在訓練和測試時的性能的指標，典型用法是metrics=[‘accuracy’]如果要在多輸出模型中為不同的輸出指定不同的指標，可像該參數傳遞一個字典，例如metrics={‘ouput_a’: ‘accuracy’} 
sample_weight_mode：如果你需要按時間步為樣本賦權（2D權矩陣），將該值設為“temporal”。默認為“None”，代表按樣本賦權（1D權）。 
如果模型有多個輸出，可以向該參數傳入指定sample_weight_mode的字典或列表。在下面fit函數的解釋中有相關的參考內容。
 【Tips】如果你只是載入模型並利用其predict，可以不用進行compile。在Keras中，compile主要完成損失函數和優化器的一些配置，是為訓練服務的。predict會在內部進行符號函數的編譯工作（通過調用_make_predict_function生成函數）
 3 fit 模型訓練參數設置 + 訓練
 fit(self, x=None, y=None, batch_size=32, epochs=1, verbose=1, callbacks=None, validation_split=0.0, validation_data=None, shuffle=True, class_weight=None, sample_weight=None, initial_epoch=0)
  
 本函數用以訓練模型，參數有： 
                  
                  x：輸入數據。如果模型只有一個輸入，那么x的類型是numpy 
array，如果模型有多個輸入，那么x的類型應當為list，list的元素是對應於各個輸入的numpy 
array。如果模型的每個輸入都有名字，則可以傳入一個字典，將輸入名與其輸入數據對應起來。 
                  y：標簽，numpy array。如果模型有多個輸出，可以傳入一個numpy 
array的list。如果模型的輸出擁有名字，則可以傳入一個字典，將輸出名與其標簽對應起來。 
                  batch_size：整數，指定進行梯度下降時每個batch包含的樣本數。訓練時一個batch的樣本會被計算一次梯度下降，使目標函數優化一步。 
                  nb_epoch：整數，訓練的輪數，訓練數據將會被遍歷nb_epoch次。Keras中nb開頭的變量均為”number of”的意思 
                  verbose：日志顯示，0為不在標准輸出流輸出日志信息，1為輸出進度條記錄，2為每個epoch輸出一行記錄 
                  callbacks：list，其中的元素是keras.callbacks.Callback的對象。這個list中的回調函數將會在訓練過程中的適當時機被調用，參考回調函數 
                  validation_split：0~1之間的浮點數，用來指定訓練集的一定比例數據作為驗證集。驗證集將不參與訓練，並在每個epoch結束后測試的模型的指標，如損失函數、精確度等。注意，validation_split的划分在shuffle之后，因此如果你的數據本身是有序的，需要先手工打亂再指定validation_split，否則可能會出現驗證集樣本不均勻。 
                  validation_data：形式為（X，y）或（X，y，sample_weights）的tuple，是指定的驗證集。此參數將覆蓋validation_spilt。 
                  shuffle：布爾值，表示是否在訓練過程中每個epoch前隨機打亂輸入樣本的順序。 
                  class_weight：字典，將不同的類別映射為不同的權值，該參數用來在訓練過程中調整損失函數（只能用於訓練）。該參數在處理非平衡的訓練數據（某些類的訓練樣本數很少）時，可以使得損失函數對樣本數不足的數據更加關注。 
                  sample_weight：權值的numpy 
array，用於在訓練時調整損失函數（僅用於訓練）。可以傳遞一個1D的與樣本等長的向量用於對樣本進行1對1的加權，或者在面對時序數據時，傳遞一個的形式為（samples，sequence_length）的矩陣來為每個時間步上的樣本賦不同的權。這種情況下請確定在編譯模型時添加了sample_weight_mode=’temporal’。 
                  initial_epoch: 從該參數指定的epoch開始訓練，在繼續之前的訓練時有用。 
                 
 輸入數據與規定數據不匹配時會拋出錯誤
 fit函數返回一個History的對象，其History.history屬性記錄了損失函數和其他指標的數值隨epoch變化的情況，如果有驗證集的話，也包含了驗證集的這些指標變化情況
 4.evaluate，模型評估
 evaluate(self, x, y, batch_size=32, verbose=1, sample_weight=None)
  
 本函數按batch計算在某些輸入數據上模型的誤差，其參數有： 
                  
                  x：輸入數據，與fit一樣，是numpy array或numpy array的list 
                  y：標簽，numpy array 
                  batch_size：整數，含義同fit的同名參數 
                  verbose：含義同fit的同名參數，但只能取0或1 
                  sample_weight：numpy array，含義同fit的同名參數 
                 
 本函數返回一個測試誤差的標量值（如果模型沒有其他評價指標），或一個標量的list（如果模型還有其他的評價指標）。model.metrics_names將給出list中各個值的含義。
 如果沒有特殊說明，以下函數的參數均保持與fit的同名參數相同的含義 
如果沒有特殊說明，以下函數的verbose參數（如果有）均只能取0或1
 5.predict 模型預測
 predict(self, x, batch_size=32, verbose=0) 
  
 本函數按batch獲得輸入數據對應的輸出，其參數有：
 函數的返回值是預測值的numpy array
 模型檢查 on_batch
 train_on_batch(self, x, y, class_weight=None, sample_weight=None) test_on_batch(self, x, y, sample_weight=None) predict_on_batch(self, x)
 train_on_batch：本函數在一個batch的數據上進行一次參數更新，函數返回訓練誤差的標量值或標量值的list，與evaluate的情形相同。 
 test_on_batch：本函數在一個batch的樣本上對模型進行評估，函數的返回與evaluate的情形相同； 
predict_on_batch：本函數在一個batch的樣本上對模型進行測試，函數返回模型在一個batch上的預測結果
 _generator
 fit_generator(self, generator, steps_per_epoch, epochs=1, verbose=1, callbacks=None, validation_data=None, validation_steps=None, class_weight=None, max_q_size=10, workers=1, pickle_safe=False, initial_epoch=0) evaluate_generator(self, generator, steps, max_q_size=10, workers=1, pickle_safe=False)
  
 案例一：簡單的單層-全連接網絡
 from keras.layers import Input, Dense from keras.models import Model # This returns a tensor
 inputs = Input(shape=(784,))
 # a layer instance is callable on a tensor, and returns a tensor
 x = Dense(64, activation='relu')(inputs)
 # 輸入inputs，輸出x
 # (inputs)代表輸入
 x = Dense(64, activation='relu')(x)
 # 輸入x，輸出x
 predictions = Dense(10, activation='softmax')(x)
 # 輸入x，輸出分類
 # This creates a model that includes
 # the Input layer and three Dense layers
 model = Model(inputs=inputs, outputs=predictions)
 model.compile(optimizer='rmsprop',
 loss='categorical_crossentropy',
 metrics=['accuracy'])
 model.fit(data, labels) # starts training
  
 其中： 
可以看到結構與序貫模型完全不一樣，其中x = Dense(64, activation=’relu’)(inputs)中：(input)代表輸入；x代表輸出 
model = Model(inputs=inputs, outputs=predictions)；該句是函數式模型的經典，可以同時輸入兩個input，然后輸出output兩個模型
 案例二：視頻處理
 x = Input(shape=(784,)) # This works, and returns the 10-way softmax we defined above. y = model(x) # model里面存着權重，然后輸入x，輸出結果，用來作fine-tuning # 分類->視頻、實時處理
 from keras.layers import TimeDistributed
 # Input tensor for sequences of 20 timesteps,
 # each containing a 784-dimensional vector
 input_sequences = Input(shape=(20, 784))
 # 20個時間間隔，輸入784維度的數據
 # This applies our previous model to every timestep in the input sequences.
 # the output of the previous model was a 10-way softmax,
 # so the output of the layer below will be a sequence of 20 vectors of size 10.
 processed_sequences = TimeDistributed(model)(input_sequences)
 # Model是已經訓練好的
  
 其中： 
Model是已經訓練好的，現在用來做遷移學習； 
其中還可以通過TimeDistributed來進行實時預測； 
TimeDistributed(model)(input_sequences)，input_sequences代表序列輸入；model代表已訓練的模型
 案例三：雙輸入、雙模型輸出：LSTM 時序預測
 本案例很好，可以了解到Model的精髓在於他的任意性，給編譯者很多的便利。
 輸入： 
新聞語料；新聞語料對應的時間 
輸出： 
新聞語料的預測模型；新聞語料+對應時間的預測模型 
 模型一：只針對新聞語料的LSTM模型
 from keras.layers import Input, Embedding, LSTM, Dense from keras.models import Model # Headline input: meant to receive sequences of 100 integers, between 1 and 10000.
 # Note that we can name any layer by passing it a "name" argument.
 main_input = Input(shape=(100,), dtype='int32', name='main_input')
 # 一個100詞的BOW序列
 # This embedding layer will encode the input sequence
 # into a sequence of dense 512-dimensional vectors.
 x = Embedding(output_dim=512, input_dim=10000, input_length=100)(main_input)
 # Embedding層，把100維度再encode成512的句向量，10000指的是詞典單詞總數
 # A LSTM will transform the vector sequence into a single vector,
 # containing information about the entire sequence
 lstm_out = LSTM(32)(x)
 # ？ 32什么意思？？？？？？？？？？？？？？？？？？？？？
 #然后，我們插入一個額外的損失，使得即使在主損失很高的情況下，LSTM和Embedding層也可以平滑的訓練。
 auxiliary_output = Dense(1, activation='sigmoid', name='aux_output')(lstm_out)
 #再然后，我們將LSTM與額外的輸入數據串聯起來組成輸入，送入模型中：
 # 模型一：只針對以上的序列做的預測模型
  
 組合模型：新聞語料+時序
 # 模型二：組合模型 auxiliary_input = Input(shape=(5,), name='aux_input') # 新加入的一個Input,5維度 x = keras.layers.concatenate([lstm_out, auxiliary_input]) # 組合起來，對應起來 # We stack a deep densely-connected network on top
 # 組合模型的形式
 x = Dense(64, activation='relu')(x)
 x = Dense(64, activation='relu')(x)
 x = Dense(64, activation='relu')(x)
 # And finally we add the main logistic regression layer
 main_output = Dense(1, activation='sigmoid', name='main_output')(x)
 #最后，我們定義整個2輸入，2輸出的模型：
 model = Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])
 #模型定義完畢，下一步編譯模型。
 #我們給額外的損失賦0.2的權重。我們可以通過關鍵字參數loss_weights或loss來為不同的輸出設置不同的損失函數或權值。
 #這兩個參數均可為Python的列表或字典。這里我們給loss傳遞單個損失函數，這個損失函數會被應用於所有輸出上。
 
  
 其中：Model(inputs=[main_input, auxiliary_input], outputs=[main_output, auxiliary_output])是核心， 
Input兩個內容，outputs兩個模型
 # 訓練方式一：兩個模型一個loss model.compile(optimizer='rmsprop', loss='binary_crossentropy', loss_weights=[1., 0.2]) #編譯完成后，我們通過傳遞訓練數據和目標值訓練該模型： model.fit([headline_data, additional_data], [labels, labels],
 epochs=50, batch_size=32)
 # 訓練方式二：兩個模型,兩個Loss
 #因為我們輸入和輸出是被命名過的（在定義時傳遞了“name”參數），我們也可以用下面的方式編譯和訓練模型：
 model.compile(optimizer='rmsprop',
 loss={'main_output': 'binary_crossentropy', 'aux_output': 'binary_crossentropy'},
 loss_weights={'main_output': 1., 'aux_output': 0.2})
 # And trained it via:
 model.fit({'main_input': headline_data, 'aux_input': additional_data},
 {'main_output': labels, 'aux_output': labels},
 epochs=50, batch_size=32)
 
  
 因為輸入兩個，輸出兩個模型，所以可以分為設置不同的模型訓練參數
 案例四：共享層：對應關系、相似性
 一個節點，分成兩個分支出去
 import keras from keras.layers import Input, LSTM, Dense from keras.models import Model tweet_a = Input(shape=(140, 256))
 tweet_b = Input(shape=(140, 256))
 #若要對不同的輸入共享同一層，就初始化該層一次，然后多次調用它
 # 140個單詞，每個單詞256維度，詞向量
 # 
 # This layer can take as input a matrix
 # and will return a vector of size 64
 shared_lstm = LSTM(64)
 # 返回一個64規模的向量
 # When we reuse the same layer instance
 # multiple times, the weights of the layer
 # are also being reused
 # (it is effectively the same layer)
 encoded_a = shared_lstm(tweet_a)
 encoded_b = shared_lstm(tweet_b)
 # We can then concatenate the two vectors:
 # 連接兩個結果
 # axis=-1？？？？？
 merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
 # And add a logistic regression on top
 predictions = Dense(1, activation='sigmoid')(merged_vector)
 # 其中的1 代表什么？？？？
 # We define a trainable model linking the
 # tweet inputs to the predictions
 model = Model(inputs=[tweet_a, tweet_b], outputs=predictions)
 model.compile(optimizer='rmsprop',
 loss='binary_crossentropy',
 metrics=['accuracy'])
 model.fit([data_a, data_b], labels, epochs=10)
 # 訓練模型，然后預測
  
 案例五：抽取層節點內容
 # 1、單節點 a = Input(shape=(140, 256)) lstm = LSTM(32) encoded_a = lstm(a) assert lstm.output == encoded_a # 抽取獲得encoded_a的輸出張量 # 2、多節點
 a = Input(shape=(140, 256))
 b = Input(shape=(140, 256))
 lstm = LSTM(32)
 encoded_a = lstm(a)
 encoded_b = lstm(b)
 assert lstm.get_output_at(0) == encoded_a
 assert lstm.get_output_at(1) == encoded_b
 # 3、圖像層節點
 # 對於input_shape和output_shape也是一樣，如果一個層只有一個節點，
 #或所有的節點都有相同的輸入或輸出shape，
 #那么input_shape和output_shape都是沒有歧義的，並也只返回一個值。
 #但是，例如你把一個相同的Conv2D應用於一個大小為(3,32,32)的數據，
 #然后又將其應用於一個(3,64,64)的數據，那么此時該層就具有了多個輸入和輸出的shape，
 #你就需要顯式的指定節點的下標，來表明你想取的是哪個了
 a = Input(shape=(3, 32, 32))
 b = Input(shape=(3, 64, 64))
 conv = Conv2D(16, (3, 3), padding='same')
 conved_a = conv(a)
 # Only one input so far, the following will work:
 assert conv.input_shape == (None, 3, 32, 32)
 conved_b = conv(b)
 # now the .input_shape property wouldn't work, but this does:
 assert conv.get_input_shape_at(0) == (None, 3, 32, 32)
 assert conv.get_input_shape_at(1) == (None, 3, 64, 64)
  
 案例六：視覺問答模型
 #這個模型將自然語言的問題和圖片分別映射為特征向量， #將二者合並后訓練一個logistic回歸層，從一系列可能的回答中挑選一個。 from keras.layers import Conv2D, MaxPooling2D, Flatten from keras.layers import Input, LSTM, Embedding, Dense from keras.models import Model, Sequential # First, let's define a vision model using a Sequential model.
 # This model will encode an image into a vector.
 vision_model = Sequential()
 vision_model.add(Conv2D(64, (3, 3) activation='relu', padding='same', input_shape=(3, 224, 224)))
 vision_model.add(Conv2D(64, (3, 3), activation='relu'))
 vision_model.add(MaxPooling2D((2, 2)))
 vision_model.add(Conv2D(128, (3, 3), activation='relu', padding='same'))
 vision_model.add(Conv2D(128, (3, 3), activation='relu'))
 vision_model.add(MaxPooling2D((2, 2)))
 vision_model.add(Conv2D(256, (3, 3), activation='relu', padding='same'))
 vision_model.add(Conv2D(256, (3, 3), activation='relu'))
 vision_model.add(Conv2D(256, (3, 3), activation='relu'))
 vision_model.add(MaxPooling2D((2, 2)))
 vision_model.add(Flatten())
 # Now let's get a tensor with the output of our vision model:
 image_input = Input(shape=(3, 224, 224))
 encoded_image = vision_model(image_input)
 # Next, let's define a language model to encode the question into a vector.
 # Each question will be at most 100 word long,
 # and we will index words as integers from 1 to 9999.
 question_input = Input(shape=(100,), dtype='int32')
 embedded_question = Embedding(input_dim=10000, output_dim=256, input_length=100)(question_input)
 encoded_question = LSTM(256)(embedded_question)
 # Let's concatenate the question vector and the image vector:
 merged = keras.layers.concatenate([encoded_question, encoded_image])
 # And let's train a logistic regression over 1000 words on top:
 output = Dense(1000, activation='softmax')(merged)
 # This is our final model:
 vqa_model = Model(inputs=[image_input, question_input], outputs=output)
 # The next stage would be training this model on actual data.
 延伸一：fine-tuning時如何加載No_top的權重
 如果你需要加載權重到不同的網絡結構（有些層一樣）中，例如fine-tune或transfer-learning，你可以通過層名字來加載模型： 
model.load_weights(‘my_model_weights.h5’, by_name=True) 
例如：
 假如原模型為：
     model = Sequential()
    model.add(Dense(2, input_dim=3, name="dense_1")) model.add(Dense(3, name="dense_2")) ... model.save_weights(fname)
 # new model model = Sequential() model.add(Dense(2, input_dim=3, name="dense_1")) # will be loaded model.add(Dense(10, name="new_dense")) # will not be loaded # load weights from first model; will only affect the first layer, dense_1.
 model.load_weights(fname, by_name=True)add(self, layer) # 譬如：
 model.add(Dense(32, activation='relu', input_dim=100))
 model.add(Dropout(0.25))

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 keras系列︱Sequential與Model模型、keras基本結構功能（一） Keras（一）Sequential與Model模型、Keras基本結構功能使用tf.keras.model.Sequential搭建分類模型 Keras Sequential模型和add() 蛋白質結構模型和功能預測：Swiss-model工具的使用 TensorFlow2.0——Sequential模型與函數式API構建神經網絡結構 keras_1_Keras_Model簡介 Sharepoint學習筆記 –架構系列—Sharepoint的服務器端對象模型(Server Object Model) 2.內容層次結構（重磅）Internal: Failed to call ThenRnnForward with model config問題的解決（Keras 2.4.3和Tensorflow2.0系列）四、添加模型Model（ASP.NET MVC5 系列）