引自:http://blog.csdn.net/sinat_26917383/article/details/72859145
中文文檔:http://keras-cn.readthedocs.io/en/latest/
官方文檔:https://keras.io/
文檔主要是以keras2.0。
.
.
Keras系列:
1、keras系列︱Sequential與Model模型、keras基本結構功能(一)
2、keras系列︱Application中五款已訓練模型、VGG16框架(Sequential式、Model式)解讀(二)
3、keras系列︱圖像多分類訓練與利用bottleneck features進行微調(三)
4、keras系列︱人臉表情分類與識別:opencv人臉檢測+Keras情緒分類(四)
5、keras系列︱遷移學習:利用InceptionV3進行fine-tuning及預測、完整案例(五)
一、Application的五款已訓練模型 + H5py簡述
Kera的應用模塊Application提供了帶有預訓練權重的Keras模型,這些模型可以用來進行預測、特征提取和finetune。
后續還有對以下幾個模型的參數介紹:
- Xception
- VGG16
- VGG19
- ResNet50
- InceptionV3
所有的這些模型(除了Xception)都兼容Theano和Tensorflow,並會自動基於~/.keras/keras.json的Keras的圖像維度進行自動設置。例如,如果你設置data_format=”channel_last”,則加載的模型將按照TensorFlow的維度順序來構造,即“Width-Height-Depth”的順序。
模型的官方下載路徑:https://github.com/fchollet/deep-learning-models/releases
其中:
.
1、th與tf的區別
==================
Keras提供了兩套后端,Theano和Tensorflow,
th和tf的大部分功能都被backend統一包裝起來了,但二者還是存在不小的沖突,有時候你需要特別注意Keras是運行在哪種后端之上,它們的主要沖突有:
dim_ordering,也就是維度順序。比方說一張224*224的彩色圖片,theano的維度順序是(3,224,224),即通道維在前。而tf的維度順序是(224,224,3),即通道維在后。
卷積層權重的shape:從無到有訓練一個網絡,不會有任何問題。但是如果你想把一個th訓練出來的卷積層權重載入風格為tf的卷積層……說多了都是淚。我一直覺得這個是個bug,數據的dim_ordering有問題就罷了,為啥卷積層權重的shape還需要變換咧?我遲早要提個PR把這個bug修掉!
然后是卷積層kernel的翻轉不翻轉問題,這個我們說過很多次了,就不再多提。
數據格式的區別,channels_last”對應原本的“tf”,“channels_first”對應原本的“th”。
以128x128的RGB圖像為例,“channels_first”應將數據組織為(3,128,128),而“channels_last”應將數據組織為(128,128,3)。
譬如:
vgg16_weights_th_dim_ordering_th_kernels_notop.h5
vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
.
2、notop模型是指什么?
==============
是否包含最后的3個全連接層(whether to include the 3 fully-connected layers at the top of the network)。用來做fine-tuning專用,專門開源了這類模型。
.
3、H5py簡述
========
keras的已訓練模型是H5PY格式的,不是caffe的.caffemodel
h5py.File類似Python的詞典對象,因此我們可以查看所有的鍵值:
讀入
file=h5py.File('.../notop.h5','r')
f.attrs['nb_layers'],代表f的屬性,其中有一個屬性為'nb_layers'
>>> f.keys() [u'block1_conv1', u'block1_conv2', u'block1_pool', u'block2_conv1', u'block2_conv2', u'block2_pool', u'block3_conv1', u'block3_conv2', u'block3_conv3', u'block3_pool', u'block4_conv1', u'block4_conv2', u'block4_conv3', u'block4_pool', u'block5_conv1', u'block5_conv2', u'block5_conv3', u'block5_pool']
可以看到f中各個層內有些什么。
for name in f: print(name) # 類似f.keys()
.
4、官方案例——利用ResNet50網絡進行ImageNet分類
================================
rom keras.applications.resnet50 import ResNet50 from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input, decode_predictions import numpy as np model = ResNet50(weights='imagenet') img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) print('Predicted:', decode_predictions(preds, top=3)[0]) # Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]
還有的案例可見Keras官方文檔
利用VGG16提取特征、從VGG19的任意中間層中抽取特征、在定制的輸入tensor上構建InceptionV3
.
5、調用參數解釋
========
以下幾類,因為調用好像都是從網站下載權重,所以可以自己修改一下源碼,讓其可以讀取本地H5文件。
Xception模型
ImageNet上,該模型取得了驗證集top1 0.790和top5 0.945的正確率;
,該模型目前僅能以TensorFlow為后端使用,由於它依賴於”SeparableConvolution”層,目前該模型只支持channels_last的維度順序(width, height, channels)
默認輸入圖片大小為299x299
keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
VGG16模型
VGG16模型,權重由ImageNet訓練而來
該模型再Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
模型的默認輸入尺寸時224x224
keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
VGG19模型
VGG19模型,權重由ImageNet訓練而來
該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
模型的默認輸入尺寸時224x224
keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
ResNet50模型
50層殘差網絡模型,權重訓練自ImageNet
該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
模型的默認輸入尺寸時224x224
keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
InceptionV3模型
InceptionV3網絡,權重訓練自ImageNet
該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
模型的默認輸入尺寸時299x299
keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)
.
二、 keras-applications-VGG16解讀——函數式
.py文件來源於:https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py
VGG16默認的輸入數據格式應該是:channels_last
# -*- coding: utf-8 -*- '''VGG16 model for Keras. # Reference: - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) ''' from __future__ import print_function import numpy as np import warnings from keras.models import Model from keras.layers import Flatten from keras.layers import Dense from keras.layers import Input from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import GlobalMaxPooling2D from keras.layers import GlobalAveragePooling2D from keras.preprocessing import image from keras.utils import layer_utils from keras.utils.data_utils import get_file from keras import backend as K from keras.applications.imagenet_utils import decode_predictions # decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率) decode_predictions(y_pred) from keras.applications.imagenet_utils import preprocess_input # 預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x) from keras.applications.imagenet_utils import _obtain_input_shape # 確定適當的輸入形狀,相當於opencv中的read.img,將圖像變為數組 from keras.engine.topology import get_source_inputs WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' def VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): # 檢查weight與分類設置是否正確 if weights not in {'imagenet', None}: raise ValueError('The `weights` argument should be either ' '`None` (random initialization) or `imagenet` ' '(pre-training on ImageNet).') if weights == 'imagenet' and include_top and classes != 1000: raise ValueError('If using `weights` as imagenet with `include_top`' ' as true, `classes` should be 1000') # 設置圖像尺寸,類似caffe中的transform # Determine proper input shape input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=48, # 模型所能接受的最小長寬 data_format=K.image_data_format(), # 數據的使用格式 include_top=include_top) #是否通過一個Flatten層再連接到分類器 # 數據簡單處理,resize if input_tensor is None: img_input = Input(shape=input_shape) # 這里的Input是keras的格式,可以用於轉換 else: if not K.is_keras_tensor(input_tensor): img_input = Input(tensor=input_tensor, shape=input_shape) else: img_input = input_tensor # 如果是tensor的數據格式,需要兩步走: # 先判斷是否是keras指定的數據類型,is_keras_tensor # 然后get_source_inputs(input_tensor) # 編寫網絡結構,prototxt # Block 1 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x) x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) if include_top: # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x) else: if pooling == 'avg': x = GlobalAveragePooling2D()(x) elif pooling == 'max': x = GlobalMaxPooling2D()(x) # 調整數據 # Ensure that the model takes into account # any potential predecessors of `input_tensor`. if input_tensor is not None: inputs = get_source_inputs(input_tensor) # get_source_inputs 返回計算需要的數據列表,List of input tensors. # 如果是tensor的數據格式,需要兩步走: # 先判斷是否是keras指定的數據類型,is_keras_tensor # 然后get_source_inputs(input_tensor) else: inputs = img_input # 創建模型 # Create model. model = Model(inputs, x, name='vgg16') # 加載權重 # load weights if weights == 'imagenet': if include_top: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5', WEIGHTS_PATH, cache_subdir='models') else: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5', WEIGHTS_PATH_NO_TOP, cache_subdir='models') model.load_weights(weights_path) if K.backend() == 'theano': layer_utils.convert_all_kernels_in_model(model) if K.image_data_format() == 'channels_first': if include_top: maxpool = model.get_layer(name='block5_pool') shape = maxpool.output_shape[1:] dense = model.get_layer(name='fc1') layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')