keras系列︱Application中五款已訓練模型、VGG16框架(Sequential式、Model式)解讀(二)


引自:http://blog.csdn.net/sinat_26917383/article/details/72859145

中文文檔:http://keras-cn.readthedocs.io/en/latest/ 
官方文檔:https://keras.io/ 
文檔主要是以keras2.0。 
.


.

Keras系列:

1、keras系列︱Sequential與Model模型、keras基本結構功能(一) 
2、keras系列︱Application中五款已訓練模型、VGG16框架(Sequential式、Model式)解讀(二) 
3、keras系列︱圖像多分類訓練與利用bottleneck features進行微調(三) 
4、keras系列︱人臉表情分類與識別:opencv人臉檢測+Keras情緒分類(四) 
5、keras系列︱遷移學習:利用InceptionV3進行fine-tuning及預測、完整案例(五)


一、Application的五款已訓練模型 + H5py簡述

Kera的應用模塊Application提供了帶有預訓練權重的Keras模型,這些模型可以用來進行預測、特征提取和finetune。 
后續還有對以下幾個模型的參數介紹:

  • Xception
  • VGG16
  • VGG19
  • ResNet50
  • InceptionV3

所有的這些模型(除了Xception)都兼容Theano和Tensorflow,並會自動基於~/.keras/keras.json的Keras的圖像維度進行自動設置。例如,如果你設置data_format=”channel_last”,則加載的模型將按照TensorFlow的維度順序來構造,即“Width-Height-Depth”的順序。

模型的官方下載路徑:https://github.com/fchollet/deep-learning-models/releases

其中: 
.

1、thtf的區別

==================

Keras提供了兩套后端,Theano和Tensorflow, 
th和tf的大部分功能都被backend統一包裝起來了,但二者還是存在不小的沖突,有時候你需要特別注意Keras是運行在哪種后端之上,它們的主要沖突有:

dim_ordering,也就是維度順序。比方說一張224*224的彩色圖片,theano的維度順序是(3,224,224),即通道維在前。而tf的維度順序是(224,224,3),即通道維在后。 
卷積層權重的shape:從無到有訓練一個網絡,不會有任何問題。但是如果你想把一個th訓練出來的卷積層權重載入風格為tf的卷積層……說多了都是淚。我一直覺得這個是個bug,數據的dim_ordering有問題就罷了,為啥卷積層權重的shape還需要變換咧?我遲早要提個PR把這個bug修掉! 
然后是卷積層kernel的翻轉不翻轉問題,這個我們說過很多次了,就不再多提。 
數據格式的區別,channels_last”對應原本的“tf”,“channels_first”對應原本的“th”。 
以128x128的RGB圖像為例,“channels_first”應將數據組織為(3,128,128),而“channels_last”應將數據組織為(128,128,3)。 
譬如: 
vgg16_weights_th_dim_ordering_th_kernels_notop.h5 
vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5 
.

2、notop模型是指什么?

==============

是否包含最后的3個全連接層(whether to include the 3 fully-connected layers at the top of the network)。用來做fine-tuning專用,專門開源了這類模型。 
.

3、H5py簡述

========

keras的已訓練模型是H5PY格式的,不是caffe的.caffemodel 
h5py.File類似Python的詞典對象,因此我們可以查看所有的鍵值: 
讀入

file=h5py.File('.../notop.h5','r')
f.attrs['nb_layers'],代表f的屬性,其中有一個屬性為'nb_layers'

 

>>> f.keys() [u'block1_conv1', u'block1_conv2', u'block1_pool', u'block2_conv1', u'block2_conv2', u'block2_pool', u'block3_conv1', u'block3_conv2', u'block3_conv3', u'block3_pool', u'block4_conv1', u'block4_conv2', u'block4_conv3', u'block4_pool', u'block5_conv1', u'block5_conv2', u'block5_conv3', u'block5_pool']

 

可以看到f中各個層內有些什么。

for name in f: print(name) # 類似f.keys()

.

4、官方案例——利用ResNet50網絡進行ImageNet分類

================================

rom keras.applications.resnet50 import ResNet50 from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input, decode_predictions import numpy as np model = ResNet50(weights='imagenet') img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) preds = model.predict(x) print('Predicted:', decode_predictions(preds, top=3)[0]) # Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]

還有的案例可見Keras官方文檔

利用VGG16提取特征、從VGG19的任意中間層中抽取特征、在定制的輸入tensor上構建InceptionV3

.

5、調用參數解釋

========

以下幾類,因為調用好像都是從網站下載權重,所以可以自己修改一下源碼,讓其可以讀取本地H5文件。

Xception模型

ImageNet上,該模型取得了驗證集top1 0.790和top5 0.945的正確率; 
,該模型目前僅能以TensorFlow為后端使用,由於它依賴於”SeparableConvolution”層,目前該模型只支持channels_last的維度順序(width, height, channels)

默認輸入圖片大小為299x299

keras.applications.xception.Xception(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

 

VGG16模型

VGG16模型,權重由ImageNet訓練而來

該模型再Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序

模型的默認輸入尺寸時224x224

keras.applications.vgg16.VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

VGG19模型

VGG19模型,權重由ImageNet訓練而來

該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序

模型的默認輸入尺寸時224x224

keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

 

ResNet50模型

50層殘差網絡模型,權重訓練自ImageNet

該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序

模型的默認輸入尺寸時224x224

keras.applications.resnet50.ResNet50(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

InceptionV3模型

InceptionV3網絡,權重訓練自ImageNet

該模型在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序

模型的默認輸入尺寸時299x299

keras.applications.inception_v3.InceptionV3(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000)

.


二、 keras-applications-VGG16解讀——函數式

.py文件來源於:https://github.com/fchollet/deep-learning-models/blob/master/vgg16.py 
VGG16默認的輸入數據格式應該是:channels_last

# -*- coding: utf-8 -*- '''VGG16 model for Keras. # Reference: - [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) ''' from __future__ import print_function import numpy as np import warnings from keras.models import Model from keras.layers import Flatten from keras.layers import Dense from keras.layers import Input from keras.layers import Conv2D from keras.layers import MaxPooling2D from keras.layers import GlobalMaxPooling2D from keras.layers import GlobalAveragePooling2D from keras.preprocessing import image from keras.utils import layer_utils from keras.utils.data_utils import get_file from keras import backend as K from keras.applications.imagenet_utils import decode_predictions # decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率) decode_predictions(y_pred) from keras.applications.imagenet_utils import preprocess_input # 預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x) from keras.applications.imagenet_utils import _obtain_input_shape # 確定適當的輸入形狀,相當於opencv中的read.img,將圖像變為數組 from keras.engine.topology import get_source_inputs WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' def VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000): # 檢查weight與分類設置是否正確 if weights not in {'imagenet', None}: raise ValueError('The `weights` argument should be either ' '`None` (random initialization) or `imagenet` ' '(pre-training on ImageNet).') if weights == 'imagenet' and include_top and classes != 1000: raise ValueError('If using `weights` as imagenet with `include_top`' ' as true, `classes` should be 1000') # 設置圖像尺寸,類似caffe中的transform # Determine proper input shape input_shape = _obtain_input_shape(input_shape, default_size=224, min_size=48, # 模型所能接受的最小長寬 data_format=K.image_data_format(), # 數據的使用格式 include_top=include_top) #是否通過一個Flatten層再連接到分類器 # 數據簡單處理,resize if input_tensor is None: img_input = Input(shape=input_shape) # 這里的Input是keras的格式,可以用於轉換 else: if not K.is_keras_tensor(input_tensor): img_input = Input(tensor=input_tensor, shape=input_shape) else: img_input = input_tensor # 如果是tensor的數據格式,需要兩步走: # 先判斷是否是keras指定的數據類型,is_keras_tensor # 然后get_source_inputs(input_tensor) # 編寫網絡結構,prototxt # Block 1 x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input) x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x) # Block 2 x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x) x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x) # Block 3 x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x) x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x) # Block 4 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x) # Block 5 x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x) x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x) x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x) if include_top: # Classification block x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x) else: if pooling == 'avg': x = GlobalAveragePooling2D()(x) elif pooling == 'max': x = GlobalMaxPooling2D()(x) # 調整數據 # Ensure that the model takes into account # any potential predecessors of `input_tensor`. if input_tensor is not None: inputs = get_source_inputs(input_tensor) # get_source_inputs 返回計算需要的數據列表,List of input tensors. # 如果是tensor的數據格式,需要兩步走: # 先判斷是否是keras指定的數據類型,is_keras_tensor # 然后get_source_inputs(input_tensor) else: inputs = img_input # 創建模型 # Create model. model = Model(inputs, x, name='vgg16') # 加載權重 # load weights if weights == 'imagenet': if include_top: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5', WEIGHTS_PATH, cache_subdir='models') else: weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5', WEIGHTS_PATH_NO_TOP, cache_subdir='models') model.load_weights(weights_path) if K.backend() == 'theano': layer_utils.convert_all_kernels_in_model(model) if K.image_data_format() == 'channels_first': if include_top: maxpool = model.get_layer(name='block5_pool') shape = maxpool.output_shape[1:] dense = model.get_layer(name='fc1') layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first') if K.backend() == 'tensorflow': warnings.warn('You are using the TensorFlow backend, yet you ' 'are using the Theano ' 'image data format convention ' '(`image_data_format="channels_first"`). ' 'For best performance, set ' '`image_data_format="channels_last"` in ' 'your Keras config ' 'at ~/.keras/keras.json.') return model if __name__ == '__main__': model = VGG16(include_top=True, weights='imagenet') img_path = 'elephant.jpg' img = image.load_img(img_path, target_size=(224, 224)) x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x) print('Input image shape:', x.shape) preds = model.predict(x) print('Predicted:', decode_predictions(preds)) # decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率)

 

其中: 
.

1、如何已經把模型下載到本地

==============

模型已經下載,不再每次從網站進行加載,可以修改以下內容。

WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5' WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5' weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5', WEIGHTS_PATH, cache_subdir='models') weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5', WEIGHTS_PATH_NO_TOP, cache_subdir='models')

.

2、幾個layer中的新用法

==============

from keras.applications.imagenet_utils import decode_predictions decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率) decode_predictions(y_pred) from keras.applications.imagenet_utils import preprocess_input 預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x) from keras.applications.imagenet_utils import _obtain_input_shape 確定適當的輸入形狀,相當於opencv中的read.img,將圖像變為數組

 

 

(1)decode_predictions用在最后輸出結果上,比較好用【print(‘Predicted:’, decode_predictions(preds))】; 
(2)preprocess_input,改變編碼,【preprocess_input(x)】; 
(3)_obtain_input_shape 
相當於caffe中的transform,在預測的時候,需要對預測的圖片進行一定的預處理。

 input_shape = _obtain_input_shape(input_shape,
                                      default_size=224, min_size=48, # 模型所能接受的最小長寬 data_format=K.image_data_format(), # 數據的使用格式 include_top=include_top)

 

.

3、當include_top=True時

====================

fc_model = VGG16(include_top=True) notop_model = VGG16(include_top=False)

 

之前提到過用VGG16做fine-tuning的時候,得到的notop_model就是沒有全連接層的模型。然后再去添加自己的層。 
當時健全的網絡結構的時候,fc_model需要添加以下的內容以補全網絡結構:

x = Flatten(name='flatten')(x) x = Dense(4096, activation='relu', name='fc1')(x) x = Dense(4096, activation='relu', name='fc2')(x) x = Dense(classes, activation='softmax', name='predictions')(x)

 

pool層之后接一個flatten層,修改數據格式,然后接兩個dense層,最后有softmax的Dense層。 
.

4、如果輸入的數據格式是channels_first?

===========================

如果input的格式是’channels_first’,fc_model還需要修改一下格式,因為VGG16源碼是以’channels_last’定義的,所以需要轉換一下輸出格式。

 maxpool = model.get_layer(name='block5_pool') # model.get_layer()依據層名或下標獲得層對象 shape = maxpool.output_shape[1:] # 獲取block5_pool層輸出的數據格式 dense = model.get_layer(name='fc1') layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')

 

其中layer_utils.convert_dense_weights_data_format的作用很特殊,官方文檔中沒有說明,本質用來修改數據格式,因為層中有Flatter層把數據格式換了,所以需要再修改一下。 
原文:

When porting the weights of a convnet from one data format to the other,if the convnet includes a Flatten layer (applied to the last convolutional feature map) followed by a Dense layer, the weights of that Dense layer should be updated to reflect the new dimension ordering.

.


三、keras-Sequential-VGG16源碼解讀:序列式

本節節選自Keras中文文檔《CNN眼中的世界:利用Keras解釋CNN的濾波器》

已訓練好VGG16和VGG19模型的權重: 
國外:https://gist.github.com/baraldilorenzo/07d7802847aaad0a35d3 
國內:http://files.heuritech.com/weights/vgg16_weights.h5

前面是VGG16架構的函數式模型的結構,那么在官方文檔這個案例中,也有VGG16架構的序列式,都拿來比對一下比較好。 
.

1、VGG16的Sequential-網絡結構


首先,我們在Keras中定義VGG網絡的結構:

from keras.models import Sequential from keras.layers import Convolution2D, ZeroPadding2D, MaxPooling2D img_width, img_height = 128, 128 # build the VGG16 network model = Sequential() model.add(ZeroPadding2D((1, 1), batch_input_shape=(1, 3, img_width, img_height))) first_layer = model.layers[-1] # this is a placeholder tensor that will contain our generated images input_img = first_layer.input # build the rest of the network model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_1')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(64, 3, 3, activation='relu', name='conv1_2')) model.add(MaxPooling2D((2, 2), strides=(2, 2))) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_1')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(128, 3, 3, activation='relu', name='conv2_2')) model.add(MaxPooling2D((2, 2), strides=(2, 2))) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_1')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_2')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(256, 3, 3, activation='relu', name='conv3_3')) model.add(MaxPooling2D((2, 2), strides=(2, 2))) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_1')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_2')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv4_3')) model.add(MaxPooling2D((2, 2), strides=(2, 2))) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_1')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_2')) model.add(ZeroPadding2D((1, 1))) model.add(Convolution2D(512, 3, 3, activation='relu', name='conv5_3')) model.add(MaxPooling2D((2, 2), strides=(2, 2))) # get the symbolic outputs of each "key" layer (we gave them unique names). layer_dict = dict([(layer.name, layer) for layer in model.layers])

 

從使用Convolution2D來看,是比較早的版本寫的。 
.

2、Sequential模型如何部分layer載入權重


下面,我們將預訓練好的權重載入模型,一般而言我們可以通過model.load_weights()載入,但這種辦法是載入全部的權重,並不適用。 
之前所看到的No_top模型就是用來應付此時的。 
這里我們只載入一部分參數,用的是set_weights()函數,所以我們需要手工載入:

import h5py weights_path = '.../vgg16_weights.h5' f = h5py.File(weights_path) for k in range(f.attrs['nb_layers']): if k >= len(model.layers): break g = f['layer_{}'.format(k)] weights = [g['param_{}'.format(p)] for p in range(g.attrs['nb_params'])] model.layers[k].set_weights(weights) f.close() print('Model loaded.')

 

筆者在實踐過程中,並沒有實踐出來,因為我載入的.h5,不知道為什么沒有屬性f.attrs[‘nb_layers’]也沒有屬性g.attrs[‘nb_params’]) 
在尋找答案的過程中,看到有前人也跟我一樣的問題,可見([keras]貓狗大戰的總結):

    • Q1.f.attrs[‘nb_layers’]是什么意思?我看h5py中沒有’nb_layers’的屬性啊?attrs是指向f中的屬性,點擊右鍵可以看見這個屬性(在HDF5-viewer)
    • Q2.g= f[‘layer_{}’.format(k)]的含義,.format的作用 
      format是格式化的意思,輸出g就是format(k)填充到{}上
    • Q3.weights = [g[‘param_{}’.format(p)] for p 
      inrange(g.attrs[‘nb_params’])]的含義 得到的是layer下param_0、param_1等 
      這里用到的是set_weights(weights),weights設置的大小應與該層網絡大小一致,否則會報錯。

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM