原文鏈接:http://www.one2know.cn/keras3/
Application的五款已訓練模型 + H5py簡述
- Keras的應用模塊Application提供了帶有預訓練權重的Keras模型,這些模型可以用來進行預測、特征提取和finetune。
后續還有對以下幾個模型的參數介紹:
Xception
VGG16
VGG19
ResNet50
InceptionV3
所有的這些模型(除了Xception)都兼容Theano和Tensorflow,並會自動基於~/.keras/keras.json的Keras的圖像維度進行自動設置。例如,如果你設置data_format=”channel_last”,則加載的模型將按照TensorFlow的維度順序來構造,即“Width-Height-Depth”的順序。
模型的官方下載路徑:
https://github.com/fchollet/deep-learning-models/releases - th與tf的區別
Keras提供了兩套后端,Theano和Tensorflow
th和tf的大部分功能都被backend統一包裝起來了,但二者還是存在不小的沖突,有時候你需要特別注意Keras是運行在哪種后端之上,它們的主要沖突有:
dim_ordering,也就是維度順序。比方說一張224×224的彩色圖片,theano的維度順序是(3,224,224),即通道維在前。而tf的維度順序是(224,224,3),即通道維在后。
數據格式的區別,channels_last”對應原本的“tf”,“channels_first”對應原本的“th”。以128x128的RGB圖像為例,“channels_first”應將數據組織為(3,128,128),而“channels_last”應將數據組織為(128,128,3) - notop模型
是否包含最后的3個全連接層,用來做微調(fine-tuning)專用,專門開源了這類模型 - H5py簡述
keras的已訓練模型是H5PY格式的,后綴是h5
h5py.File類似Python的詞典對象,因此我們可以查看所有的鍵值
輸入:
import h5py
file=h5py.File('.../notop.h5','r')
查看鍵值:
f = file.attrs['nb_layers']
f.key()
查看到f中各個層內有些什么:
for name in f:
print(name)
- 官方案例:利用ResNet50網絡進行ImageNet分類
識別大象的品種:
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from keras.applications.resnet50 import preprocess_input,decode_predictions
import numpy as np
model = ResNet50(weights=r'..\Model\resnet50_weights_tf_dim_ordering_tf_kernels.h5')
img_path = 'elephant.jpg'
img = image.load_img(img_path,target_size=(224,224))
# 現有模型輸入shape為 (224, 224, 3)
x = image.img_to_array(img)
x = np.expand_dims(x,axis=0)
x = preprocess_input(x)
preds = model.predict(x)
print('Predicted:',decode_predictions(preds,top=3)[0])
輸出:
Predicted: [('n02504458', 'African_elephant', 0.603124), ('n02504013', 'Indian_elephant', 0.334439), ('n01871265', 'tusker', 0.062180385)]
- 五個模型
1.Xception模型:僅能以TensorFlow為后端使用,目前該模型只支持channels_last的維度順序(width, height, channels)
默認輸入圖片大小為299x299
keras.applications.xception.Xception(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None, classes=1000)
2.VGG16模型:在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
默認輸入圖片大小為224x224
keras.applications.vgg16.VGG16(include_top=True, weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000)
3.VGG19模型
在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
默認輸入圖片大小為224x224
keras.applications.vgg19.VGG19(include_top=True, weights='imagenet', input_tensor=None, input_shape=None,pooling=None,classes=1000)
4.ResNet50模型
在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
默認輸入圖片大小為224x224
keras.applications.resnet50.ResNet50(include_top=True,weights='imagenet',input_tensor=None, input_shape=None,pooling=None,classes=1000)
5.InceptionV3模型
在Theano和TensorFlow后端均可使用,並接受channels_first和channels_last兩種輸入維度順序
默認輸入圖片大小為299x299
keras.applications.inception_v3.InceptionV3(include_top=True,weights='imagenet',input_tensor=None,input_shape=None,pooling=None,classes=1000)
keras-applications-VGG16解讀:函數式
- VGG16默認的輸入數據格式應該是:channels_last
from __future__ import print_function
import numpy as np
import warnings
from keras.models import Model
from keras.layers import Flatten,Dense,Input,Conv2D
from keras.layers import MaxPooling2D,GlobalMaxPooling2D,GlobalAveragePooling2D
from keras.preprocessing import image
from keras.utils import layer_utils
from keras.utils.data_utils import get_file
from keras import backend as K
from keras.applications.imagenet_utils import decode_predictions
# decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率) decode_predictions(y_pred)
from keras.applications.imagenet_utils import preprocess_input
# 預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x)
from keras_applications.imagenet_utils import _obtain_input_shape
# 確定適當的輸入形狀,相當於opencv中的read.img,將圖像變為數組
from keras.engine.topology import get_source_inputs
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
def VGG16(include_top=True, weights='imagenet',
input_tensor=None, input_shape=None,
pooling=None,
classes=1000):
# 檢查weight與分類設置是否正確
if weights not in {'imagenet', None}:
raise ValueError('The `weights` argument should be either '
'`None` (random initialization) or `imagenet` '
'(pre-training on ImageNet).')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as imagenet with `include_top`'
' as true, `classes` should be 1000')
# 設置圖像尺寸,類似caffe中的transform
# Determine proper input shape
input_shape = _obtain_input_shape(input_shape,
default_size=224,
min_size=48,
# 模型所能接受的最小長寬
data_format=K.image_data_format(),
# 數據的使用格式
require_flatten=include_top)
#是否通過一個Flatten層再連接到分類器
# 數據簡單處理,resize
if input_tensor is None:
img_input = Input(shape=input_shape)
# 這里的Input是keras的格式,可以用於轉換
else:
if not K.is_keras_tensor(input_tensor):
img_input = Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
# 如果是tensor的數據格式,需要兩步走:
# 先判斷是否是keras指定的數據類型,is_keras_tensor
# 然后get_source_inputs(input_tensor)
# 編寫網絡結構,prototxt
# Block 1
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv1')(img_input)
x = Conv2D(64, (3, 3), activation='relu', padding='same', name='block1_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block1_pool')(x)
# Block 2
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv1')(x)
x = Conv2D(128, (3, 3), activation='relu', padding='same', name='block2_conv2')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block2_pool')(x)
# Block 3
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv1')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv2')(x)
x = Conv2D(256, (3, 3), activation='relu', padding='same', name='block3_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block3_pool')(x)
# Block 4
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block4_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block4_pool')(x)
# Block 5
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv1')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv2')(x)
x = Conv2D(512, (3, 3), activation='relu', padding='same', name='block5_conv3')(x)
x = MaxPooling2D((2, 2), strides=(2, 2), name='block5_pool')(x)
if include_top:
# Classification block
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
else:
if pooling == 'avg':
x = GlobalAveragePooling2D()(x)
elif pooling == 'max':
x = GlobalMaxPooling2D()(x)
# 調整數據
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = get_source_inputs(input_tensor)
# get_source_inputs 返回計算需要的數據列表,List of input tensors.
# 如果是tensor的數據格式,需要兩步走:
# 先判斷是否是keras指定的數據類型,is_keras_tensor
# 然后get_source_inputs(input_tensor)
else:
inputs = img_input
# 創建模型
# Create model.
model = Model(inputs, x, name='vgg16')
# 加載權重
# load weights
if weights == 'imagenet':
if include_top:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',
WEIGHTS_PATH,
cache_subdir='models')
else:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',
WEIGHTS_PATH_NO_TOP,
cache_subdir='models')
model.load_weights(weights_path)
if K.backend() == 'theano':
layer_utils.convert_all_kernels_in_model(model)
if K.image_data_format() == 'channels_first':
if include_top:
maxpool = model.get_layer(name='block5_pool')
shape = maxpool.output_shape[1:]
dense = model.get_layer(name='fc1')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
if K.backend() == 'tensorflow':
warnings.warn('You are using the TensorFlow backend, yet you '
'are using the Theano '
'image data format convention '
'(`image_data_format="channels_first"`). '
'For best performance, set '
'`image_data_format="channels_last"` in '
'your Keras config '
'at ~/.keras/keras.json.')
return model
if __name__ == '__main__':
model = VGG16(include_top=True, weights='imagenet')
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
print('Input image shape:', x.shape)
preds = model.predict(x)
print('Predicted:', decode_predictions(preds))
# decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率)
輸出:
Input image shape: (1, 224, 224, 3)
Predicted: [[('n02504458', 'African_elephant', 0.62728244), ('n02504013', 'Indian_elephant', 0.19092941), ('n01871265', 'tusker', 0.18166111), ('n02437312', 'Arabian_camel', 4.5080957e-05), ('n07802026', 'hay', 1.7709652e-05)]]
- 將model下載到本地,修改下載的代碼
注釋掉下面兩行:
WEIGHTS_PATH = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5'
WEIGHTS_PATH_NO_TOP = 'https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
修改下面兩行:
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels.h5',WEIGHTS_PATH,cache_subdir='models')
weights_path = get_file('vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5',WEIGHTS_PATH_NO_TOP,cache_subdir='models')
- 幾個layer中的新用法
from keras.applications.imagenet_utils import decode_predictions
decode_predictions 輸出5個最高概率:(類名, 語義概念, 預測概率) decode_predictions(y_pred)
from keras.applications.imagenet_utils import preprocess_input
預處理 圖像編碼服從規定,譬如,RGB,GBR這一類的,preprocess_input(x)
from keras.applications.imagenet_utils import _obtain_input_shape
確定適當的輸入形狀,相當於opencv中的read.img,將圖像變為數組
(1)decode_predictions用在最后輸出結果上,比較好用【print(‘Predicted:’, decode_predictions(preds))】;
(2)preprocess_input,改變編碼,【preprocess_input(x)】;
(3)_obtain_input_shape
相當於caffe中的transform,在預測的時候,需要對預測的圖片進行一定的預處理。
input_shape = _obtain_input_shape(input_shape,default_size=224,min_size=48,data_format=K.image_data_format(),include_top=include_top)
min_size=48,模型所能接受的最小長寬
data_format=K.image_data_format(),數據的使用格式 - 當include_top=True時
fc_model = VGG16(include_top=True)
notop_model = VGG16(include_top=False)
用VGG16做fine-tuning的時候,得到的notop_model就是沒有全連接層的模型,然后再去添加自己的層。
當是健全的網絡結構的時候,fc_model需要添加以下的內容以補全網絡結構:
x = Flatten(name='flatten')(x)
x = Dense(4096, activation='relu', name='fc1')(x)
x = Dense(4096, activation='relu', name='fc2')(x)
x = Dense(classes, activation='softmax', name='predictions')(x)
pool層之后接一個flatten層,修改數據格式,然后接兩個dense層,最后有softmax的Dense層
- channels_first轉成channels_last格式
maxpool = model.get_layer(name='block5_pool')
# model.get_layer()依據層名或下標獲得層對象
shape = maxpool.output_shape[1:]
# 獲取block5_pool層輸出的數據格式
dense = model.get_layer(name='fc1')
layer_utils.convert_dense_weights_data_format(dense, shape, 'channels_first')
convert_dense_weights_data_format
將convnet的權重從一種數據格式移植到另一種數據格式時,如果convnet包含一個平坦層(應用於最后一個卷積特征映射),然后是一個密集層,則應更新該密集層的權重,以反映新的維度順序。