tf.nn.conv2d和tf.contrib.slim.conv2d的區別

本文轉載自查看原文 2018-09-06 10:33 1646 深度學習

來源：http://blog.sina.com.cn/s/blog_6ca0f5eb0102wsuu.html

在查看代碼的時候，看到有代碼用到卷積層是tf.nn.conv2d，但是也有的使用的卷積層是tf.contrib.slim.conv2d，這兩個函數調用的卷積層是否一致，在查看了API的文檔，以及slim.conv2d的源碼后，做如下總結：

首先是常見使用的tf.nn.conv2d的函數，其定義如下：

conv2d(

input,

filter,

strides,

padding,

use_cudnn_on_gpu=None,

data_format=None,

name=None

)

input指需要做卷積的輸入圖像，它要求是一個Tensor，具有[batch_size, in_height, in_width, in_channels]這樣的shape，具體含義是[訓練時一個batch的圖片數量, 圖片高度, 圖片寬度, 圖像通道數]，注意這是一個4維的Tensor，要求數據類型為float32和float64其中之一

filter用於指定CNN中的卷積核，它要求是一個Tensor，具有[filter_height, filter_width, in_channels, out_channels]這樣的shape，具體含義是[卷積核的高度，卷積核的寬度，圖像通道數，卷積核個數]，要求類型與參數input相同，有一個地方需要注意，第三維in_channels，就是參數input的第四維，這里是維度一致，不是數值一致

strides為卷積時在圖像每一維的步長，這是一個一維的向量，長度為4，對應的是在input的4個維度上的步長

padding是string類型的變量，只能是"SAME","VALID"其中之一，這個值決定了不同的卷積方式，SAME代表卷積核可以停留圖像邊緣，VALID表示不能，更詳細的描述可以參考http://blog.csdn.net/mao_xiao_feng/article/details/53444333

use_cudnn_on_gpu指定是否使用cudnn加速，默認為true

data_format是用於指定輸入的input的格式，默認為NHWC格式

結果返回一個Tensor，這個輸出，就是我們常說的feature map

而對於tf.contrib.slim.conv2d，其函數定義如下：

convolution(inputs,

num_outputs,

kernel_size,

stride=1,

padding='SAME',

data_format=None,

rate=1,

activation_fn=nn.relu,

normalizer_fn=None,

normalizer_params=None,

weights_initializer=initializers.xavier_initializer(),

weights_regularizer=None,

biases_initializer=init_ops.zeros_initializer(),

biases_regularizer=None,

reuse=None,

variables_collections=None,

outputs_collections=None,

trainable=True,

scope=None):

inputs同樣是指需要做卷積的輸入圖像

num_outputs指定卷積核的個數（就是filter的個數）

kernel_size用於指定卷積核的維度（卷積核的寬度，卷積核的高度）

stride為卷積時在圖像每一維的步長

padding為padding的方式選擇，VALID或者SAME

data_format是用於指定輸入的input的格式

rate這個參數不是太理解，而且tf.nn.conv2d中也沒有，對於使用atrous convolution的膨脹率（不是太懂這個atrous convolution）

activation_fn用於激活函數的指定，默認的為ReLU函數

normalizer_fn用於指定正則化函數

normalizer_params用於指定正則化函數的參數

weights_initializer用於指定權重的初始化程序

weights_regularizer為權重可選的正則化程序

biases_initializer用於指定biase的初始化程序

biases_regularizer: biases可選的正則化程序

reuse指定是否共享層或者和變量

variable_collections指定所有變量的集合列表或者字典

outputs_collections指定輸出被添加的集合

trainable:卷積層的參數是否可被訓練

scope:共享變量所指的variable_scope

在上述的API中，可以看出去除掉初始化的部分，那么兩者並沒有什么不同，只是tf.contrib.slim.conv2d提供了更多可以指定的初始化的部分，而對於tf.nn.conv2d而言，其指定filter的方式相比較tf.contrib.slim.conv2d來說，更加的復雜。去除掉少用的初始化部分，其實兩者的API可以簡化如下：

tf.contrib.slim.conv2d (inputs,

num_outputs,[卷積核個數]

kernel_size,[卷積核的高度，卷積核的寬度]

stride=1,

padding='SAME',

)

tf.nn.conv2d(

input,(與上述一致)

filter,([卷積核的高度，卷積核的寬度，圖像通道數，卷積核個數])

strides,

padding,

)

可以說兩者是幾乎相同的，運行下列代碼也可知這兩者一致

import tensorflow as tf

import tensorflow.contrib.slim as slim

x1 = tf.ones(shape=[1, 64, 64, 3])

w = tf.fill([5, 5, 3, 64], 1)

# print("rank is", tf.rank(x1))

y1 = tf.nn.conv2d(x1, w, strides=[1, 1, 1, 1], padding='SAME')

y2 = slim.conv2d(x1, 64, [5, 5], weights_initializer=tf.ones_initializer, padding='SAME')

with tf.Session() as sess:

sess.run(tf.global_variables_initializer())

y1_value,y2_value,x1_value=sess.run([y1,y2,x1])

print("shapes are", y1_value.shape, y2_value.shape)

print(y1_value==y2_value)

print(y1_value)

print(y2_value)

最后配上tf.contrib.slim.conv2d的API英文版

def convolution(inputs,

num_outputs,

kernel_size,

stride=1,

padding='SAME',

data_format=None,

rate=1,

activation_fn=nn.relu,

normalizer_fn=None,

normalizer_params=None,

weights_initializer=initializers.xavier_initializer(),

weights_regularizer=None,

biases_initializer=init_ops.zeros_initializer(),

biases_regularizer=None,

reuse=None,

variables_collections=None,

outputs_collections=None,

trainable=True,

scope=None):

"""Adds an N-D convolution followed by an optional batch_norm layer.

It is required that 1 <= N <= 3.

`convolution` creates a variable called `weights`, representing the

convolutional kernel, that is convolved (actually cross-correlated) with the

`inputs` to produce a `Tensor` of activations. If a `normalizer_fn` is

provided (such as `batch_norm`), it is then applied. Otherwise, if

`normalizer_fn` is None and a `biases_initializer` is provided then a `biases`

variable would be created and added the activations. Finally, if

`activation_fn` is not `None`, it is applied to the activations as well.

Performs atrous convolution with input stride/dilation rate equal to `rate`

if a value > 1 for any dimension of `rate` is specified. In this case

`stride` values != 1 are not supported.

Args:

inputs: A Tensor of rank N+2 of shape

`[batch_size] + input_spatial_shape + [in_channels]` if data_format does

not start with "NC" (default), or

`[batch_size, in_channels] + input_spatial_shape` if data_format starts

with "NC".

num_outputs: Integer, the number of output filters.

kernel_size: A sequence of N positive integers specifying the spatial

dimensions of the filters. Can be a single integer to specify the same

value for all spatial dimensions.

stride: A sequence of N positive integers specifying the stride at which to

compute output. Can be a single integer to specify the same value for all

spatial dimensions. Specifying any `stride` value != 1 is incompatible

with specifying any `rate` value != 1.

padding: One of `"VALID"` or `"SAME"`.

data_format: A string or None. Specifies whether the channel dimension of

the `input` and output is the last dimension (default, or if `data_format`

does not start with "NC"), or the second dimension (if `data_format`

starts with "NC"). For N=1, the valid values are "NWC" (default) and

"NCW". For N=2, the valid values are "NHWC" (default) and "NCHW".

For N=3, the valid values are "NDHWC" (default) and "NCDHW".

rate: A sequence of N positive integers specifying the dilation rate to use

for atrous convolution. Can be a single integer to specify the same

value for all spatial dimensions. Specifying any `rate` value != 1 is

incompatible with specifying any `stride` value != 1.

activation_fn: Activation function. The default value is a ReLU function.

Explicitly set it to None to skip it and maintain a linear activation.

normalizer_fn: Normalization function to use instead of `biases`. If

`normalizer_fn` is provided then `biases_initializer` and

`biases_regularizer` are ignored and `biases` are not created nor added.

default set to None for no normalizer function

normalizer_params: Normalization function parameters.

weights_initializer: An initializer for the weights.

weights_regularizer: Optional regularizer for the weights.

biases_initializer: An initializer for the biases. If None skip biases.

biases_regularizer: Optional regularizer for the biases.

reuse: Whether or not the layer and its variables should be reused. To be

able to reuse the layer scope must be given.

variables_collections: Optional list of collections for all the variables or

a dictionary containing a different list of collection per variable.

outputs_collections: Collection to add the outputs.

trainable: If `True` also add variables to the graph collection

`GraphKeys.TRAINABLE_VARIABLES` (see tf.Variable).

scope: Optional scope for `variable_scope`.

Returns:

A tensor representing the output of the operation.

Raises:

ValueError: If `data_format` is invalid.

ValueError: Both 'rate' and `stride` are not uniformly 1.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 tf.nn.conv2d 與tf.layers.conv2d的區別 TF-卷積函數 tf.nn.conv2d 介紹 TensorFlow_CNN內tf.nn.conv2d和tf.layers.conv2d參數【TensorFlow】tf.nn.conv2d是怎樣實現卷積的？ tf.nn.conv2d實現卷積的過程 TensorFlow使用記錄 (二）：理解tf.nn.conv2d方法 tf.nn.conv1d tf.nn.conv2d函數和tf.nn.max_pool函數介紹【TensorFlow】理解tf.nn.conv2d方法 ( 附代碼詳解注釋 ) tf.nn.depthwise_conv2d 卷積