這個地方一開始是迷糊的,寫代碼做比較分析,總結出直覺上的經驗.
某人若想看精准的解釋,移步這個網址(http://blog.csdn.net/fireflychh/article/details/73743849),但我覺得直覺上的經驗更有用,如下:
直覺上的經驗:
- 一件確定的事: padding 無論取 'SAME' 還是取 'VALID', 它在 conv2d 和 max_pool 上的表現是一致的;
- padding = 'SAME' 時,輸出並不一定和原圖size一致,但會保證覆蓋原圖所有像素,不會舍棄邊上的莫些元素;
- padding = 'VALID' 時,輸出的size總比原圖的size小,有時不會覆蓋原圖所有元素(既,可能舍棄邊上的某些元素).
# -*- coding: utf-8 -*-
import tensorflow as tf
import numpy as np
def pooling_show():
a = tf.Variable(tf.random_normal(X))
pooling = tf.nn.max_pool(a, pooling_filter, pooling_strides, padding=pad)
# VALID (1, 2, 2, 7)
# SAME (1, 3, 3, 7)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print 'image: '
image = sess.run(a)
print image.shape
print 'pooling result: '
res = sess.run(pooling)
print res.shape
def conv2d_padding_show():
# [1, 13, 13, 2] ---> [m, height, width, channel]
input = tf.Variable(tf.random_normal(X))
# [6, 6, 2, 7] ---> [height, width, prev_channel, output_channel]
filter = tf.Variable(tf.random_normal(conv2d_filter))
op = tf.nn.conv2d(input, filter, strides=conv2d_strides, padding=pad)
# VALID (1, 2, 2, 7)
# SAME (1, 3, 3, 7)
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
print 'image: '
image = sess.run(input)
print image.shape
print 'result: '
res = sess.run(op)
print res.shape
pad = 'VALID'
# X ---> [m, height, width, channel]
# X = [1, 13, 13, 7]
X = [1, 8, 8, 3]
# ---> [1, f, f, 1]
# pooling_filter = [1, 6, 6, 1]
pooling_filter = [1, 2, 2, 1]
# ---> [1, s, s, 1]
# pooling_strides = [1, 5, 5, 1]
pooling_strides = [1, 2, 2, 1]
# ---> [height, width, prev_channel, output_channel]
# conv2d_filter = [6, 6, 7, 7]
conv2d_filter = [2, 2, 3, 3]
# ---> [1, s, s, 1]
# conv2d_strides = [1, 5, 5, 1]
conv2d_strides = [1, 2, 2, 1]
# 自己改改 X, fileter, strides 的值,配合直覺經驗,會有更好的理解
conv2d_padding_show()
pooling_show()