Convolution Neural Network (CNN) 原理與實現

本文轉載自查看原文 2015-10-11 09:24 2215 深度學習

本文結合Deep learning的一個應用，Convolution Neural Network 進行一些基本應用，參考Lecun的Document 0.1進行部分拓展，與結果展示（in python）。

分為以下幾部分：

1. Convolution（卷積）

2. Pooling（降采樣過程）

3. CNN結構

4. 跑實驗

下面分別介紹。

PS：本篇blog為ese機器學習短期班參考資料（20140516課程），本文只是簡要講最naive最simple的思想，重在實踐部分，原理課上詳述。

1. Convolution（卷積）

類似於高斯卷積，對imagebatch中的所有image進行卷積。對於一張圖，其所有feature map用一個filter卷成一張feature map。如下面的代碼，對一個imagebatch（含兩張圖）進行操作，每個圖初始有3張feature map(R,G,B), 用兩個9*9的filter進行卷積，結果是，每張圖得到兩個feature map。

卷積操作由theano的conv.conv2d實現，這里我們用隨機參數W，b。結果有點像edge detector是不是？

Code: （詳見注釋）

[python] view plain copy

# -*- coding: utf-8 -*-
"""
Created on Sat May 10 18:55:26 2014
@author: rachel
Function: convolution option of two pictures with same size (width,height)
input: 3 feature maps (3 channels <RGB> of a picture)
convolution: two 9*9 convolutional filters
"""
from theano.tensor.nnet import conv
import theano.tensor as T
import numpy, theano
rng = numpy.random.RandomState(23455)
# symbol variable
input = T.tensor4(name = 'input')
# initial weights
w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
w_bound = numpy.sqrt(3*9*9)
W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
dtype = input.dtype),name = 'W')
b_shape = (2,)
b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
dtype = input.dtype),name = 'b')
conv_out = conv.conv2d(input,W)
#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
#dimshuffle(self,*pattern)
# >>>b1 = b.dimshuffle('x',0,'x','x')
# >>>b1.shape.eval()
# array([1,2,1,1])
output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
f = theano.function([input],output)
# demo
import pylab
from PIL import Image
#minibatch_img = T.tensor4(name = 'minibatch_img')
#-------------img1---------------
img1 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
width1,height1 = img1.size
img1 = numpy.asarray(img1, dtype = 'float32')/256. # (height, width, 3)
# put image in 4D tensor of shape (1,3,height,width)
img1_rgb = img1.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height1,width1) #(3,height,width)
#-------------img2---------------
img2 = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel1.jpg'))
width2,height2 = img2.size
img2 = numpy.asarray(img2,dtype = 'float32')/256.
img2_rgb = img2.swapaxes(0,2).swapaxes(1,2).reshape(1,3,height2,width2) #(3,height,width)
#minibatch_img = T.join(0,img1_rgb,img2_rgb)
minibatch_img = numpy.concatenate((img1_rgb,img2_rgb),axis = 0)
filtered_img = f(minibatch_img)
# plot original image and two convoluted results
pylab.subplot(2,3,1);pylab.axis('off');
pylab.imshow(img1)
pylab.subplot(2,3,4);pylab.axis('off');
pylab.imshow(img2)
pylab.gray()
pylab.subplot(2,3,2); pylab.axis("off")
pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
pylab.subplot(2,3,3); pylab.axis("off")
pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
pylab.subplot(2,3,5); pylab.axis("off")
pylab.imshow(filtered_img[1,0,:,:]) #0:minibatch_index; 0:1-st filter
pylab.subplot(2,3,6); pylab.axis("off")
pylab.imshow(filtered_img[1,1,:,:]) #0:minibatch_index; 1:1-st filter
pylab.show()

2. Pooling（降采樣過程）

最常用的Maxpooling. 解決了兩個問題：

1. 減少計算量

2. 旋轉不變性（原因自己悟）

PS：對於旋轉不變性，回憶下SIFT，LBP：采用主方向；HOG：選擇不同方向的模版

Maxpooling的降采樣過程會將feature map的長寬各減半。（下面結果圖中沒有體現出來，python自動給拉到一樣大了，但實際上像素數是減半的）

Code: （詳見注釋）

[python] view plain copy

# -*- coding: utf-8 -*-
"""
Created on Sat May 10 18:55:26 2014
@author: rachel
Function: convolution option
input: 3 feature maps (3 channels <RGB> of a picture)
convolution: two 9*9 convolutional filters
"""
from theano.tensor.nnet import conv
import theano.tensor as T
import numpy, theano
rng = numpy.random.RandomState(23455)
# symbol variable
input = T.tensor4(name = 'input')
# initial weights
w_shape = (2,3,9,9) #2 convolutional filters, 3 channels, filter shape: 9*9
w_bound = numpy.sqrt(3*9*9)
W = theano.shared(numpy.asarray(rng.uniform(low = -1.0/w_bound, high = 1.0/w_bound,size = w_shape),
dtype = input.dtype),name = 'W')
b_shape = (2,)
b = theano.shared(numpy.asarray(rng.uniform(low = -.5, high = .5, size = b_shape),
dtype = input.dtype),name = 'b')
conv_out = conv.conv2d(input,W)
#T.TensorVariable.dimshuffle() can reshape or broadcast (add dimension)
#dimshuffle(self,*pattern)
# >>>b1 = b.dimshuffle('x',0,'x','x')
# >>>b1.shape.eval()
# array([1,2,1,1])
output = T.nnet.sigmoid(conv_out + b.dimshuffle('x',0,'x','x'))
f = theano.function([input],output)
# demo
import pylab
from PIL import Image
from matplotlib.pyplot import *
#open random image
img = Image.open(open('//home//rachel//Documents//ZJU_Projects//DL//Dataset//rachel.jpg'))
width,height = img.size
img = numpy.asarray(img, dtype = 'float32')/256. # (height, width, 3)
# put image in 4D tensor of shape (1,3,height,width)
img_rgb = img.swapaxes(0,2).swapaxes(1,2) #(3,height,width)
minibatch_img = img_rgb.reshape(1,3,height,width)
filtered_img = f(minibatch_img)
# plot original image and two convoluted results
pylab.figure(1)
pylab.subplot(1,3,1);pylab.axis('off');
pylab.imshow(img)
title('origin image')
pylab.gray()
pylab.subplot(2,3,2); pylab.axis("off")
pylab.imshow(filtered_img[0,0,:,:]) #0:minibatch_index; 0:1-st filter
title('convolution 1')
pylab.subplot(2,3,3); pylab.axis("off")
pylab.imshow(filtered_img[0,1,:,:]) #0:minibatch_index; 1:1-st filter
title('convolution 2')
#pylab.show()
# maxpooling
from theano.tensor.signal import downsample
input = T.tensor4('input')
maxpool_shape = (2,2)
pooled_img = downsample.max_pool_2d(input,maxpool_shape,ignore_border = False)
maxpool = theano.function(inputs = [input],
outputs = [pooled_img])
pooled_res = numpy.squeeze(maxpool(filtered_img))
#pylab.figure(2)
pylab.subplot(235);pylab.axis('off');
pylab.imshow(pooled_res[0,:,:])
title('down sampled 1')
pylab.subplot(236);pylab.axis('off');
pylab.imshow(pooled_res[1,:,:])
title('down sampled 2')
pylab.show()

3. CNN結構

想必大家隨便google下CNN的圖都濫大街了，這里拖出來那時候學CNN的時候一張圖，自認為陪上講解的話畫得還易懂（）

廢話不多說了，直接上Lenet結構圖：（從下往上順着箭頭看，最下面為底層original input）

4. CNN代碼

去資源里下載吧，我放上去了喔~（in python）

這里貼少部分代碼，僅表示建模的NN：

[python] view plain copy

rng = numpy.random.RandomState(23455)
# transfrom x from (batchsize, 28*28) to (batchsize,feature,28,28))
# I_shape = (28,28),F_shape = (5,5),
N_filters_0 = 20
D_features_0= 1
layer0_input = x.reshape((batch_size,D_features_0,28,28))
layer0 = LeNetConvPoolLayer(rng, input = layer0_input, filter_shape = (N_filters_0,D_features_0,5,5),
image_shape = (batch_size,1,28,28))
#layer0.output: (batch_size, N_filters_0, (28-5+1)/2, (28-5+1)/2) -> 20*20*12*12
N_filters_1 = 50
D_features_1 = N_filters_0
layer1 = LeNetConvPoolLayer(rng,input = layer0.output, filter_shape = (N_filters_1,D_features_1,5,5),
image_shape = (batch_size,N_filters_0,12,12))
# layer1.output: (20,50,4,4)
layer2_input = layer1.output.flatten(2) # (20,50,4,4)->(20,(50*4*4))
layer2 = HiddenLayer(rng,layer2_input,n_in = 50*4*4,n_out = 500, activation = T.tanh)
layer3 = LogisticRegression(input = layer2.output, n_in = 500, n_out = 10)

layer0, layer1 ：分別是卷積+降采樣

layer2+layer3：組成一個MLP（ANN）

訓練模型：

[python] view plain copy

cost = layer3.negative_log_likelihood(y)
params = layer3.params + layer2.params + layer1.params + layer0.params
gparams = T.grad(cost,params)
updates = []
for par,gpar in zip(params,gparams):
updates.append((par, par - learning_rate * gpar))
train_model = theano.function(inputs = [minibatch_index],
outputs = [cost],
updates = updates,
givens = {x: train_set_x[minibatch_index * batch_size : (minibatch_index+1) * batch_size],
y: train_set_y[minibatch_index * batch_size : (minibatch_index+1) * batch_size]})

根據cost（最上層MLP的輸出NLL），對所有層的parameters進行訓練

剩下的具體見代碼和注釋。

PS：數據為MNIST所有數據

final result：
Optimization complete. Best validation score of 0.990000 % obtained at iteration 122500, with test performance 0.950000 %

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 CNN(Convolutional Neural Network) [CNN] What is Convolutional Neural Network （六）6.17 Neurons Networks convolutional neural network（cnn）卷積神經網絡 (Convolution Neural Networks, CNN) 深度學習：卷積神經網絡（convolution neural network）卷積神經網絡（Convolutional Neural Network，CNN） Deep Learning 學習隨記（八）CNN（Convolutional neural network）理解 “卷積神經網絡（Convolutional Neural Network，CNN）”之問卷積神經網絡(Convolutional Neural Network, CNN)簡析卷積思想理解、Convolutional Neural Network（CNN）卷積神經網絡初探