ImageNet(http://www.image-net.org)是李菲菲組的圖像庫,和WordNet 可以結合使用 (畢業於Caltech;導師:Pietro Perona;主頁:http://vision.stanford.edu/~feifeili/) 總共有十萬的synset, 其中2010的數據表示,有圖像的非空synset是21841,每一類大約1000張圖片,圖片總數:14197122。
Caffe中訓練ImageNet使用的是Alex Krizhevsky的方法(AlexNet),這個有60million參數的模型(簡稱為AlexNet),在ILSVRC 2012中贏得了第一名,Top5錯誤率15.3%。
2013年,Clarifai通過cnn模型可視化技術調整網絡架構,贏得了ILSVRC (http://www.clarifai.com/ 看八卦 可以看zhihu.com Filestorm:的一段閑聊: http://zhuanlan.zhihu.com/cvprnet/19821292)。
2014年,google也加入進來,它通過增加模型的層數(總共22層),並且利用multi-scale data training,取得第一名,Top5錯誤率 6.66%。Google的模型大大增加的網絡的深度,並且去掉了最頂層的全連接層:因為全連接層(Fully Connected)幾乎占據了CNN大概90%的參數,但是同時又可能帶來過擬合(overfitting)的效果。它的模型比以前AlexNet的模型大大縮小,並且減輕了過擬合帶來的副作用。Alex模型參數是60M,GoogLeNet只有7M。
2015年,百度用Deep Image,Top5錯誤率達到5.98% ,它基於GPU,利用36個服務節點開發了一個專為深度學習運算的supercompter(名叫Minwa,敏媧)。這台supercomputer具備TB級的host memory,超強的數據交換能力,訓練了一個有100billion參數的巨大的深層神經網絡。 (這一系統包含 36 個服務器節點,每一服務器節點配備了 2 顆六核英特爾至強 E5-2620 處理器。每個服務器包含 4 顆英偉達 Tesla K40m GPU,以及 1 個 FDR InfiniBand(速度為 56GB/S)。這帶來了高性能、低延時的連接,以及對 RDMA 的支持。每一顆 GPU 的最高浮點運算性能為每秒 4.29 萬億次浮點運算,而每一顆 GPU 也配備了 12GB 的內存。整體來看,Minwa 內置了 6.9TB 的主內存、1.7TB 的設備內存,而理論上的最高性能約為 0.6 千萬億次浮點運算。憑借如此強大的系統,研究人員可以使用與其他深度學習項目不同,或者說更好的訓練數據。因此,百度沒有使用常見的 256x256 像素圖片,而是使用了 512x512 像素圖片,並且可以給這些圖片添加各種特效,例如色彩調整、增加光暈,以及透鏡扭曲等。這樣做的目的是使系統學習更多尺寸更小的對象,並在各種環境下識別 對象。)
AlexNet整個模型比起之前我們看到的Cifar10 和LeNet模型相對來說復雜一些, 訓練時間是在兩台GTX 580 3GB GPUs上進行了5-6天的訓練,其中用到了Hinton的改進方法(在全連接層加入ReLU+Dropout), 對於這個模型,我們首先認定一個完整的卷積層可能包括一層convolution,一層Rectified Linear Units,一層max-pooling,一層normalization。則整個網絡結構包括五層卷積層和三層全連接層,網絡的最前端是輸入圖片的原始像素點,最后端是圖片的分類結果。
對於每一層網絡,具體的網絡參數配置如下圖所示。InputLayer就是輸入圖片層,每個輸入圖片都將被縮放成227*227大小,分rgb三個顏色維度輸入。Layer1~ Layer5是卷積層,以Layer1為例,卷積濾波器的大小是11*11,卷積步幅為4,本層共有96個卷積濾波器,本層的輸出則是96個55*55大小的圖片。在Layer1,卷積濾波后,還接有ReLUs操作和max-pooling操作。Layer6~ Layer8是全連接層,相當於在五層卷積層的基礎上再加上一個三層的全連接神經網絡分類器。以Layer6為例,本層的神經元個數為4096個。Layer8的神經元個數為1000個,相當於訓練目標的1000個圖片類別。
下面的圖來自sunbaigui的 博文 http://blog.csdn.net/sunbaigui/article/details/39938097
流程是一致的,不過對於每個值的計算過程 有點不一樣。
conv1階段DFD(data flow diagram):
2. conv2階段DFD(data flow diagram):
3. conv3階段DFD(data flow diagram):
4. conv4階段DFD(data flow diagram):
5. conv5階段DFD(data flow diagram):
6. fc6階段DFD(data flow diagram):
7. fc7階段DFD(data flow diagram):
8. fc8階段DFD(data flow diagram):
然后我調用訓練好的模型
MODEL_FILE = '../models/bvlc_reference_caffenet/deploy.prototxt'
PRETRAINED = '../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel'
獲取到預測結果,然后再標簽里查找到對應的分類
imagenet_label = caffe_root + 'data/ilsvrc12/synset_words.txt'
輸入一張貓的圖片
輸出前五個分類
['n02123045 tabby, tabby cat' 'n02127052 lynx, catamount'
'n02124075 Egyptian cat' 'n07930864 cup' 'n02123159 tiger cat']
[281 287 285 968 282]
- input: "data"
- input_dim: 10
- input_dim: 3
- input_dim: 227
- input_dim: 227
- I0318 02:36:31.173322 15756 net.cpp:358] Input 0 -> data
- I0318 02:36:31.173369 15756 net.cpp:56] Memory required for data: 0
- I0318 02:36:31.173436 15756 net.cpp:67] Creating Layer conv1
- I0318 02:36:31.173446 15756 net.cpp:394] conv1 <- data
- I0318 02:36:31.173463 15756 net.cpp:356] conv1 -> conv1
- I0318 02:36:31.173481 15756 net.cpp:96] Setting up conv1
- I0318 02:36:31.173616 15756 net.cpp:103] Top shape: 10 96 55 55 (2904000)
- I0318 02:36:31.173624 15756 net.cpp:113] Memory required for data: 11616000
- I0318 02:36:31.173655 15756 net.cpp:67] Creating Layer relu1
- I0318 02:36:31.173665 15756 net.cpp:394] relu1 <- conv1
- I0318 02:36:31.173679 15756 net.cpp:345] relu1 -> conv1 (in-place)
- I0318 02:36:31.173691 15756 net.cpp:96] Setting up relu1
- I0318 02:36:31.173697 15756 net.cpp:103] Top shape: 10 96 55 55 (2904000)
- I0318 02:36:31.173702 15756 net.cpp:113] Memory required for data: 23232000
- I0318 02:36:31.173715 15756 net.cpp:67] Creating Layer pool1
- I0318 02:36:31.173722 15756 net.cpp:394] pool1 <- conv1
- I0318 02:36:31.173737 15756 net.cpp:356] pool1 -> pool1
- I0318 02:36:31.173750 15756 net.cpp:96] Setting up pool1
- I0318 02:36:31.173763 15756 net.cpp:103] Top shape: 10 96 27 27 (699840)
- I0318 02:36:31.173768 15756 net.cpp:113] Memory required for data: 26031360
- I0318 02:36:31.173780 15756 net.cpp:67] Creating Layer norm1
- I0318 02:36:31.173787 15756 net.cpp:394] norm1 <- pool1
- I0318 02:36:31.173800 15756 net.cpp:356] norm1 -> norm1
- I0318 02:36:31.173811 15756 net.cpp:96] Setting up norm1
- I0318 02:36:31.173820 15756 net.cpp:103] Top shape: 10 96 27 27 (699840)
- I0318 02:36:31.173825 15756 net.cpp:113] Memory required for data: 28830720
- I0318 02:36:31.173837 15756 net.cpp:67] Creating Layer conv2
- I0318 02:36:31.173845 15756 net.cpp:394] conv2 <- norm1
- I0318 02:36:31.173858 15756 net.cpp:356] conv2 -> conv2
- I0318 02:36:31.173872 15756 net.cpp:96] Setting up conv2
- I0318 02:36:31.174792 15756 net.cpp:103] Top shape: 10 256 27 27 (1866240)
- I0318 02:36:31.174799 15756 net.cpp:113] Memory required for data: 36295680
- I0318 02:36:31.174818 15756 net.cpp:67] Creating Layer relu2
- I0318 02:36:31.174826 15756 net.cpp:394] relu2 <- conv2
- I0318 02:36:31.174839 15756 net.cpp:345] relu2 -> conv2 (in-place)
- I0318 02:36:31.174849 15756 net.cpp:96] Setting up relu2
- I0318 02:36:31.174856 15756 net.cpp:103] Top shape: 10 256 27 27 (1866240)
- I0318 02:36:31.174860 15756 net.cpp:113] Memory required for data: 43760640
- I0318 02:36:31.174870 15756 net.cpp:67] Creating Layer pool2
- I0318 02:36:31.174876 15756 net.cpp:394] pool2 <- conv2
- I0318 02:36:31.174890 15756 net.cpp:356] pool2 -> pool2
- I0318 02:36:31.174902 15756 net.cpp:96] Setting up pool2
- I0318 02:36:31.174912 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
- I0318 02:36:31.174917 15756 net.cpp:113] Memory required for data: 45491200
- I0318 02:36:31.174927 15756 net.cpp:67] Creating Layer norm2
- I0318 02:36:31.174933 15756 net.cpp:394] norm2 <- pool2
- I0318 02:36:31.174947 15756 net.cpp:356] norm2 -> norm2
- I0318 02:36:31.174957 15756 net.cpp:96] Setting up norm2
- I0318 02:36:31.174967 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
- I0318 02:36:31.174970 15756 net.cpp:113] Memory required for data: 47221760
- I0318 02:36:31.174983 15756 net.cpp:67] Creating Layer conv3
- I0318 02:36:31.174990 15756 net.cpp:394] conv3 <- norm2
- I0318 02:36:31.175004 15756 net.cpp:356] conv3 -> conv3
- I0318 02:36:31.175015 15756 net.cpp:96] Setting up conv3
- I0318 02:36:31.177199 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
- I0318 02:36:31.177208 15756 net.cpp:113] Memory required for data: 49817600
- I0318 02:36:31.177230 15756 net.cpp:67] Creating Layer relu3
- I0318 02:36:31.177238 15756 net.cpp:394] relu3 <- conv3
- I0318 02:36:31.177252 15756 net.cpp:345] relu3 -> conv3 (in-place)
- I0318 02:36:31.177263 15756 net.cpp:96] Setting up relu3
- I0318 02:36:31.177268 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
- I0318 02:36:31.177273 15756 net.cpp:113] Memory required for data: 52413440
- I0318 02:36:31.177284 15756 net.cpp:67] Creating Layer conv4
- I0318 02:36:31.177289 15756 net.cpp:394] conv4 <- conv3
- I0318 02:36:31.177302 15756 net.cpp:356] conv4 -> conv4
- I0318 02:36:31.177316 15756 net.cpp:96] Setting up conv4
- I0318 02:36:31.179213 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
- I0318 02:36:31.179220 15756 net.cpp:113] Memory required for data: 55009280
- I0318 02:36:31.179237 15756 net.cpp:67] Creating Layer relu4
- I0318 02:36:31.179244 15756 net.cpp:394] relu4 <- conv4
- I0318 02:36:31.179258 15756 net.cpp:345] relu4 -> conv4 (in-place)
- I0318 02:36:31.179268 15756 net.cpp:96] Setting up relu4
- I0318 02:36:31.179275 15756 net.cpp:103] Top shape: 10 384 13 13 (648960)
- I0318 02:36:31.179280 15756 net.cpp:113] Memory required for data: 57605120
- I0318 02:36:31.179291 15756 net.cpp:67] Creating Layer conv5
- I0318 02:36:31.179296 15756 net.cpp:394] conv5 <- conv4
- I0318 02:36:31.179309 15756 net.cpp:356] conv5 -> conv5
- I0318 02:36:31.179322 15756 net.cpp:96] Setting up conv5
- I0318 02:36:31.180598 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
- I0318 02:36:31.180606 15756 net.cpp:113] Memory required for data: 59335680
- I0318 02:36:31.180629 15756 net.cpp:67] Creating Layer relu5
- I0318 02:36:31.180636 15756 net.cpp:394] relu5 <- conv5
- I0318 02:36:31.180649 15756 net.cpp:345] relu5 -> conv5 (in-place)
- I0318 02:36:31.180660 15756 net.cpp:96] Setting up relu5
- I0318 02:36:31.180665 15756 net.cpp:103] Top shape: 10 256 13 13 (432640)
- I0318 02:36:31.180670 15756 net.cpp:113] Memory required for data: 61066240
- I0318 02:36:31.180682 15756 net.cpp:67] Creating Layer pool5
- I0318 02:36:31.180690 15756 net.cpp:394] pool5 <- conv5
- I0318 02:36:31.180703 15756 net.cpp:356] pool5 -> pool5
- I0318 02:36:31.180716 15756 net.cpp:96] Setting up pool5
- I0318 02:36:31.180726 15756 net.cpp:103] Top shape: 10 256 6 6 (92160)
- I0318 02:36:31.180730 15756 net.cpp:113] Memory required for data: 61434880
- I0318 02:36:31.180742 15756 net.cpp:67] Creating Layer fc6
- I0318 02:36:31.180748 15756 net.cpp:394] fc6 <- pool5
- I0318 02:36:31.180760 15756 net.cpp:356] fc6 -> fc6
- I0318 02:36:31.180773 15756 net.cpp:96] Setting up fc6
- I0318 02:36:31.262177 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.262205 15756 net.cpp:113] Memory required for data: 61598720
- I0318 02:36:31.262254 15756 net.cpp:67] Creating Layer relu6
- I0318 02:36:31.262269 15756 net.cpp:394] relu6 <- fc6
- I0318 02:36:31.262296 15756 net.cpp:345] relu6 -> fc6 (in-place)
- I0318 02:36:31.262312 15756 net.cpp:96] Setting up relu6
- I0318 02:36:31.262320 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.262325 15756 net.cpp:113] Memory required for data: 61762560
- I0318 02:36:31.262341 15756 net.cpp:67] Creating Layer drop6
- I0318 02:36:31.262346 15756 net.cpp:394] drop6 <- fc6
- I0318 02:36:31.262358 15756 net.cpp:345] drop6 -> fc6 (in-place)
- I0318 02:36:31.262369 15756 net.cpp:96] Setting up drop6
- I0318 02:36:31.262378 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.262382 15756 net.cpp:113] Memory required for data: 61926400
- I0318 02:36:31.262408 15756 net.cpp:67] Creating Layer fc7
- I0318 02:36:31.262416 15756 net.cpp:394] fc7 <- fc6
- I0318 02:36:31.262430 15756 net.cpp:356] fc7 -> fc7
- I0318 02:36:31.262445 15756 net.cpp:96] Setting up fc7
- I0318 02:36:31.298831 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.298859 15756 net.cpp:113] Memory required for data: 62090240
- I0318 02:36:31.298895 15756 net.cpp:67] Creating Layer relu7
- I0318 02:36:31.298908 15756 net.cpp:394] relu7 <- fc7
- I0318 02:36:31.298933 15756 net.cpp:345] relu7 -> fc7 (in-place)
- I0318 02:36:31.298949 15756 net.cpp:96] Setting up relu7
- I0318 02:36:31.298955 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.298960 15756 net.cpp:113] Memory required for data: 62254080
- I0318 02:36:31.298974 15756 net.cpp:67] Creating Layer drop7
- I0318 02:36:31.298981 15756 net.cpp:394] drop7 <- fc7
- I0318 02:36:31.298993 15756 net.cpp:345] drop7 -> fc7 (in-place)
- I0318 02:36:31.299003 15756 net.cpp:96] Setting up drop7
- I0318 02:36:31.299011 15756 net.cpp:103] Top shape: 10 4096 1 1 (40960)
- I0318 02:36:31.299016 15756 net.cpp:113] Memory required for data: 62417920
- I0318 02:36:31.299028 15756 net.cpp:67] Creating Layer fc8
- I0318 02:36:31.299034 15756 net.cpp:394] fc8 <- fc7
- I0318 02:36:31.299046 15756 net.cpp:356] fc8 -> fc8
- I0318 02:36:31.299060 15756 net.cpp:96] Setting up fc8
- I0318 02:36:31.308233 15756 net.cpp:103] Top shape: 10 1000 1 1 (10000)
- I0318 02:36:31.308260 15756 net.cpp:113] Memory required for data: 62457920
- I0318 02:36:31.308295 15756 net.cpp:67] Creating Layer prob
- I0318 02:36:31.308310 15756 net.cpp:394] prob <- fc8
- I0318 02:36:31.308336 15756 net.cpp:356] prob -> prob
- I0318 02:36:31.308353 15756 net.cpp:96] Setting up prob
- I0318 02:36:31.308372 15756 net.cpp:103] Top shape: 10 1000 1 1 (10000)
- I0318 02:36:31.308377 15756 net.cpp:113] Memory required for data: 62497920
- I0318 02:36:31.308384 15756 net.cpp:172] prob does not need backward computation.
- I0318 02:36:31.308404 15756 net.cpp:172] fc8 does not need backward computation.
- I0318 02:36:31.308410 15756 net.cpp:172] drop7 does not need backward computation.
- I0318 02:36:31.308415 15756 net.cpp:172] relu7 does not need backward computation.
- I0318 02:36:31.308420 15756 net.cpp:172] fc7 does not need backward computation.
- I0318 02:36:31.308425 15756 net.cpp:172] drop6 does not need backward computation.
- I0318 02:36:31.308430 15756 net.cpp:172] relu6 does not need backward computation.
- I0318 02:36:31.308435 15756 net.cpp:172] fc6 does not need backward computation.
- I0318 02:36:31.308440 15756 net.cpp:172] pool5 does not need backward computation.
- I0318 02:36:31.308445 15756 net.cpp:172] relu5 does not need backward computation.
- I0318 02:36:31.308450 15756 net.cpp:172] conv5 does not need backward computation.
- I0318 02:36:31.308454 15756 net.cpp:172] relu4 does not need backward computation.
- I0318 02:36:31.308459 15756 net.cpp:172] conv4 does not need backward computation.
- I0318 02:36:31.308465 15756 net.cpp:172] relu3 does not need backward computation.
- I0318 02:36:31.308470 15756 net.cpp:172] conv3 does not need backward computation.
- I0318 02:36:31.308473 15756 net.cpp:172] norm2 does not need backward computation.
- I0318 02:36:31.308478 15756 net.cpp:172] pool2 does not need backward computation.
- I0318 02:36:31.308483 15756 net.cpp:172] relu2 does not need backward computation.
- I0318 02:36:31.308488 15756 net.cpp:172] conv2 does not need backward computation.
- I0318 02:36:31.308493 15756 net.cpp:172] norm1 does not need backward computation.
- I0318 02:36:31.308498 15756 net.cpp:172] pool1 does not need backward computation.
- I0318 02:36:31.308502 15756 net.cpp:172] relu1 does not need backward computation.
- I0318 02:36:31.308508 15756 net.cpp:172] conv1 does not need backward computation.
- I0318 02:36:31.308512 15756 net.cpp:208] This network produces output prob
- I0318 02:36:31.308542 15756 net.cpp:467] Collecting Learning Rate and Weight Decay.
- I0318 02:36:31.308554 15756 net.cpp:219] Network initialization done.
- I0318 02:36:31.308558 15756 net.cpp:220] Memory required for data: 62497920
- E0318 02:36:31.683995 15756 upgrade_proto.cpp:611] Attempting to upgrade input file specified using deprecated transformation parameters: ../models/bvlc_reference_caffenet/bvlc_reference_caffenet.caffemodel
- I0318 02:36:31.684034 15756 upgrade_proto.cpp:614] Successfully upgraded file specified using deprecated data transformation parameters.
- E0318 02:36:31.684039 15756 upgrade_proto.cpp:616] Note that future Caffe releases will only support transform_param messages for transformation fields.
- I0318 02:36:31.684048 15756 net.cpp:702] Ignoring source layer data
- I0318 02:36:31.684052 15756 net.cpp:705] Copying source layer conv1
- I0318 02:36:31.684367 15756 net.cpp:705] Copying source layer relu1
- I0318 02:36:31.684375 15756 net.cpp:705] Copying source layer pool1
- I0318 02:36:31.684380 15756 net.cpp:705] Copying source layer norm1
- I0318 02:36:31.684383 15756 net.cpp:705] Copying source layer conv2
- I0318 02:36:31.687021 15756 net.cpp:705] Copying source layer relu2
- I0318 02:36:31.687028 15756 net.cpp:705] Copying source layer pool2
- I0318 02:36:31.687033 15756 net.cpp:705] Copying source layer norm2
- I0318 02:36:31.687038 15756 net.cpp:705] Copying source layer conv3
- I0318 02:36:31.694574 15756 net.cpp:705] Copying source layer relu3
- I0318 02:36:31.694586 15756 net.cpp:705] Copying source layer conv4
- I0318 02:36:31.700258 15756 net.cpp:705] Copying source layer relu4
- I0318 02:36:31.700266 15756 net.cpp:705] Copying source layer conv5
- I0318 02:36:31.704053 15756 net.cpp:705] Copying source layer relu5
- I0318 02:36:31.704062 15756 net.cpp:705] Copying source layer pool5
- I0318 02:36:31.704067 15756 net.cpp:705] Copying source layer fc6
- I0318 02:36:32.025676 15756 net.cpp:705] Copying source layer relu6
- I0318 02:36:32.025704 15756 net.cpp:705] Copying source layer drop6
- I0318 02:36:32.025709 15756 net.cpp:705] Copying source layer fc7
- I0318 02:36:32.168704 15756 net.cpp:705] Copying source layer relu7
- I0318 02:36:32.168730 15756 net.cpp:705] Copying source layer drop7
- I0318 02:36:32.168735 15756 net.cpp:705] Copying source layer fc8
- I0318 02:36:32.203642 15756 net.cpp:702] Ignoring source layer loss
['n02123045 tabby, tabby cat' 'n02127052 lynx, catamount'
'n02124075 Egyptian cat' 'n07930864 cup' 'n02123159 tiger cat']
[281 287 285 968 282]
其他參考:
http://blog.csdn.net/sunbaigui/article/details/39938097
http://dataunion.org/10781.html
http://djt.qq.com/article/view/1245