經典論文學習bag of feature（二）

本文轉載自查看原文 2014-12-17 16:38 2922 經典論文學習/ bag of feature

Bag-of-word

Bag-of-words模型是信息檢索領域常用的文檔表示方法。在信息檢索中，BOW模型假定對於一個文檔，忽略它的單詞順序和語法、句法等要素，將其僅僅看作是若干個詞匯的集合，文檔中每個單詞的出現都是獨立的，不依賴於其它單詞是否出現。例如有如下兩個文檔：

1：Bob likes to play basketball, Jim likes too. 2：Bob also likes to play football games.

基於這兩個文本文檔，提取單個單詞，並構造一個詞典：

Dictionary = {1:”Bob”, 2. “like”, 3. “to”, 4. “play”, 5. “basketball”, 6. “also”, 7. “football”, 8. “games”, 9. “Jim”, 10. “too”}。

這個詞典一共包含10個不同的單詞，根據詞典，對上面兩個文檔中的單詞出現次數進行統計，每個文檔可表示為10維向量。如下：

1：[1, 2, 1, 1, 1, 0, 0, 0, 1, 1] 2：[1, 1, 1, 1 ,0, 1, 1, 1, 0, 0]

若每種類型的文檔中單詞的直方圖統計呈現特定的規律，則可以利用這種規律進行海量文檔歸類。

Bag Of Feature

1.1 [CVPR06] Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories

摘要：將BOW的思想引入到圖像中來，word在圖像中用一種特定的特征描述子來代替，但這樣完全忽略了圖像的空間布局關系，incapable of capturing shape or of segmenting an object from its background，因此結合空間金字塔匹配來實現。 Our method involves repeatedly subdividing the image and computing histograms of local features at increasingly fine resolutions.

比較：以下三方面和傳統方法比較：

1 locally orderless images：SPM as an alternative formulation of a locally orderless image, instead of a Gaussian scale space of apertures,define a fixed hierarchy of rectangular windows. 2 multiresolution histograms：fixing the resolution at which the features are computed, but varying the spatial resolution at which they are aggregated. 3 subdivide and disorder：the best subdivision scheme may be achieved when multiple resolutions are combined in a principled way; the reason for the empirical success of “subdivide and disorder” techniques is the fact that they actually perform approximate geometric matching.

Pyramid Match Kernels:

xy表示兩個矢量 PyramidMatch用來計算xy之間的appriosimate correspondence.通過placing a sequence of increasingly coarser grids over the feature space and taking a weighted sum of the number of matches that occur at each level of resolution.

Match means they fall into the same cell. Resolution counts from 0 to L.

At level l,image can be divide into 2exp(d*l) cells(這里的cell應該就是后面的聚類中心？);The number of matchs at level l is given by (1);

The weight number of level l is set to 1/(2exp(L-l)),Note lower(coarser) level include the num of finer level, so the num of level l is given by .The Pyramid match kernel can be given by(2);

(1)(2)

Spatial Matching Scheme

perform pyramid matching in the two-dimensional image space, and use traditional clustering techniques in feature space.(對於圖像中feature空間，圖像的坐標已經包含了幾何空間信息，只需要按照坐標順序排列vector即可)(特征空間用聚類將特征聚到M個類別channel，大概就是上面講的fall into the same cell，H就是用直方圖統計，I越小表示二者相關度越小)

(3)

Dimension is:(上式中k(x,y) 中並不是相加而是每level的I矢量連接成一個很長的矢量) ; M=400 L=3 d=34000 is long and sparse.

Normalize all histograms by the total weight of all features in the image.

(1) (2)

histogram intersection function

用來對特征構成的直方圖進行相似度匹配.計算公式(即式(1))：

上圖(2)是對直方圖交叉核函數的描述圖：(a)里的y和z代表兩種數據分布，三幅圖代表三層金字塔，每一層里有間距相等的虛線，表示直方圖寬度，金字塔L越大寬度越小，間隔越多。可以看到紅點藍點的位置是固定的，但是根據直方圖寬度的不同可以划到不同的直方圖里，如(b)所示。(c)圖就是L的計算結果，是通過(b)里兩種直方圖取交集得來的，c圖每個圖的下方都給出了交集數目，比如x₀=2,x₁=4,x₂=3（原圖里是5，是不是錯了？）

Q：

對照例子和代碼：(1)在此處的作用應該是通過類似SPM的可核函數(2)/(3)計算得到H/H'(包含特征的直方圖統計結果，但是其中的I含義和原文不一樣)，cell是聚類中心，match是指在同一個聚類中心，H是通過同cluster的統計直方圖，然后用(1)計算I，即numofmatchs，通過直方圖內核函數來計算(the histogram intersection function).

按照文章理解：應該是先計算1 然后 2/3來計算。進一步可以看：The Pyramid Match Kernel:Discriminative Classification with Sets of Image Features理解。自己理解是：(2)/(3)權重常數和channel可以乘進去或者一開始就考慮，所以例子和代碼看起來就是先考慮2/3,然后1最后計算決定numofmatch的，即兩個直方圖的相似度（交叉核）。原文中有的表示。

局部和全局特征表示：本文中說到SPM是一種approximate global geometric correspondence，又如何理解an alternative formulation of a locally orderless image,傳統的局部和全局特征是怎樣定義的，有哪些？？

ps：

部分來源於：http://blog.csdn.net/v_JULY_v

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 sppNet論文學習 VAE論文學習 FactorVAE論文學習-1 Reservoir Computing論文學習 FCOS論文學習筆記 Raft論文學習筆記 Fast RCNN論文學習 GAN Compression - 1 - 論文學習 Faster RCNN論文學習淺析 Bag of Feature