在看完第一篇的情況下,這一篇給人的感覺就算灌水嚴重。。。主要內容集中在相似度測量方面的過程和為了加速運算在內存管理方面的額外並行化處理,depth方面的內容和第一篇相同就沒有摘錄了
這是基於linemod的第三篇文章,主要集中於他們提出的相似度檢測這一點的闡述方面
- Similarity Measure:
unoptimized similarity measure is : \(\varepsilon_{Steger}(I,T,c)=\sum_{r\in P}\left| cos(ori(O,r)- ori(I,c+r))\right|\), where \(ori(O,r)\) is the gradient orientation in radians at location \(r\) in a reference image \(O\) of an object to detect. \(ori(I, c+r)\) is the gradient orientation at \(c\) shifted by \(r\) in the input image \(I\). Use a list \(P\) to define the location \(r\) to be considered in \(O\). So template \(T = (O, P)\). the Eq. is robust to background clutter, but not to small shifts and deformations. So we introduce a similarity measure that, for each gradient orientation on the object, searches in a neighborhood of the associated gradient location for the most similar orientation in the input image. And Eq is formalized as: \(\varepsilon(I,T,c)=\sum_{r\in P}(\max_{t\in R(c+r)}\left| cos(ori(O,r)- ori(I,c+r))\right|)\) where \(R(c+r) = [c+r-\frac{T}{2}, c+r+\frac{T}{2}]\times[c+r-\frac{T}{2}, c+r+\frac{T}{2}]\). To increase robustness, for each image location use the gradient orientation of the channel whose magnitude is largest. Given an RGB color image \(I\), we compute the gradient orientation map \(I_G(x)\) at location \(x\) with \(I_G(x) = ori(\hat{C}(x))\) where \(\hat{C}(x) = \arg\max_{C\in \{R,G,B\}}\left\| \frac{\vartheta C}{\vartheta x} \right\|\) - Spreading the Orientations:

a binary representation \(J\) of the gradient around each image location. First, we quantize orientations into a small number of \(n_0\) values. And spread the gradient of input image around their locations to obtain a new representation of the original image. Each individual bit of this string corresponds to one quantized orientation, and is set to 1 if this orientation is present in the \([-\frac{T}{2}, \frac{T}{2}] \times [-\frac{T}{2}, \frac{T}{2}]\)neighborhood of \(m\). The string will be used as indices to access lookup tables. - Precomputing Response Maps:

\(J\) is used together with lookup tables to precompute the value of the max operation for each location and each possible orientation \(ori(O, r)\) in the tamplate. We store the results into 2D maps \(S_i\). And we use \(\tau_i\) for each of the \(n_0\) quantized orientations and compute as: \(\tau_i[L]=\max_{l\in L}\left| cos(i-l)\right|\), where \(i\) is the index of quantized orientations, \(L\) is the list of orientations appearing in local neighborhood of a gradient with oridientation \(i\). For each orientttion \(i\), we can now compute the value at each location \(c\) of the response map \(S_i\) as: \(S_i(c)=\tau_i[J(c)]\). And now we can get \(\varepsilon(I, T, c)=\sum_{r\in P}S_{ori(O,r)}(c+r)\) - Linearizing the Memory for Parallelization:

We restructure each response map so that the values of one row that are \(T\) pixels apart on the \(x\) axis are now stored next to each other in memory. We continue with the row which is \(T\) pixels apart on the \(y\) axis once we finished with the current one.
