Person Transfer GAN to Bridge Domain Gap for Person Re-identification

本文轉載自查看原文 2019-04-10 15:40 540 機器學習/ 計算機視覺/ 學術論文/ 深度學習/ Person ReID

相關背景
主要內容
- MSMT17
- Person Transfer GAN(PTGAN)
總結

注：原創不易，轉載請務必注明原作者和出處，感謝支持！

相關背景

行人再識別（Person Re-identification, Person ReID）是指給定一個行人的圖片/視頻（probe），然后從一個監控網絡所拍攝的圖片/視頻（gallery）庫中識別出該行人的這個一個過程。其可以看做是一個基於內容的圖像檢索（CBIR）的一個子問題。

論文題目：Person Transfer GAN to Bridge Domain Gap for Person Re-identification

來源：CVPR 2018

摘要：Although the performance of person Re-Identification(ReID) has been significantly boosted, many challengins issues in real scenarios have not been fully investigated, e.g., the complex scenes and lighting variations, viewpoint and pose changes, and the large number of identities in a camera network. To facilitate the research towards conquering those issues, this paper contributes a new dataset called MSMT17 with many important features, e.g., 1) the raw videos are taken by an 15-camera network deployed in both indoor and outdoor scenes, 2) the videos cover a long period of time and present complex lighting variations, and 3) it contains currently the largest number of annotated identities, i.e. 4101 identities and 126441 bounding boxes. We also observe that, domain gap commonly exists between datasets, which essentially causes severe performance drop when training and testing on different datasets. This results in that available training data cannot be effectively leveraged for new testing domains. To relieve the expensive costs of annotating new training samples, we propose a Person Transfer Generative Adversarial Network(PTGAN) to bridge the domain gap. Comprehensive experiments show that the domain gap could be substantially narrowed-down by the PTGAN.

主要內容

MSMT17

數據集網址：http://www.pkuvmc.com

針對目前Person ReID數據集存在的缺陷：

數據量規模小
場景單一
數據采集的時間跨度短，光照變化不明顯
數據標注方式不合理

本文發布了一個新的Person ReID數據集——MSMT17。MSMT17是目前為止數據量規模最大的Person ReID數據集。共有126441個Bounding Boxes，4101個Identities，15個Cameras，涵蓋了indoor和outdoor兩個場景，Detector用的是更為先進的Faster RCNN。

Person Transfer GAN(PTGAN)

Domain Gap現象

舉個例子，比如在CUHK03數據集上訓練好的模型放到PRID數據集上測試，結果rank-1的准確率只有2.0%。在不同的Person ReID數據集上進行算法的訓練和測試會導致ReID的性能急劇下降。而這種下降是普遍存在的。這意味着基於舊有的訓練數據訓練到的模型無法直接應用在新的數據集中，如何降低Domain Gap的影響以利用好舊有的標注數據很有研究的必要。為此本文提出了PTGAN模型。

造成Domain Gap現象的原因是復雜的，可能是由於光照、圖像分辨率、人種、季節和背景等復雜因素造成的。

比如，我們在數據集B上做Person ReID任務時，為了更好地利用現有數據集A的訓練數據，我們可以試着將數據集A中的行人圖片遷移到目標數據集B當中。但由於Domain Gap的存在，在遷移時，要求算法能夠做到以下兩點：

被遷移的行人圖片應該具有和目標數據集圖片相一致的style，這是為了盡可能地降低因為style不一致所導致的Domain Gap所帶來的性能下降。
具有區分不同行人能力的外觀特征（appearance）和身份線索（identity cues）應該在遷移之后保持不變！因為遷移前和遷移后的行人具有相同的label，即他們應該是同一個人。

因為Person Transfer與Unpaired Image-to-Image Translation任務類似，所以本文選擇在Unpaired Image-to-Image Translation任務中表現優異的Cycle-GAN模型基礎上，提出了Person Transfer GAN模型。PTGAN模型的loss函數\(L_{PTGAN}\)被設計成如下公式：

\[L_{PTGAN} = L_{Style} + \lambda_1L_{ID} \]

其中：
\(L_{Style}\)：the style loss
\(L_{ID}\)：the identity loss
\(\lambda_1\)：the parameter for the trade-off between the two losses above

定義下列符號，則\(L_{Style}\)可以表示成：
\(G\)：the style mapping function from dataset A to dataset B
\(\overline{G}\)：the style mapping function from dataset B to dataset A
\(D_A\)：the style discriminator for dataset A
\(D_B\)：the style discriminator for dataset B

\[L_{Style} = L_{GAN}(G, D_B, A, B) + L_{GAN}(\overline{G}, D_A, B, A) + \lambda_2L_{cyc}(G, \overline{G}) \]

其中：
\(L_{GAN}\)：the standard adversarial loss
\(L_{cyc}\)：the cycle consistency loss

定義下列符號，則\(L_{ID}\)可以表示成：
\(a\)和\(b\)：original image from dataset A and B
\(G(a)\)和\(\overline{G}(b)\)：transferred image from image a and b
\(M(a)\)和\(M(b)\)：forground mask of image a and b

\[L_{ID} = \mathbb{E}_{a \sim p_{data}(a)}\left[\left\| (G(a) - a) \odot M(a)\right \|_2\right] + \mathbb{E}_{b \sim p_{data}(b)}\left[\left\| (\overline{G}(b) - b) \odot M(b)\right \|_2\right] \]

遷移效果圖

總結

本文發布了一個更接近實際應用場景的新數據集MSMT17，因其更接近實際的復雜應用場景，使得MSMT17數據集更具挑戰性和研究價值
本文提出了一個能夠降低Domain Gap影響的PTGAN模型，並通過實驗證明其有效性

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。