一、VLAD
1、NetVLAD
視覺場景識別經典之作
論文:https://arxiv.org/pdf/1511.07247.pdf
代碼:https://github.com/Nanne/pytorch-NetVlad(目前測試工作已完成)
小代一下ghost-vlad:降低不清楚圖片的權重
2、Patch-NetVLAD
2021NetVLAD的延續工作
論文:https://arxiv.org/pdf/2103.01486v1.pdf
代碼:https://github.com/QVPR/Patch-NetVLAD(同樣等待數據集)
數據集'./Pittsburgh250k/001/001048_pitch1_yaw7.jpg'發生損壞,暫時用旁邊的替代,需要時重新下載
BATCH需要調到2才能正常工作。
目前數據集除09外重新下載不存在問題,其他的暫時用旁邊的代替
speed結果:
====> Recall NetVLAD@1: 0.0135
====> Recall NetVLAD@5: 0.0389
====> Recall NetVLAD@10: 0.0497
====> Recall NetVLAD@20: 0.0580
====> Recall NetVLAD@50: 0.0634
====> Recall NetVLAD@100: 0.0660
====> Recall PatchNetVLAD@1: 0.0343
====> Recall PatchNetVLAD@5: 0.0483
====> Recall PatchNetVLAD@10: 0.0541
====> Recall PatchNetVLAD@20: 0.0593
====> Recall PatchNetVLAD@50: 0.0647
====> Recall PatchNetVLAD@100: 0.0660
Writing recalls to results/recalls.txt
3、PatchNet+
Patch-net的改進版本,不過數據方面和patchnet原文測得不一樣,沒有仔細看
論文:https://arxiv.org/pdf/2202.05738.pdf
4、MultiRes-NetVLAD
2022NetVLAD延續工作
論文:https://arxiv.org/pdf/2202.09146.pdf
代碼:https://github.com/Ahmedest61/MultiRes-NetVLAD
5、
https://arxiv.org/pdf/2010.09228.pdf
6、VLAD-SLAM
2016年的作品,將VLAD融入回環檢測並用SDC進行對比。
論文:https://sci-hub.mksa.top/10.1109/icinfa.2016.7831876
7、Spatial Pyramid-Enhanced NetVLAD 2019
空間加強的VLAD,有點類似與圖像金字塔,同時改變了權重(與每個epoch有關,收斂不好的加大權重)
論文:https://sci-hub.st/10.1109/TNNLS.2019.2908982
8、DELG
論文:https://arxiv.org/pdf/2001.05027.pdf
代碼:https://github.com/tensorflow/models/tree/master/research/delf
9、CRN
一種帶局部加權的VLAD
論文:https://openaccess.thecvf.com/content_cvpr_2017/papers/Kim_Learned_Contextual_Feature_CVPR_2017_paper.pdf
差不多的論文:
https://www.researchgate.net/publication/329857970_Learning_to_Fuse_Multiscale_Features_for_Visual_Place_Recognition
10、APA
一種金字塔帶注意力的VLAD
論文:https://arxiv.org/pdf/1808.00288v1.pdf
這里將NETVLAD PCA之后再做的實驗,不知道我是否可以。
待閱讀:
https://arxiv.org/pdf/2107.02440.pdf
https://blog.csdn.net/qq_24954345/article/details/86176862 里面有一個時空VLAD
11、VSA
將語義信息編碼進入vector,數學成分很多
論文:http://www.roboticsproceedings.org/rss17/p083.pdf
12、DELF
帶注意力的local feature用在檢索
論文:https://arxiv.org/pdf/1612.06321.pdf
代碼:https://github.com/nashory/DeLF-pytorch
二、Transformer
1、基本知識
注意力機制:
https://zhuanlan.zhihu.com/p/52119092
Transformer:
https://zhuanlan.zhihu.com/p/82312421
https://blog.csdn.net/longxinchen_ml/article/details/86533005
CVPR2021:https://blog.csdn.net/amusi1994/article/details/117433649?spm=1001.2101.3001.6650.4&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-4.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-4.pc_relevant_default&utm_relevant_index=9
vit必讀:https://blog.csdn.net/u014546828/article/details/117657912?spm=1001.2101.3001.6650.1&utm_medium=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&depth_1-utm_source=distribute.pc_relevant.none-task-blog-2%7Edefault%7ECTRLIST%7ERate-1.pc_relevant_default&utm_relevant_index=2
解讀Transformer:
https://luweikxy.gitbook.io/machine-learning-notes/self-attention-and-transformer#skip%20connection%E5%92%8CLayer%20Normalization
https://zhuanlan.zhihu.com/p/48508221
2、
論文:https://blog.csdn.net/weixin_43882112/article/details/121440070
https://blog.csdn.net/weixin_43882112/article/details/121440070
代碼:https://github.com/fpthink/PPT-Net
激光的,可以不看
3、
論文:https://arxiv.org/pdf/2203.03397.pdf
4、TransVPR 2022 CVPR
transformer 用於vpr,效果比原文patchnet要好
論文:https://arxiv.org/pdf/2201.02001.pdf
解讀:https://zhuanlan.zhihu.com/p/461437620
代碼:無
5、VIT 2020
論文:https://arxiv.org/pdf/2010.11929v1.pdf
代碼:https://github.com/google-research/vision_transformer , https://github.com/lucidrains/vit-pytorch , https://github.com/likelyzhao/vit-pytorch
解讀:https://blog.csdn.net/qq_44055705/article/details/113825863
6、Self-supervising Fine-grained Region Similaritiesfor Large-scale Image Localization
自監督圖像相似性檢測,采用迭代訓練,將上一epoch的結果導入到下一epoch中。
論文:https://arxiv.org/pdf/2006.03926.pdf
代碼:https://github.com/yxgeee/OpenIBL
解讀:https://zhuanlan.zhihu.com/p/169596514
7、https://arxiv.org/abs/2201.005201
8、swin transformer
一種特殊窗口的VIT
論文:https://arxiv.org/pdf/2103.14030.pdf
代碼:https://github.com/microsoft/Swin-Transformer
解讀:https://zhuanlan.zhihu.com/p/367111046
9、ConTransformer
橋接cnn與transformer
論文:https://arxiv.org/pdf/2105.03889.pdf
代碼:https://github.com/pengzhiliang/Conformer
解讀:https://blog.csdn.net/qq_15698613/article/details/119723545?utm_medium=distribute.pc_relevant.none-task-blog-2~default~baidujs_title~default-1.pc_relevant_antiscanv2&spm=1001.2101.3001.4242.2&utm_relevant_index=4
已經運行成功,這篇可以參考。
與之類似的mobile-former待觀察。
10、EViT
論文:https://zhuanlan.zhihu.com/p/440294002
代碼:https://github.com/youweiliang/evit
解讀:https://zhuanlan.zhihu.com/p/440294002
11、Transformer + CNN
論文:https://arxiv.org/pdf/2106.03180.pdf
代碼:https://github.com/yun-liu/TransCNN
12、PVT
金字塔Transformer
論文:https://arxiv.org/pdf/2102.12122.pdf
代碼:https://github.com/whai362/PVT
13、MVIT
多層級的
論文:https://arxiv.org/pdf/2104.11227.pdf
代碼:https://github.com/facebookresearch/SlowFast
14、DEIT
知識蒸餾
論文:https://arxiv.org/pdf/2012.12877.pdf
代碼:https://github.com/facebookresearch/deit/issues?q=is%3Aclosed
deit三代
論文:https://arxiv.org/pdf/2204.07118.pdf
套用VLAD結果(由於TIMM存在問題)
====> Recall@1: 0.0503
====> Recall@5: 0.1284
====> Recall@10: 0.1976
====> Recall@20: 0.2946
15、Patch-conv-net
修改VIT結構,和DEIT是一個團隊
論文:https://arxiv.org/pdf/2112.13692.pdf
代碼:同DEIT
16、DOLG-EfficientNet
S-TRANSFORMRE VPR
https://arxiv.org/pdf/2110.03786.pdf
2021檢索挑戰賽冠軍,可參考https://jishuin.proginn.com/p/763bfbd6b138
三、其他形式VPR
1、NYU-VPR
IROS2021
論文:https://arxiv.org/pdf/2110.09004.pdf
這是個數據集
2、HSD
ITSC2021
論文:https://arxiv.org/pdf/2109.14916.pdf
這個看不太懂
3、激光點雲介入
證明了基於點雲強度圖的可行性
論文名稱:Visual Place Recognition using LiDAR Intensity Information
參考:https://xuwuzhou.top/%E8%AE%BA%E6%96%87%E9%98%85%E8%AF%BB63/
四、其他領域可參考論文
1、https://proceedings.neurips.cc/paper/2021/file/27d52bcb3580724eb4cbe9f2718a9365-Paper.pdf
See More for Scene: Pairwise Consistency Learning for Scene Classification
利用focus區域進行場景分類
2、elective Convolutional Descrip- tor Aggregation for Fine-Grained Image Retrieval
基於關鍵區域的圖像檢索
https://openaccess.thecvf.com/content_CVPR_2019/papers/Chen_Destruction_and_Construction_Learning_for_Fine-Grained_Image_Recognition_CVPR_2019_paper.pdf