多模態-中文數據集

本文轉載自查看原文 2022-03-09 09:20 1121

(1) 華為悟空
悟空，華為，https://wukong-dataset.github.io/wukong-dataset/
The dataset contains 100 Million <image, text> pairs

(2) FLICKR的中文版
flickr30k-cn、flickr8k-cn
https://github.com/weiyuk/fluent-cap

(3)COCO中文版
https://github.com/li-xirong/coco-cn

(4) muge
https://github.com/MUGE-2021，電商數據集ECommerce-IC
MUGE（牧歌，Multimodal Understanding and Generation Evaluation）是業界首個大規模中文多模態評測基准，由達摩院聯合浙江大學、阿里雲天池平台聯合發布，中國計算機學會計算機視覺專委會（CCF-CV專委）協助推出。目前包括：多模態理解與生成任務在內的多模態評測基准，其中包括圖像描述、圖文檢索以及基於文本的圖像生成。
模型：M6、OFA

M6-Corpus，J. Lin, R. Men, A. Yang, C. Zhou, M. Ding, Y. Zhang, P. Wang, A. Wang, L. Jiang, X. Jia, et al. M6: A chinese multimodal pretrainer. arXiv preprint arXiv:2103.00823, 2021.

(5) WuDaoCorpora
CogView、悟道2.0、文瀾2.0

WuDaoMM：用於預訓練模型的大規模多模態數據集
https://github.com/BAAI-WuDao/WuDaoMM/

(6) Product1M
100萬圖文對兒
X. Zhan, Y. Wu, X. Dong, Y. Wei, M. Lu, Y. Zhang, H. Xu, and X. Liang. Product1m: Towards weakly supervised instance-level product retrieval via cross-modal pretraining. In International Conference on Computer Vision, 2021.

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 中文語言理解全套資料-模型、數據集、評測數據集的划分 cityscapes數據集開源數據集 wikitext數據集交通數據集訓練自己的數據集數據集匯總 SpaceNet 數據集