資源整理&日記:入口(每天更新)

2021-05-06周資源:
1.
資源整理:
★★★★ 2020三篇語義匹配筆記:基於表征(Representation)文本匹配、信息檢索、向量召回的方法總結（用於召回、或者粗排） - 知乎 (zhihu.com) ColBERT、Poly-Encoder、Pre training Tasks for Embedding-based Large-scale Retrieval,近期三篇比較著名的模型。
還是上述三篇文章筆記：搜索推薦召回&&粗排相關性優化最新進展—2020 - 知乎 (zhihu.com)
SIGIR2020相關論文整理：SIGIR會議之文本表征、檢索重排序、閱讀理解論文整理 - 知乎 (zhihu.com)
論文筆記： SIGIR 2020之DC-BERT模型：解耦問題-文檔編碼，提速QA-Rerank模塊 - 知乎 (zhihu.com) 【√】
《搜索與推薦中的深度學習匹配》之搜索篇 @后青春期的工程師 sigir tutorial 【√】
搜索中的深度匹配模型(上) @ 辛俊波 SIGIR tutorial 【√】
搜索中的深度匹配模型（下） @辛俊波 SIGIR tutorial 【√】
A Deep Look into Neural Ranking Models for Information Retrieval 2019-03 綜述 (chengxueqi)
BERT在語義相似度計算中的應用（一） DC-BERT、Poly-Encoders、ColBERT

# 叄筆記列表

一、未分類

筆記目錄:
DC-BERT、Poly-Encoder、ColBERT 三者都是 representation 和 interaction 的結合,三者類似。
IR任務提升:
1.pretraining for IR
2.交互+表征結合
3.
Poly-Encoder(ICLR2020 facebook): Poly-encoders: architectures and pre-training strategies for fast and accurate multi-sentence scoring
簡述: 匹配任務中結合表征學習(上線推理速度快)和交互學習(性能好)的優點，做一個折中。
貢獻1:提出poly-encoder，並對poly-encoder做了詳細的實驗。
貢獻2:驗證不同預訓練策略對於下游任務影響。
筆記:Poly-Encoder(ICLR2020) - 知乎 (zhihu.com) 搜索推薦召回&&粗排相關性優化最新進展—2020
原文鏈接：1905.01969.pdf (arxiv.org)
代碼鏈接：
非官方實現torch: chijames/Poly-Encoder (github.com) 結果差距很大
非官方實現torch: sfzhou/PolyEncoder 結果差距很大
非官方實現torch：llStringll/Poly-encoders
其他: 看了文章、筆記

(SIGIR2020-Stanford)ColBERT: Efficient and Effective Passage Search via Contextualized Late Interaction over BERT
簡述: 文本匹配在表征層后接一個 MaxSim 進行Query-Doc交互。
貢獻1:提出ColBERT,在Re-Ranking 和 Retrieval 任務中做了很充分實驗。
貢獻2: query編碼幾種設置策略對比，計算相似度幾種策略進行對比。
貢獻3:時間復雜度、索引吞吐量、內存占用等方面不同策略進行詳細的對比。
筆記: ColBERT(SIGIR2020) - 知乎 (zhihu.com) 搜索推薦召回&&粗排相關性優化最新進展—2020
原文鏈接：https://arxiv.org/pdf/2004.12832.pdf
代碼鏈接:stanford-futuredata/ColBERT: ColBERT: Contextualized Late Interaction over BERT (SIGIR'20 paper) (github.com)
其他：看了文章、筆記
(ICLR2020-Google)Pre_Training Tasks for Embedding-Based Large Scale Retrieval

簡述: 設計三個 Pre_train task 解決信息檢索中Query-Doc匹配問題。
任務1: ICT(Inverse Close Task) 同一段落中局部窗口兩句話之間的關聯。沒太懂
任務2: BFS(Body First Selection) Doc中全局一致語義信息,從第一段隨機抽Q， D是頁面中隨機選擇的一個段落。
任務3: WLP(Wiki Link Prediction): 兩個Doc之間的語義關聯，Q從wikipage第一段中隨機選擇，d是另外一個頁面中的passage，但是還有到Q的超鏈接。
筆記：https://zhuanlan.zhihu.com/p/140323216 搜索推薦召回&&粗排相關性優化最新進展—2020
原文鏈接：https://arxiv.org/pdf/2002.03932.pdf
代碼鏈接：暫無...
其他：看了筆記

MarkedBERT: Integrating Traditional IR Cues in Pre-trained Language Models for Passage Retrieval (sigir2020 short paper)
簡述：預訓練語言模型融入傳統檢索線索微調進行段落檢索任務。
貢獻1: 提出MarkedBERT, 通過Mark token標記 Q-D Exact-Term Matching.
貢獻2: MS數據集ReRank 任務 MRR@10 性能好於 BERT(直觀解釋性還不懂)
筆記：SIGIR 2020之MarkedBERT模型：加入傳統檢索線索的Rerank模型 - 知乎 (zhihu.com) 參考大佬筆記,感謝 SIGIR2020-MarkedBERT(short paper) - 知乎 (zhihu.com)
原文鏈接：MarkedBERT: Integrating Traditional IR Cues in Pre-trained Language Models for Passage Retrieval (archives-ouvertes.fr)
代碼： BOUALILILila/markers_bert (github.com)
其他：看了文章、筆記
DC-BERT:Decoupling Question and Document for Efficient Contextual Encoding (SIGIR2020)
簡述: 開放域QA,針對BERT線上推理慢問題，提出DC-BERT模型，雙塔+交互架構
貢獻1: BC-BERT： Dual-BERT雙塔)、Transformer交互、打分
貢獻2: 典型的交互式+表征式檢索。對於Transformer層數進行了對比
貢獻3: SQuAD open、Natural Questions Open 性能+速度優勢。
筆記：DC-BERT模型：解耦問題-文檔編碼，提速QA-Rerank模塊 (還是大佬筆記，膜)
原文鏈接：https://arxiv.org/pdf/2002.12591.pdf
代碼鏈接：上述筆記評論區...
其他：
BERT-QE(EMNLP2020): Contextualized Query Expansion for Document Re-ranking
簡述: 基於上下文化Query擴展的文檔重排序。
貢獻1：原始Query-Doc匹配時候會因為表達格式差異進行PDF(偽相關性反饋)進行query 擴展，進一步進行搜索結果排序，但是query擴展又會引入不相關信息(這點不是完全理解)
，提出了 BERT-QE通過三階段重新進行 Retrieve & ReRank
文檔重新排序三階段
階段1：BM25粗排結果，使用BERT微調對文檔重新排序，獲取高排名的文檔作為PRF文檔。
階段2: 階段1的PDF文檔，滑動窗口分解味固定長度文本快，並評估文本快和Query相關性。
階段3: 階段2選取的文本塊和原始文本Query一起和Doc做相關性匹配。
文章三階段目標是解決query-expansion 中的噪音問題。實際使用中，可以對Doc每個句子計算Embbeding，句子層級先匹配，然后進一步做擴展？
筆記：還是大佬筆記,寫的太好BERT-QE: 基於上下文化查詢擴展的文檔ReRank - 知乎 (zhihu.com) 后期自己補筆記
原文鏈接： 2009.07258.pdf (arxiv.org)
代碼鏈接：https://github.com/zh-zheng/BERT-QE
其他：看了筆記
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks(emnlp2019) 引用400多,太誇張了
簡述: 孿生網絡編碼器變成BERT編碼，並探索了 CLS向量、平均池化、最大值池化三種策略的比較。
貢獻1: sentBERT 是BERT工業界使用的一次探索。 STS數據集
引用已經arXiv400, 大家都很關注...
文章分別嘗試三種優化目標/任務： softmax分類、回歸、Triplet Objective Function
實驗確實很充實...
筆記：Sentence-Bert論文筆記 - 知乎 (zhihu.com) 大佬太猛了，討論區也很火熱, 自己不再重新寫了
原文鏈接： Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
代碼鏈接：https://github.com/UKPLab/
其他：看了幾個筆記，文章沒看完整
DeepCT: Context-Aware Sentence/Passage Term Importance Estimation For First Stage Retrieval
DeepCT: Deep Contextualized Term Weighting framework, 詞權重分配任務
簡介: 通過學習賦予 query, doc中相同詞語不同的權重
筆記: 動態配置詞權重，檢索系列文章之DeepCT論文筆記 (大佬筆記)
原文鏈接: https://arxiv.org/abs/1910.10687
代碼鏈接:https://github.com/AdeDZY/DeepCT
其他：看了筆記
HDCT:Context-Aware Document Term Weighting for Ad-Hoc Search (WWW2020)
簡介: DeepCT的進階版, 解決長文檔問題，提出了多種label構建的方式。
筆記: 動態配置詞權重，檢索系列文章之HDCT論文筆記 (大佬筆記)
原文鏈接:Context-Aware Document Term Weighting for Ad-Hoc Search (acm.org)
代碼鏈接: https://github.com/AdeDZY/DeepCT/tree/master/HDCT
其他: 沒細看，看了筆記
ABNIRML: Analyzing the Behavior of Neural IR Models (AI2， 2020-11)
簡介: 分析當下神經檢索模型
分析為什么有效?
分析改進為什么有效？
當下模型的缺點:偏置
筆記:
原文鏈接:https://arxiv.org/pdf/2011.00696.pdf
代碼鏈接
其他: 后續看,關注現有模型問題,怎么發掘模型缺點? 后續自己的模型可以通過這個框架評價，然后說性能好
Pretrained Transformers for Text Ranking: BERT and Beyond (2020-10)
簡介: 綜述基於預訓練的文本排序模型，主要關注長文檔、性能vs速度
筆記:
原文鏈接: https://arxiv.org/pdf/2010.06467.pdf
代碼鏈接
其他: 未讀
ERNIE-DOC: The Retrospective Long-Document Modeling Transformer (baidu 2020-12)

簡介: 一種基於遞歸變換的文檔級語言預訓練模型ERNIE-DOC
retrospective feed mechanism and the enhanced recurrence mechanism
筆記:
原文鏈接：https://arxiv.org/pdf/2012.15688.pdf
代碼鏈接
其他: 未讀

Composite Re-Ranking for Efficient Document Search with BERT (2021-03-11 )
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2103.06499
代碼鏈接
其他: 文章未更新
Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks (2021-01)
簡介: CNN增強ALBERT 等在 STS任務中的表現。
筆記: https://arxiv.org/abs/2101.10642
原文鏈接
代碼鏈接
其他
Less is More: Pre-training a Strong Siamese Encoder Using a Weak Decoder(微軟 2021-02 )

簡介: 利用Weak Decoder增強預訓練Encoder，
在Web Search(MS MARCO)上ReRank: MRR@10 0.334 Retrieval: MRR@10 0.339
筆記:
原文鏈接: https://arxiv.org/pdf/2102.09206.pdf
代碼鏈接
其他: 有空看

A Primer in BERTology: What we know about how BERT works (2020-11，TACL)
簡介: 探索BERT 現有研究進展...
筆記:
原文鏈接
代碼鏈接
其他: 有空看
PROP: Pre-training with Representative Words Prediction for Ad-hoc Retrieval (wsdm2021, chengxueqi )
簡介: ROP(Representative Words Prediction)代表詞預測+MLM 任務做預訓練
筆記:
原文鏈接: https://arxiv.org/pdf/2010.10137.pdf
代碼鏈接： https://github.com/Albert-Ma/PROP
其他: 簡單看了ROP任務,最后結果。 ROP任務對IR作用感覺沒解釋清楚, 以及還有什么任務適用於IR？
★A Linguistic Study on Relevance Modeling in Information Retrieval (www2021 cxq)
簡介: 這篇文章寫的實在是太好了.... 寫作+貢獻+思路+指導意義
筆記:
原文鏈接
代碼鏈接
其他
Multi-Turn Response Selection for Chatbots with Deep Attention Matching Network
說明: ACL2018 百度檢索式多輪對話
簡介: representation-matching- aggregation框架,
創新: attention機制的有效性
筆記: Deep Attention Matching Network論文解讀
原文鏈接：
代碼鏈接
其他
Bridging the Gap Between Relevance Matching and Semantic Matching for Short Text Similarity Modeling
說明: EMNLP2019 Facebook
簡介: 提出了HCAN (Hybrid Co-Attention Network) 將相關性匹配和語義匹配結合起來
筆記: 通過HCAN將相關性匹配和語義匹配融合起來
原文鏈接:https://cs.uwaterloo.ca/~jimmylin/publications/Rao_etal_EMNLP2019.pdf
代碼鏈接
其他
keyword-attentive deep semantic matching (騰訊 2020-02)
簡介: 基於關鍵字的深度語義匹配模型
筆記: [閱讀筆記] 基於關鍵字注意的深層語義匹配模型
原文鏈接：https://arxiv.org/pdf/2003.11516.pdf
代碼鏈接
其他
IMN: Interactive Matching Network for Multi-Turn Response Selection in Retrieval-Based Chatbots
簡介: cikm 2019 檢索式響應回復
筆記: IMN: Interactive Matching Network
原文鏈接：https://arxiv.org/abs/1901.01824?context=cs.CL
代碼鏈接
其他
DRCN: Semantic Sentence Matching with Densely-connected Recurrent and Co-attentive Information
說明: AAAI 2019 句子匹配任務
簡介:
筆記: 論文筆記——Semantic Sentence Matching with DRCN
原文鏈接: https://arxiv.org/abs/1805.11360
代碼鏈接
其他
Match-Ignition: Plugging PageRank into Transformer for Long-form Text Matching
說明: WWW2021 chengxueqi 長文本匹配
簡介:
筆記: 畢設需要做長文本之間的匹配，沒有nlp基礎，有什么推薦的論文及代碼嗎？ - 樂清的回答 - 知乎 https://www.zhihu.com/question/453641003/answer/1825470481
原文鏈接: https://arxiv.org/abs/2101.06423
代碼鏈接
其他
Matching Algorithms: Fundamentals, Applications and Challenges
說明: 2021-03-16
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2103.03770
代碼鏈接
其他
Learning To Retrieve: How to Train a Dense Retrieval Model Effectively and Efficiently
說明:
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Local Self-Attention over Long Text for Efficient Document Retrieval
說明: SIGIR 2020 short paper
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2005.04908
代碼鏈接
其他
Long Document Ranking with Query-Directed Sparse Transformer
說明: emnlp2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
An In-depth Analysis of Passage-Level Label Transfer for Contextual Document Ranking
說明: 20210-03-30
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2103.16669
代碼鏈接
其他
End-to-End Contextualized Document Indexing and Retrieval with Neural Networks
說明: SIGIR2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
A Graph-based Relevance Matching Model for Ad-hoc Retrieval
To retrieve more relevant, appropriate and useful documents given a query, finding clues about that query through the text is crucial. Recent deep learning models regard the task as a term-level matching problem, which seeks exact or similar query patterns in the document. However, we argue that they are inherently based on local interactions and do not generalise to ubiquitous, non-consecutive contextual relationships. In this work, we propose a novel relevance matching model based on graph neural networks to leverage the document-level word relationships for ad-hoc retrieval. In addition to the local interactions, we explicitly incorporate all contexts of a term through the graph-of-word text format. Matching patterns can be revealed accordingly to provide a more accurate relevance score. Our approach significantly outperforms strong baselines on two ad-hoc benchmarks. We also experimentally compare our model with BERT and show our advantages on long documents.
說明: AAAI 2021
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2101.11873
代碼鏈接
其他
Graph-based Hierarchical Relevance Matching Signals for Ad-hoc Retrieval
The ad-hoc retrieval task is to rank related documents given a query and a document collection. A series of deep learning based approaches have been proposed to solve such problem and gained lots of attention. However, we argue that they are inherently based on local word sequences, ignoring the subtle long-distance document-level word relationships. To solve the problem, we explicitly model the document-level word relationship through the graph structure, capturing the subtle information via graph neural networks. In addition, due to the complexity and scale of the document collections, it is considerable to explore the different grain-sized hierarchical matching signals at a more general level. Therefore, we propose a Graph-based Hierarchical Relevance Matching model (GHRM) for ad-hoc retrieval, by which we can capture the subtle and general hierarchical matching signals simultaneously. We validate the effects of GHRM over two representative ad-hoc retrieval benchmarks, the comprehensive experiments and results demonstrate its superiority over state-of-the-art methods.
說明: WWW2021
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2102.11127
代碼鏈接
其他
Conformer-Kernel with Query Term Independence for Document Retrieval
The Transformer-Kernel (TK) model has demonstrated strong reranking performance on the TREC Deep Learning benchmark---and can be considered to be an efficient (but slightly less effective) alternative to BERT-based ranking models. In this work, we extend the TK architecture to the full retrieval setting by incorporating the query term independence assumption. Furthermore, to reduce the memory complexity of the Transformer layers with respect to the input sequence length, we propose a new Conformer layer. We show that the Conformer's GPU memory requirement scales linearly with input sequence length, making it a more viable option when ranking long documents. Finally, we demonstrate that incorporating explicit term matching signal into the model can be particularly useful in the full retrieval setting. We present preliminary results from our work in this paper.
說明:2020-06-20
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2007.10434
代碼鏈接
其他
A Systematic Evaluation of Transfer Learning and Pseudo-labeling with BERT-based Ranking Models
說明: 2021-03
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2103.03335
代碼鏈接
其他
Distilling Dense Representations for Ranking using Tightly-Coupled Teachers
We present an approach to ranking with dense representations that applies knowledge distillation to improve the recently proposed late-interaction ColBERT model. Specifically, we distill the knowledge from ColBERT's expressive MaxSim operator for computing relevance scores into a simple dot product, thus enabling single-step ANN search. Our key insight is that during distillation, tight coupling between the teacher model and the student model enables more flexible distillation strategies and yields better learned representations. We empirically show that our approach improves query latency and greatly reduces the onerous storage requirements of ColBERT, while only making modest sacrifices in terms of effectiveness. By combining our dense representations with sparse representations derived from document expansion, we are able to approach the effectiveness of a standard cross-encoder reranker using BERT that is orders of magnitude slower.
說明: 2020-10-22
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2010.11386
代碼鏈接
其他
The Expando-Mono-Duo Design Pattern for Text Ranking with Pretrained Sequence-to-Sequence Models
說明: 2021-01-14
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2101.05667
代碼鏈接
其他
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation
說明: 2021-01-22
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2010.02666
代碼鏈接
其他
Longformer for MS MARCO Document Re-ranking Task
說明: 2020-09-20
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2009.09392
代碼鏈接
其他
ORCAS: 18 Million Clicked Query-Document Pairs for Analyzing Search
說明: 2020-08
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2006.05324
代碼鏈接
其他
QueryBlazer: Efficient Query Autocompletion Framework WSDM2021
說明:
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
MIMICS: A Large-Scale Data Collection for Search Clarification
說明: 2020-06
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2006.10174
代碼鏈接
其他

Neural Methods for Effective, Efficient, and Exposure-Aware Information Retrieval

說明: 2021-03
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2012.11685
代碼鏈接
其他

Listwise Learning to Rank by Exploring Unique Ratings
說明: 2020-01
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2001.01828
代碼鏈接
其他
Selective Weak Supervision for Neural Information Retrieval
說明: WWW2020
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2001.10382
代碼鏈接
其他
OpenMatch: An Open-Source Package for Information Retrieval
說明: 2021-02 清華
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2102.00166
代碼鏈接
其他
Relevance-guided Supervision for OpenQA with ColBERT
說明: 2020-07
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2007.00814
代碼鏈接
其他

Training Curricula for Open Domain Answer Re-Ranking

說明: SIGIR 2020
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2004.14269
代碼鏈接
其他

Beyond Relevance: Trustworthy Answer Selection via Consensus Verification
說明: WSDM2021 chengxueqi
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Distant Supervision in BERT-based Adhoc Document Retrieval
說明: CIKM 2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Investigating Reading Behavior in Fine-grained Relevance Judgment
說明: sigir2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Investigating the case of weak baselines in Ad-hoc Retrieval and Question Answering
說明: 2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Leveraging Passage-level Cumulative Gain for Document Ranking
說明: WWW2021
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Information retrieval: a view from the Chinese IR community
說明: 2021 chengxueqi 綜述
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Topic-enhanced knowledge-aware retrieval model for diverse relevance estimation
說明: 2021
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
A Pseudo-relevance feedback framework combining relevance matching and semantic matching for information retrieval
說明: 2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2
說明: EMNLP2020
簡介:
筆記:
原文鏈接:Ad hoc Document Retrieval using Weak Supervision with BERT and GPT2 - AMiner
代碼鏈接
其他
An Analysis of BERT in Document Ranking
說明: 2020 馬少平
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Beyond 512 Tokens: Siamese Multi-depth Transformer-based Hierarchical Encoder for Long-Form Document Matching
說明: CIKM2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他

QueryBlazer: Efficient Query Autocompletion Framework

說明: WWW2021
簡介:
筆記:
原文鏈接:
代碼鏈接
其他

RocketQA: An Optimized Training Approach to Dense Passage Retrieval for Open-Domain Question Answering
說明: 2020-10
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2010.08191
代碼鏈接
其他
CoRT: Complementary Rankings from Transformers
說明: 召回檢索速度更快 2020-10
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2010.10252
代碼鏈接
其他
DiPair: Fast and Accurate Distillation for Trillion-Scale Text Matching and Pair Modeling
說明: EMNLP2020 檢索蒸餾
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Beyond Probability Ranking Principle: Modeling the Dependencies among Documents
說明: WSDM 2021
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
An Attention-based Deep Relevance Model for Few-shot Document Filtering
說明: 2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
An end-to-end pseudo relevance feedback framework for neural document retrieval
說明: 2019
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
IART: Intent-aware Response Ranking with Transformers in Information-seeking Conversation Systems
說明: WWW2020
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2002.00571
代碼鏈接
其他
Learning Better Representations for Neural Information Retrieval with Graph Information
說明: CIKM2020
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Fine-Grained Relevance Annotations for Multi-Task Document Ranking and Question Answering
說明: CIKM2020
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2008.05363
代碼鏈接
其他
Improving Efficient Neural Ranking Models with Cross-Architecture Knowledge Distillation
說明: 2021-01
簡介:
筆記:
原文鏈接: https://arxiv.org/abs/2010.02666
代碼鏈接
其他

說明:
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
19.
說明:
簡介:
筆記:
原文鏈接:
代碼鏈接
其他
Ad-hoc Document Retrieval using Weak-Supervision with BERT and GPT2 EMNLP2020

論文列表:

OpenQA綜述: Retrieving and Reading : A Comprehensive Survey on Open-domain Question Answering (AI2 2021-01, 關注Retrieval部分) 【未讀】

文本匹配:

檢索:

其他:
13. ANN Negative Contrastive Learning for Dense Text Retrieval (微軟 2020-10)
簡介:
筆記: 對比學習論文筆記2
原文鏈接: https://arxiv.org/pdf/2007.00808.pdf
代碼鏈接
其他
14. ANN相關資源:
一文入門Facebook開源向量檢索框架Faiss
Graph Search Engine: A Deeper Dive 實值向量搜索引擎進展
Search Engine For AI：高維數據檢索工業級解決方案
一文縱覽KNN（ANN）向量檢索
高維空間最近鄰逼近搜索算法評測
語義索引（向量檢索）的幾類經典方法 @后青春期的工程師

數據集 & Leadboard & 評價指標

ECIR、TOIS、AIRS、sigkdd、TKDE
wsdm\www\cikm\sigir\kdd、

會議論文整理

按照Track/Topic 整理資源

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 python日記整理 informatica 學習日記整理愛死這個阿里資源站了！2000W+資源，每天更新，不限速！記錄錯誤or日記(更新中) Flutter 學習資源整理 GPU資源整理 Webpack 多html入口、devServer、熱更新配置日記日記2 史上最全Quant資源整理