Awesome
awesome-generative-retrieval-models
A curated list of awesome papers related to generative retrieval models. If I missed any papers, feel free to open a PR to include them! And any feedback and contribuitons are welcome!
Generative retrieval is meant to replace the long-lived "retrieve-then-rank" paradigm by collapsing the indexing, retrieval, and ranking components of traditional Information Retrieval (IR) systems into a single unified model. With generative retrieval, indexing is replaced with model training, while retrieval and ranking are replaced with model inference.
Table of Contents
Opinion Papers
- Rethinking Search: Making Domain Experts out of Dilettantes. Metzler et al., SIGIR Forum 2021.
Analysis
- Understanding Differential Search Index for Text Retrieval. Chen et al., ACL 2023 Findings.
- How Does Generative Retrieval Scale to Millions of Passages?. Pradeep et al., Arxiv 2023.
Indexing Strategy
- Transformer Memory as a Differentiable Search Index. Tay et al., NeurIPS 2022. [Video] (DSI)
- CorpusBrain: Pre-train a Generative Retrieval Model for Knowledge-Intensive Language Tasks. Chen et al., CIKM 2022. [Code] (CorpusBrain)
- A Neural Corpus Indexer for Document Retrieval. Wang et al., NeurIPS 2022. (NCI)
- Bridging the Gap Between Indexing and Retrieval for Differentiable Search Index with Query Generation. Zhuang et al., Arxiv 2022. [Code] (DSI-QG)
Identifier Design
- Autoregressive Entity Retrieval. Cao et al, ICLR 2021. [Code] (GENRE)
- Autoregressive Search Engines: Generating Substrings as Document Identifiers. Bevilacqua et al., NeurIPS 2022. [Code] (SEAL)
- DynamicRetriever: A Pre-training Model-based IR System with Neither Sparse nor Dense Index. Zhou et al, Arxiv 2022. (DynamicRetriever)
- Ultron: An Ultimate Retriever on Corpus with a Model-based Indexer. Zhou et al, Arxiv 2022. (Ultron)
- Learning to Tokenize for Generative Retrieval. Sun et al, Arxiv 2023. (GenRet)
- TOME: A Two-stage Approach for Model-based Retrieval. Ren et al, ACL 2023. (TOME)
- Term-Sets Can Be Strong Document Identifiers For Auto-Regressive Search Engines. Zhang et al, Arxiv 2023. (AutoTSG)
- Multiview Identifiers Enhanced Generative Retrieval. Li et al, ACL 2023.
Dynamic Corpora
- DSI++: Updating Transformer Memory with New Documents. Mehta et al, Arxiv 2022. (DSI++)
- Continually Updating Generative Retrieval on Dynamic Corpora. Yoon et al, Arxiv 2022. (StreamingIR)
Applications
- GERE: Generative Evidence Retrieval for Fact Verification. Chen et al, SIGIR 2022. [Code] (GERE)
- Generative Multi-hop Retrieval. Lee et al, Arxiv 2022. (GMR)
- Data-Efficient Autoregressive Document Retrieval for Fact Verification. Thorne, Arxiv 2023. (DearDR)
- Contextualized Generative Retrieval. Lee et al, Arxiv 2022. (CGR)
- A Unified Generative Retriever for Knowledge-Intensive Language Tasks via Prompt Learning. Chen et al, SIGIR 2023. (UGR)
- Recommender Systems with Generative Retrieval. Rajput et al, Arxiv 2023. (RQ-VAE)
- IRGen: Generative Modeling for Image Retrieval. Zhang et al, Arxiv 2023. (IRGen)