Dense Passage Retrieval for Open-Domain Question Answering
Vladimir Karpukhin,Barlas Oguz,Sewon Min,Patrick S. H. Lewis,Ledell Wu,Sergey Edunov,Danqi Chen,Wen-tau Yih +7 more
- 10 Apr 2020
- pp 6769-6781
TL;DR: In this paper, a dual-encoder framework is proposed to learn dense representations from a small number of questions and passages by a simple dual encoder framework, which outperforms a strong Lucene-BM25 system greatly.
read more
Abstract: Open-domain question answering relies on efficient passage retrieval to select candidate contexts, where traditional sparse vector space models, such as TF-IDF or BM25, are the de facto method. In this work, we show that retrieval can be practically implemented using dense representations alone, where embeddings are learned from a small number of questions and passages by a simple dual-encoder framework. When evaluated on a wide range of open-domain QA datasets, our dense retriever outperforms a strong Lucene-BM25 system greatly by 9%-19% absolute in terms of top-20 passage retrieval accuracy, and helps our end-to-end QA system establish new state-of-the-art on multiple open-domain QA benchmarks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
•Posted Content
SimCSE: Simple Contrastive Learning of Sentence Embeddings
TL;DR: SimCSE as discussed by the authors proposes a contrastive learning framework for sentence embeddings, which takes an input sentence and predicts itself in contrastive objective, with only standard dropout used as noise.
1.7K
•Posted Content
On the Opportunities and Risks of Foundation Models.
Rishi Bommasani,Drew A. Hudson,Ehsan Adeli,Russ B. Altman,Simran Arora,Sydney von Arx,Michael S. Bernstein,Jeannette Bohg,Antoine Bosselut,Emma Brunskill,Erik Brynjolfsson,Shyamal Buch,Dallas Card,Rodrigo Castellon,Niladri S. Chatterji,Annie Chen,Kathleen Creel,Jared Davis,Dora Demszky,Chris Donahue,Moussa Doumbouya,Esin Durmus,Stefano Ermon,John Etchemendy,Kawin Ethayarajh,Li Fei-Fei,Chelsea Finn,Trevor Gale,Lauren Gillespie,Karan Goel,Noah D. Goodman,Shelby Grossman,Neel Guha,Tatsunori Hashimoto,Peter Henderson,John Hewitt,Daniel E. Ho,Jenny Hong,Kyle Hsu,Jing Huang,Thomas Icard,Saahil Jain,Dan Jurafsky,Pratyusha Kalluri,Siddharth Karamcheti,Geoff Keeling,Fereshte Khani,Omar Khattab,Pang Wei Koh,Mark Krass,Ranjay Krishna,Rohith Kuditipudi,Ananya Kumar,Faisal Ladhak,Mina Lee,Tony Lee,Jure Leskovec,Isabelle Levent,Xiang Lisa Li,Xuechen Li,Tengyu Ma,Ali Ahmad Malik,Christopher D. Manning,Suvir Mirchandani,Eric Mitchell,Zanele Munyikwa,Suraj Nair,Avanika Narayan,Deepak Narayanan,Ben Newman,Allen Nie,Juan Carlos Niebles,Hamed Nilforoshan,Julian Nyarko,Giray Ogut,Laurel Orr,Isabel Papadimitriou,Joon Sung Park,Chris Piech,Eva Portelance,Christopher Potts,Aditi Raghunathan,Rob Reich,Hongyu Ren,Frieda Rong,Yusuf H. Roohani,Camilo Ruiz,Jack Ryan,Christopher Ré,Dorsa Sadigh,Shiori Sagawa,Keshav Santhanam,Andy Shih,Krishnan Srinivasan,Alex Tamkin,Rohan Taori,Armin W. Thomas,Florian Tramèr,Rose E. Wang,William Yang Wang,Bohan Wu,Jiajun Wu,Yuhuai Wu,Sang Michael Xie,Michihiro Yasunaga,Jiaxuan You,Matei Zaharia,Michael Zhang,Tianyi Zhang,Xikun Zhang,Yuhui Zhang,Lucia Zheng,Kaitlyn Zhou,Percy Liang +113 more
TL;DR: The authors provides a thorough account of the opportunities and risks of foundation models, ranging from their capabilities (e.g., language, vision, robotics, reasoning, human interaction) and technical principles(e. g.g. model architectures, training procedures, data, systems, security, evaluation, theory) to their applications.
1.3K
Self-supervised Learning: Generative or Contrastive.
TL;DR: This survey takes a look into new self-supervised learning methods for representation in computer vision, natural language processing, and graph learning, and comprehensively review the existing empirical methods into three main categories according to their objectives.
1.1K
•Posted Content
Approximate Nearest Neighbor Negative Contrastive Learning for Dense Text Retrieval
Lee Xiong,Chenyan Xiong,Ye Li,Kwok-Fung Tang,Jialin Liu,Paul N. Bennett,Junaid Ahmed,Arnold Overwijk +7 more
TL;DR: Approximate nearest neighbor Negative Contrastive Estimation (ANCE) is presented, a training mechanism that constructs negatives from an Approximate Nearest Neighbor (ANN) index of the corpus, which is parallelly updated with the learning process to select more realistic negative training instances.
917
UNIFIEDQA: Crossing Format Boundaries with a Single QA System
Daniel Khashabi,Sewon Min,Tushar Khot,Ashish Sabharwal,Oyvind Tafjord,Peter Clark,Hannaneh Hajishirzi +6 more
- 02 May 2020
TL;DR: This work uses the latest advances in language modeling to build a single pre-trained QA model, UNIFIEDQA, that performs well across 19 QA datasets spanning 4 diverse formats, and results in a new state of the art on 10 factoid and commonsense question answering datasets.
References
•Proceedings Article
Learning Discriminative Projections for Text Similarity Measures
Wen-tau Yih,Kristina Toutanova,John Platt,Christopher Meek +3 more
- 23 Jun 2011
TL;DR: A novel discriminative training method that projects the raw term vectors into a common, low-dimensional vector space, which not only outperforms existing state-of-the-art approaches, but also achieves high accuracy at low dimensions and is thus more efficient.
•Proceedings Article
Poly-encoders: Architectures and Pre-training Strategies for Fast and Accurate Multi-sentence Scoring
Samuel Humeau,Kurt Shuster,Marie-Anne Lachaux,Jason Weston +3 more
- 30 Apr 2020
TL;DR: This work develops a new transformer architecture, the Poly-encoder, that learns global rather than token level self-attention features, and shows that the models achieve state-of-the-art results on four tasks.
End-to-End Open-Domain Question Answering with BERTserini
TL;DR: In this paper, an end-to-end question answering system that integrates BERT with the open-source Anserini information retrieval toolkit is presented, which integrates best practices from IR with a BERT-based reader to identify answers from a large corpus of Wikipedia articles.
303
Information science as "Little Science":The implications of a bibliometric analysis of theJournal of the American Society for Information Science
TL;DR: Based on analysis of articles published in AD and JASIS from 1950 to1999, it is found that there has been a slow but perhaps inevitable shift based first on the single nonfunded researcher and author to a much wider research and publishing participation among authors, regions, corporate authors, and countries.
247
Multi-passage BERT: A Globally Normalized BERT Model for Open-domain Question Answering
Zhiguo Wang,Patrick Ng,Xiaofei Ma,Ramesh Nallapati,Bing Xiang +4 more
- 22 Aug 2019
TL;DR: The authors proposed a multi-passage BERT model to globally normalize answer scores across all passages of the same question, and this change enables our QA model find better answers by utilizing more passages.