Information Retrieval

Books

Semantic Search on Text and Knowledge Bases, by Bast, Hannah & Buchhold, Björn & Haussmann, Elmar, 2016. A comprehensive overview of the broad area of semantic search on text and knowledge bases. It was written in 2016, which means the analysis stops before Transformers were widely adopted in IR. This is the book to read in order to understand the IR field prior to the introduction of Transformers.
Pretrained Transformers for Text Ranking: BERT and Beyond, by Jimmy Lin, Rodrigo Nogueira, and Andrew Yates, 2021. The previous book was about the IR field prior to 2016. This one is about the recent years. In particular, about Transformers usage in text ranking. This is the book to read in order to uderstand the present of IR (at least at the time when I'm writing this).

Papers

Multi-Stage Document Ranking with BERT by Rodrigo Nogueira, Wei Yang, Kyunghyun Cho, Jimmy Lin, 2019 Rodrigo Nogueira is co-author of the second book I mentioned at the start of this page. This is a must-read article, as R.N. has important contributions in this field.
Passage Re-ranking with BERT, by Rodrigo Nogueira, Kyunghyun Cho, 2019.
Rethink Training of BERT Rerankers in Multi-Stage Retrieval Pipeline, by Luyu Gao, Zhuyun Dai, Jamie Callan. They propose Localized Contrastive Estimation (LCE) for training rerankers and demonstrate it significantly improves deep two-stage models
What happens to BERT embeddings during fine-tuning, by Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney They investigate how fine-tuning affects the representations of the BERT model. Findings: - Fine-tuning does not lead to catastrophic forgetting of linguistic phenomena - fine-tuning primarily affects the top layers of BERT, but with noteworthy variation across tasks - weaker effect on representations of out-of-domain sentences
Understanding the Behaviors of BERT in Ranking, by Yifan Qiao, Chenyan Xiong, Zhenghao Liu, Zhiyuan Liu
Language models as knowledge bases ? by Fabio Petroni, Tim Rocktäschel, Patrick Lewis, Anton Bakhtin, Yuxiang Wu, Alexander H. Miller, Sebastian Riedel
Understanding BERT Rankers Under Distillation, by Luyu Gao, Zhuyun Dai, Jamie Callan
Deep Learning for Matching in Search and Recommendation, by Jun Xu, Hang Li, Xiangnan He, 2019
Project PIAF: Building a Native French Question-Answering Dataset, by Rachel Keraron, Guillaume Lancrenon, Mathilde Bras, Frédéric Allary, Gilles Moyse, Thomas Scialom, Edmundo-Pavel Soriano-Morales, Jacopo Staiano, 2020

Other

Zero-shot, One Kill: BERT for Neural Information Retrieval, by Stergios Efes This is an interesting work done as a master thess

Resource Name

Type