Proceedings Article10.1145/2396761.2398662
A constraint to automatically regulate document-length normalisation
Ronan Cummins,Colm O'Riordan +1 more
- 29 Oct 2012
- pp 2443-2446
16
TL;DR: This paper formally describes the interaction between query-terms and document length normalisation using a constraint, and develops a general pre-retrieval approach to adapt a number of state-of-the-art ranking functions so that they adhere to the constraint.
read more
Abstract: Retrieval functions in information retrieval (IR) are fundamental to the effectiveness of search systems. However, considerable parameter tuning is often needed to increase the effectiveness of the retrieval. Document length normalisation is one such aspect that requires tuning on a per-query and per-collection basis for many retrieval functions. In this paper, we develop an approach that regularises the level of normalisation to apply on a per-query basis. We formally describe the interaction between query-terms and document length normalisation using a constraint. We then develop a general pre-retrieval approach to adapt a number of state-of-the-art ranking functions so that they adhere to the constraint. Finally, we empirically demonstrate that the adapted retrieval functions outperform default versions of the original retrieval functions, and perform at least comparably to tuned versions of the original functions, on a number of datasets. Essentially this regulates the normalisation parameter in a number of retrieval functions on a per-query basis in a principled manner.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Towards Axiomatic Explanations for Neural Ranking Models
Michael Völske,Alexander Bondarenko,Maik Fröbe,Benno Stein,Jaspreet Singh,Matthias Hagen,Avishek Anand +6 more
- 11 Jul 2021
TL;DR: In this article, the authors investigate whether neural ranking models can be explained in terms of well-studied principles of document ranking by using established theories from axiomatic~IR, and propose a set of axioms to reproduce ranking decisions based on combinations of elementary constraints.
27
Improving Retrieval Performance for Verbose Queries via Axiomatic Analysis of Term Discrimination Heuristic
Mozhdeh Ariannezhad,Ali Montazeralghaem,Hamed Zamani,Azadeh Shakery +3 more
- 07 Aug 2017
TL;DR: This paper proposes a constraint to model the interaction between query length and IDF, and suggests a modification to adapt BM25 so that it adheres to the new constraint.
14
A Study of Retrieval Models for Long Documents and Queries in Information Retrieval
Ronan Cummins
- 11 Apr 2016
TL;DR: This paper formally analyse two important but distinct reasons for normalising documents with respect to length, namely verbosity and scope, and develops a new discriminative query language modelling approach that demonstrates improved performance on long verbose queries by appropriately weighting salient aspects of the query.
A Study of Query Length Heuristics in Information Retrieval
Yuanhua Lv
- 17 Oct 2015
TL;DR: It is revealed that query length actually interacts with term frequency (TF) normalization, a key component of all effective retrieval models and that, in order to solve this problem, the TF normalization component in a retrieval function should be adapted to query length.
8
Verbosity normalized pseudo-relevance feedback in information retrieval
Seung-Hoon Na,Kangil Kim +1 more
TL;DR: The results of the experiments show that the proposed verbosity normalized pseudo-relevance feedback consistently provides statistically significant improvements over conventional methods, under the settings of the relevance model and latent concept expansion.
7
References
A probabilistic model of information retrieval: development and comparative experiments
TL;DR: The paper combines a comprehensive account of the probabilistic model of retrieval with new systematic experiments on TREC Programme material, and presents the model from its foundations through its logical development to cover more aspects of retrieval data and a wider range of system functions.
1.2K
Probabilistic models of information retrieval based on measuring the divergence from randomness
TL;DR: A framework for deriving probabilistic models of Information Retrieval using term-weighting models obtained in the language model approach by measuring the divergence of the actual term distribution from that obtained under a random process is introduced.
Pivoted document length normalization
Amit Singhal,Chris Buckley,Manclar Mitra +2 more
- 18 Aug 1996
TL;DR: Pivoted normalization is presented, a technique that can be used to modify any normalization function thereby reducing the gap between the relevance and the retrieval probabilities, and two new normalization functions--pivoted unique normalization and piuotert byte size normalization are presented.
989
A formal study of information retrieval heuristics
Hui Fang,Tao Tao,ChengXiang Zhai +2 more
- 25 Jul 2004
TL;DR: A formal study of retrieval heuristics is presented and it is found that the empirical performance of a retrieval formula is tightly related to how well it satisfies basic desirable constraints.
An exploration of axiomatic approaches to information retrieval
Hui Fang,ChengXiang Zhai +1 more
- 15 Aug 2005
TL;DR: This paper proposes a new axiomatic approach to developing retrieval models based on direct modeling of relevance with formalized retrieval constraints defined at the level of terms, and derives several new retrieval functions using this framework.
Related Papers (5)
Yuanhua Lv,ChengXiang Zhai +1 more
- 24 Oct 2011
Amit Singhal,Chris Buckley,Manclar Mitra +2 more
- 18 Aug 1996
Hui Fang,Tao Tao,ChengXiang Zhai +2 more
- 25 Jul 2004