1. What was the common approach for scoring the entity type?
A filtering approach, based on the Kullback-Leibler divergence between the probability distributions of the entity and query types, was used for scoring the entity type.
read more
2. What are the future works in this paper?
In this section, the authors present several ideas for the future, both at a high-level of abstraction, and at a high-level of detail.. 3. 4 ]. 256 11. 3 future work 11.. More importantly, future work should focus on alternative approaches to reduce the complexity of the representation model, such as the document profiles based on keywords that the authors proposed here.. Apart from reducing the model by pruning redundancies, the authors would, on the other hand, like to extend it with synonyms for verbs, adjectives and adverbs, measuring the impact in effectiveness, and understanding whether the usage of synsets for nouns had been sufficient.
read more
3. What are the contributions in this paper?
With the goal of harnessing all available information to optimize retrieval, the authors explore joint representation models of documents and entities, while taking a step towards the definition of a more general retrieval approach.. Specifically, the authors propose that graphs should be used to incorporate explicit and implicit information derived from the relations between text found in corpora and entities found in knowledge bases.. The authors also take advantage of this framework to elaborate a general model for entity-oriented search, proposing a universal ranking function for the tasks of ad hoc document retrieval ( leveraging entities ), ad hoc entity retrieval, and entity list completion.. The authors introduce the entity weight as the corresponding ranking function, relying on the idea of seed nodes for representing the query, either directly through term nodes, or based on the expansion to adjacent entity nodes.. The authors introduce the random walk score as the corresponding ranking function, relying on the same idea of seed nodes, similar to the entity weight in the graph-of-entity.. Scoring based on this function is highly reliant on the structure of the hypergraph, which the authors call representation-driven retrieval.. The authors also propose TF-bins as a discretization for representing term frequency in the hypergraph-of-entity.. For the random walk score, the authors propose and explore several parameters, including length and repeats, with or without seed node expansion, direction, or weights, and with or without a certain degree of node and/or hyperedge fatigue, a concept that they also propose.. For evaluation, the authors took advantage of TREC 2017 OpenSearch track, which relied on an online evaluation process based on the Living Labs API, and they also participated in TREC 2018 Common Core track, which was based on the newly introduced TREC Washington Post Corpus.
read more
4. What types of knowledge bases are commonly used to augment a corpus?
Knowledge bases like Wikipedia (semi-structured), DBpedia (structured), or Wikidata (structured) are frequently used to augment a corpus.
read more





