Entity linking

Topic Tools

Papers published on a yearly basis

Papers

Proceedings Article•10.3115/1119176.1119195•

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

[...]

Erik Tjong Kim Sang¹, Fien De Meulder¹•Institutions (1)

University of Antwerp¹

31 May 2003

TL;DR: The CoNLL-2003 shared task on NER as mentioned in this paper was the first NER task with language-independent named entity recognition (NER) data sets and evaluation method, and a general overview of the systems that participated in the task and their performance.

...read moreread less

Abstract: We describe the CoNLL-2003 shared task: language-independent named entity recognition. We give background information on the data sets (English and German) and the evaluation method, present a general overview of the systems that have taken part in the task and discuss their performance.

...read moreread less

4,522 citations

Journal Article•10.1075/LI.30.1.03NAD•

A survey of named entity recognition and classification

[...]

David Nadeau¹, Satoshi Sekine²•Institutions (2)

National Research Council¹, New York University²

01 Jan 2007-Lingvisticae Investigationes

TL;DR: Observations about languages, named entity types, domains and textual genres studied in the literature, along with other critical aspects of NERC such as features and evaluation methods, are reported.

...read moreread less

Abstract: This survey covers fifteen years of research in the Named Entity Recognition and Classification (NERC) field, from 1991 to 2006. We report observations about languages, named entity types, domains and textual genres studied in the literature. From the start, NERC systems have been developed using hand-made rules, but now machine learning techniques are widely used. These techniques are surveyed along with other critical aspects of NERC such as features and evaluation methods. Features are word-level, dictionary-level and corpus-level representations of words in a document. Evaluation techniques, ranging from intuitive exact match to very complex matching techniques with adjustable cost of errors, are an indisputable key to progress.

...read moreread less

3,083 citations

Journal Article•10.1162/TACL_A_00104•

Named Entity Recognition with Bidirectional LSTM-CNNs

[...]

Jason P.C. Chiu¹, Eric Nichols²•Institutions (2)

University of British Columbia¹, Honda²

21 Jul 2016-Transactions of the Association for Computational Linguistics

TL;DR: In this article, a hybrid bidirectional LSTM and CNN architecture was proposed to automatically detect word and character-level features, eliminating the need for feature engineering and lexicons to achieve high performance.

...read moreread less

Abstract: Named entity recognition is a challenging task that has traditionally required large amounts of knowledge in the form of feature engineering and lexicons to achieve high performance. In this paper, we present a novel neural network architecture that automatically detects word- and character-level features using a hybrid bidirectional LSTM and CNN architecture, eliminating the need for most feature engineering. We also propose a novel method of encoding partial lexicon matches in neural networks and compare it to existing approaches. Extensive evaluation shows that, given only tokenized text and publicly available word embeddings, our system is competitive on the CoNLL-2003 dataset and surpasses the previously reported state of the art performance on the OntoNotes 5.0 dataset by 2.13 F1 points. By using two lexicons constructed from publicly-available sources, we establish new state of the art performance with an F1 score of 91.62 on CoNLL-2003 and 86.28 on OntoNotes, surpassing systems that employ heavy feature engineering, proprietary lexicons, and rich entity linking information.

...read moreread less

1,842 citations

Proceedings Article•10.1145/1458082.1458150•

Learning to link with wikipedia

[...]

David Milne, Ian H. Witten

26 Oct 2008

TL;DR: This paper explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to the appropriate Wikipedia articles, and performs very well, with recall and precision of almost 75%.

...read moreread less

Abstract: This paper describes how to automatically cross-reference documents with Wikipedia: the largest knowledge base ever known. It explains how machine learning can be used to identify significant terms within unstructured text, and enrich it with links to the appropriate Wikipedia articles. The resulting link detector and disambiguator performs very well, with recall and precision of almost 75%. This performance is constant whether the system is evaluated on Wikipedia articles or "real world" documents.This work has implications far beyond enriching documents with explanatory links. It can provide structured knowledge about any unstructured fragment of text. Any task that is currently addressed with bags of words - indexing, clustering, retrieval, and summarization to name a few - could use the techniques described here to draw on a vast network of concepts and semantics.

...read moreread less

1,466 citations

Proceedings Article•

Large-Scale Named Entity Disambiguation Based on Wikipedia Data

[...]

Silviu Cucerzan¹•Institutions (1)

Microsoft¹

1 Jun 2007

TL;DR: Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles.

...read moreread less

Abstract: This paper presents a large-scale system for the recognition and semantic disambiguation of named entities based on information extracted from a large encyclopedic collection and Web search results. It describes in detail the disambiguation paradigm employed and the information extraction process from Wikipedia. Through a process of maximizing the agreement between the contextual information extracted from Wikipedia and the context of a document, as well as the agreement among the category tags associated with the candidate entities, the implemented system shows high disambiguation accuracy on both news stories and Wikipedia articles.

...read moreread less

1,354 citations

...

Expand

Year	Papers
2026	1
2025	23
2024	80
2023	166
2022	291
2021	191

Topic Tools

Papers published on a yearly basis

Papers

Introduction to the CoNLL-2003 shared task: language-independent named entity recognition

A survey of named entity recognition and classification

Named Entity Recognition with Bidirectional LSTM-CNNs

Learning to link with wikipedia

Large-Scale Named Entity Disambiguation Based on Wikipedia Data

Related Topics (5)

Performance Metrics