Phrase chunking

Topic Tools

Papers published on a yearly basis

Papers

Book Chapter•10.1007/978-94-017-2390-9_10•

Text Chunking Using Transformation-Based Learning

[...]

1 Jan 1999

TL;DR: This work has shown that the transformation-based learning approach can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks.

...read moreread less

Abstract: Transformation-based learning, a technique introduced by Eric Brill (1993b), has been shown to do part-of-speech tagging with fairly high accuracy. This same method can be applied at a higher level of textual interpretation for locating chunks in the tagged text, including non-recursive “baseNP” chunks. For this purpose, it is convenient to view chunking as a tagging problem by encoding the chunk structure in new tags attached to each word. In automatic tests using Treebank-derived data, this technique achieved recall and precision rates of roughly 93% for baseNP chunks (trained on 950K words) and 88% for somewhat more complex chunks that partition the sentence (trained on 200K words). Working in this new application and with larger template and training sets has also required some interesting adaptations to the transformation-based learning approach.

...read moreread less

1,697 citations

Proceedings Article•10.3115/1117601.1117631•

Introduction to the CoNLL-2000 shared task: chunking

[...]

Erik Tjong Kim Sang¹, Sabine Buchholz²•Institutions (2)

University of Antwerp¹, Tilburg University²

13 Sep 2000

TL;DR: The CoNLL-2000 shared task: dividing text into syntactically related non-overlapping groups of words, so-called text chunking is described.

...read moreread less

Abstract: We describe the CoNLL-2000 shared task: dividing text into syntactically related non-overlapping groups of words, so-called text chunking. We give background information on the data sets, present a general overview of the systems that have taken part in the shared task and briefly discuss their performance.

...read moreread less

933 citations

Proceedings Article•10.3115/1219840.1219893•

Exploring Various Knowledge in Relation Extraction

[...]

Guodong Zhou¹, Jian Su², Jie Zhang², Min Zhang¹•Institutions (2)

Institute for Infocomm Research Singapore¹, Agency for Science, Technology and Research²

25 Jun 2005

TL;DR: This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM and illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement.

...read moreread less

Abstract: Extracting semantic relationships between entities is challenging. This paper investigates the incorporation of diverse lexical, syntactic and semantic knowledge in feature-based relation extraction using SVM. Our study illustrates that the base phrase chunking information is very effective for relation extraction and contributes to most of the performance improvement from syntactic aspect while additional information from full parsing gives limited further enhancement. This suggests that most of useful information in full parse trees for relation extraction is shallow and can be captured by chunking. We also demonstrate how semantic information such as WordNet and Name List, can be used in feature-based relation extraction to further improve the performance. Evaluation on the ACE corpus shows that effective incorporation of diverse features enables our system outperform previously best-reported systems on the 24 ACE relation subtypes and significantly outperforms tree kernel-based systems by over 20 in F-measure on the 5 ACE relation types.

...read moreread less

911 citations

Journal Article•10.1016/J.JBI.2013.08.004•

Unsupervised biomedical named entity recognition

[...]

Shaodian Zhang¹, Noémie Elhadad¹•Institutions (1)

Columbia University¹

01 Dec 2013-Journal of Biomedical Informatics

TL;DR: A stepwise solution to tackle the challenges of entity boundary detection and entity type classification without relying on any handcrafted rules, heuristics, or annotated data is described.

...read moreread less

268 citations

Proceedings Article•

YADAC: Yet another Dialectal Arabic Corpus

[...]

Rania Al-Sabbagh¹, Roxana Girju•Institutions (1)

University of Illinois at Urbana–Champaign¹

1 Jan 2012

TL;DR: This paper presents the first phase of building YADAC, a multi-genre Dialectal Arabic corpus that is compiled using Web data from microblogs and question-answer pairs extracted from online knowledge market services in which both questions and answers are user-generated.

...read moreread less

Abstract: This paper presents the first phase of building YADAC ― a multi-genre Dialectal Arabic (DA) corpus ― that is compiled using Web data from microblogs (i.e. Twitter), blogs/forums and online knowledge market services in which both questions and answers are user-generated. In addition to introducing two new genres to the current efforts of building DA corpora (i.e. microblogs and question-answer pairs extracted from online knowledge market services), the paper highlights and tackles several new issues related to building DA corpora that have not been handled in previous studies: function-based Web harvesting and dialect identification, vowel-based spelling variation, linguistic hypercorrection and its effect on spelling variation, unsupervised Part-of-Speech (POS) tagging and base phrase chunking for DA. Although the algorithms for both POS tagging and base-phrase chunking are still under development, the results are promising.

...read moreread less

81 citations

...

Expand

Year	Papers
2021	1
2020	3
2019	2
2018	2
2017	1
2016	2

Topic Tools

Papers published on a yearly basis

Papers

Text Chunking Using Transformation-Based Learning

Introduction to the CoNLL-2000 shared task: chunking

Exploring Various Knowledge in Relation Extraction

Unsupervised biomedical named entity recognition

YADAC: Yet another Dialectal Arabic Corpus

Related Topics (5)

Performance Metrics