End-to-end Sequence Labeling via Bi-directional LSTM-CNNs-CRF
Xuezhe Ma,Eduard Hovy +1 more
- 04 Mar 2016
- Vol. 1, pp 1064-1074
TL;DR: This paper used a combination of bidirectional LSTM, CNN and CRF for sequence labeling tasks, and achieved state-of-the-art performance on both datasets for POS tagging and CoNLL 2003 corpus for NER.
read more
Abstract: State-of-the-art sequence labeling systems traditionally require large amounts of taskspecific knowledge in the form of handcrafted features and data pre-processing. In this paper, we introduce a novel neutral network architecture that benefits from both word- and character-level representations automatically, by using combination of bidirectional LSTM, CNN and CRF. Our system is truly end-to-end, requiring no feature engineering or data preprocessing, thus making it applicable to a wide range of sequence labeling tasks. We evaluate our system on two data sets for two sequence labeling tasks — Penn Treebank WSJ corpus for part-of-speech (POS) tagging and CoNLL 2003 corpus for named entity recognition (NER). We obtain state-of-the-art performance on both datasets — 97.55% accuracy for POS tagging and 91.21% F1 for NER.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Deep contextualized word representations
Matthew E. Peters,Mark Neumann,Mohit Iyyer,Matt Gardner,Christopher Clark,Kenton Lee,Luke Zettlemoyer +6 more
- 15 Feb 2018
TL;DR: This paper introduced a new type of deep contextualized word representation that models both complex characteristics of word use (e.g., syntax and semantics), and how these uses vary across linguistic contexts (i.e., to model polysemy).
Neural Architectures for Named Entity Recognition
Guillaume Lample,Miguel Ballesteros,Sandeep Subramanian,Kazuya Kawakami,Chris Dyer +4 more
- 04 Mar 2016
TL;DR: Comunicacio presentada a la 2016 Conference of the North American Chapter of the Association for Computational Linguistics, celebrada a San Diego (CA, EUA) els dies 12 a 17 of juny 2016.
Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing
16 Jan 2023
TL;DR: The authors surveys and organizes research works in a new paradigm in natural language processing, which they dub "prompt-based learning" and describe a unified set of mathematical notations that can cover a wide variety of existing work.
•Proceedings Article
Contextual String Embeddings for Sequence Labeling
Alan Akbik,Duncan A. J. Blythe,Roland Vollgraf +2 more
- 01 Aug 2018
TL;DR: This paper proposes to leverage the internal states of a trained character language model to produce a novel type of word embedding which they refer to as contextual string embeddings, which are fundamentally model words as sequences of characters and are contextualized by their surrounding text.
1.4K
A Survey on Deep Learning for Named Entity Recognition
TL;DR: A comprehensive review on existing deep learning techniques for NER is provided in this paper, where the authors systematically categorize existing works based on a taxonomy along three axes: distributed representations for input, context encoder, and tag decoder.
1.1K
References
•Proceedings Article
Distributed Representations of Words and Phrases and their Compositionality
Tomas Mikolov,Ilya Sutskever,Kai Chen,Greg S. Corrado,Jeffrey Dean +4 more
- 05 Dec 2013
TL;DR: This paper presents a simple method for finding phrases in text, and shows that learning good vector representations for millions of phrases is possible and describes a simple alternative to the hierarchical softmax called negative sampling.
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 07 Dec 2015
TL;DR: In this paper, a Parametric Rectified Linear Unit (PReLU) was proposed to improve model fitting with nearly zero extra computational cost and little overfitting risk, which achieved a 4.94% top-5 test error on ImageNet 2012 classification dataset.
•Proceedings Article
Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data
John Lafferty,Andrew McCallum,Fernando Pereira +2 more
- 28 Jun 2001
TL;DR: This work presents iterative parameter estimation algorithms for conditional random fields and compares the performance of the resulting models to HMMs and MEMMs on synthetic and natural-language data.
•Posted Content
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
15.1K
Backpropagation applied to handwritten zip code recognition
Yann LeCun,Bernhard E. Boser,John S. Denker,D. Henderson,Richard Howard,W. Hubbard,Lawrence D. Jackel +6 more
TL;DR: This paper demonstrates how constraints from the task domain can be integrated into a backpropagation network through the architecture of the network, successfully applied to the recognition of handwritten zip code digits provided by the U.S. Postal Service.
12.5K
Related Papers (5)
Jeffrey Pennington,Richard Socher,Christopher D. Manning +2 more
- 01 Oct 2014