A primer on neural network models for natural language processing

doi:10.1613/JAIR.4992

Open AccessJournal Article10.1613/JAIR.4992

A primer on neural network models for natural language processing

Yoav Goldberg

- 01 Sep 2016

- Journal of Artificial Intelligence Resea...

- Vol. 57, Iss: 1, pp 345-420

1.2K

TL;DR: This tutorial surveys neural network models from the perspective of natural language processing research, in an attempt to bring natural-language researchers up to speed with the neural techniques.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.17979/SPUDC.9788497497565.0644

Supervisión remota en el entrenamiento de un clasificador de sentimientos en comentarios turísticos

C. A. Martín, +3 more

- 06 Mar 2020

TL;DR: In this article, an algorithm for automatically identifying the sentiments expressed by tourists on eWOM (Electronic Word of Mouth) platforms is described. But the authors do not present a use case for this method involving a group of hotels located on the island of Tenerife (Canary Islands).

...read moreread less

5

•Dissertation

Aprendizagem profunda para reconhecimento de entidades nomeadas em domínio jurídico

Pedro Vitor Quinta de Castro

- 05 Dec 2019

TL;DR: Reconhecimento de Entidades Nomeadas (REN) and uma tarefa desafiadora em Processamento de Linguagem Natural, for uma lingua tao rica quanto o Portuguesa, have been evaluated in this article.

...read moreread less

5

Proceedings Article•10.1109/BESC51023.2020.9348310

An Efficient Intrusion Detection Model Combined Bidirectional Gated Recurrent Units With Attention Mechanism

Jingyi Wang, +4 more

- 05 Nov 2020

TL;DR: Wang et al. as mentioned in this paper proposed a two-layer bidirectional gated recurrent unit (BiGRU) network with attention mechanism to classify traffic data, which can detect network intrusions effectively and outperform other related methods with reduction of false alarm rate, high accuracy rate, reduced training and testing time.

...read moreread less

5

•Journal Article•10.3390/DATA6030031

KazNewsDataset: Single Country Overall Digital Mass Media Publication Corpus

Kirill Yakunin, +13 more

- 14 Mar 2021

TL;DR: In this article, the authors present a corpus of Kazakhstan media, which contains over 4 million publications from 36 primary sources (which has at least 500 publications), including more than 2 million texts of Russian media for comparative analysis of publication activity of the countries, also about 4000 sections of state policy documents.

...read moreread less

5

•Posted Content

Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations

Eugénio Ribeiro, +2 more

- 23 Jul 2018

- arXiv: Computation and Language

TL;DR: The authors explored means to generate more informative segment representations, not only by exploring different network architectures, but also by considering different token representations at both the word level and the character and functional level.

...read moreread less

5

...

Expand

References

Journal Article•10.1162/NECO.1997.9.8.1735

Long short-term memory

Sepp Hochreiter, +1 more

- 01 Nov 1997

- Neural Computation

TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.

...read moreread less

99K

•Journal Article•10.1145/3065386

ImageNet classification with deep convolutional neural networks

Alex Krizhevsky, +2 more

- 24 May 2017

- Communications of The ACM

TL;DR: A large, deep convolutional neural network was trained to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes and employed a recently developed regularization method called "dropout" that proved to be very effective.

...read moreread less

98.2K

•Proceedings Article

ImageNet Classification with Deep Convolutional Neural Networks

Alex Krizhevsky, +2 more

- 03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

88.4K

•Posted Content

Adam: A Method for Stochastic Optimization

Diederik P. Kingma, +1 more

- 22 Dec 2014

- arXiv: Learning

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.

...read moreread less

82.5K

Journal Article•10.1109/5.726791

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

- 01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

53.5K

...

Expand

A primer on neural network models for natural language processing

Chat with Paper

AI Agents for this Paper

Citations

Supervisión remota en el entrenamiento de un clasificador de sentimientos en comentarios turísticos

Aprendizagem profunda para reconhecimento de entidades nomeadas em domínio jurídico

An Efficient Intrusion Detection Model Combined Bidirectional Gated Recurrent Units With Attention Mechanism

KazNewsDataset: Single Country Overall Digital Mass Media Publication Corpus

Deep Dialog Act Recognition using Multiple Token, Segment, and Context Information Representations

References

Long short-term memory

ImageNet classification with deep convolutional neural networks

ImageNet Classification with Deep Convolutional Neural Networks

Adam: A Method for Stochastic Optimization

Gradient-based learning applied to document recognition

Related Papers (5)

Glove: Global Vectors for Word Representation

Long short-term memory

Deep learning

Neural Machine Translation by Jointly Learning to Align and Translate

Dropout: a simple way to prevent neural networks from overfitting