An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Open AccessPosted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

- 04 Mar 2018

5.4K

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/P19-1285

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Zihang Dai, +5 more

- 09 Jan 2019

TL;DR: This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

...read moreread less

4.2K

•Proceedings Article•10.1109/ICDM.2018.00035

Self-Attentive Sequential Recommendation

Wang-Cheng Kang, +1 more

- 01 Nov 2018

TL;DR: In this article, a self-attention based sequential model (SASRec) is proposed, which uses an attention mechanism to identify which items are'relevant' from a user's action history, and use them to predict the next item.

...read moreread less

2.7K

•Journal Article•10.1029/98JC02160

Ocean Color Chlorophyll Algorithms for SEAWIFS

John E. O'Reilly, +7 more

- 15 Oct 1998

- Journal of Geophysical Research

TL;DR: In this article, a large data set containing coincident in situ chlorophyll and remote sensing reflectance measurements was used to evaluate the accuracy, precision, and suitability of a wide variety of ocean color algorithms for use by SeaWiFS (Sea-viewing Wide Field-of-view Sensor).

...read moreread less

2.6K

•Journal Article•10.1109/TASLP.2019.2915167

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

Yi Luo, +1 more

- 20 Sep 2018

- arXiv: Sound

TL;DR: A fully convolutional time-domain audio separation network (Conv-TasNet), a deep learning framework for end-to-end time- domain speech separation, which significantly outperforms previous time–frequency masking methods in separating two- and three-speaker mixtures.

...read moreread less

2K

•Proceedings Article•10.1109/CVPR.2019.00319

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Christopher Choy, +2 more

- 15 Jun 2019

TL;DR: In this paper, a generalized sparse convolutional neural network (GS-CNN) was proposed for spatio-temporal perception of 3D-videos, which can directly process 3D videos using high-dimensional convolutions.

...read moreread less

1.4K

...

Expand

References

•Posted Content

The LAMBADA dataset: Word prediction requiring a broad discourse context

Denis Paperno, +8 more

- 20 Jun 2016

- arXiv: Computation and Language

TL;DR: It is shown that LAMBADA exemplifies a wide range of linguistic phenomena, and that none of several state-of-the-art language models reaches accuracy above 1% on this novel benchmark.

...read moreread less

433

•Posted Content

Hierarchical Multiscale Recurrent Neural Networks

Junyoung Chung, +2 more

- 06 Sep 2016

- arXiv: Learning

TL;DR: In this paper, a hierarchical multiscale recurrent neural network (HM-RNN) is proposed to capture the latent hierarchical structure in the sequence by encoding the temporal dependencies with different timescales using a novel update mechanism.

...read moreread less

420

•Proceedings Article

A Clockwork RNN

Jan Koutník, +3 more

- 21 Jun 2014

TL;DR: This paper introduces a simple, yet powerful modification to the simple RNN architecture, the Clockwork RNN (CW-RNN), in which the hidden layer is partitioned into separate modules, each processing inputs at its own temporal granularity, making computations only at its prescribed clock rate.

...read moreread less

336

•Proceedings Article

Recurrent Batch Normalization

Tim Cooijmans, +4 more

- 30 Mar 2016

TL;DR: In this article, a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks is proposed. But the authors only apply batch normalisation to the hidden-to-hidden transformation of RNNs and demonstrate that it is both possible and beneficial to batch-normalize the hidden to hidden transition.

...read moreread less

335

•Posted Content

On the State of the Art of Evaluation in Neural Language Models

Gábor Melis, +2 more

- 18 Jul 2017

- arXiv: Computation and Language

TL;DR: This work reevaluate several popular architectures and regularisation methods with large-scale automatic black-box hyperparameter tuning and arrives at the somewhat surprising conclusion that standard LSTM architectures, when properly regularised, outperform more recent models.

...read moreread less

294

...

Expand

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Chat with Paper

AI Agents for this Paper

Citations

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Self-Attentive Sequential Recommendation

Ocean Color Chlorophyll Algorithms for SEAWIFS

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

References

The LAMBADA dataset: Word prediction requiring a broad discourse context

Hierarchical Multiscale Recurrent Neural Networks

A Clockwork RNN

Recurrent Batch Normalization

On the State of the Art of Evaluation in Neural Language Models

Related Papers (5)

Long short-term memory

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Attention is All you Need

Dropout: a simple way to prevent neural networks from overfitting