An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Open AccessPosted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

- 04 Mar 2018

5.4K

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/P19-1285

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Zihang Dai, +5 more

- 09 Jan 2019

TL;DR: This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

...read moreread less

4.2K

•Proceedings Article•10.1109/ICDM.2018.00035

Self-Attentive Sequential Recommendation

Wang-Cheng Kang, +1 more

- 01 Nov 2018

TL;DR: In this article, a self-attention based sequential model (SASRec) is proposed, which uses an attention mechanism to identify which items are'relevant' from a user's action history, and use them to predict the next item.

...read moreread less

2.7K

•Journal Article•10.1029/98JC02160

Ocean Color Chlorophyll Algorithms for SEAWIFS

John E. O'Reilly, +7 more

- 15 Oct 1998

- Journal of Geophysical Research

TL;DR: In this article, a large data set containing coincident in situ chlorophyll and remote sensing reflectance measurements was used to evaluate the accuracy, precision, and suitability of a wide variety of ocean color algorithms for use by SeaWiFS (Sea-viewing Wide Field-of-view Sensor).

...read moreread less

2.6K

•Journal Article•10.1109/TASLP.2019.2915167

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

Yi Luo, +1 more

- 20 Sep 2018

- arXiv: Sound

TL;DR: A fully convolutional time-domain audio separation network (Conv-TasNet), a deep learning framework for end-to-end time- domain speech separation, which significantly outperforms previous time–frequency masking methods in separating two- and three-speaker mixtures.

...read moreread less

2K

•Proceedings Article•10.1109/CVPR.2019.00319

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Christopher Choy, +2 more

- 15 Jun 2019

TL;DR: In this paper, a generalized sparse convolutional neural network (GS-CNN) was proposed for spatio-temporal perception of 3D-videos, which can directly process 3D videos using high-dimensional convolutions.

...read moreread less

1.4K

...

Expand

References

•Posted Content

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Tim Salimans, +1 more

- 25 Feb 2016

- arXiv: Learning

TL;DR: Weight normalization as mentioned in this paper reparameterizes the weight vectors in a neural network that decouples the length of those weight vectors from their direction, improving the conditioning of the optimization problem and speed up convergence of stochastic gradient descent.

...read moreread less

1.3K

•Proceedings Article•10.18653/V1/E17-1104

Very deep convolutional networks for text classification

Alexis Conneau, +3 more

- 03 Apr 2017

TL;DR: Very deep convolutional networks (VDCNN) as mentioned in this paper have been applied to text classification. And they have achieved state-of-the-art performance on several public text classification tasks.

...read moreread less

1.2K

•Posted Content

Comparative Study of CNN and RNN for Natural Language Processing

Wenpeng Yin, +3 more

- 07 Feb 2017

- arXiv: Computation and Language

TL;DR: This work is the first systematic comparison of CNN and RNN on a wide range of representative NLP tasks, aiming to give basic guidance for DNN selection.

...read moreread less

1.1K

•Posted Content

Temporal Convolutional Networks for Action Segmentation and Detection

Colin Lea, +4 more

- 16 Nov 2016

- arXiv: Computer Vision and Pattern Recog...

TL;DR: Temporal Convolutional Networks (TCNs) as mentioned in this paper use a hierarchy of temporal convolutions to perform fine-grained action segmentation or detection, which can capture action compositions, segment durations, and long-range dependencies.

...read moreread less

1K

•Posted Content

Regularizing and Optimizing LSTM Language Models

Stephen Merity, +2 more

- 07 Aug 2017

- arXiv: Computation and Language

TL;DR: This paper proposes the weight-dropped LSTM which uses DropConnect on hidden-to-hidden weights as a form of recurrent regularization and introduces NT-ASGD, a variant of the averaged stochastic gradient method, wherein the averaging trigger is determined using a non-monotonic condition as opposed to being tuned by the user.

...read moreread less

1K

...

Expand

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Chat with Paper

AI Agents for this Paper

Citations

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Self-Attentive Sequential Recommendation

Ocean Color Chlorophyll Algorithms for SEAWIFS

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

References

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

Very deep convolutional networks for text classification

Comparative Study of CNN and RNN for Natural Language Processing

Temporal Convolutional Networks for Action Segmentation and Detection

Regularizing and Optimizing LSTM Language Models

Related Papers (5)

Long short-term memory

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Attention is All you Need

Dropout: a simple way to prevent neural networks from overfitting