An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Open AccessPosted Content

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

- 04 Mar 2018

5.4K

TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Proceedings Article•10.18653/V1/P19-1285

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Zihang Dai, +5 more

- 09 Jan 2019

TL;DR: This work proposes a novel neural architecture Transformer-XL that enables learning dependency beyond a fixed length without disrupting temporal coherence, which consists of a segment-level recurrence mechanism and a novel positional encoding scheme.

...read moreread less

4.2K

•Proceedings Article•10.1109/ICDM.2018.00035

Self-Attentive Sequential Recommendation

Wang-Cheng Kang, +1 more

- 01 Nov 2018

TL;DR: In this article, a self-attention based sequential model (SASRec) is proposed, which uses an attention mechanism to identify which items are'relevant' from a user's action history, and use them to predict the next item.

...read moreread less

2.7K

•Journal Article•10.1029/98JC02160

Ocean Color Chlorophyll Algorithms for SEAWIFS

John E. O'Reilly, +7 more

- 15 Oct 1998

- Journal of Geophysical Research

TL;DR: In this article, a large data set containing coincident in situ chlorophyll and remote sensing reflectance measurements was used to evaluate the accuracy, precision, and suitability of a wide variety of ocean color algorithms for use by SeaWiFS (Sea-viewing Wide Field-of-view Sensor).

...read moreread less

2.6K

•Journal Article•10.1109/TASLP.2019.2915167

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

Yi Luo, +1 more

- 20 Sep 2018

- arXiv: Sound

TL;DR: A fully convolutional time-domain audio separation network (Conv-TasNet), a deep learning framework for end-to-end time- domain speech separation, which significantly outperforms previous time–frequency masking methods in separating two- and three-speaker mixtures.

...read moreread less

2K

•Proceedings Article•10.1109/CVPR.2019.00319

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

Christopher Choy, +2 more

- 15 Jun 2019

TL;DR: In this paper, a generalized sparse convolutional neural network (GS-CNN) was proposed for spatio-temporal perception of 3D-videos, which can directly process 3D videos using high-dimensional convolutions.

...read moreread less

1.4K

...

Expand

References

•Proceedings Article•10.3115/V1/N15-1011

Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

Rie Johnson, +1 more

- 01 Jan 2015

TL;DR: A straightforward adaptation of CNN from image to text, a simple but new variation which employs bag-of-word conversion in the convolution layer is proposed and an extension to combine multiple convolution layers is explored for higher accuracy.

...read moreread less

1K

•Proceedings Article

How to Construct Deep Recurrent Neural Networks

Razvan Pascanu, +3 more

- 01 Jan 2014

TL;DR: In this article, the authors explore different ways to extend a recurrent neural network (RNN) to a \textit{deep} RNN by carefully analyzing and understanding the architecture of an RNN.

...read moreread less

944

•Proceedings Article•10.18653/V1/E17-2025

Using the Output Embedding to Improve Language Models

Ofir Press, +1 more

- 01 Apr 2017

TL;DR: This article showed that weight tying can reduce the size of neural translation models to less than half of their original size without harming their performance and proposed a new method of regularizing the output embedding.

...read moreread less

928

•Proceedings Article•10.18653/V1/P17-1052

Deep pyramid convolutional neural networks for text categorization

Rie Johnson, +1 more

- 01 Jul 2017

TL;DR: A low-complexity word-level deep convolutional neural network architecture for text categorization that can efficiently represent long-range associations in text and outperforms the previous best models on six benchmark datasets for sentiment classification and topic categorization.

...read moreread less

883

•Posted Content

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

Quoc V. Le, +2 more

- 03 Apr 2015

- arXiv: Neural and Evolutionary Computing

TL;DR: This paper proposes a simpler solution that use recurrent neural networks composed of rectified linear units that is comparable to LSTM on four benchmarks: two toy problems involving long-range temporal structures, a large language modeling problem and a benchmark speech recognition problem.

...read moreread less

822

...

Expand

An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling

Chat with Paper

AI Agents for this Paper

Citations

Transformer-XL: Attentive Language Models beyond a Fixed-Length Context.

Self-Attentive Sequential Recommendation

Ocean Color Chlorophyll Algorithms for SEAWIFS

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

References

Effective Use of Word Order for Text Categorization with Convolutional Neural Networks

How to Construct Deep Recurrent Neural Networks

Using the Output Embedding to Improve Language Models

Deep pyramid convolutional neural networks for text categorization

A Simple Way to Initialize Recurrent Networks of Rectified Linear Units

Related Papers (5)

Long short-term memory

Deep Residual Learning for Image Recognition

Adam: A Method for Stochastic Optimization

Attention is All you Need

Dropout: a simple way to prevent neural networks from overfitting