Open AccessPosted Content
An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling
TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.
read more
Abstract: For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this http URL .
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Forecasting water quality variable using deep learning and weighted averaging ensemble models.
Mohammad Ghadir Zamani,Mohammad Reza Nikoo,Sina Jahanshahi,Rahim Barzegar,Amirreza Meydani +4 more
TL;DR: The study’s findings demonstrated that the EM-NSGA-II stands out with exceptional effectiveness compared to DL and EM-GA models, showcasing improvements of 14% (RNN), 8% (LSTM), 6% (GRU), 8% (TCN), and 3% (EM-GA) during the testing phase during the testing phase.
23
Multi-task Temporal Convolutional Network for Predicting Water Quality Sensor Data
Yi-Fan Zhang,Peter J. Thorburn,Peter Fitch +2 more
- 12 Dec 2019
TL;DR: The proposed multi-task temporal convolution network (MTCN) is an encouraging approach for water quality management by processing a large amount of sensor data and achieves the best RMSE scores in predicting both temperature and DO in the following 48 time steps.
23
Centimeter-Scale Lithology and Facies Prediction in Cored Wells Using Machine Learning
TL;DR: In this paper, an open-source, python-based machine learning workflow was developed to analyze core image data in a scalable, reproducible way, which can unlock warehouses full of high-resolution data for a multitude of geological settings.
TNT: An Interpretable Tree-Network-Tree Learning Framework using Knowledge Distillation.
TL;DR: A Tree-Network-Tree (TNT) learning framework for explainable decision-making, where the knowledge is alternately transferred between the tree model and DNNs is proposed, and extensive experiments demonstrated the effectiveness of the proposed method.
23
Large-Area Land-Cover Changes Monitoring With Time-Series Remote Sensing Images Using Transferable Deep Models
TL;DR: Wang et al. as mentioned in this paper proposed the similarity-measurement-based deep transfer learning for time-series adaptive change detection (SDTL-TSACD) model, which used a standard dynamic time warping (SDTW) distance to cluster large-scale time series into multiple subcategories with high time series similarity.
23
References
Deep Residual Learning for Image Recognition
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
•Proceedings Article
Adam: A Method for Stochastic Optimization
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015
TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
138.5K
•Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
117.9K
Long short-term memory
TL;DR: A novel, efficient, gradient based method called long short-term memory (LSTM) is introduced, which can learn to bridge minimal time lags in excess of 1000 discrete-time steps by enforcing constant error flow through constant error carousels within special units.
99K
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner +6 more
- 01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
53.5K
Related Papers (5)
Kaiming He,Xiangyu Zhang,Shaoqing Ren,Jian Sun +3 more
- 27 Jun 2016
Diederik P. Kingma,Jimmy Ba +1 more
- 01 Jan 2015