Wide Residual Networks

doi:10.5244/C.30.87

Open AccessProceedings Article10.5244/C.30.87

Wide Residual Networks

- 01 Jan 2016

4.9K

TL;DR: This paper conducts a detailed experimental study on the architecture of ResNet blocks and proposes a novel architecture where the depth and width of residual networks are decreased and the resulting network structures are called wide residual networks (WRNs), which are far superior over their commonly used thin and very deep counterparts.

Abstract: Deep residual networks were shown to be able to scale up to thousands of layers and still have improving performance. However, each fraction of a percent of improved accuracy costs nearly doubling the number of layers, and so training very deep residual networks has a problem of diminishing feature reuse, which makes these networks very slow to train. To tackle these problems, in this paper we conduct a detailed experimental study on the architecture of ResNet blocks, based on which we propose a novel architecture where we decrease depth and increase width of residual networks. We call the resulting network structures wide residual networks (WRNs) and show that these are far superior over their commonly used thin and very deep counterparts. For example, we demonstrate that even a simple 16-layer-deep wide residual network outperforms in accuracy and efficiency all previous deep residual networks, including thousand-layer-deep networks, achieving new state-of-the-art results on CIFAR, SVHN, COCO, and significant improvements on ImageNet. Our code and models are available at this https URL

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

•Posted Content

On Lazy Training in Differentiable Programming.

Lénaïc Chizat, +2 more

- 19 Dec 2018

- arXiv: Optimization and Control

TL;DR: In this article, the authors show that the lazy training phenomenon is not specific to over-parameterized neural networks, and is due to a choice of scaling, often implicit, that makes the model behave as its linearization around the initialization, thus yielding a model equivalent to learning with positive-definite kernels.

...read moreread less

219

•Proceedings Article•10.1109/WACV.2018.00068

Wide-Slice Residual Networks for Food Recognition

Niki Martinel, +2 more

- 12 Mar 2018

TL;DR: In this paper, a slice convolution block is introduced to capture vertical food traits that are common to a large number of categories (i.e., 15% of the whole data in current datasets).

...read moreread less

218

•Proceedings Article•10.1109/CVPR.2019.01167

Regularizing Activation Distribution for Training Binarized Deep Networks

Ruizhou Ding, +3 more

- 01 Jun 2019

TL;DR: The experiments show that the distribution loss can consistently improve the accuracy of BNNs without losing their energy benefits and equipped with the proposed regularization, BNN training is shown to be robust to the selection of hyper-parameters including optimizer and learning rate.

...read moreread less

217

•Journal Article•10.1109/ACCESS.2021.3084358

A Survey on Semi-, Self- and Unsupervised Learning for Image Classification

Lars Schmarje, +3 more

- 27 May 2021

- IEEE Access

TL;DR: In this article, the authors provide an overview of often used ideas and methods in image classification with fewer labels and compare 34 methods in detail based on their performance and their commonly used ideas rather than a fine-grained taxonomy.

...read moreread less

216

•Journal Article•10.3390/APP10134523

Towards a Better Understanding of Transfer Learning for Medical Imaging: A Case Study

Laith Alzubaidi, +6 more

- 29 Jun 2020

- Applied Sciences

TL;DR: A deep convolutional neural network (DCNN) model that integrates three ideas including traditional and parallel Convolutional layers and residual connections along with global average pooling is designed that can significantly improve the performance considering a reduced number of images in the same domain of the target dataset.

...read moreread less

214

...

Expand

Wide Residual Networks

Chat with Paper

AI Agents for this Paper

Citations

On Lazy Training in Differentiable Programming.

Wide-Slice Residual Networks for Food Recognition

Regularizing Activation Distribution for Training Binarized Deep Networks

A Survey on Semi-, Self- and Unsupervised Learning for Image Classification

Towards a Better Understanding of Transfer Learning for Medical Imaging: A Case Study

Related Papers (5)

Deep Residual Learning for Image Recognition

ImageNet Classification with Deep Convolutional Neural Networks

Densely Connected Convolutional Networks

ImageNet: A large-scale hierarchical image database

Going deeper with convolutions