Normalization (statistics)

Topic Tools

Papers published on a yearly basis

1 / 2

Papers

Proceedings Article•

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

[...]

Sergey Ioffe¹, Christian Szegedy¹•Institutions (1)

Google¹

6 Jul 2015

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.

...read moreread less

Abstract: Training Deep Neural Networks is complicated by the fact that the distribution of each layer's inputs changes during training, as the parameters of the previous layers change. This slows down the training by requiring lower learning rates and careful parameter initialization, and makes it notoriously hard to train models with saturating nonlinearities. We refer to this phenomenon as internal covariate shift, and address the problem by normalizing layer inputs. Our method draws its strength from making normalization a part of the model architecture and performing the normalization for each training mini-batch. Batch Normalization allows us to use much higher learning rates and be less careful about initialization, and in some cases eliminates the need for Dropout. Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin. Using an ensemble of batch-normalized networks, we improve upon the best published result on ImageNet classification: reaching 4.82% top-5 test error, exceeding the accuracy of human raters.

...read moreread less

43,706 citations

Proceedings Article•10.1109/CVPR.2005.177•

Histograms of oriented gradients for human detection

[...]

Navneet Dalal¹, Bill Triggs¹•Institutions (1)

French Institute for Research in Computer Science and Automation¹

20 Jun 2005

TL;DR: It is shown experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection, and the influence of each stage of the computation on performance is studied.

...read moreread less

Abstract: We study the question of feature sets for robust visual object recognition; adopting linear SVM based human detection as a test case. After reviewing existing edge and gradient based descriptors, we show experimentally that grids of histograms of oriented gradient (HOG) descriptors significantly outperform existing feature sets for human detection. We study the influence of each stage of the computation on performance, concluding that fine-scale gradients, fine orientation binning, relatively coarse spatial binning, and high-quality local contrast normalization in overlapping descriptor blocks are all important for good results. The new approach gives near-perfect separation on the original MIT pedestrian database, so we introduce a more challenging dataset containing over 1800 annotated human images with a large range of pose variations and backgrounds.

...read moreread less

36,789 citations

Journal Article•10.1186/GB-2002-3-7-RESEARCH0034•

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

[...]

Jo Vandesompele¹, Katleen De Preter¹, Filip Pattyn¹, Bruce Poppe¹, Nadine Van Roy¹, Anne De Paepe¹, Franki Speleman¹ - Show less +3 more•Institutions (1)

Ghent University Hospital¹

18 Jun 2002-Genome Biology

TL;DR: The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which opens up the possibility of studying the biological relevance of small expression differences.

...read moreread less

Abstract: Gene-expression analysis is increasingly important in biological research, with real-time reverse transcription PCR (RT-PCR) becoming the method of choice for high-throughput and accurate expression profiling of selected genes. Given the increased sensitivity, reproducibility and large dynamic range of this methodology, the requirements for a proper internal control gene for normalization have become increasingly stringent. Although housekeeping gene expression has been reported to vary considerably, no systematic survey has properly determined the errors related to the common practice of using only one control gene, nor presented an adequate way of working around this problem. We outline a robust and innovative strategy to identify the most stably expressed control genes in a given set of tissues, and to determine the minimum number of genes required to calculate a reliable normalization factor. We have evaluated ten housekeeping genes from different abundance and functional classes in various human tissues, and demonstrated that the conventional use of a single gene for normalization leads to relatively large errors in a significant proportion of samples tested. The geometric mean of multiple carefully selected housekeeping genes was validated as an accurate normalization factor by analyzing publicly available microarray data. The normalization strategy presented here is a prerequisite for accurate RT-PCR expression profiling, which, among other things, opens up the possibility of studying the biological relevance of small expression differences.

...read moreread less

20,425 citations

Reference Entry•10.1002/0470013192.BSA501•

Principal Component Analysis

[...]

Ian T. Jolliffe¹•Institutions (1)

University of Aberdeen¹

15 Oct 2005

TL;DR: Principal component analysis (PCA) as discussed by the authors replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables.

...read moreread less

Abstract: When large multivariate datasets are analyzed, it is often desirable to reduce their dimensionality. Principal component analysis is one technique for doing this. It replaces the p original variables by a smaller number, q, of derived variables, the principal components, which are linear combinations of the original variables. Often, it is possible to retain most of the variability in the original variables with q very much smaller than p. Despite its apparent simplicity, principal component analysis has a number of subtleties, and it has many uses and extensions. A number of choices associated with the technique are briefly discussed, namely, covariance or correlation, how many components, and different normalization constraints, as well as confusion with factor analysis. Various uses and extensions are outlined. Keywords: dimension reduction; factor analysis; multivariate analysis; variance maximization

...read moreread less

15,111 citations

Journal Article•10.1093/BIOINFORMATICS/19.2.185•

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

[...]

Benjamin M. Bolstad¹, Rafael A. Irizarry², Magnus Åstrand³, Terence P. Speed¹, Terence P. Speed⁴ - Show less +1 more•Institutions (4)

University of California, Berkeley¹, Johns Hopkins University², AstraZeneca³, Walter and Eliza Hall Institute of Medical Research⁴

22 Jan 2003-Bioinformatics

TL;DR: Three methods of performing normalization at the probe intensity level are presented: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure and the simplest and quickest complete data method is found to perform favorably.

...read moreread less

Abstract: Motivation: When running experiments that involve multiple high density oligonucleotide arrays, it is important to remove sources of variation between arrays of non-biological origin. Normalization is a process for reducing this variation. It is common to see non-linear relations between arrays and the standard normalization provided by Affymetrix does not perform well in these situations. Results: We present three methods of performing normalization at the probe intensity level. These methods are called complete data methods because they make use of data from all arrays in an experiment to form the normalizing relation. These algorithms are compared to two methods that make use of a baseline array: a one number scaling based algorithm and a method that uses a non-linear normalizing relation by comparing the variability and bias of an expression measure. Two publicly available datasets are used to carry out the comparisons. The simplest and quickest complete data method is found to perform favorably. Availabilty: Software implementing all three of the complete data normalization methods is available as part of the R package Affy, which is a part of the Bioconductor project http://www.bioconductor.org. Contact: bolstad@stat.berkeley.edu Supplementary information: Additional figures may be found at http://www.stat.berkeley.edu/∼bolstad/normalize/ index.html

...read moreread less

9,065 citations

...

Expand

Year	Papers
2022	9
2021	767
2020	846
2019	850
2018	727
2017	584

Topic Tools

Papers published on a yearly basis

Papers

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

Histograms of oriented gradients for human detection

Accurate normalization of real-time quantitative RT-PCR data by geometric averaging of multiple internal control genes

Principal Component Analysis

A comparison of normalization methods for high density oligonucleotide array data based on variance and bias

Related Topics (5)

Performance Metrics