Scispace (Formerly Typeset)
  1. Home
  2. Topics
  3. Normalization (statistics)
  4. 2016
  1. Home
  2. Topics
  3. Normalization (statistics)
  4. 2016
Showing papers on "Normalization (statistics) published in 2016"
Posted Content•
Layer Normalization

[...]

Jimmy Ba, Jamie Ryan Kiros, Geoffrey E. Hinton
21 Jul 2016-arXiv: Machine Learning
TL;DR: In this paper, layer normalization is applied to recurrent neural networks by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case.
Abstract: Training state-of-the-art, deep neural networks is computationally expensive One way to reduce the training time is to normalize the activities of the neurons A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case This significantly reduces the training time in feed-forward neural networks However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity Unlike batch normalization, layer normalization performs exactly the same computation at training and test times It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques

7,174 citations

Posted Content•
Instance Normalization: The Missing Ingredient for Fast Stylization.

[...]

Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky
27 Jul 2016-arXiv: Computer Vision and Pattern Recognition
TL;DR: A small change in the stylization architecture results in a significant qualitative improvement in the generated images, and can be used to train high-performance architectures for real-time image generation.
Abstract: It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images. The change is limited to swapping batch normalization with instance normalization, and to apply the latter both at training and testing times. The resulting method can be used to train high-performance architectures for real-time image generation. The code will is made available on github at this https URL. Full paper can be found at arXiv:1701.02096.

4,533 citations

Journal Article•10.1109/TIP.2017.2662206•
Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

[...]

Kai Zhang1, Wangmeng Zuo1, Yunjin Chen, Deyu Meng2, Lei Zhang3 •
Harbin Institute of Technology1, Xi'an Jiaotong University2, Hong Kong Polytechnic University3
13 Aug 2016-arXiv: Computer Vision and Pattern Recognition
TL;DR: Zhang et al. as discussed by the authors proposed a denoising convolutional neural network (DnCNN) to handle Gaussian denoizing with unknown noise level, which implicitly removes the latent clean image in the hidden layers.
Abstract: Discriminative model learning for image denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the construction of feed-forward denoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method into image denoising. Specifically, residual learning and batch normalization are utilized to speed up the training process as well as boost the denoising performance. Different from the existing discriminative denoising models which usually train a specific model for additive white Gaussian noise (AWGN) at a certain noise level, our DnCNN model is able to handle Gaussian denoising with unknown noise level (i.e., blind Gaussian denoising). With the residual learning strategy, DnCNN implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model to tackle with several general image denoising tasks such as Gaussian denoising, single image super-resolution and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.

3,257 citations

Posted Content•
Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

[...]

Tim Salimans1, Diederik P. Kingma1•
OpenAI1
25 Feb 2016-arXiv: Learning
TL;DR: Weight normalization as mentioned in this paper reparameterizes the weight vectors in a neural network that decouples the length of those weight vectors from their direction, improving the conditioning of the optimization problem and speed up convergence of stochastic gradient descent.
Abstract: We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

1,345 citations

Journal Article•10.1186/S13059-016-0947-7•
Pooling across cells to normalize single-cell RNA sequencing data with many zero counts

[...]

Aaron T. L. Lun1, Karsten Bach2, John C. Marioni3, John C. Marioni2, John C. Marioni1 •
University of Cambridge1, European Bioinformatics Institute2, Wellcome Trust Sanger Institute3
27 Apr 2016-Genome Biology
TL;DR: This work presents a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization, which outperforms existing methods for accurate normalization of cell-specific biases in simulated data.
Abstract: Normalization of single-cell RNA sequencing data is necessary to eliminate cell-specific biases prior to downstream analyses. However, this is not straightforward for noisy single-cell data where many counts are zero. We present a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization. Pool-based size factors are then deconvolved to yield cell-based factors. Our deconvolution approach outperforms existing methods for accurate normalization of cell-specific biases in simulated data. Similar behavior is observed in real data, where deconvolution improves the relevance of results of downstream analyses.

1,269 citations

Proceedings Article•
Weight normalization: a simple reparameterization to accelerate training of deep neural networks

[...]

Tim Salimans1, Diederik P. Kingma1•
OpenAI1
5 Dec 2016
TL;DR: A reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction is presented, improving the conditioning of the optimization problem and speeding up convergence of stochastic gradient descent.
Abstract: We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

795 citations

Posted Content•
Revisiting Batch Normalization For Practical Domain Adaptation

[...]

Yanghao Li1, Naiyan Wang2, Jianping Shi3, Jiaying Liu1, Xiaodi Hou4 •
Peking University1, Hong Kong University of Science and Technology2, The Chinese University of Hong Kong3, California Institute of Technology4
15 Mar 2016-arXiv: Computer Vision and Pattern Recognition
TL;DR: This paper proposes a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN, and demonstrates that the method is complementary with other existing methods and may further improve model performance.
Abstract: Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance.

622 citations

Journal Article•10.1016/J.NEUROIMAGE.2015.12.012•
Reliability of dissimilarity measures for multi-voxel pattern analysis.

[...]

Alexander Walther1, Alexander Walther2, Hamed Nili3, Naveed Ejaz1, Arjen Alink2, Nikolaus Kriegeskorte2, Jörn Diedrichsen1 •
University College London1, Cognition and Brain Sciences Unit2, University of Oxford3
15 Aug 2016-NeuroImage
TL;DR: In this paper, the authors compare the reliability of three classes of dissimilarity measures: classification accuracy, Euclidean/Mahalanobis distance, and Pearson correlation distance, using simulations and four real functional magnetic resonance imaging (fMRI) datasets.

558 citations

Proceedings Article•
Recurrent Batch Normalization

[...]

Tim Cooijmans1, Nicolas Ballas1, César Laurent1, Caglar Gulcehre1, Aaron Courville1 •
Université de Montréal1
30 Mar 2016
TL;DR: In this article, a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks is proposed. But the authors only apply batch normalisation to the hidden-to-hidden transformation of RNNs and demonstrate that it is both possible and beneficial to batch-normalize the hidden to hidden transition.
Abstract: We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.

335 citations

Journal Article•10.1093/BIOINFORMATICS/BTW343•
TaggerOne: joint named entity recognition and normalization with semi-Markov Models

[...]

Robert Leaman, Zhiyong Lu
15 Sep 2016-Bioinformatics
TL;DR: This work proposes the first machine learning model for joint NER and normalization during both training and prediction, which is trainable for arbitrary entity types and consists of a semi-Markov structured linear classifier, with a rich feature approach for N ER and supervised semantic indexing for normalization.
Abstract: Motivation: Text mining is increasingly used to manage the accelerating pace of the biomedical literature. Many text mining applications depend on accurate named entity recognition (NER) and normalization (grounding). While high performing machine learning methods trainable for many entity types exist for NER, normalization methods are usually specialized to a single entity type. NER and normalization systems are also typically used in a serial pipeline, causing cascading errors and limiting the ability of the NER system to directly exploit the lexical information provided by the normalization. Methods: We propose the first machine learning model for joint NER and normalization during both training and prediction. The model is trainable for arbitrary entity types and consists of a semi-Markov structured linear classifier, with a rich feature approach for NER and supervised semantic indexing for normalization. We also introduce TaggerOne, a Java implementation of our model as a general toolkit for joint NER and normalization. TaggerOne is not specific to any entity type, requiring only annotated training data and a corresponding lexicon, and has been optimized for high throughput. Results: We validated TaggerOne with multiple gold-standard corpora containing both mention- and concept-level annotations. Benchmarking results show that TaggerOne achieves high performance on diseases (NCBI Disease corpus, NER f-score: 0.829, normalization f-score: 0.807) and chemicals (BioCreative 5 CDR corpus, NER f-score: 0.914, normalization f-score 0.895). These results compare favorably to the previous state of the art, notwithstanding the greater flexibility of the model. We conclude that jointly modeling NER and normalization greatly improves performance. Availability and Implementation: The TaggerOne source code and an online demonstration are available at: http://www.ncbi.nlm.nih.gov/bionlp/taggerone Contact: zhiyong.lu@nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.

322 citations

Journal Article•10.1093/BIB/BBW095•
A systematic evaluation of normalization methods in quantitative label-free proteomics

[...]

Tommi Välikangas, Tomi Suomi, Laura L. Elo1•
University of Turku1
02 Oct 2016-Briefings in Bioinformatics
TL;DR: It is found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets and performed consistently well in the differential expression analysis.
Abstract: To date, mass spectrometry (MS) data remain inherently biased as a result of reasons ranging from sample handling to differences caused by the instrumentation. Normalization is the process that aims to account for the bias and make samples more comparable. The selection of a proper normalization method is a pivotal task for the reliability of the downstream analysis and results. Many normalization methods commonly used in proteomics have been adapted from the DNA microarray techniques. Previous studies comparing normalization methods in proteomics have focused mainly on intragroup variation. In this study, several popular and widely used normalization methods representing different strategies in normalization are evaluated using three spike-in and one experimental mouse label-free proteomic data sets. The normalization methods are evaluated in terms of their ability to reduce variation between technical replicates, their effect on differential expression analysis and their effect on the estimation of logarithmic fold changes. Additionally, we examined whether normalizing the whole data globally or in segments for the differential expression analysis has an effect on the performance of the normalization methods. We found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets. Vsn also performed consistently well in the differential expression analysis. Linear regression normalization and local regression normalization performed also systematically well. Finally, we discuss the choice of a normalization method and some qualities of a suitable normalization method in the light of the results of our evaluation.
Proceedings Article•10.1109/ICASSP.2016.7472159•
Batch normalized recurrent neural networks

[...]

César Laurent1, Gabriel Pereyra2, Philemon Brakel1, Ying Zhang1, Yoshua Bengio1 •
Université de Montréal1, University of Southern California2
20 Mar 2016
TL;DR: This paper investigates how batch normalization can be applied to RNNs and shows that the way it is applied leads to a faster convergence of the training criterion but doesn't seem to improve the generalization performance.
Abstract: Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. However, they are computationally expensive to train and difficult to parallelize. Recent work has shown that normalizing intermediate representations of neural networks can significantly improve convergence rates in feed-forward neural networks [1]. In particular, batch normalization, which uses mini-batch statistics to standardize features, was shown to significantly reduce training time. In this paper, we investigate how batch normalization can be applied to RNNs. We show for both a speech recognition task and language modeling that the way we apply batch normalization leads to a faster convergence of the training criterion but doesn't seem to improve the generalization performance.
Proceedings Article•
Density Modeling of Images using a Generalized Normalization Transformation

[...]

Johannes Ballé1, Valero Laparra1, Eero P. Simoncelli1•
New York University1
1 Jan 2016
TL;DR: In this paper, a parametric nonlinear transformation is proposed for Gaussianizing data from natural images, where each component is normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and an additive constant.
Abstract: We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. After a linear transformation of the data, each component is normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and an additive constant. We optimize the parameters of this transformation (linear transform, exponents, weights, constant) over a database of natural images, directly minimizing the negentropy of the responses. We find that the optimized transformation successfully Gaussianizes the data, achieving a significantly smaller mutual information between transformed components than previous methods including ICA and radial Gaussianization. The transformation is differentiable and can be efficiently inverted, and thus induces a density model on images. We show that samples of this model are visually similar to samples of natural image patches. We also demonstrate the use of the model as a prior density in removing additive noise. Finally, we show that the transformation can be cascaded, with each layer optimized (unsupervised) using the same Gaussianization objective, to capture additional probabilistic structure.
Journal Article•10.1016/J.CHROMA.2015.12.007•
Sample normalization methods in quantitative metabolomics.

[...]

Yiman Wu1, Liang Li1•
University of Alberta1
22 Jan 2016-Journal of Chromatography A
TL;DR: The importance of sample normalization in the analytical workflow with a focus on mass spectrometry (MS)-based platforms is described, a number of methods recently reported in the literature are discussed and their applicability in real world metabolomics applications are commented on.
Proceedings Article•
Revisiting Batch Normalization For Practical Domain Adaptation

[...]

Yanghao Li1, Naiyan Wang2, Jianping Shi3, Jiaying Liu1, Xiaodi Hou4 •
Peking University1, Hong Kong University of Science and Technology2, The Chinese University of Hong Kong3, California Institute of Technology4
15 Mar 2016
TL;DR: Adaptive batch normalization (AdaBN) as mentioned in this paper modifies the statistics in all Batch Normalization layers across the network to increase the generalization ability of a DNN and achieves deep adaptation effect for domain adaptation tasks.
Abstract: Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance.
Book Chapter•10.1007/978-3-319-31165-4_26•
Normalization Techniques for Multi-Criteria Decision Making: Analytical Hierarchy Process Case Study

[...]

Nazanin Vafaei1, Rita A. Ribeiro1, Luis M. Camarinha-Matos1•
University of Lisbon1
11 Apr 2016
TL;DR: In this article, the authors discuss metrics for assessing which are the most appropriate normalization techniques in decision problems, specifically for the Analytical Hierarchy Process (AHP) multi-criteria method.
Abstract: Multi-Criteria Decision Making (MCDM) methods use normalization techniques to allow aggregation of criteria with numerical and comparable data. With the advent of Cyber Physical Systems, where big data is collected from heterogeneous sensors and other data sources, finding a suitable normalization technique is also a challenge to enable data fusion (integration). Therefore, data fusion and aggregation of criteria are similar processes of combining values either from criteria or from sensors to obtain a common score. In this study, our aim is to discuss metrics for assessing which are the most appropriate normalization techniques in decision problems, specifically for the Analytical Hierarchy Process (AHP) multi-criteria method. AHP uses a pairwise approach to evaluate the alternatives regarding a set of criteria and then fuses (aggregation) the evaluations to determine the final ratings (scores).
Proceedings Article•
A kernelized stein discrepancy for goodness-of-fit tests

[...]

Qiang Liu1, Jason D. Lee2, Michael I. Jordan2•
Dartmouth College1, University of California, Berkeley2
19 Jun 2016
TL;DR: A new discrepancy statistic for measuring differences between two probability distributions is derived based on combining Stein's identity with the reproducing kernel Hilbert space theory and a new class of powerful goodness-of-fit tests are derived that are widely applicable for complex and high dimensional distributions.
Abstract: We derive a new discrepancy statistic for measuring differences between two probability distributions based on combining Stein's identity with the reproducing kernel Hilbert space theory. We apply our result to test how well a probabilistic model fits a set of observations, and derive a new class of powerful goodness-of-fit tests that are widely applicable for complex and high dimensional distributions, even for those with computationally intractable normalization constants. Both theoretical and empirical properties of our methods are studied thoroughly.
Journal Article•10.1103/PHYSREVB.94.235438•
Exact mode volume and Purcell factor of open optical systems

[...]

Egor A. Muljarov1, Wolfgang Werner Langbein1•
Cardiff University1
15 Dec 2016-Physical Review B
TL;DR: In this paper, an analytic theory of the Purcell effect is presented based on this exact mode normalization and the resulting effective mode volume. But this theory is restricted to a homogeneous dielectric sphere in vacuum, which is analytically solvable.
Abstract: The Purcell factor quantifies the change of the radiative decay of a dipole in an electromagnetic environment relative to free space. Designing this factor is at the heart of photonics technology, striving to develop ever smaller or less lossy optical resonators. The Purcell factor can be expressed using the electromagnetic eigenmodes of the resonators, introducing the notion of a mode volume for each mode. This approach allows an analytic treatment, reducing the Purcell factor and other observables to sums over eigenmode resonances. Calculating the mode volumes requires a correct normalization of the modes. We introduce an exact normalization of modes, not relying on perfectly matched layers. We present an analytic theory of the Purcell effect based on this exact mode normalization and the resulting effective mode volume. We use a homogeneous dielectric sphere in vacuum, which is analytically solvable, to exemplify these findings. We furthermore verify the applicability of the normalization to numerically determined modes of a finite dielectric cylinder.
Proceedings Article•
Learning values across many orders of magnitude

[...]

Hado van Hasselt1, Arthur Guez1, Matteo Hessel1, Volodymyr Mnih1, David Silver1 •
Google1
5 Dec 2016
TL;DR: This work proposes to adaptively normalize the targets used in learning, useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when the policy of behavior changes.
Abstract: Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates. This is important in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior. Using adaptive normalization we can remove this domain-specific heuristic without diminishing overall performance.
Journal Article•10.1016/J.ECOLECON.2016.06.018•
Normalization in sustainability assessment: Methods and implications

[...]

Nathan Pollesch1, Nathan Pollesch2, Virginia H. Dale2•
University of Tennessee1, Oak Ridge National Laboratory2
01 Oct 2016-Ecological Economics
TL;DR: Various normalization schemes including rationormalization, target normalization, Z-score normalization), and unit equivalence normalization are explored within the context of sustainability assessment.
Journal Article•10.1007/S11306-016-1026-5•
Normalization and integration of large-scale metabolomics data using support vector regression

[...]

Xiaotao Shen1, Xiaoyun Gong2, Yuping Cai1, Yuan Guo1, Jia Tu1, Hao Li1, Tao Zhang2, Jialin Wang2, Fuzhong Xue2, Zheng-Jiang Zhu1 •
Chinese Academy of Sciences1, Shandong University2
26 Mar 2016-Metabolomics
TL;DR: A machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration that can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.
Abstract: Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra- and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies. We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses. We developed a machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization. After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected. SVR normalization can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.
Journal Article•10.1016/J.ESWA.2015.10.047•
Fully automatic face normalization and single sample face recognition in unconstrained environments

[...]

Mohammad Haghighat1, Mohamed Abdel-Mottaleb2, Wadee Alhalabi3•
University of Miami1, Effat University2, King Abdulaziz University3
01 Apr 2016-Expert Systems With Applications
TL;DR: A fully automatic face normalization and recognition system robust to most common face variations in unconstrained environments and improves the performance of AAM fitting by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts.
Abstract: We present a fully automatic face normalization and recognition system.It normalizes the face images for both in-plane and out-of-plane pose variations.The performance of AAM fitting is improved using a novel initialization technique.HOG and Gabor features are fused using CCA to have more discriminative features.The proposed system recognizes non-frontal faces using only a single gallery sample. Single sample face recognition have become an important problem because of the limitations on the availability of gallery images. In many real-world applications such as passport or driver license identification, there is only a single facial image per subject available. The variations between the single gallery face image and the probe face images, captured in unconstrained environments, make the single sample face recognition even more difficult. In this paper, we present a fully automatic face recognition system robust to most common face variations in unconstrained environments. Our proposed system is capable of recognizing faces from non-frontal views and under different illumination conditions using only a single gallery sample for each subject. It normalizes the face images for both in-plane and out-of-plane pose variations using an enhanced technique based on active appearance models (AAMs). We improve the performance of AAM fitting, not only by training it with in-the-wild images and using a powerful optimization technique, but also by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts. The proposed initialization technique results in significant improvement of AAM fitting to non-frontal poses and makes the normalization process robust, fast and reliable. Owing to the proper alignment of the face images, made possible by this approach, we can use local feature descriptors, such as Histograms of Oriented Gradients (HOG), for matching. The use of HOG features makes the system robust against illumination variations. In order to improve the discriminating information content of the feature vectors, we also extract Gabor features from the normalized face images and fuse them with HOG features using Canonical Correlation Analysis (CCA). Experimental results performed on various databases outperform the state-of-the-art methods and show the effectiveness of our proposed method in normalization and recognition of face images obtained in unconstrained environments.
Journal Article•10.3389/FGENE.2016.00164•
In papyro comparison of TMM (edgeR), RLE (DESeq2), and MRN normalization methods for a simple two-conditions-without-replicates RNA-Seq experimental design

[...]

Elie Maza1•
National Polytechnic Institute of Toulouse1
16 Sep 2016-Frontiers in Genetics
TL;DR: In this article, the authors highlight the similarities between three normalization methods: TMM from edgeR R package, RLE from DESeq2 R package and MRN from MRN.
Abstract: In the past 5 years, RNA-Seq has become a powerful tool in transcriptome analysis even though computational methods dedicated to the analysis of high-throughput sequencing data are yet to be standardized. It is, however, now commonly accepted that the choice of a normalization procedure is an important step in such a process, for example in differential gene expression analysis. The present article highlights the similarities between three normalization methods: TMM from edgeR R package, RLE from DESeq2 R package, and MRN. Both TMM and DESeq2 are widely used for differential gene expression analysis. This paper introduces properties that show when these three methods will give exactly the same results. These properties are proven mathematically and illustrated by performing in silico calculations on a given RNA-Seq data set.
Proceedings Article•
Dirichlet process mixture model for correcting technical variation in single-cell gene expression data

[...]

Sandhya Prabhakaran1, Elham Azizi1, Ambrose J. Carr1, Dana Pe'er1•
Columbia University1
19 Jun 2016
TL;DR: In this article, a hierarchical Bayesian mixture model with cell-specific scalings is proposed to aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals.
Abstract: We introduce an iterative normalization and clustering method for single-cell gene expression data. The emerging technology of single-cell RNA-seq gives access to gene expression measurements for thousands of cells, allowing discovery and characterization of cell types. However, the data is confounded by technical variation emanating from experimental errors and cell type-specific biases. Current approaches perform a global normalization prior to analyzing biological signals, which does not resolve missing data or variation dependent on latent cell types. Our model is formulated as a hierarchical Bayesian mixture model with cell-specific scalings that aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals. We demonstrate that this approach is superior to global normalization followed by clustering. We show identifiability and weak convergence guarantees of our method and present a scalable Gibbs inference algorithm. This method improves cluster inference in both synthetic and real single-cell data compared with previous methods, and allows easy interpretation and recovery of the underlying structure and cell types.
Journal Article•10.1080/17476933.2015.1079628•
Certain geometric properties of the Mittag-Leffler functions

[...]

Deepak Bansal, J. K. Prajapat1•
Central University of Rajasthan1
03 Mar 2016-Complex Variables and Elliptic Equations
TL;DR: In this article, the Mittag-Leffler functions with their normalization are considered and sufficient conditions are obtained so that they have certain geometric properties including univalency, starlikeness, convexity and close-to-convexity in the open unit disk.
Abstract: In the present investigation, the Mittag-Leffler functions with their normalization are considered. Several sufficient conditions are obtained so that the Mittag-Leffler functions have certain geometric properties including univalency, starlikeness, convexity and close-to-convexity in the open unit disk. Partial sums of Mittag-Leffler functions are also studied. The results obtained are new and their usefulness is depicted by deducing several interesting corollaries and examples.
Journal Article•10.1007/S12064-015-0220-8•
How should we measure proportionality on relative gene expression data

[...]

Ionas Erb1, Cedric Notredame1•
Pompeu Fabra University1
13 Jan 2016-Theory in Biosciences
TL;DR: It is demonstrated that using an unchanged gene as a reference has huge advantages in terms of sensitivity and the link between proportionality and partial correlation and derive expressions for a partial proportionality coefficient is explored.
Posted Content•
Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

[...]

Devansh Arpit1, Yingbo Zhou1, Bhargava Urala Kota1, Venu Govindaraju1•
University at Buffalo1
04 Mar 2016-arXiv: Machine Learning
TL;DR: Normalization Propagation as mentioned in this paper uses a data-independent parametric estimate of mean and standard deviation in every layer thus being computationally faster compared with batch normalization, and it can forward propagate this normalization without the need for recalculating the approximate statistics for hidden layers.
Abstract: While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- Internal Covariate Shift-- the current solution has certain drawbacks. Specifically, BN depends on batch statistics for layerwise input normalization during training which makes the estimates of mean and standard deviation of input (distribution) to hidden layers inaccurate for validation due to shifting parameter values (especially during initial training epochs). Also, BN cannot be used with batch-size 1 during training. We address these drawbacks by proposing a non-adaptive normalization technique for removing internal covariate shift, that we call Normalization Propagation. Our approach does not depend on batch statistics, but rather uses a data-independent parametric estimate of mean and standard-deviation in every layer thus being computationally faster compared with BN. We exploit the observation that the pre-activation before Rectified Linear Units follow Gaussian distribution in deep networks, and that once the first and second order statistics of any given dataset are normalized, we can forward propagate this normalization without the need for recalculating the approximate statistics for hidden layers.
Journal Article•10.1093/MNRAS/STV2374•
A Bayesian approach to linear regression in astronomy

[...]

Mauro Sereno1•
University of Bologna1
11 Jan 2016-Monthly Notices of the Royal Astronomical Society
TL;DR: In this paper, a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter is discussed, where the intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time.
Abstract: Linear regression is common in astronomical analyses. I discuss a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter. The method fully accounts for time evolution. The slope, the normalization, and the intrinsic scatter of the relation can evolve with the redshift. The intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time. The method can address scatter in the measured independent variable (a kind of Eddington bias), selection effects in the response variable (Malmquist bias), and departure from linearity in form of a knee. I tested the method with toy models and simulations and quantified the effect of biases and inefficient modeling. The R-package LIRA (LInear Regression in Astronomy) is made available to perform the regression.
Posted Content•
Recurrent Batch Normalization

[...]

Tim Cooijmans1, Nicolas Ballas1, César Laurent1, Caglar Gulcehre1, Aaron Courville1 •
Université de Montréal1
30 Mar 2016-arXiv: Learning
TL;DR: In this article, a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks is proposed. But the authors only apply batch normalisation to the hidden-to-hidden transformation of RNNs and demonstrate that it is both possible and beneficial to batch-normalize the hidden to hidden transition.
Abstract: We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.
Proceedings Article•
Identity Matters in Deep Learning

[...]

Moritz Hardt1, Tengyu Ma2•
Google1, Princeton University2
4 Nov 2016
TL;DR: This paper showed that linear residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size.
Abstract: An emerging design principle in deep learning is that each layer of a deep artificial neural network should be able to easily express the identity transformation. This idea not only motivated various normalization techniques, such as \emph{batch normalization}, but was also key to the immense success of \emph{residual networks}. In this work, we put the principle of \emph{identity parameterization} on a more solid theoretical footing alongside further empirical progress. We first give a strikingly simple proof that arbitrarily deep linear residual networks have no spurious local optima. The same result for linear feed-forward networks in their standard parameterization is substantially more delicate. Second, we show that residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size. Directly inspired by our theory, we experiment with a radically simple residual architecture consisting of only residual convolutional layers and ReLu activations, but no batch normalization, dropout, or max pool. Our model improves significantly on previous all-convolutional networks on the CIFAR10, CIFAR100, and ImageNet classification benchmarks.
...

Tools

SciSpace AgentBiomedical AgentSciSpace RecruitSciSpace for EnterpriseAgent GalleryChat with PDFLiterature ReviewAI WriterFind TopicsParaphraserCitation GeneratorExtract DataAI DetectorCitation Booster

Learn

ResourcesLive Workshops

SciSpace

CareersSupportBrowse PapersPricingSciSpace Affiliate ProgramCancellation & Refund PolicyTermsPrivacyData Sources

Directories

PapersTopicsJournalsAuthorsConferencesInstitutionsCitation StylesWriting templates

Extension & Apps

SciSpace Chrome ExtensionSciSpace Mobile App

Contact

support@scispace.com
SciSpace

© 2026 | PubGenius Inc. | Suite # 217 691 S Milpitas Blvd Milpitas CA 95035, USA

soc2
Secured by Delve