Top 539 papers published in the topic of Normalization (statistics) in 2016

Showing papers on "Normalization (statistics) published in 2016"

Posted Content•

Layer Normalization

[...]

Jimmy Ba, Jamie Ryan Kiros, Geoffrey E. Hinton

21 Jul 2016-arXiv: Machine Learning

TL;DR: In this paper, layer normalization is applied to recurrent neural networks by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case.

...read moreread less

Abstract: Training state-of-the-art, deep neural networks is computationally expensive One way to reduce the training time is to normalize the activities of the neurons A recently introduced technique called batch normalization uses the distribution of the summed input to a neuron over a mini-batch of training cases to compute a mean and variance which are then used to normalize the summed input to that neuron on each training case This significantly reduces the training time in feed-forward neural networks However, the effect of batch normalization is dependent on the mini-batch size and it is not obvious how to apply it to recurrent neural networks In this paper, we transpose batch normalization into layer normalization by computing the mean and variance used for normalization from all of the summed inputs to the neurons in a layer on a single training case Like batch normalization, we also give each neuron its own adaptive bias and gain which are applied after the normalization but before the non-linearity Unlike batch normalization, layer normalization performs exactly the same computation at training and test times It is also straightforward to apply to recurrent neural networks by computing the normalization statistics separately at each time step Layer normalization is very effective at stabilizing the hidden state dynamics in recurrent networks Empirically, we show that layer normalization can substantially reduce the training time compared with previously published techniques

...read moreread less

7,174 citations

Posted Content•

Instance Normalization: The Missing Ingredient for Fast Stylization.

[...]

Dmitry Ulyanov, Andrea Vedaldi, Victor Lempitsky

27 Jul 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: A small change in the stylization architecture results in a significant qualitative improvement in the generated images, and can be used to train high-performance architectures for real-time image generation.

...read moreread less

Abstract: It this paper we revisit the fast stylization method introduced in Ulyanov et. al. (2016). We show how a small change in the stylization architecture results in a significant qualitative improvement in the generated images. The change is limited to swapping batch normalization with instance normalization, and to apply the latter both at training and testing times. The resulting method can be used to train high-performance architectures for real-time image generation. The code will is made available on github at this https URL. Full paper can be found at arXiv:1701.02096.

...read moreread less

4,533 citations

Journal Article•10.1109/TIP.2017.2662206•

Beyond a Gaussian Denoiser: Residual Learning of Deep CNN for Image Denoising

[...]

Kai Zhang¹, Wangmeng Zuo¹, Yunjin Chen, Deyu Meng², Lei Zhang³ - Show less +1 more•Institutions (3)

Harbin Institute of Technology¹, Xi'an Jiaotong University², Hong Kong Polytechnic University³

13 Aug 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors proposed a denoising convolutional neural network (DnCNN) to handle Gaussian denoizing with unknown noise level, which implicitly removes the latent clean image in the hidden layers.

...read moreread less

Abstract: Discriminative model learning for image denoising has been recently attracting considerable attentions due to its favorable denoising performance. In this paper, we take one step forward by investigating the construction of feed-forward denoising convolutional neural networks (DnCNNs) to embrace the progress in very deep architecture, learning algorithm, and regularization method into image denoising. Specifically, residual learning and batch normalization are utilized to speed up the training process as well as boost the denoising performance. Different from the existing discriminative denoising models which usually train a specific model for additive white Gaussian noise (AWGN) at a certain noise level, our DnCNN model is able to handle Gaussian denoising with unknown noise level (i.e., blind Gaussian denoising). With the residual learning strategy, DnCNN implicitly removes the latent clean image in the hidden layers. This property motivates us to train a single DnCNN model to tackle with several general image denoising tasks such as Gaussian denoising, single image super-resolution and JPEG image deblocking. Our extensive experiments demonstrate that our DnCNN model can not only exhibit high effectiveness in several general image denoising tasks, but also be efficiently implemented by benefiting from GPU computing.

...read moreread less

3,257 citations

Posted Content•

Weight Normalization: A Simple Reparameterization to Accelerate Training of Deep Neural Networks

[...]

Tim Salimans¹, Diederik P. Kingma¹•Institutions (1)

OpenAI¹

25 Feb 2016-arXiv: Learning

TL;DR: Weight normalization as mentioned in this paper reparameterizes the weight vectors in a neural network that decouples the length of those weight vectors from their direction, improving the conditioning of the optimization problem and speed up convergence of stochastic gradient descent.

...read moreread less

Abstract: We present weight normalization: a reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction. By reparameterizing the weights in this way we improve the conditioning of the optimization problem and we speed up convergence of stochastic gradient descent. Our reparameterization is inspired by batch normalization but does not introduce any dependencies between the examples in a minibatch. This means that our method can also be applied successfully to recurrent models such as LSTMs and to noise-sensitive applications such as deep reinforcement learning or generative models, for which batch normalization is less well suited. Although our method is much simpler, it still provides much of the speed-up of full batch normalization. In addition, the computational overhead of our method is lower, permitting more optimization steps to be taken in the same amount of time. We demonstrate the usefulness of our method on applications in supervised image recognition, generative modelling, and deep reinforcement learning.

...read moreread less

1,345 citations

Journal Article•10.1186/S13059-016-0947-7•

Pooling across cells to normalize single-cell RNA sequencing data with many zero counts

[...]

Aaron T. L. Lun¹, Karsten Bach², John C. Marioni³, John C. Marioni², John C. Marioni¹ - Show less +1 more•Institutions (3)

University of Cambridge¹, European Bioinformatics Institute², Wellcome Trust Sanger Institute³

27 Apr 2016-Genome Biology

TL;DR: This work presents a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization, which outperforms existing methods for accurate normalization of cell-specific biases in simulated data.

...read moreread less

Abstract: Normalization of single-cell RNA sequencing data is necessary to eliminate cell-specific biases prior to downstream analyses. However, this is not straightforward for noisy single-cell data where many counts are zero. We present a novel approach where expression values are summed across pools of cells, and the summed values are used for normalization. Pool-based size factors are then deconvolved to yield cell-based factors. Our deconvolution approach outperforms existing methods for accurate normalization of cell-specific biases in simulated data. Similar behavior is observed in real data, where deconvolution improves the relevance of results of downstream analyses.

...read moreread less

1,269 citations

Proceedings Article•

Weight normalization: a simple reparameterization to accelerate training of deep neural networks

[...]

Tim Salimans¹, Diederik P. Kingma¹•Institutions (1)

OpenAI¹

5 Dec 2016

TL;DR: A reparameterization of the weight vectors in a neural network that decouples the length of those weight vectors from their direction is presented, improving the conditioning of the optimization problem and speeding up convergence of stochastic gradient descent.

...read moreread less

795 citations

Posted Content•

Revisiting Batch Normalization For Practical Domain Adaptation

[...]

Yanghao Li¹, Naiyan Wang², Jianping Shi³, Jiaying Liu¹, Xiaodi Hou⁴ - Show less +1 more•Institutions (4)

Peking University¹, Hong Kong University of Science and Technology², The Chinese University of Hong Kong³, California Institute of Technology⁴

15 Mar 2016-arXiv: Computer Vision and Pattern Recognition

TL;DR: This paper proposes a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN, and demonstrates that the method is complementary with other existing methods and may further improve model performance.

...read moreread less

Abstract: Deep neural networks (DNN) have shown unprecedented success in various computer vision applications such as image classification and object detection. However, it is still a common annoyance during the training phase, that one has to prepare at least thousands of labeled images to fine-tune a network to a specific domain. Recent study (Tommasi et al. 2015) shows that a DNN has strong dependency towards the training dataset, and the learned features cannot be easily transferred to a different but relevant task without fine-tuning. In this paper, we propose a simple yet powerful remedy, called Adaptive Batch Normalization (AdaBN) to increase the generalization ability of a DNN. By modulating the statistics in all Batch Normalization layers across the network, our approach achieves deep adaptation effect for domain adaptation tasks. In contrary to other deep learning domain adaptation methods, our method does not require additional components, and is parameter-free. It archives state-of-the-art performance despite its surprising simplicity. Furthermore, we demonstrate that our method is complementary with other existing methods. Combining AdaBN with existing domain adaptation treatments may further improve model performance.

...read moreread less

622 citations

Journal Article•10.1016/J.NEUROIMAGE.2015.12.012•

Reliability of dissimilarity measures for multi-voxel pattern analysis.

[...]

Alexander Walther¹, Alexander Walther², Hamed Nili³, Naveed Ejaz¹, Arjen Alink², Nikolaus Kriegeskorte², Jörn Diedrichsen¹ - Show less +3 more•Institutions (3)

University College London¹, Cognition and Brain Sciences Unit², University of Oxford³

15 Aug 2016-NeuroImage

TL;DR: In this paper, the authors compare the reliability of three classes of dissimilarity measures: classification accuracy, Euclidean/Mahalanobis distance, and Pearson correlation distance, using simulations and four real functional magnetic resonance imaging (fMRI) datasets.

...read moreread less

558 citations

Proceedings Article•

Recurrent Batch Normalization

[...]

Tim Cooijmans¹, Nicolas Ballas¹, César Laurent¹, Caglar Gulcehre¹, Aaron Courville¹ - Show less +1 more•Institutions (1)

Université de Montréal¹

30 Mar 2016

TL;DR: In this article, a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks is proposed. But the authors only apply batch normalisation to the hidden-to-hidden transformation of RNNs and demonstrate that it is both possible and beneficial to batch-normalize the hidden to hidden transition.

...read moreread less

Abstract: We propose a reparameterization of LSTM that brings the benefits of batch normalization to recurrent neural networks. Whereas previous works only apply batch normalization to the input-to-hidden transformation of RNNs, we demonstrate that it is both possible and beneficial to batch-normalize the hidden-to-hidden transition, thereby reducing internal covariate shift between time steps. We evaluate our proposal on various sequential problems such as sequence classification, language modeling and question answering. Our empirical results show that our batch-normalized LSTM consistently leads to faster convergence and improved generalization.

...read moreread less

335 citations

Journal Article•10.1093/BIOINFORMATICS/BTW343•

TaggerOne: joint named entity recognition and normalization with semi-Markov Models

[...]

Robert Leaman, Zhiyong Lu

15 Sep 2016-Bioinformatics

TL;DR: This work proposes the first machine learning model for joint NER and normalization during both training and prediction, which is trainable for arbitrary entity types and consists of a semi-Markov structured linear classifier, with a rich feature approach for N ER and supervised semantic indexing for normalization.

...read moreread less

Abstract: Motivation: Text mining is increasingly used to manage the accelerating pace of the biomedical literature. Many text mining applications depend on accurate named entity recognition (NER) and normalization (grounding). While high performing machine learning methods trainable for many entity types exist for NER, normalization methods are usually specialized to a single entity type. NER and normalization systems are also typically used in a serial pipeline, causing cascading errors and limiting the ability of the NER system to directly exploit the lexical information provided by the normalization. Methods: We propose the first machine learning model for joint NER and normalization during both training and prediction. The model is trainable for arbitrary entity types and consists of a semi-Markov structured linear classifier, with a rich feature approach for NER and supervised semantic indexing for normalization. We also introduce TaggerOne, a Java implementation of our model as a general toolkit for joint NER and normalization. TaggerOne is not specific to any entity type, requiring only annotated training data and a corresponding lexicon, and has been optimized for high throughput. Results: We validated TaggerOne with multiple gold-standard corpora containing both mention- and concept-level annotations. Benchmarking results show that TaggerOne achieves high performance on diseases (NCBI Disease corpus, NER f-score: 0.829, normalization f-score: 0.807) and chemicals (BioCreative 5 CDR corpus, NER f-score: 0.914, normalization f-score 0.895). These results compare favorably to the previous state of the art, notwithstanding the greater flexibility of the model. We conclude that jointly modeling NER and normalization greatly improves performance. Availability and Implementation: The TaggerOne source code and an online demonstration are available at: http://www.ncbi.nlm.nih.gov/bionlp/taggerone Contact: zhiyong.lu@nih.gov Supplementary information: Supplementary data are available at Bioinformatics online.

...read moreread less

322 citations

Journal Article•10.1093/BIB/BBW095•

A systematic evaluation of normalization methods in quantitative label-free proteomics

[...]

Tommi Välikangas, Tomi Suomi, Laura L. Elo¹•Institutions (1)

University of Turku¹

02 Oct 2016-Briefings in Bioinformatics

TL;DR: It is found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets and performed consistently well in the differential expression analysis.

...read moreread less

Abstract: To date, mass spectrometry (MS) data remain inherently biased as a result of reasons ranging from sample handling to differences caused by the instrumentation. Normalization is the process that aims to account for the bias and make samples more comparable. The selection of a proper normalization method is a pivotal task for the reliability of the downstream analysis and results. Many normalization methods commonly used in proteomics have been adapted from the DNA microarray techniques. Previous studies comparing normalization methods in proteomics have focused mainly on intragroup variation. In this study, several popular and widely used normalization methods representing different strategies in normalization are evaluated using three spike-in and one experimental mouse label-free proteomic data sets. The normalization methods are evaluated in terms of their ability to reduce variation between technical replicates, their effect on differential expression analysis and their effect on the estimation of logarithmic fold changes. Additionally, we examined whether normalizing the whole data globally or in segments for the differential expression analysis has an effect on the performance of the normalization methods. We found that variance stabilization normalization (Vsn) reduced variation the most between technical replicates in all examined data sets. Vsn also performed consistently well in the differential expression analysis. Linear regression normalization and local regression normalization performed also systematically well. Finally, we discuss the choice of a normalization method and some qualities of a suitable normalization method in the light of the results of our evaluation.

...read moreread less

Proceedings Article•10.1109/ICASSP.2016.7472159•

Batch normalized recurrent neural networks

[...]

César Laurent¹, Gabriel Pereyra², Philemon Brakel¹, Ying Zhang¹, Yoshua Bengio¹ - Show less +1 more•Institutions (2)

Université de Montréal¹, University of Southern California²

20 Mar 2016

TL;DR: This paper investigates how batch normalization can be applied to RNNs and shows that the way it is applied leads to a faster convergence of the training criterion but doesn't seem to improve the generalization performance.

...read moreread less

Abstract: Recurrent Neural Networks (RNNs) are powerful models for sequential data that have the potential to learn long-term dependencies. However, they are computationally expensive to train and difficult to parallelize. Recent work has shown that normalizing intermediate representations of neural networks can significantly improve convergence rates in feed-forward neural networks [1]. In particular, batch normalization, which uses mini-batch statistics to standardize features, was shown to significantly reduce training time. In this paper, we investigate how batch normalization can be applied to RNNs. We show for both a speech recognition task and language modeling that the way we apply batch normalization leads to a faster convergence of the training criterion but doesn't seem to improve the generalization performance.

...read moreread less

Proceedings Article•

Density Modeling of Images using a Generalized Normalization Transformation

[...]

Johannes Ballé¹, Valero Laparra¹, Eero P. Simoncelli¹•Institutions (1)

New York University¹

1 Jan 2016

TL;DR: In this paper, a parametric nonlinear transformation is proposed for Gaussianizing data from natural images, where each component is normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and an additive constant.

...read moreread less

Abstract: We introduce a parametric nonlinear transformation that is well-suited for Gaussianizing data from natural images. After a linear transformation of the data, each component is normalized by a pooled activity measure, computed by exponentiating a weighted sum of rectified and exponentiated components and an additive constant. We optimize the parameters of this transformation (linear transform, exponents, weights, constant) over a database of natural images, directly minimizing the negentropy of the responses. We find that the optimized transformation successfully Gaussianizes the data, achieving a significantly smaller mutual information between transformed components than previous methods including ICA and radial Gaussianization. The transformation is differentiable and can be efficiently inverted, and thus induces a density model on images. We show that samples of this model are visually similar to samples of natural image patches. We also demonstrate the use of the model as a prior density in removing additive noise. Finally, we show that the transformation can be cascaded, with each layer optimized (unsupervised) using the same Gaussianization objective, to capture additional probabilistic structure.

...read moreread less

Journal Article•10.1016/J.CHROMA.2015.12.007•

Sample normalization methods in quantitative metabolomics.

[...]

Yiman Wu¹, Liang Li¹•Institutions (1)

University of Alberta¹

22 Jan 2016-Journal of Chromatography A

TL;DR: The importance of sample normalization in the analytical workflow with a focus on mass spectrometry (MS)-based platforms is described, a number of methods recently reported in the literature are discussed and their applicability in real world metabolomics applications are commented on.

...read moreread less

Proceedings Article•

Revisiting Batch Normalization For Practical Domain Adaptation

[...]

Yanghao Li¹, Naiyan Wang², Jianping Shi³, Jiaying Liu¹, Xiaodi Hou⁴ - Show less +1 more•Institutions (4)

Peking University¹, Hong Kong University of Science and Technology², The Chinese University of Hong Kong³, California Institute of Technology⁴

15 Mar 2016

TL;DR: Adaptive batch normalization (AdaBN) as mentioned in this paper modifies the statistics in all Batch Normalization layers across the network to increase the generalization ability of a DNN and achieves deep adaptation effect for domain adaptation tasks.

...read moreread less

Book Chapter•10.1007/978-3-319-31165-4_26•

Normalization Techniques for Multi-Criteria Decision Making: Analytical Hierarchy Process Case Study

[...]

Nazanin Vafaei¹, Rita A. Ribeiro¹, Luis M. Camarinha-Matos¹•Institutions (1)

University of Lisbon¹

11 Apr 2016

TL;DR: In this article, the authors discuss metrics for assessing which are the most appropriate normalization techniques in decision problems, specifically for the Analytical Hierarchy Process (AHP) multi-criteria method.

...read moreread less

Abstract: Multi-Criteria Decision Making (MCDM) methods use normalization techniques to allow aggregation of criteria with numerical and comparable data. With the advent of Cyber Physical Systems, where big data is collected from heterogeneous sensors and other data sources, finding a suitable normalization technique is also a challenge to enable data fusion (integration). Therefore, data fusion and aggregation of criteria are similar processes of combining values either from criteria or from sensors to obtain a common score. In this study, our aim is to discuss metrics for assessing which are the most appropriate normalization techniques in decision problems, specifically for the Analytical Hierarchy Process (AHP) multi-criteria method. AHP uses a pairwise approach to evaluate the alternatives regarding a set of criteria and then fuses (aggregation) the evaluations to determine the final ratings (scores).

...read moreread less

Proceedings Article•

A kernelized stein discrepancy for goodness-of-fit tests

[...]

Qiang Liu¹, Jason D. Lee², Michael I. Jordan²•Institutions (2)

Dartmouth College¹, University of California, Berkeley²

19 Jun 2016

TL;DR: A new discrepancy statistic for measuring differences between two probability distributions is derived based on combining Stein's identity with the reproducing kernel Hilbert space theory and a new class of powerful goodness-of-fit tests are derived that are widely applicable for complex and high dimensional distributions.

...read moreread less

Abstract: We derive a new discrepancy statistic for measuring differences between two probability distributions based on combining Stein's identity with the reproducing kernel Hilbert space theory. We apply our result to test how well a probabilistic model fits a set of observations, and derive a new class of powerful goodness-of-fit tests that are widely applicable for complex and high dimensional distributions, even for those with computationally intractable normalization constants. Both theoretical and empirical properties of our methods are studied thoroughly.

...read moreread less

Journal Article•10.1103/PHYSREVB.94.235438•

Exact mode volume and Purcell factor of open optical systems

[...]

Egor A. Muljarov¹, Wolfgang Werner Langbein¹•Institutions (1)

Cardiff University¹

15 Dec 2016-Physical Review B

TL;DR: In this paper, an analytic theory of the Purcell effect is presented based on this exact mode normalization and the resulting effective mode volume. But this theory is restricted to a homogeneous dielectric sphere in vacuum, which is analytically solvable.

...read moreread less

Abstract: The Purcell factor quantifies the change of the radiative decay of a dipole in an electromagnetic environment relative to free space. Designing this factor is at the heart of photonics technology, striving to develop ever smaller or less lossy optical resonators. The Purcell factor can be expressed using the electromagnetic eigenmodes of the resonators, introducing the notion of a mode volume for each mode. This approach allows an analytic treatment, reducing the Purcell factor and other observables to sums over eigenmode resonances. Calculating the mode volumes requires a correct normalization of the modes. We introduce an exact normalization of modes, not relying on perfectly matched layers. We present an analytic theory of the Purcell effect based on this exact mode normalization and the resulting effective mode volume. We use a homogeneous dielectric sphere in vacuum, which is analytically solvable, to exemplify these findings. We furthermore verify the applicability of the normalization to numerically determined modes of a finite dielectric cylinder.

...read moreread less

Proceedings Article•

Learning values across many orders of magnitude

[...]

Hado van Hasselt¹, Arthur Guez¹, Matteo Hessel¹, Volodymyr Mnih¹, David Silver¹ - Show less +1 more•Institutions (1)

Google¹

5 Dec 2016

TL;DR: This work proposes to adaptively normalize the targets used in learning, useful in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when the policy of behavior changes.

...read moreread less

Abstract: Most learning algorithms are not invariant to the scale of the signal that is being approximated. We propose to adaptively normalize the targets used in the learning updates. This is important in value-based reinforcement learning, where the magnitude of appropriate value approximations can change over time when we update the policy of behavior. Our main motivation is prior work on learning to play Atari games, where the rewards were clipped to a predetermined range. This clipping facilitates learning across many different games with a single learning algorithm, but a clipped reward function can result in qualitatively different behavior. Using adaptive normalization we can remove this domain-specific heuristic without diminishing overall performance.

...read moreread less

Journal Article•10.1016/J.ECOLECON.2016.06.018•

Normalization in sustainability assessment: Methods and implications

[...]

Nathan Pollesch¹, Nathan Pollesch², Virginia H. Dale²•Institutions (2)

University of Tennessee¹, Oak Ridge National Laboratory²

01 Oct 2016-Ecological Economics

TL;DR: Various normalization schemes including rationormalization, target normalization, Z-score normalization), and unit equivalence normalization are explored within the context of sustainability assessment.

...read moreread less

Journal Article•10.1007/S11306-016-1026-5•

Normalization and integration of large-scale metabolomics data using support vector regression

[...]

Xiaotao Shen¹, Xiaoyun Gong², Yuping Cai¹, Yuan Guo¹, Jia Tu¹, Hao Li¹, Tao Zhang², Jialin Wang², Fuzhong Xue², Zheng-Jiang Zhu¹ - Show less +6 more•Institutions (2)

Chinese Academy of Sciences¹, Shandong University²

26 Mar 2016-Metabolomics

TL;DR: A machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration that can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.

...read moreread less

Abstract: Untargeted metabolomics studies for biomarker discovery often have hundreds to thousands of human samples. Data acquisition of large-scale samples has to be divided into several batches and may span from months to as long as several years. The signal drift of metabolites during data acquisition (intra- and inter-batch) is unavoidable and is a major confounding factor for large-scale metabolomics studies. We aim to develop a data normalization method to reduce unwanted variations and integrate multiple batches in large-scale metabolomics studies prior to statistical analyses. We developed a machine learning algorithm-based method, support vector regression (SVR), for large-scale metabolomics data normalization and integration. An R package named MetNormalizer was developed and provided for data processing using SVR normalization. After SVR normalization, the portion of metabolite ion peaks with relative standard deviations (RSDs) less than 30 % increased to more than 90 % of the total peaks, which is much better than other common normalization methods. The reduction of unwanted analytical variations helps to improve the performance of multivariate statistical analyses, both unsupervised and supervised, in terms of classification and prediction accuracy so that subtle metabolic changes in epidemiological studies can be detected. SVR normalization can effectively remove the unwanted intra- and inter-batch variations, and is much better than other common normalization methods.

...read moreread less

Journal Article•10.1016/J.ESWA.2015.10.047•

Fully automatic face normalization and single sample face recognition in unconstrained environments

[...]

Mohammad Haghighat¹, Mohamed Abdel-Mottaleb², Wadee Alhalabi³•Institutions (3)

University of Miami¹, Effat University², King Abdulaziz University³

01 Apr 2016-Expert Systems With Applications

TL;DR: A fully automatic face normalization and recognition system robust to most common face variations in unconstrained environments and improves the performance of AAM fitting by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts.

...read moreread less

Abstract: We present a fully automatic face normalization and recognition system.It normalizes the face images for both in-plane and out-of-plane pose variations.The performance of AAM fitting is improved using a novel initialization technique.HOG and Gabor features are fused using CCA to have more discriminative features.The proposed system recognizes non-frontal faces using only a single gallery sample. Single sample face recognition have become an important problem because of the limitations on the availability of gallery images. In many real-world applications such as passport or driver license identification, there is only a single facial image per subject available. The variations between the single gallery face image and the probe face images, captured in unconstrained environments, make the single sample face recognition even more difficult. In this paper, we present a fully automatic face recognition system robust to most common face variations in unconstrained environments. Our proposed system is capable of recognizing faces from non-frontal views and under different illumination conditions using only a single gallery sample for each subject. It normalizes the face images for both in-plane and out-of-plane pose variations using an enhanced technique based on active appearance models (AAMs). We improve the performance of AAM fitting, not only by training it with in-the-wild images and using a powerful optimization technique, but also by initializing the AAM with estimates of the locations of the facial landmarks obtained by a method based on flexible mixture of parts. The proposed initialization technique results in significant improvement of AAM fitting to non-frontal poses and makes the normalization process robust, fast and reliable. Owing to the proper alignment of the face images, made possible by this approach, we can use local feature descriptors, such as Histograms of Oriented Gradients (HOG), for matching. The use of HOG features makes the system robust against illumination variations. In order to improve the discriminating information content of the feature vectors, we also extract Gabor features from the normalized face images and fuse them with HOG features using Canonical Correlation Analysis (CCA). Experimental results performed on various databases outperform the state-of-the-art methods and show the effectiveness of our proposed method in normalization and recognition of face images obtained in unconstrained environments.

...read moreread less

Journal Article•10.3389/FGENE.2016.00164•

In papyro comparison of TMM (edgeR), RLE (DESeq2), and MRN normalization methods for a simple two-conditions-without-replicates RNA-Seq experimental design

[...]

Elie Maza¹•Institutions (1)

National Polytechnic Institute of Toulouse¹

16 Sep 2016-Frontiers in Genetics

TL;DR: In this article, the authors highlight the similarities between three normalization methods: TMM from edgeR R package, RLE from DESeq2 R package and MRN from MRN.

...read moreread less

Abstract: In the past 5 years, RNA-Seq has become a powerful tool in transcriptome analysis even though computational methods dedicated to the analysis of high-throughput sequencing data are yet to be standardized. It is, however, now commonly accepted that the choice of a normalization procedure is an important step in such a process, for example in differential gene expression analysis. The present article highlights the similarities between three normalization methods: TMM from edgeR R package, RLE from DESeq2 R package, and MRN. Both TMM and DESeq2 are widely used for differential gene expression analysis. This paper introduces properties that show when these three methods will give exactly the same results. These properties are proven mathematically and illustrated by performing in silico calculations on a given RNA-Seq data set.

...read moreread less

Proceedings Article•

Dirichlet process mixture model for correcting technical variation in single-cell gene expression data

[...]

Sandhya Prabhakaran¹, Elham Azizi¹, Ambrose J. Carr¹, Dana Pe'er¹•Institutions (1)

Columbia University¹

19 Jun 2016

TL;DR: In this article, a hierarchical Bayesian mixture model with cell-specific scalings is proposed to aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals.

...read moreread less

Abstract: We introduce an iterative normalization and clustering method for single-cell gene expression data. The emerging technology of single-cell RNA-seq gives access to gene expression measurements for thousands of cells, allowing discovery and characterization of cell types. However, the data is confounded by technical variation emanating from experimental errors and cell type-specific biases. Current approaches perform a global normalization prior to analyzing biological signals, which does not resolve missing data or variation dependent on latent cell types. Our model is formulated as a hierarchical Bayesian mixture model with cell-specific scalings that aid the iterative normalization and clustering of cells, teasing apart technical variation from biological signals. We demonstrate that this approach is superior to global normalization followed by clustering. We show identifiability and weak convergence guarantees of our method and present a scalable Gibbs inference algorithm. This method improves cluster inference in both synthetic and real single-cell data compared with previous methods, and allows easy interpretation and recovery of the underlying structure and cell types.

...read moreread less

Journal Article•10.1080/17476933.2015.1079628•

Certain geometric properties of the Mittag-Leffler functions

[...]

Deepak Bansal, J. K. Prajapat¹•Institutions (1)

Central University of Rajasthan¹

03 Mar 2016-Complex Variables and Elliptic Equations

TL;DR: In this article, the Mittag-Leffler functions with their normalization are considered and sufficient conditions are obtained so that they have certain geometric properties including univalency, starlikeness, convexity and close-to-convexity in the open unit disk.

...read moreread less

Abstract: In the present investigation, the Mittag-Leffler functions with their normalization are considered. Several sufficient conditions are obtained so that the Mittag-Leffler functions have certain geometric properties including univalency, starlikeness, convexity and close-to-convexity in the open unit disk. Partial sums of Mittag-Leffler functions are also studied. The results obtained are new and their usefulness is depicted by deducing several interesting corollaries and examples.

...read moreread less

Journal Article•10.1007/S12064-015-0220-8•

How should we measure proportionality on relative gene expression data

[...]

Ionas Erb¹, Cedric Notredame¹•Institutions (1)

Pompeu Fabra University¹

13 Jan 2016-Theory in Biosciences

TL;DR: It is demonstrated that using an unchanged gene as a reference has huge advantages in terms of sensitivity and the link between proportionality and partial correlation and derive expressions for a partial proportionality coefficient is explored.

...read moreread less

Posted Content•

Normalization Propagation: A Parametric Technique for Removing Internal Covariate Shift in Deep Networks

[...]

Devansh Arpit¹, Yingbo Zhou¹, Bhargava Urala Kota¹, Venu Govindaraju¹•Institutions (1)

University at Buffalo¹

04 Mar 2016-arXiv: Machine Learning

TL;DR: Normalization Propagation as mentioned in this paper uses a data-independent parametric estimate of mean and standard deviation in every layer thus being computationally faster compared with batch normalization, and it can forward propagate this normalization without the need for recalculating the approximate statistics for hidden layers.

...read moreread less

Abstract: While the authors of Batch Normalization (BN) identify and address an important problem involved in training deep networks-- Internal Covariate Shift-- the current solution has certain drawbacks. Specifically, BN depends on batch statistics for layerwise input normalization during training which makes the estimates of mean and standard deviation of input (distribution) to hidden layers inaccurate for validation due to shifting parameter values (especially during initial training epochs). Also, BN cannot be used with batch-size 1 during training. We address these drawbacks by proposing a non-adaptive normalization technique for removing internal covariate shift, that we call Normalization Propagation. Our approach does not depend on batch statistics, but rather uses a data-independent parametric estimate of mean and standard-deviation in every layer thus being computationally faster compared with BN. We exploit the observation that the pre-activation before Rectified Linear Units follow Gaussian distribution in deep networks, and that once the first and second order statistics of any given dataset are normalized, we can forward propagate this normalization without the need for recalculating the approximate statistics for hidden layers.

...read moreread less

Journal Article•10.1093/MNRAS/STV2374•

A Bayesian approach to linear regression in astronomy

[...]

Mauro Sereno¹•Institutions (1)

University of Bologna¹

11 Jan 2016-Monthly Notices of the Royal Astronomical Society

TL;DR: In this paper, a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter is discussed, where the intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time.

...read moreread less

Abstract: Linear regression is common in astronomical analyses. I discuss a Bayesian hierarchical modeling of data with heteroscedastic and possibly correlated measurement errors and intrinsic scatter. The method fully accounts for time evolution. The slope, the normalization, and the intrinsic scatter of the relation can evolve with the redshift. The intrinsic distribution of the independent variable is approximated using a mixture of Gaussian distributions whose means and standard deviations depend on time. The method can address scatter in the measured independent variable (a kind of Eddington bias), selection effects in the response variable (Malmquist bias), and departure from linearity in form of a knee. I tested the method with toy models and simulations and quantified the effect of biases and inefficient modeling. The R-package LIRA (LInear Regression in Astronomy) is made available to perform the regression.

...read moreread less

Posted Content•

Recurrent Batch Normalization

[...]

Tim Cooijmans¹, Nicolas Ballas¹, César Laurent¹, Caglar Gulcehre¹, Aaron Courville¹ - Show less +1 more•Institutions (1)

Université de Montréal¹

30 Mar 2016-arXiv: Learning

...read moreread less

Proceedings Article•

Identity Matters in Deep Learning

[...]

Moritz Hardt¹, Tengyu Ma²•Institutions (2)

Google¹, Princeton University²

4 Nov 2016

TL;DR: This paper showed that linear residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size.

...read moreread less

Abstract: An emerging design principle in deep learning is that each layer of a deep artificial neural network should be able to easily express the identity transformation. This idea not only motivated various normalization techniques, such as \emph{batch normalization}, but was also key to the immense success of \emph{residual networks}. In this work, we put the principle of \emph{identity parameterization} on a more solid theoretical footing alongside further empirical progress. We first give a strikingly simple proof that arbitrarily deep linear residual networks have no spurious local optima. The same result for linear feed-forward networks in their standard parameterization is substantially more delicate. Second, we show that residual networks with ReLu activations have universal finite-sample expressivity in the sense that the network can represent any function of its sample provided that the model has more parameters than the sample size. Directly inspired by our theory, we experiment with a radically simple residual architecture consisting of only residual convolutional layers and ReLu activations, but no batch normalization, dropout, or max pool. Our model improves significantly on previous all-convolutional networks on the CIFAR10, CIFAR100, and ImageNet classification benchmarks.

...read moreread less

...

Expand