Top 793 papers published in the topic of Linear model in 2011

Showing papers on "Linear model published in 2011"

The arcsine is asinine: the analysis of proportions in ecology

[...]

David I. Warton¹, Francis K. C. Hui¹•Institutions (1)

01 Jan 2011-Ecology

TL;DR: It is argued that the arcsine transform should not be used in either binomial or non-binomial data, and the logit transformation is proposed as an alternative approach to address these issues.

...read moreread less

Abstract: The arcsine square root transformation has long been standard procedure when analyzing proportional data in ecology, with applications in data sets containing binomial and non-binomial response variables. Here, we argue that the arcsine transform should not be used in either circumstance. For binomial data, logistic regression has greater interpretability and higher power than analyses of transformed data. However, it is important to check the data for additional unexplained variation, i.e., overdispersion, and to account for it via the inclusion of random effects in the model if found. For non-binomial data, the arcsine transform is undesirable on the grounds of interpretability, and because it can produce nonsensical predictions. The logit transformation is proposed as an alternative approach to address these issues. Examples are presented in both cases to illustrate these advantages, comparing various methods of analyzing proportions including untransformed, arcsine- and logit-transformed linear models and logistic regression (with or without random effects). Simulations demonstrate that logistic regression usually provides a gain in power over other methods.

...read moreread less

2,270 citations

Journal Article•10.1007/S00265-010-1038-5•

Cryptic multiple hypotheses testing in linear models: overestimated effect sizes and the winner's curse.

[...]

Wolfgang Forstmeier¹, Holger Schielzeth¹•Institutions (1)

Max Planck Society¹

01 Jan 2011-Behavioral Ecology and Sociobiology

TL;DR: Full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone, and favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.

...read moreread less

Abstract: Fitting generalised linear models (GLMs) with more than one predictor has become the standard method of analysis in evolutionary and behavioural research. Often, GLMs are used for exploratory data analysis, where one starts with a complex full model including interaction terms and then simplifies by removing non-significant terms. While this approach can be useful, it is problematic if significant effects are interpreted as if they arose from a single a priori hypothesis test. This is because model selection involves cryptic multiple hypothesis testing, a fact that has only rarely been acknowledged or quantified. We show that the probability of finding at least one ‘significant’ effect is high, even if all null hypotheses are true (e.g. 40% when starting with four predictors and their two-way interactions). This probability is close to theoretical expectations when the sample size (N) is large relative to the number of predictors including interactions (k). In contrast, type I error rates strongly exceed even those expectations when model simplification is applied to models that are over-fitted before simplification (low N/k ratio). The increase in false-positive results arises primarily from an overestimation of effect sizes among significant predictors, leading to upward-biased effect sizes that often cannot be reproduced in follow-up studies (‘the winner's curse’). Despite having their own problems, full model tests and P value adjustments can be used as a guide to how frequently type I errors arise by sampling variation alone. We favour the presentation of full models, since they best reflect the range of predictors investigated and ensure a balanced representation also of non-significant results.

...read moreread less

1,039 citations

Posted Content•

Inference on Treatment Effects After Selection Amongst High-Dimensional Controls

[...]

Alexandre Belloni, Victor Chernozhukov, Christian Hansen

31 Dec 2011-arXiv: Methodology

TL;DR: This work develops a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the "post-double-selection" method, which resolves the problem of uniform inference after model selection for a large, interesting class of models.

...read moreread less

Abstract: We propose robust methods for inference on the effect of a treatment variable on a scalar outcome in the presence of very many controls. Our setting is a partially linear model with possibly non-Gaussian and heteroscedastic disturbances. Our analysis allows the number of controls to be much larger than the sample size. To make informative inference feasible, we require the model to be approximately sparse; that is, we require that the effect of confounding factors can be controlled for up to a small approximation error by conditioning on a relatively small number of controls whose identities are unknown. The latter condition makes it possible to estimate the treatment effect by selecting approximately the right set of controls. We develop a novel estimation and uniformly valid inference method for the treatment effect in this setting, called the "post-double-selection" method. Our results apply to Lasso-type methods used for covariate selection as well as to any other model selection method that is able to find a sparse model with good approximation properties. The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates.

...read moreread less

934 citations

Journal Article•10.1016/J.ASOC.2010.10.015•

A novel hybridization of artificial neural networks and ARIMA models for time series forecasting

[...]

Mehdi Khashei¹, Mehdi Bijari¹•Institutions (1)

Isfahan University of Technology¹

1 Mar 2011

TL;DR: Empirical results with three well-known real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy achieved by traditional hybrid models and also either of the components models used separately.

...read moreread less

Abstract: Improving forecasting especially time series forecasting accuracy is an important yet often difficult task facing decision makers in many areas. Both theoretical and empirical findings have indicated that integration of different models can be an effective way of improving upon their predictive performance, especially when the models in combination are quite different. Artificial neural networks (ANNs) are flexible computing frameworks and universal approximators that can be applied to a wide range of forecasting problems with a high degree of accuracy. However, using ANNs to model linear problems have yielded mixed results, and hence; it is not wise to apply ANNs blindly to any type of data. Autoregressive integrated moving average (ARIMA) models are one of the most popular linear models in time series forecasting, which have been widely applied in order to construct more accurate hybrid models during the past decade. Although, hybrid techniques, which decompose a time series into its linear and nonlinear components, have recently been shown to be successful for single models, these models have some disadvantages. In this paper, a novel hybridization of artificial neural networks and ARIMA model is proposed in order to overcome mentioned limitation of ANNs and yield more general and more accurate forecasting model than traditional hybrid ARIMA-ANNs models. In our proposed model, the unique advantages of ARIMA models in linear modeling are used in order to identify and magnify the existing linear structure in data, and then a neural network is used in order to determine a model to capture the underlying data generating process and predict, using preprocessed data. Empirical results with three well-known real data sets indicate that the proposed model can be an effective way to improve forecasting accuracy achieved by traditional hybrid models and also either of the components models used separately.

...read moreread less

811 citations

Journal Article•10.1016/J.CSDA.2011.02.004•

Practical variable selection for generalized additive models

[...]

Giampiero Marra¹, Simon N. Wood²•Institutions (2)

University College London¹, University of Bath²

01 Jul 2011-Computational Statistics & Data Analysis

TL;DR: Two very simple but effective shrinkage methods and an extension of the nonnegative garrote estimator are introduced, which avoid having to use nonparametric testing methods for which there is no general reliable distributional theory.

...read moreread less

753 citations

NICE DSU Technical Support Document 2: A Generalised Linear Modelling Framework for Pairwise and Network Meta-Analysis of Randomised Controlled Trials

[...]

Sofia Dias, Nicky J Welton, Alex J. Sutton, A E Ades

1 Aug 2011

TL;DR: This DSU series of Technical Support Documents (TSDs) is intended to complement the Methods Guide by providing detailed information on how to implement specific methods by providing clear recommendations on the implementation of methods and reporting standards where it is appropriate to do so.

...read moreread less

Abstract: This paper sets out a generalised linear model (GLM) framework for the synthesis of data from randomised controlled trials (RCTs). We describe a common model taking the form of a linear regression for both fixed and random effects synthesis, that can be implemented with Normal, Binomial, Poisson, and Multinomial data. The familiar logistic model for meta- analysis with Binomial data is a GLM with a logit link function, which is appropriate for probability outcomes. The same linear regression framework can be applied to continuous outcomes, rate models, competing risks, or ordered category outcomes, by using other link functions, such as identity, log, complementary log-log, and probit link functions. The common core model for the linear predictor can be applied to pair-wise meta-analysis, indirect comparisons, synthesis of multi-arm trials, and mixed treatment comparisons, also known as network meta-analysis, without distinction.We take a Bayesian approach to estimation and provide WinBUGS program code for a Bayesian analysis using Markov chain Monte Carlo (MCMC) simulation. An advantage of this approach is that it is straightforward to extend to shared parameter models where different RCTs report outcomes in different formats but from a common underlying model. Use of the GLM framework allows us to present a unified account of how models can be compared using the Deviance Information Criterion (DIC), and how goodness of fit can be assessed using the residual deviance. WinBUGS code for model critique is provided. Our approach is illustrated through a range of worked examples for the commonly encountered evidence formats, including shared parameter models.We give suggestions on computational issues that sometimes arise in MCMC evidence synthesis, and comment briefly on alternative software.

...read moreread less

647 citations

Journal Article•10.1214/11-AOS896•

Oracle Inequalities and Optimal Inference under Group Sparsity

[...]

Karim Lounici, Massimiliano Pontil, Sara van de Geer, Alexandre B. Tsybakov

01 Aug 2011-Annals of Statistics

TL;DR: In this article, the authors consider the problem of estimating a sparse linear regression vector β* under a Gaussian noise model, for the purpose of both prediction and model selection, and establish oracle inequalities for the prediction and l2 estimation errors of this estimator.

...read moreread less

Abstract: We consider the problem of estimating a sparse linear regression vector β* under a Gaussian noise model, for the purpose of both prediction and model selection. We assume that prior knowledge is available on the sparsity pattern, namely the set of variables is partitioned into prescribed groups, only few of which are relevant in the estimation process. This group sparsity assumption suggests us to consider the Group Lasso method as a means to estimate β*. We establish oracle inequalities for the prediction and l2 estimation errors of this estimator. These bounds hold under a restricted eigenvalue condition on the design matrix. Under a stronger condition, we derive bounds for the estimation error for mixed (2, p)-norms with 1 ≤ p ≤ ∞. When p=∞, this result implies that a thresholded version of the Group Lasso estimator selects the sparsity pattern of β* with high probability. Next, we prove that the rate of convergence of our upper bounds is optimal in a minimax sense, up to a logarithmic factor, for all estimators over a class of group sparse vectors. Furthermore, we establish lower bounds for the prediction and l2 estimation errors of the usual Lasso estimator. Using this result, we demonstrate that the Group Lasso can achieve an improvement in the prediction and estimation errors as compared to the Lasso. An important application of our results is provided by the problem of estimating multiple regression equations simultaneously or multi-task learning. In this case, we obtain refinements of the results in [In Proc. of the 22nd Annual Conference on Learning Theory (COLT) (2009)], which allow us to establish a quantitative advantage of the Group Lasso over the usual Lasso in the multi-task setting. Finally, within the same setting, we show how our results can be extended to more general noise distributions, of which we only require the fourth moment to be finite. To obtain this extension, we establish a new maximal moment inequality, which may be of independent interest.

...read moreread less

506 citations

Journal Article•10.1111/J.1541-0420.2011.01564.X•

Hierarchical Commensurate and Power Prior Models for Adaptive Incorporation of Historical Information in Clinical Trials

[...]

Brian P. Hobbs¹, Bradley P. Carlin², Sumithra J. Mandrekar³, Daniel J. Sargent³•Institutions (3)

University of Texas MD Anderson Cancer Center¹, University of Minnesota², Mayo Clinic³

01 Sep 2011-Biometrics

TL;DR: This article presents several models that allow for the commensurability of the information in the historical and current data to determine how much historical information is used in hierarchical Bayesian methods for incorporating historical data that are adaptively robust to prior information that reveals itself to be inconsistent with the accumulating experimental data.

...read moreread less

Abstract: Bayesian clinical trial designs offer the possibility of a substantially reduced sample size, increased statistical power, and reductions in cost and ethical hazard. However when prior and current information conflict, Bayesian methods can lead to higher than expected type I error, as well as the possibility of a costlier and lengthier trial. This motivates an investigation of the feasibility of hierarchical Bayesian methods for incorporating historical data that are adaptively robust to prior information that reveals itself to be inconsistent with the accumulating experimental data. In this article, we present several models that allow for the commensurability of the information in the historical and current data to determine how much historical information is used. A primary tool is elaborating the traditional power prior approach based upon a measure of commensurability for Gaussian data. We compare the frequentist performance of several methods using simulations, and close with an example of a colon cancer trial that illustrates a linear models extension of our adaptive borrowing approach. Our proposed methods produce more precise estimates of the model parameters, in particular, conferring statistical significance to the observed reduction in tumor size for the experimental regimen as compared to the control regimen.

...read moreread less

367 citations

Posted Content•

Scaled Sparse Linear Regression

[...]

Tingni Sun¹, Cun-Hui Zhang¹•Institutions (1)

Rutgers University¹

24 Apr 2011-arXiv: Machine Learning

TL;DR: In this paper, the authors proposed scaled sparse linear regression (SRL) to jointly estimate the regression coefficients and the noise level in a linear model, which is a convex minimization of a penalized joint loss function.

...read moreread less

Abstract: Scaled sparse linear regression jointly estimates the regression coefficients and noise level in a linear model. It chooses an equilibrium with a sparse regression method by iteratively estimating the noise level via the mean residual square and scaling the penalty in proportion to the estimated noise level. The iterative algorithm costs little beyond the computation of a path or grid of the sparse regression estimator for penalty levels above a proper threshold. For the scaled lasso, the algorithm is a gradient descent in a convex minimization of a penalized joint loss function for the regression coefficients and noise level. Under mild regularity conditions, we prove that the scaled lasso simultaneously yields an estimator for the noise level and an estimated coefficient vector satisfying certain oracle inequalities for prediction, the estimation of the noise level and the regression coefficients. These inequalities provide sufficient conditions for the consistency and asymptotic normality of the noise level estimator, including certain cases where the number of variables is of greater order than the sample size. Parallel results are provided for the least squares estimation after model selection by the scaled lasso. Numerical results demonstrate the superior performance of the proposed methods over an earlier proposal of joint convex minimization.

...read moreread less

334 citations

Journal Article•10.1198/JCGS.2010.10007•

Penalized Functional Regression.

[...]

Jeff Goldsmith¹, Jennifer F. Bobb¹, Ciprian M. Crainiceanu¹, Brian Caffo¹, Daniel S. Reich² - Show less +1 more•Institutions (2)

Johns Hopkins University¹, National Institutes of Health²

01 Dec 2011-Journal of Computational and Graphical Statistics

TL;DR: Differences between various cerebral white-matter tract property measurements of multiple sclerosis patients and controls are analyzed to analyze differences between various Cerebral White-matter demyelination via diffusion tensor imaging (DTI).

...read moreread less

Abstract: We develop fast fitting methods for generalized functional linear models. The functional predictor is projected onto a large number of smooth eigenvectors and the coefficient function is estimated using penalized spline regression; confidence intervals based on the mixed model framework are obtained. Our method can be applied to many functional data designs including functions measured with and without error, sparsely or densely sampled. The methods also extend to the case of multiple functional predictors or functional predictors with a natural multilevel structure. The approach can be implemented using standard mixed effects software and is computationally fast. The methodology is motivated by a study of white-matter demyelination via diffusion tensor imaging (DTI). The aim of this study is to analyze differences between various cerebral white-matter tract property measurements of multiple sclerosis (MS) patients and controls. While the statistical developments proposed here were motivated by the DTI st...

...read moreread less

320 citations

Journal Article•10.1109/TSP.2010.2102756•

Bayesian Nonparametric Inference of Switching Dynamic Linear Models

[...]

Emily B. Fox¹, Erik B. Sudderth², Michael I. Jordan³, Alan S. Willsky⁴•Institutions (4)

Duke University¹, Brown University², University of California, Berkeley³, Massachusetts Institute of Technology⁴

01 Apr 2011-IEEE Transactions on Signal Processing

TL;DR: In this article, a Bayesian nonparametric approach utilizes a hierarchical Dirichlet process prior to learn an unknown number of persistent, smooth dynamical modes, and additionally employs automatic relevance determination to infer a sparse set of dynamic dependencies allowing to learn SLDS with varying state dimension or switching VAR processes with varying autoregressive order.

...read moreread less

Abstract: Many complex dynamical phenomena can be effectively modeled by a system that switches among a set of conditionally linear dynamical modes. We consider two such models: the switching linear dynamical system (SLDS) and the switching vector autoregressive (VAR) process. Our Bayesian nonparametric approach utilizes a hierarchical Dirichlet process prior to learn an unknown number of persistent, smooth dynamical modes. We additionally employ automatic relevance determination to infer a sparse set of dynamic dependencies allowing us to learn SLDS with varying state dimension or switching VAR processes with varying autoregressive order. We develop a sampling algorithm that combines a truncated approximation to the Dirichlet process with efficient joint sampling of the mode and state sequences. The utility and flexibility of our model are demonstrated on synthetic data, sequences of dancing honey bees, the IBOVESPA stock index and a maneuvering target tracking application.

...read moreread less

Journal Article•10.1371/JOURNAL.PCBI.1001056•

From spiking neuron models to linear-nonlinear models.

[...]

Srdjan Ostojic¹, Nicolas Brunel²•Institutions (2)

Columbia University¹, Paris Descartes University²

20 Jan 2011-PLOS Computational Biology

TL;DR: It is found that the LN cascade provides accurate estimates of the firing rates of spiking neurons in most of parameter space, and an adaptive timescale rate model is introduced in which the timescale of the linear filter depends on the instantaneous firing rate.

...read moreread less

Abstract: Neurons transform time-varying inputs into action potentials emitted stochastically at a time dependent rate. The mapping from current input to output firing rate is often represented with the help of phenomenological models such as the linear-nonlinear (LN) cascade, in which the output firing rate is estimated by applying to the input successively a linear temporal filter and a static non-linear transformation. These simplified models leave out the biophysical details of action potential generation. It is not a priori clear to which extent the input-output mapping of biophysically more realistic, spiking neuron models can be reduced to a simple linear-nonlinear cascade. Here we investigate this question for the leaky integrate-and-fire (LIF), exponential integrate-and-fire (EIF) and conductance-based Wang-Buzsaki models in presence of background synaptic activity. We exploit available analytic results for these models to determine the corresponding linear filter and static non-linearity in a parameter-free form. We show that the obtained functions are identical to the linear filter and static non-linearity determined using standard reverse correlation analysis. We then quantitatively compare the output of the corresponding linear-nonlinear cascade with numerical simulations of spiking neurons, systematically varying the parameters of input signal and background noise. We find that the LN cascade provides accurate estimates of the firing rates of spiking neurons in most of parameter space. For the EIF and Wang-Buzsaki models, we show that the LN cascade can be reduced to a firing rate model, the timescale of which we determine analytically. Finally we introduce an adaptive timescale rate model in which the timescale of the linear filter depends on the instantaneous firing rate. This model leads to highly accurate estimates of instantaneous firing rates.

...read moreread less

Journal Article•10.3982/ECTA8662•

Applied nonparametric instrumental variables estimation

[...]

Joel L. Horowitz¹•Institutions (1)

Northwestern University¹

01 Mar 2011-Econometrica

TL;DR: In this paper, the authors explore what can be learned when the function of interest is identified through an instrumental variable but is not assumed to be known up to finitely many parameters.

...read moreread less

Abstract: Instrumental variables are widely used in applied econometrics to achieve identification and carry out estimation and inference in models that contain endogenous explanatory variables. In most applications, the function of interest (e.g., an Engel curve or demand function) is assumed to be known up to finitely many parameters (e.g., a linear model), and instrumental variables are used identify and estimate these parameters. However, linear and other finite-dimensional parametric models make strong assumptions about the population being modeled that are rarely if ever justified by economic theory or other a priori reasoning and can lead to seriously erroneous conclusions if they are incorrect. This paper explores what can be learned when the function of interest is identified through an instrumental variable but is not assumed to be known up to finitely many parameters. The paper explains the differences between parametric and nonparametric estimators that are important for applied research, describes an easily implemented nonparametric instrumental variables estimator, and presents empirical examples in which nonparametric methods lead to substantive conclusions that are quite different from those obtained using standard, parametric estimators.

...read moreread less

Journal Article•10.1016/J.JMVA.2010.11.002•

Log-linear Poisson autoregression

[...]

Konstantinos Fokianos¹, Dag Tjøstheim²•Institutions (2)

University of Cyprus¹, University of Bergen²

01 Mar 2011-Journal of Multivariate Analysis

TL;DR: Positive association between the number of transactions and the volatility process of a certain stock is discovered and it is proved that the maximum likelihood estimator of the vector of unknown parameters is asymptotically normal with a covariance matrix that can be consistently estimated.

...read moreread less

Journal Article•10.1257/AER.101.3.532•

Oaxaca-Blinder as a Reweighting Estimator

[...]

Patrick Kline

01 May 2011-The American Economic Review

TL;DR: The classic regression based estimator of counterfactual means studied by Oaxaca and Blinder as mentioned in this paper constitutes a propensity score reweighting estimator based upon a linear model for the conditional odds of being treated.

...read moreread less

Abstract: The classic regression based estimator of counterfactual means studied by Ronald Oaxaca (1973) and Alan Blinder (1973) is shown to constitute a propensity score reweighting estimator based upon a linear model for the conditional odds of being treated.

...read moreread less

Proceedings Article•

Spike and Slab Variational Inference for Multi-Task and Multiple Kernel Learning

[...]

Miguel Lázaro-Gredilla¹, Michalis K. Titsias²•Institutions (2)

Complutense University of Madrid¹, University of Manchester²

12 Dec 2011

TL;DR: A variational Bayesian inference algorithm which can be widely applied to sparse linear models and is based on the spike and slab prior, which is the golden standard for sparse inference is introduced.

...read moreread less

Abstract: We introduce a variational Bayesian inference algorithm which can be widely applied to sparse linear models. The algorithm is based on the spike and slab prior which, from a Bayesian perspective, is the golden standard for sparse inference. We apply the method to a general multi-task and multiple kernel learning model in which a common set of Gaussian process functions is linearly combined with task-specific sparse weights, thus inducing relation between tasks. This model unifies several sparse linear models, such as generalized linear models, sparse factor analysis and matrix factorization with missing values, so that the variational algorithm can be applied to all these cases. We demonstrate our approach in multi-output Gaussian process regression, multi-class classification, image processing applications and collaborative filtering.

...read moreread less

Journal Article•10.1016/J.NEUROIMAGE.2010.08.042•

Functional connectivity in resting-state fMRI: is linear correlation sufficient?

[...]

Jaroslav Hlinka¹, Milan Paluš¹, Martin Vejmelka¹, Dante Mantini², Dante Mantini³, Maurizio Corbetta⁴, Maurizio Corbetta² - Show less +3 more•Institutions (4)

Academy of Sciences of the Czech Republic¹, University of Chieti-Pescara², Katholieke Universiteit Leuven³, Washington University in St. Louis⁴

01 Feb 2011-NeuroImage

TL;DR: Practical relevance of nonlinear methods trying to improve over linear correlation might be limited by the fact that the data are indeed almost Gaussian, and this framework for testing and estimating the deviation from Gaussianity is presented.

...read moreread less

Journal Article•10.1109/TNN.2011.2152852•

Local Linear Discriminant Analysis Framework Using Sample Neighbors

[...]

Zizhu Fan¹, Yong Xu¹, David Zhang•Institutions (1)

Harbin Institute of Technology¹

01 Jul 2011-IEEE Transactions on Neural Networks

TL;DR: An improved LDA framework is proposed, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions, and can effectively capture the local structure of samples.

...read moreread less

Abstract: The linear discriminant analysis (LDA) is a very popular linear feature extraction approach. The algorithms of LDA usually perform well under the following two assumptions. The first assumption is that the global data structure is consistent with the local data structure. The second assumption is that the input data classes are Gaussian distributions. However, in real-world applications, these assumptions are not always satisfied. In this paper, we propose an improved LDA framework, the local LDA (LLDA), which can perform well without needing to satisfy the above two assumptions. Our LLDA framework can effectively capture the local structure of samples. According to different types of local data structure, our LLDA framework incorporates several different forms of linear feature extraction approaches, such as the classical LDA and principal component analysis. The proposed framework includes two LLDA algorithms: a vector-based LLDA algorithm and a matrix-based LLDA (MLLDA) algorithm. MLLDA is directly applicable to image recognition, such as face recognition. Our algorithms need to train only a small portion of the whole training set before testing a sample. They are suitable for learning large-scale databases especially when the input data dimensions are very high and can achieve high classification accuracy. Extensive experiments show that the proposed algorithms can obtain good classification results.

...read moreread less

Journal Article•10.1214/10-AOAS377•

Random lasso

[...]

Sijian Wang, Bin Nan, Saharon Rosset, Ji Zhu

18 Apr 2011-arXiv: Applications

TL;DR: In this article, the random lasso method for variable selection in linear models is proposed, which consists of two major steps, in step 1, the lasso is applied to many bootstrap samples, each using a set of randomly selected covariates A measure of importance is yielded from this step for each covariate in step 2, a similar procedure to the first step is implemented with the exception that for each bootstrap sample, a subset of covariates is randomly selected with unequal selection probabilities determined by the covariates' importance.

...read moreread less

Abstract: We propose a computationally intensive method, the random lasso method, for variable selection in linear models The method consists of two major steps In step 1, the lasso method is applied to many bootstrap samples, each using a set of randomly selected covariates A measure of importance is yielded from this step for each covariate In step 2, a similar procedure to the first step is implemented with the exception that for each bootstrap sample, a subset of covariates is randomly selected with unequal selection probabilities determined by the covariates' importance Adaptive lasso may be used in the second step with weights determined by the importance measures The final set of covariates and their coefficients are determined by averaging bootstrap results obtained from step 2 The proposed method alleviates some of the limitations of lasso, elastic-net and related methods noted especially in the context of microarray data analysis: it tends to remove highly correlated variables altogether or select them all, and maintains maximal flexibility in estimating their coefficients, particularly with different signs; the number of selected variables is no longer limited by the sample size; and the resulting prediction accuracy is competitive or superior compared to the alternatives We illustrate the proposed method by extensive simulation studies The proposed method is also applied to a Glioblastoma microarray data analysis

...read moreread less

Journal Article•10.1214/11-AOS910•

Global testing under sparse alternatives: ANOVA, multiple comparisons and the higher criticism

[...]

Ery Arias-Castro, Emmanuel J. Candès, Yaniv Plan

01 Oct 2011-Annals of Statistics

TL;DR: In this article, the authors show that under moderate sparsity levels, that is, 0 ≤ α ≤ 1/2, the analysis of variance (ANOVA) is essentially optimal under some conditions on the design.

...read moreread less

Abstract: Testing for the significance of a subset of regression coefficients in a linear model, a staple of statistical analysis, goes back at least to the work of Fisher who introduced the analysis of variance (ANOVA). We study this problem under the assumption that the coefficient vector is sparse, a common situation in modern high-dimensional settings. Suppose we have p covariates and that under the alternative, the response only depends upon the order of p^(1−α) of those, 0 ≤ α ≤ 1. Under moderate sparsity levels, that is, 0 ≤ α ≤ 1/2, we show that ANOVA is essentially optimal under some conditions on the design. This is no longer the case under strong sparsity constraints, that is, α > 1/2. In such settings, a multiple comparison procedure is often preferred and we establish its optimality when α ≥ 3/4. However, these two very popular methods are suboptimal, and sometimes powerless, under moderately strong sparsity where 1/2 1/2. This optimality property is true for a variety of designs, including the classical (balanced) multi-way designs and more modern “p > n” designs arising in genetics and signal processing. In addition to the standard fixed effects model, we establish similar results for a random effects model where the nonzero coefficients of the regression vector are normally distributed.

...read moreread less

Journal Article•10.1016/J.BUILDENV.2010.08.004•

Model-predictive control of mixed-mode buildings with rule extraction

[...]

Peter May-Ostendorp¹, Gregor P. Henze¹, Charles D. Corbin¹, Balaji Rajagopalan¹, Clemens Felsmann² - Show less +1 more•Institutions (2)

University of Colorado Boulder¹, Dresden University of Technology²

01 Feb 2011-Building and Environment

TL;DR: In this paper, a series of model-predictive control (MPC) techniques have been explored for optimizing control sequences for window operation in mixedmode (MM) buildings using EnergyPlus, and results for a simplified MM office building have been presented.

...read moreread less

Journal Article•10.1016/J.ENBUILD.2011.02.007•

Prediction of room temperature and relative humidity by autoregressive linear and nonlinear neural network models for an open office

[...]

G. Mustafaraj¹, G. Lowry², Jie Chen¹•Institutions (2)

Brunel University London¹, London South Bank University²

01 Jun 2011-Energy and Buildings

TL;DR: In this paper, a neural network-based nonlinear autoregressive model with external inputs (NNARX) was developed to predict the thermal behavior of an open office in a modern building.

...read moreread less

Journal Article•10.1198/JASA.2011.TM10281•

Linear or Nonlinear? Automatic Structure Discovery for Partially Linear Models

[...]

Hao Helen Zhang¹, Guang Cheng², Yufeng Liu•Institutions (2)

North Carolina State University¹, Purdue University²

01 Sep 2011-Journal of the American Statistical Association

TL;DR: Under certain regularity conditions, it is shown that the LAND estimator is able to identify the underlying true model structure correctly and at the same time estimate the multivariate regression function consistently.

...read moreread less

Abstract: Partially linear models provide a useful class of tools for modeling complex data by naturally incorporating a combination of linear and nonlinear effects within one framework. One key question in partially linear models is the choice of model structure, that is, how to decide which covariates are linear and which are nonlinear. This is a fundamental, yet largely unsolved problem for partially linear models. In practice, one often assumes that the model structure is given or known and then makes estimation and inference based on that structure. Alternatively, there are two methods in common use for tackling the problem: hypotheses testing and visual screening based on the marginal fits. Both methods are quite useful in practice but have their drawbacks. First, it is difficult to construct a powerful procedure for testing multiple hypotheses of linear against nonlinear fits. Second, the screening procedure based on the scatterplots of individual covariate fits may provide an educated guess on the regressio...

...read moreread less

Journal Article•10.1016/J.JMVA.2010.10.012•

Autoregressive process modeling via the Lasso procedure

[...]

Yuval Nardi¹, Alessandro Rinaldo²•Institutions (2)

Technion – Israel Institute of Technology¹, Carnegie Mellon University²

01 Mar 2011-Journal of Multivariate Analysis

TL;DR: This paper derives conditions under which the Lasso estimator for the autoregressive coefficients is model selection consistent, estimation consistent and prediction consistent and derives theoretical results establishing various types of consistency.

...read moreread less

Journal Article•10.1016/J.AMEPRE.2011.07.008•

Estimation of aerobic fitness from 20-m multistage shuttle run test performance.

[...]

Matthew T. Mahar¹, Ashley M. Guerieri¹, Matthew S. Hanna¹, C David Kemble¹•Institutions (1)

East Carolina University¹

01 Oct 2011-American Journal of Preventive Medicine

TL;DR: The Quadratic Model and Linear Model 2 provide valid estimates of VO(2)max and compare favorably to previous models.

...read moreread less

Journal Article•10.1214/11-AOS882•

Single and multiple index functional regression models with nonparametric link

[...]

Dong Chen, Peter A. Hall, Hans-Georg Müller

01 Jun 2011-Annals of Statistics

TL;DR: In this article, a nonparametric linear model is proposed to estimate the link function nonparametrically and an approach to multi-index modeling is proposed using adaptively defined linear projections of functional data.

...read moreread less

Abstract: Fully nonparametric methods for regression from functional data have poor accuracy from a statistical viewpoint, reflecting the fact that their convergence rates are slower than nonparametric rates for the estimation of high-dimensional functions. This difficulty has led to an emphasis on the so-called functional linear model, which is much more flexible than common linear models in finite dimension, but nevertheless imposes structural constraints on the relationship between predictors and responses. Recent advances have extended the linear approach by using it in conjunction with link functions, and by considering multiple indices, but the flexibility of this technique is still limited. For example, the link may be modeled parametrically or on a grid only, or may be constrained by an assumption such as monotonicity; multiple indices have been modeled by making finite-dimensional assumptions. In this paper we introduce a new technique for estimating the link function nonparametrically, and we suggest an approach to multi-index modeling using adaptively defined linear projections of functional data. We show that our methods enable prediction with polynomial convergence rates. The finite sample performance of our methods is studied in simulations, and is illustrated by an application to a functional regression problem.

...read moreread less

Journal Article•10.1016/J.IJFORECAST.2010.09.005•

Forecast combinations of computational intelligence and linear models for the NN5 time series forecasting competition

[...]

Robert Andrawis¹, Amir F. Atiya¹, Hisham El-Shishiny²•Institutions (2)

Cairo University¹, IBM²

01 Jul 2011-International Journal of Forecasting

TL;DR: The forecasting model with which the author participated in the NN5 forecasting competition is introduced, to utilize the concept of forecast combination, which has proven to be an effective methodology in the forecasting literature.

...read moreread less

Journal Article•10.1016/J.APENERGY.2010.07.036•

Supervisory and optimal control of central chiller plants using simplified adaptive models and genetic algorithm

[...]

Zhenjun Ma¹, Shengwei Wang¹•Institutions (1)

Hong Kong Polytechnic University¹

01 Jan 2011-Applied Energy

TL;DR: In this article, a model-based supervisory and optimal control strategy for central chiller plants is presented to enhance their energy efficiency and control performance. And the optimal strategy is formulated using simplified models of major components and the genetic algorithm (GA).

...read moreread less

Journal Article•10.1002/AIC.12346•

Local learning‐based adaptive soft sensor for catalyst activation prediction

[...]

Petr Kadlec¹, Bogdan Gabrys¹•Institutions (1)

Bournemouth University¹

01 May 2011-Aiche Journal

TL;DR: The results show that the traditional recursive partial least squares algorithm struggles to deliver accurate predictions, and by exploiting the two-level adaptation scheme, the proposed algorithm delivers more accurate results.

...read moreread less

Abstract: This work presents an algorithm for the development of adaptive soft sensors. The method is based on the local learning framework, where locally valid models are built and maintained. In this framework, it is possible to model nonlinear relationship between the input and output data by the means of a combination of linear models. The method provides the possibility to perform adaptation at two levels: (i) recursive adaptation of the local models and (ii) the adaptation of the combination weights. The dataset used for evaluation of the algorithm describes a polymerization reactor where the target value is a simulated catalyst activity in the reactor. This dataset is also used to evaluate the performance of the proposed algorithm. The results show that the traditional recursive partial least squares algorithm struggles to deliver accurate predictions. In contrast to this, by exploiting the two-level adaptation scheme, the proposed algorithm delivers more accurate results. © 2010 American Institute of Chemical Engineers AIChE J, 57, 2011

...read moreread less

Journal Article•10.1111/J.1541-0420.2010.01435.X•

Prediction of Random Effects in Linear and Generalized Linear Models under Model Misspecification

[...]

Charles E. McCulloch¹, John Neuhaus¹•Institutions (1)

University of California, San Francisco¹

01 Mar 2011-Biometrics

TL;DR: It is shown that, although the predicted values can vary with the assumed distribution, the prediction accuracy is little affected for mild-to-moderate violations of the assumptions, and standard approaches, readily available in statistical software, will often suffice.

...read moreread less

Abstract: Statistical models that include random effects are commonly used to analyze longitudinal and correlated data, often with the assumption that the random effects follow a Gaussian distribution. Via theoretical and numerical calculations and simulation, we investigate the impact of misspecification of this distribution on both how well the predicted values recover the true underlying distribution and the accuracy of prediction of the realized values of the random effects. We show that, although the predicted values can vary with the assumed distribution, the prediction accuracy, as measured by mean square error, is little affected for mild-to-moderate violations of the assumptions. Thus, standard approaches, readily available in statistical software, will often suffice. The results are illustrated using data from the Heart and Estrogen/Progestin Replacement Study using models to predict future blood pressure values.

...read moreread less

...

Expand