Top 383 papers published in the topic of Linear model in 2001

Showing papers on "Linear model published in 2001"

Journal Article•10.1111/J.1442-9993.2001.01070.PP.X•

A new method for non-parametric multivariate analysis of variance

[...]

01 Feb 2001-Austral Ecology

TL;DR: In this article, a non-parametric method for multivariate analysis of variance, based on sums of squared distances, is proposed. But it is not suitable for most ecological multivariate data sets.

...read moreread less

Abstract: Hypothesis-testing methods for multivariate data are needed to make rigorous probability statements about the effects of factors and their interactions in experiments. Analysis of variance is particularly powerful for the analysis of univariate data. The traditional multivariate analogues, however, are too stringent in their assumptions for most ecological multivariate data sets. Non-parametric methods, based on permutation tests, are preferable. This paper describes a new non-parametric method for multivariate analysis of variance, after McArdle and Anderson (in press). It is given here, with several applications in ecology, to provide an alternative and perhaps more intuitive formulation for ANOVA (based on sums of squared distances) to complement the description pro- vided by McArdle and Anderson (in press) for the analysis of any linear model. It is an improvement on previous non-parametric methods because it allows a direct additive partitioning of variation for complex models. It does this while maintaining the flexibility and lack of formal assumptions of other non-parametric methods. The test- statistic is a multivariate analogue to Fisher's F-ratio and is calculated directly from any symmetric distance or dissimilarity matrix. P-values are then obtained using permutations. Some examples of the method are given for tests involving several factors, including factorial and hierarchical (nested) designs and tests of interactions.

...read moreread less

14,183 citations

Book•10.1007/978-1-4757-3462-1•

Regression modeling strategies : with applications to linear models, logistic regression, and survival analysis

[...]

Frank E. Harrell

1 Jan 2001

TL;DR: In this article, the authors present a case study in least squares fitting and interpretation of a linear model, where they use nonparametric transformations of X and Y to fit a linear regression model.

...read moreread less

Abstract: Introduction * General Aspects of Fitting Regression Models * Missing Data * Multivariable Modeling Strategies * Resampling, Validating, Describing, and Simplifying the Model * S-PLUS Software * Case Study in Least Squares Fitting and Interpretation of a Linear Model * Case Study in Imputation and Data Reduction * Overview of Maximum Likelihood Estimation * Binary Logistic Regression * Logistic Model Case Study 1: Predicting Cause of Death * Logistic Model Case Study 2: Survival of Titanic Passengers * Ordinal Logistic Regression * Case Study in Ordinal Regrssion, Data Reduction, and Penalization * Models Using Nonparametic Transformations of X and Y * Introduction to Survival Analysis * Parametric Survival Models * Case Study in Parametric Survival Modeling and Model Approximation * Cox Proportional Hazards Regression Model * Case Study in Cox Regression

...read moreread less

8,749 citations

Journal Article•10.1890/0012-9658(2001)082[0290:FMMTCD]2.0.CO;2•

Fitting multivariate models to community data: a comment on distance‐based redundancy analysis

[...]

Brian H. McArdle¹, Marti J. Anderson¹•Institutions (1)

University of Auckland¹

01 Jan 2001-Ecology

TL;DR: The distance-based redundancy analysis (db-RDA) as mentioned in this paper is a nonparametric multivariate analysis of ecological data using permutation tests that is used to partition the variability in the data according to a complex design or model, as is often required in ecological experiments.

...read moreread less

Abstract: Nonparametric multivariate analysis of ecological data using permutation tests has two main challenges: (1) to partition the variability in the data according to a complex design or model, as is often required in ecological experiments, and (2) to base the analysis on a multivariate distance measure (such as the semimetric Bray-Curtis measure) that is reasonable for ecological data sets. Previous nonparametric methods have succeeded in one or other of these areas, but not in both. A recent contribution to Ecological Monographs by Legendre and Anderson, called distance-based redundancy analysis (db-RDA), does achieve both. It does this by calculating principal coordinates and subsequently correcting for negative eigenvalues, if they are present, by adding a constant to squared distances. We show here that such a correction is not necessary. Partitioning can be achieved directly from the distance matrix itself, with no corrections and no eigenanalysis, even if the distance measure used is semimetric. An ecological example is given to show the differences in these statistical methods. Empirical simulations, based on parameters estimated from real ecological species abundance data, showed that db-RDA done on multifactorial designs (using the correction) does not have type 1 error consistent with the significance level chosen for the analysis (i.e., does not provide an exact test), whereas the direct method described and advocated here does.

...read moreread less

3,871 citations

Journal Article•10.1006/NIMG.2001.0931•

Temporal autocorrelation in univariate linear modeling of FMRI data.

[...]

Mark W. Woolrich¹, Brian D. Ripley¹, J. Michael Brady¹, Stephen M. Smith¹•Institutions (1)

University of Oxford¹

01 Dec 2001-NeuroImage

TL;DR: Estimation is improved by using nonlinear spatial filtering to smooth the estimated autocorrelation, but only within tissue type, and reduced bias to close to zero at probability levels as low as 1 x 10(-5).

...read moreread less

3,068 citations

Book•

Generalized, Linear, and Mixed Models

[...]

Charles E. McCulloch, Shayle R. Searle

1 Jan 2001

TL;DR: In this paper, the authors present a model for estimating the effect of random effects on a set of variables in a linear mixed model with the objective of finding the probability of a given variable having a given effect.

...read moreread less

Abstract: Preface. Preface to the First Edition. 1. Introduction. 1.1 Models. 1.2 Factors, Levels, Cells, Effects And Data. 1.3 Fixed Effects Models. 1.4 Random Effects Models. 1.5 Linear Mixed Models (Lmms). 1.6 Fixed Or Random? 1.7 Inference. 1.8 Computer Software. 1.9 Exercises. 2. One-Way Classifications. 2.1 Normality And Fixed Effects. 2.2 Normality, Random Effects And MLE. 2.3 Normality, Random Effects And REM1. 2.4 More On Random Effects And Normality. 2.5 Binary Data: Fixed Effects. 2.6 Binary Data: Random Effects. 2.7 Computing. 2.8 Exercises. 3. Single-Predictor Regression. 3.1 Introduction. 3.2 Normality: Simple Linear Regression. 3.3 Normality: A Nonlinear Model. 3.4 Transforming Versus Linking. 3.5 Random Intercepts: Balanced Data. 3.6 Random Intercepts: Unbalanced Data. 3.7 Bernoulli - Logistic Regression. 3.8 Bernoulli - Logistic With Random Intercepts. 3.9 Exercises. 4. Linear Models (LMs). 4.1 A General Model. 4.2 A Linear Model For Fixed Effects. 4.3 Mle Under Normality. 4.4 Sufficient Statistics. 4.5 Many Apparent Estimators. 4.6 Estimable Functions. 4.7 A Numerical Example. 4.8 Estimating Residual Variance. 4.9 Comments On The 1- And 2-Way Classifications. 4.10 Testing Linear Hypotheses. 4.11 T-Tests And Confidence Intervals. 4.12 Unique Estimation Using Restrictions. 4.13 Exercises. 5. Generalized Linear Models (GLMs). 5.1 Introduction. 5.2 Structure Of The Model. 5.3 Transforming Versus Linking. 5.4 Estimation By Maximum Likelihood. 5.5 Tests Of Hypotheses. 5.6 Maximum Quasi-Likelihood. 5.7 Exercises. 6. Linear Mixed Models (LMMs). 6.1 A General Model. 6.2 Attributing Structure To VAR(y). 6.3 Estimating Fixed Effects For V Known. 6.4 Estimating Fixed Effects For V Unknown. 6.5 Predicting Random Effects For V Known. 6.6 Predicting Random Effects For V Unknown. 6.7 Anova Estimation Of Variance Components. 6.8 Maximum Likelihood (Ml) Estimation. 6.9 Restricted Maximum Likelihood (REMl). 6.10 Notes And Extensions. 6.11 Appendix For Chapter 6. 6.12 Exercises. 7. Generalized Linear Mixed Models. 7.1 Introduction. 7.2 Structure Of The Model. 7.3 Consequences Of Having Random Effects. 7.4 Estimation By Maximum Likelihood. 7.5 Other Methods Of Estimation. 7.6 Tests Of Hypotheses. 7.7 Illustration: Chestnut Leaf Blight. 7.8 Exercises. 8. Models for Longitudinal data. 8.1 Introduction. 8.2 A Model For Balanced Data. 8.3 A Mixed Model Approach. 8.4 Random Intercept And Slope Models. 8.5 Predicting Random Effects. 8.6 Estimating Parameters. 8.7 Unbalanced Data. 8.8 Models For Non-Normal Responses. 8.9 A Summary Of Results. 8.10 Appendix. 8.11 Exercises. 9. Marginal Models. 9.1 Introduction. 9.2 Examples Of Marginal Regression Models. 9.3 Generalized Estimating Equations. 9.4 Contrasting Marginal And Conditional Models. 9.5 Exercises. 10. Multivariate Models. 10.1 Introduction. 10.2 Multivariate Normal Outcomes. 10.3 Non-Normally Distributed Outcomes. 10.4 Correlated Random Effects. 10.5 Likelihood Based Analysis. 10.6 Example: Osteoarthritis Initiative. 10.7 Notes And Extensions. 10.8 Exercises. 11. Nonlinear Models. 11.1 Introduction. 11.2 Example: Corn Photosynthesis. 11.3 Pharmacokinetic Models. 11.4 Computations For Nonlinear Mixed Models. 11.5 Exercises. 12. Departures From Assumptions. 12.1 Introduction. 12.2 Misspecifications Of Conditional Model For Response. 12.3 Misspecifications Of Random Effects Distribution. 12.4 Methods To Diagnose And Correct For Misspecifications. 12.5 Exercises. 13. Prediction. 13.1 Introduction. 13.2 Best Prediction (BP). 13.3 Best Linear Prediction (BLP). 13.4 Linear Mixed Model Prediction (BLUP). 13.5 Required Assumptions. 13.6 Estimated Best Prediction. 13.7 Henderson's Mixed Model Equations. 13.8 Appendix. 13.9 Exercises. 14. Computing. 14.1 Introduction. 14.2 Computing Ml Estimates For LMMs. 14.3 Computing Ml Estimates For GLMMs. 14.4 Penalized Quasi-Likelihood And Laplace. 14.5 Exercises. Appendix M: Some Matrix Results. M.1 Vectors And Matrices Of Ones. M.2 Kronecker (Or Direct) Products. M.3 A Matrix Notation. M.4 Generalized Inverses. M.5 Differential Calculus. Appendix S: Some Statistical Results. S.1 Moments. S.2 Normal Distributions. S.3 Exponential Families. S.4 Maximum Likelihood. S.5 Likelihood Ratio Tests. S.6 MLE Under Normality. References. Index.

...read moreread less

2,840 citations

Journal Article•10.1111/J.0006-341X.2001.00120.X•

Akaike's Information Criterion in Generalized Estimating Equations

[...]

Wei Pan¹•Institutions (1)

University of Minnesota¹

01 Mar 2001-Biometrics

TL;DR: This work proposes a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term.

...read moreread less

Abstract: Correlated response data are common in biomedical studies. Regression analysis based on the generalized estimating equations (GEE) is an increasingly important method for such data. However, there seem to be few model-selection criteria available in GEE. The well-known Akaike Information Criterion (AIC) cannot be directly applied since AIC is based on maximum likelihood estimation while GEE is nonlikelihood based. We propose a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term. Its performance is investigated through simulation studies. For illustration, the method is applied to a real data set.

...read moreread less

2,564 citations

Journal Article•10.1109/36.911111•

Fully constrained least squares linear spectral mixture analysis method for material quantification in hyperspectral imagery

[...]

D.C. Heinz¹, Chein-I-Chang²•Institutions (2)

University of Baltimore¹, University of Maryland, Baltimore County²

01 Mar 2001-IEEE Transactions on Geoscience and Remote Sensing

TL;DR: The authors present a fully constrained least squares (FCLS) linear spectral mixture analysis method for material quantification, where no closed form can be derived for this method and an efficient algorithm is developed to yield optimal solutions.

...read moreread less

Abstract: Linear spectral mixture analysis (LSMA) is a widely used technique in remote sensing to estimate abundance fractions of materials present in an image pixel. In order for an LSMA-based estimator to produce accurate amounts of material abundance, it generally requires two constraints imposed on the linear mixture model used in LSMA, which are the abundance sum-to-one constraint and the abundance nonnegativity constraint. The first constraint requires the sum of the abundance fractions of materials present in an image pixel to be one and the second imposes a constraint that these abundance fractions be nonnegative. While the first constraint is easy to deal with, the second constraint is difficult to implement since it results in a set of inequalities and can only be solved by numerical methods. Consequently, most LSMA-based methods are unconstrained and produce solutions that do not necessarily reflect the true abundance fractions of materials. In this case, they can only be used for the purposes of material detection, discrimination, and classification, but not for material quantification. The authors present a fully constrained least squares (FCLS) linear spectral mixture analysis method for material quantification. Since no closed form can be derived for this method, an efficient algorithm is developed to yield optimal solutions. In order to further apply the designed algorithm to unknown image scenes, an unsupervised least squares error (LSE)-based method is also proposed to extend the FCLS method in an unsupervised manner.

...read moreread less

1,994 citations

Journal Article•10.1139/F01-004•

Permutation tests for univariate or multivariate analysis of variance and regression

[...]

Marti J. Anderson

01 Mar 2001-Canadian Journal of Fisheries and Aquatic Sciences

TL;DR: This paper provides a summary of recent empirical and theoretical results concerning available methods and gives recommendations for their use in univariate and multivariate applications.

...read moreread less

Abstract: The most appropriate strategy to be used to create a permutation distribution for tests of individual terms in complex experimental designs is currently unclear. There are often many possibilities, including restricted permutation or permutation of some form of residuals. This paper provides a summary of recent empirical and theoretical results concerning available methods and gives recommendations for their use in univariate and multivariate applications. The focus of the paper is on complex designs in analysis of variance and multiple regression (i.e., linear models). The assumption of exchangeability required for a permutation test is assured by random allocation of treatments to units in experimental work. For observational data, exchangeability is tantamount to the assumption of independent and identically distributed errors under a null hypothesis. For partial regression, the method of permutation of residuals under a reduced model has been shown to provide the best test. For analysis of variance, o...

...read moreread less

1,374 citations

Journal Article•10.1016/S0304-4076(00)00076-2•

Benchmark Priors for Bayesian Model Averaging

[...]

Carmen Fernandez¹, Eduardo Ley², Mark F. J. Steel³•Institutions (3)

University of St Andrews¹, International Monetary Fund², University of Kent³

01 Feb 2001-Journal of Econometrics

TL;DR: In this paper, the authors propose a partially noninformative prior structure related to a Natural Conjugate g-prior speciflcation, where the amount of subjective information requested from the user is limited to the choice of a single scalar hyperparameter g0j.

...read moreread less

1,046 citations

Journal Article•10.1198/016214501753168398•

Model Selection and the Principle of Minimum Description Length

[...]

Mark Hansen, Bin Yu

01 Jun 2001-Journal of the American Statistical Association

TL;DR: This article reviews the principle of minimum description length (MDL) for problems of model selection, and illustrates the MDL principle by considering problems in regression, nonparametric curve estimation, cluster analysis, and time series analysis.

...read moreread less

Abstract: This article reviews the principle of minimum description length (MDL) for problems of model selection. By viewing statistical modeling as a means of generating descriptions of observed data, the MDL framework discriminates between competing models based on the complexity of each description. This approach began with Kolmogorov's theory of algorithmic complexity, matured in the literature on information theory, and has recently received renewed attention within the statistics community. Here we review both the practical and the theoretical aspects of MDL as a tool for model selection, emphasizing the rich connections between information theory and statistics. At the boundary between these two disciplines we find many interesting interpretations of popular frequentist and Bayesian procedures. As we show, MDL provides an objective umbrella under which rather disparate approaches to statistical modeling can coexist and be compared. We illustrate the MDL principle by considering problems in regression, nonpar...

...read moreread less

903 citations

Book•

Contemporary Statistical Models for the Plant and Soil Sciences

[...]

Oliver Schabenberger, Francis J. Pierce¹•Institutions (1)

SAS Institute¹

13 Nov 2001

TL;DR: In this paper, the authors present a framework for estimating and testing t-tests in terms of statistical models by embedding Hypotheses Hypothesis and Significance Testing and Interpretation of the p-value classes of Statistical Models Data Structures.

...read moreread less

Abstract: Statistical Models Mathematical and Statistical Models Functional Aspects of Models The Inferential Steps o Estimation and Testing t-Tests in Terms of Statistical Models Embedding Hypotheses Hypothesis and Significance Testing o Interpretation of the p-Value Classes of Statistical Models Data Structures Introduction Classification by Response Type Classification by Study Type Clustered Data Autocorrelated Data From Independent to Spatial Data o A Progression of Clustering Linear Algebra Tools Introduction Matrices and Vectors Basic Matrix Operations Matrix Inversion o Regular and Generalized Inverse Mean, Variance, and Covariance of Random Vectors The Trace and Expectation of Quadratic Forms The Multivariate Gaussian Distribution Matrix and Vector Differentiation Using Matrix Algebra to Specify Models The Classical Linear Model: Least Squares and Alternatives Introduction Least Squares Estimation and Partitioning of Variation Factorial Classification Diagnosing Regression Models Diagnosing Classification Models Robust Estimation Nonparametric Regression Nonlinear Models Introduction Models as Laws or Tools Linear Polynomials Approximate Nonlinear Models Fitting a Nonlinear Model to Data Hypothesis Tests and Confidence Intervals Transformations Parameterization of Nonlinear Models Applications Generalized Linear Models Introduction Components of a Generalized Linear Model Grouped and Ungrouped Data Parameter Estimation and Inference Modeling an Ordinal Response Overdispersion Applications Linear Mixed Models for Clustered Data Introduction The Laird-Ware Model Choosing the Inference Space Estimation and Inference Correlations in Mixed Models Applications Nonlinear Models for Clustered Data Introduction Nonlinear and Generalized Linear Mixed Models Towards an Approximate Objective Function Applications Statistical Models for Spatial Data Changing the Mindset Semivariogram Analysis and Estimation The Spatial Model Spatial Prediction and the Kriging Paradigm Spatial Regression and Classification Models Autoregressive Models for Lattice Data Analyzing Mapped Spatial Point Patterns Applications Bibliography

...read moreread less

Book Chapter•10.1016/S1573-4412(01)05006-1•

Panel data models: some recent developments*

[...]

Manuel Arellano¹, Bo E. Honoré²•Institutions (2)

CEMFI¹, Princeton University²

01 Jan 2001-Handbook of Econometrics

TL;DR: In this article, the authors provide a review of linear panel data models with predetermined variables and compare the identification from moment conditions in each case, and the implications of alternative feedback schemes for the time series properties of the errors.

...read moreread less

Abstract: This chapter focuses on two of the developments in panel data econometrics since the Handbook chapter by Chamberlain (1984). The first objective of this chapter is to provide a review of linear panel data models with predetermined variables. We discuss the implications of assuming that explanatory variables are predetermined as opposed to strictly exogenous in dynamic structural equations with unobserved heterogeneity. We compare the identification from moment conditions in each case, and the implications of alternative feedback schemes for the time series properties of the errors. We next consider autoregressive error component models under various auxiliary assumptions. There is a trade-off between robustness and efficiency since assumptions of stationary initial conditions or time series homoskedasticity can be very informative, but estimators are not robust to their violation. We also discuss the identification problems that arise in models with predetermined variables and multiple effects. Concerning inference in linear models with predetermined variables, we discuss the form of optimal instruments, and the sampling properties of GMM and LIML-analogue estimators drawing on Monte Carlo results and asymptotic approximations. A number of identification results for limited dependent variable models with fixed effects and strictly exogenous variables are available in the literature, as well as some results on consistent and asymptotically normal estimation of such models. There are also some results available for models of this type including lags of the dependent variable, although even less is known for nonlinear dynamic models. Reviewing the recent work on discrete choice and selectivity models with fixed effects is the second objective of this chapter. A feature of parametric limited dependent variable models is their fragility to auxiliary distributional assumptions. This situation prompted the development of a large literature dealing with semiparametric alternatives (reviewed in Powell, 1994’s chapter). The work that we review in the second part of the chapter is thus at the intersection of the panel data literature and that on cross-sectional semiparametric limited dependent variable models.

...read moreread less

Journal Article•10.1016/S0005-1098(01)00115-7•

Brief Constrained linear state estimation-a moving horizon approach

[...]

Christopher V. Rao¹, James B. Rawlings², Jay H. Lee³•Institutions (3)

University of California, Berkeley¹, University of Wisconsin-Madison², Georgia Institute of Technology³

01 Oct 2001-Automatica

TL;DR: This work derives sufficient conditions for the stability of moving horizon state estimation with linear models subject to constraints on the estimate, and discusses smoothing strategies for moving horizon estimation.

...read moreread less

Journal Article•10.1111/1467-842X.00156•

Permutation tests for linear models

[...]

Marti J. Anderson¹, John Robinson¹•Institutions (1)

University of Sydney¹

01 Mar 2001-Australian & New Zealand Journal of Statistics

TL;DR: In this paper, the authors compare the distributions of the test statistics under various permutation methods and show that the partial correlations under permutation are asymptotically jointly normal with means 0 and variances 1.

...read moreread less

Abstract: Summary Several approximate permutation tests have been proposed for tests of partial regression coefficients in a linear model based on sample partial correlations. This paper begins with an explanation and notation for an exact test. It then compares the distributions of the test statistics under the various permutation methods proposed, and shows that the partial correlations under permutation are asymptotically jointly normal with means 0 and variances 1. The method of Freedman & Lane (1983) is found to have asymptotic correlation 1 with the exact test, and the other methods are found to have smaller correlations with this test. Under local alternatives the critical values of all the approximate permutation tests converge to the same constant, so they all have the same asymptotic power. Simulations demonstrate these theoretical results.

...read moreread less

Report Series•10.1920/WP.CEM.2001.0901•

Endogeneity in nonparametric and semiparametric regression models

[...]

Richard Blundell¹, James L. Powell•Institutions (1)

University College London¹

1 Nov 2001

TL;DR: In this article, the authors consider the nonparametric and semiparametric methods for estimating regression models with continuous endogenous regressors and identify the "average structural function" as a parameter of central interest.

...read moreread less

Abstract: This paper considers the nonparametric and semiparametric methods for estimating regression models with continuous endogenous regressors. We list a number of different generalizations of the linear structural equation model, and discuss how two common estimation approaches for linear equations — the "instrumental variables" and "control function" approaches — may be extended to nonparametric generalizations of the linear model and to their semiparametric variants. We consider the identification and estimation of the "Average Structural Function" and argue that this is a parameter of central interest in the analysis of semiparametric and non- parametric models with endogenous regressors. We consider a particular semiparametric model, the binary response model with linear index function and nonparametric error distribution, and describes in detail how estimation of the parameters of interest can be constructed using the "control function" approach. This estimator is applied to estimating the relation of labor force participation to nonlabor income, viewed as an endogenous regressor.

...read moreread less

Journal Article•10.1198/016214501753168389•

A Two-Part Random-Effects Model for Semicontinuous Longitudinal Data

[...]

Maren K. Olsen, Joseph L Schafer

01 Jun 2001-Journal of the American Statistical Association

TL;DR: In this article, the authors extend the two-part regression approach to longitudinal settings by introducing random coefficients into both the logistic and the linear stages, and obtain maximum likelihood estimates for the fixed coefficients and variance components by an approximate Fisher scoring procedure based on high-order Laplace approximations.

...read moreread less

Abstract: A semicontinuous variable has a portion of responses equal to a single value (typically 0) and a continuous, often skewed, distribution among the remaining values. In cross-sectional analyses, variables of this type may be described by a pair of regression models; for example, a logistic model for the probability of nonzero response and a conditional linear model for the mean response given that it is nonzero. We extend this two-part regression approach to longitudinal settings by introducing random coefficients into both the logistic and the linear stages. Fitting a two-part random-effects model poses computational challenges similar to those found with generalized linear mixed models. We obtain maximum likelihood estimates for the fixed coefficients and variance components by an approximate Fisher scoring procedure based on high-order Laplace approximations. To illustrate, we apply the technique to data from the Adolescent Alcohol Prevention Trial, examining reported recent alcohol use for students in g...

...read moreread less

Book•

Generalized Linear Models: With Applications in Engineering and the Sciences

[...]

Raymond H. Myers

28 Sep 2001

TL;DR: In this article, the authors present a generalization of the generalized linear model to a nonlinear model and show that the nonlinear models can be transformed to a linear model using a linear regression model.

...read moreread less

Abstract: Preface. 1. Introduction to Generalized Linear Models. 1.1 Linear Models. 1.2 Nonlinear Models. 1.3 The Generalized Linear Model. 2. Linear Regression Models. 2.1 The Linear Regression Model and Its Application. 2.2 Multiple Regression Models. 2.3 Parameter Estimation Using Maximum Likelihood. 2.4 Model Adequacy Checking. 2.5 Using R to Perform Linear Regression Analysis. 2.6 Parameter Estimation by Weighted Least Squares. 2.7 Designs for Regression Models. 3. Nonlinear Regression Models. 3.1 Linear and Nonlinear Regression Models. 3.2 Transforming to a Linear Model. 3.3 Parameter Estimation in a Nonlinear System. 3.4 Statistical Inference in Nonlinear Regression. 3.5 Weighted Nonlinear Regression. 3.6 Examples of Nonlinear Regression Models. 3.7 Designs for Nonlinear Regression Models. 4. Logistic and Poisson Regression Models. 4.1 Regression Models Where the Variance Is a Function of theMean. 4.2 Logistic Regression Models. 4.3 Poisson Regression. 4.4 Overdispersion in Logistic and Poisson Regression. 5. The Generalized Linear Model. 5.1 The Exponential Family of Distributions. 5.2 Formal Structure for the Class of Generalized LinearModels. 5.3 Likelihood Equations for Generalized Linear models. 5.4 Quasi-Likelihood. 5.5 Other Important Distributions for Generalized LinearModels. 5.6 A Class of Link Functions The Power Function. 5.7 Inference and Residual Analysis for Generalized LinearModels. 5.8 Examples with the Gamma Distribution. 5.9 Using R to Perform GLM Analysis. 5.10 GLM and Data Transformation. 5.11 Modeling Both a Process Mean and Process Variance UsingGLM. 5.12 Quality of Asymptotic Results and Related Issues. 6. Generalized Estimating Equations. 6.1 Data Layout for Longitudinal Studies. 6.2 Impact of the Correlation Matrix R. 6.3 Iterative Procedure in the Normal Case, Identity Link. 6.4 Generalized Estimating Equations for More Generalized LinearModels. 6.5 Examples. 6.6 Summary. 7. Random Effects in Generalized Linear Models. 7.1 Linear Mixed Effects Models. 7.2 Generalized Linear Mixed Models. 7.3 Generalized Linear Mixed Models Using Bayesian. 8. Designed Experiments and the Generalized LinearModel. 8.1 Introduction. 8.2 Experimental Designs for Generalized Linear Models. 8.3 GLM Analysis of Screening Experiments. Appendix A.1 Background on Basic Test Statistics. Appendix A.2 Background from the Theory of LinearModels. Appendix A.3 The Gauss Markov Theorem, Var( ) = 2I. Appendix A.4 The Relationship Between Maximum LikelihoodEstimation of the Logistic Regression Model and Weighted LeastSquares. Appendix A.5 Computational Details for GLMs for a CanonicalLink. Appendix A.6 Computations Details for GLMs for a NoncanonicalLink. References. Index.

...read moreread less

Journal Article•10.1046/J.1365-2656.2001.00524.X•

On the misuse of residuals in ecology: testing regression residuals vs. the analysis of covariance

[...]

Emili García-Berthou

01 Jul 2001-Journal of Animal Ecology

TL;DR: The residual index is an ad hoc sequential procedure with no statistical justification, unlike the well-known ancova, and it is suggested that a t-test or an anova of the residuals should never be used in place of an anCova to study condition or any other variable.

...read moreread less

Abstract: Summary 1 An analysis of variance (anova) or other linear models of the residuals of a simple linear regression is being increasingly used in ecology to compare two or more groups. Such a procedure (hereafter, ‘residual index’) was used in 8% and 2% of the papers published during 1999 in the Journal of Animal Ecology and in Ecology, respectively, and has been recently recommended for studying condition. 2 Although the residual index is similar to an analysis of covariance (ancova), it is not identical and is incorrect for at least four reasons: (i) the regression coefficient used by the residual index differs from the one used in ancova and is not the least-squares estimator of the model. (ii) in contrast to the ancova, the error d.f. in the residual index are overestimated because of the estimation of the regression coefficient. (iii) the residual index also assumes the homogeneity of regression coefficients (parallelism assumption), which should be tested with a special ancova design. (iv) even if the assumptions of the linear model hold for the original variables, they will not hold for the residuals. 3 More importantly, the residual index is an ad hoc sequential procedure with no statistical justification, unlike the well-known ancova. For these reasons, I suggest that a t-test or an anova of the residuals should never be used in place of an ancova to study condition or any other variable.

...read moreread less

Journal Article•10.1111/J.0006-341X.2001.00253.X•

Nonparametric mixed effects models for unequally sampled noisy curves.

[...]

John Rice¹, Colin O. Wu²•Institutions (2)

University of California, Berkeley¹, Johns Hopkins University²

01 Mar 2001-Biometrics

TL;DR: A method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients, which produces a low-rank, low-frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm.

...read moreread less

Abstract: We propose a method of analyzing collections of related curves in which the individual curves are modeled as spline functions with random coefficients. The method is applicable when the individual curves are sampled at variable and irregularly spaced points. This produces a low-rank, low-frequency approximation to the covariance structure, which can be estimated naturally by the EM algorithm. Smooth curves for individual trajectories are constructed as best linear unbiased predictor (BLUP) estimates, combining data from that individual and the entire collection. This framework leads naturally to methods for examining the effects of covariates on the shapes of the curves. We use model selection techniques--Akaike information criterion (AIC), Bayesian information criterion (BIC), and cross-validation--to select the number of breakpoints for the spline approximation. We believe that the methodology we propose provides a simple, flexible, and computationally efficient means of functional data analysis.

...read moreread less

Journal Article•10.1093/BIOMET/88.4.987•

Hierarchical generalised linear models: A synthesis of generalised linear models, random-effect models and structured dispersions

[...]

Youngjo Lee¹, John A. Nelder¹•Institutions (1)

Seoul National University¹

01 Dec 2001-Biometrika

TL;DR: In this article, a hierarchical generalised linear model (GLM) is developed as a synthesis of generalized linear models, mixed linear models and structured dispersions, and a restricted maximum likelihood method for the estimation of dispersion is extended to a wider class of models.

...read moreread less

Abstract: SUMMARY Hierarchical generalised linear models are developed as a synthesis of generalised linear models, mixed linear models and structured dispersions. We generalise the restricted maximum likelihood method for the estimation of dispersion to the wider class and show how the joint fitting of models for mean and dispersion can be expressed by two interconnected generalised linear models. The method allows models with (i) any combination of a generalised linear model distribution for the response with any conjugate distribution for the random effects, (ii) structured dispersion components, (iii) different link and variance functions for the fixed and random effects, and (iv) the use of quasilikelihoods in place of likelihoods for either or both of the mean and dispersion models. Inferences can be made by applying standard procedures, in particular those for model checking, to components of either generalised linear model. We also show by numerical studies that the new method gives an efficient estimation procedure for substantial class of models of practical importance. Likelihood-type inference is extended to this wide class of models in a unified way.

...read moreread less

Journal Article•10.1198/016214501750333054•

A Model-Calibration Approach to Using Complete Auxiliary Information From Survey Data

[...]

Changbao Wu¹, Randy R. Sitter¹•Institutions (1)

Natural Sciences and Engineering Research Council¹

01 Mar 2001-Journal of the American Statistical Association

TL;DR: In this article, a unified model-assisted framework has been attempted using a proposed model-calibration technique, which can handle any linear or nonlinear working models and reduce to the conventional calibration estimators of Deville and Sarndal and/or the generalized regression estimators in the linear model case.

...read moreread less

Abstract: Suppose that the finite population consists of N identifiable units. Associated with the ith unit are the study variable, yi, and a vector of auxiliary variables, xi. The values x1, x2,…, xN are known for the entire population (i.e., complete) but yi is known only if the ith unit is selected in the sample. One of the fundamental questions is how to effectively use the complete auxiliary information at the estimation stage. In this article, a unified model-assisted framework has been attempted using a proposed model-calibration technique. The proposed model-calibration estimators can handle any linear or nonlinear working models and reduce to the conventional calibration estimators of Deville and Sarndal and/or the generalized regression estimators in the linear model case. The pseudoempirical maximum likelihood estimator of Chen and Sitter, when used in this setting, gives an estimator that is asymptotically equivalent to the model-calibration estimator but with positive weights. Some existing estimators ...

...read moreread less

Journal Article•10.1023/A:1011175525451•

Predictive Statistical Models for User Modeling

[...]

Ingrid Zukerman¹, David W. Albrecht¹•Institutions (1)

Monash University, Clayton campus¹

27 Mar 2001-User Modeling and User-adapted Interaction

TL;DR: The two main approaches to predictive statistical modeling, content-based and collaborative, are reviewed, and the main techniques used to develop predictive statistical models are discussed.

...read moreread less

Abstract: The limitations of traditional knowledge representation methods for modeling complex human behaviour led to the investigation of statistical models. Predictive statistical models enable the anticipation of certain aspects of human behaviour, such as goals, actions and preferences. In this paper, we motivate the development of these models in the context of the user modeling enterprise. We then review the two main approaches to predictive statistical modeling, content-based and collaborative, and discuss the main techniques used to develop predictive statistical models. We also consider the evaluation requirements of these models in the user modeling context, and propose topics for future research.

...read moreread less

Journal Article•10.1016/S0927-5398(01)00036-6•

The specification of conditional expectations

[...]

Campbell R. Harvey¹•Institutions (1)

Duke University¹

01 Dec 2001-Journal of Empirical Finance

TL;DR: In this paper, different specifications of conditional expectations are compared with nonparametric techniques that make no assumptions about the distribution of the data, and the conditional mean and variance of the NYSE market return are examined.

...read moreread less

Proceedings Article•

Analysis of Sparse Bayesian Learning

[...]

A. C. Faul¹, Michael E. Tipping¹•Institutions (1)

Microsoft¹

3 Jan 2001

TL;DR: It is shown that conditioned on an individual hyper-parameter, the marginal likelihood has a unique maximum which is computable in closed form, and it is further shown that if a derived 'sparsity criterion' is satisfied, this maximum is exactly equivalent to 'pruning' the corresponding parameter from the model.

...read moreread less

Abstract: The recent introduction of the 'relevance vector machine' has effectively demonstrated how sparsity may be obtained in generalised linear models within a Bayesian framework. Using a particular form of Gaussian parameter prior, 'learning' is the maximisation, with respect to hyperparameters, of the marginal likelihood of the data. This paper studies the properties of that objective function, and demonstrates that conditioned on an individual hyper-parameter, the marginal likelihood has a unique maximum which is computable in closed form. It is further shown that if a derived 'sparsity criterion' is satisfied, this maximum is exactly equivalent to 'pruning' the corresponding parameter from the model.

...read moreread less

Journal Article•10.1111/J.0006-341X.2001.00795.X•

Linear mixed models with flexible distributions of random effects for longitudinal data.

[...]

Daowen Zhang¹, Marie Davidian¹•Institutions (1)

North Carolina State University¹

01 Sep 2001-Biometrics

TL;DR: It is demonstrated that standard information criteria may be used to choose the tuning parameter and detect departures from normality, and the approach is illustrated via simulation and using longitudinal data from the Framingham study.

...read moreread less

Abstract: Normality of random effects is a routine assumption for the linear mixed model, but it may be unrealistic, obscuring important features of among-individual variation. We relax this assumption by approximating the random effects density by the seminonparameteric (SNP) representation of Gallant and Nychka (1987, Econometrics 55, 363-390), which includes normality as a special case and provides flexibility in capturing a broad range of nonnormal behavior, controlled by a user-chosen tuning parameter. An advantage is that the marginal likelihood may be expressed in closed form, so inference may be carried out using standard optimization techniques. We demonstrate that standard information criteria may be used to choose the tuning parameter and detect departures from normality, and we illustrate the approach via simulation and using longitudinal data from the Framingham study.

...read moreread less

Journal Article•10.1093/BIOMET/88.4.973•

Misspecified maximum likelihood estimates and generalised linear mixed models

[...]

Patrick J. Heagerty¹, Brenda F. Kurland¹•Institutions (1)

University of Washington¹

01 Dec 2001-Biometrika

TL;DR: In this article, the impact of model violations on the estimate of a regression coefficient in a generalised linear mixed model is investigated, and the authors evaluate the asymptotic relative bias that results from incorrect assumptions regarding the random effects.

...read moreread less

Abstract: SUMMARY We investigate the impact of model violations on the estimate of a regression coefficient in a generalised linear mixed model. Specifically, we evaluate the asymptotic relative bias that results from incorrect assumptions regarding the random effects. We compare the impact of model violation for two parameterisations of the regression model. Substantial bias in the conditionally specified regression point estimators can result from using a simple random intercepts model when either the random effects distribution depends on measured covariates or there are autoregressive random effects. A marginally specified regression structure that is estimated using maximum likelihood is much less susceptible to bias resulting from random effects model misspecification.

...read moreread less

Journal Article•10.1016/S0309-1708(00)00069-5•

Solving nonlinear water management models using a combined genetic algorithm and linear programming approach

[...]

Ximing Cai¹, Daene C. McKinney, Leon S. Lasdon²•Institutions (2)

International Food Policy Research Institute¹, University of Texas at Austin²

01 Jun 2001-Advances in Water Resources

TL;DR: In this paper, the authors describe strategies for solving large nonlinear water resources models management, which combine GAs with linear programming, by identifying a set of complicating variables in the model which, when fixed, render the problem linear in the remaining variables.

...read moreread less

Journal Article•10.1016/S0304-4076(00)00083-X•

GMM estimation of linear panel data models with time-varying individual effects

[...]

Seung C. Ahn¹, Young Hoon Lee², Peter Schmidt³•Institutions (3)

Arizona State University¹, Hansung University², Michigan State University³

01 Apr 2001-Journal of Econometrics

TL;DR: In this article, the authors consider models for panel data in which the individual effects vary over time, but the temporal pattern of variation is arbitrary, but it is the same for all individuals.

...read moreread less

Journal Article•10.1016/S1093-3263(01)00098-5•

The connectivity index 25 years after.

[...]

Milan Randic¹•Institutions (1)

Drake University¹

01 Dec 2001-Journal of Molecular Graphics & Modelling

TL;DR: This work reviews the developments following introduction of the connectivity indices as molecular descriptors in multiple linear regression analysis (MLRA) for structure-property-activity studies and discusses the results obtained with applications of the variable connectivity index.

...read moreread less

Abstract: We review the developments following introduction of the connectivity indices as molecular descriptors in multiple linear regression analysis (MLRA) for structure-property-activity studies. We end the review with discussion of results obtained with applications of the variable connectivity index. A comparison is made between some results obtained with the traditional topological indices and the variable connectivity index.

...read moreread less

Journal Article•10.1016/S0167-9473(00)00018-9•

Fast maximum likelihood estimation of very large spatial autoregression models: a characteristic polynomial approach

[...]

Oleg Smirnov¹, Luc Anselin²•Institutions (2)

University of Texas at Dallas¹, University of Illinois at Urbana–Champaign²

28 Jan 2001-Computational Statistics & Data Analysis

TL;DR: A new method for evaluating the Jacobian term based on the characteristic polynomial of the spatial weights matrix W is outlined, which approaches linear computational complexity, which makes it the fastest direct method currently available, especially for very large data sets.

...read moreread less

...

Expand