TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Abstract: SUMMARY We propose a new method for estimation in linear models. The 'lasso' minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant. Because of the nature of this constraint it tends to produce some coefficients that are exactly 0 and hence gives interpretable models. Our simulation studies suggest that the lasso enjoys some of the favourable properties of both subset selection and ridge regression. It produces interpretable models like subset selection and exhibits the stability of ridge regression. There is also an interesting relationship with recent work in adaptive function estimation by Donoho and Johnstone. The lasso idea is quite general and can be applied in a variety of statistical models: extensions to generalized regression models and tree-based models are briefly described.
TL;DR: In this article, an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities are discussed, which are particularly needed for binary, ordinal, and time-to-event outcomes.
Abstract: Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.
TL;DR: A genetic evaluation with different sources of records and the best linear unbiased prediction of breeding value - univariate models with one random effect, non-additive animal models and dominance relationship matrix animal model for rapid inversion of the dominance matrix epistatis.
Abstract: Part 1 Genetic evaluation with different sources of records: the basic model breeding value prediction from animal own performance breeding value prediction from progeny records breeding value prediction from pedigree breeding value prediction for one trait from another selection index. Part 2 Genetic relationship between relatives: the numerator relationship matrix decomposing the relationship matrix computing inverse of the relationship matrix inverse of the relationship matrix for sizes and maternal grandsires. Part 3 Best linear unbiased prediction of breeding value - univariate models with one random effect: brief theoretical background a model for an animal evaluation (animal model) a sire model reduced animal model animal model with groups. Part 4 Best linear unbiased prediction of breeding value - models with environmental effects: repeatability model models with common environmental effects. Part 5 Best linear unbiased prediction of breeding value - multivariate models: equal design matrices and no missing records canonical transformation equal design matrices with missing records Cholesky transformation unequal design matrices different traits measured on relatives. Part 6 Maternal trait models - animal and reduced animal models: animal model for a maternal trait reduced animal model with maternal effects multivariate maternal animal model. Part 7 Non-additive animal models: dominance relationship matrix animal model with dominance effects method for rapid inversion of the dominance matrix epistatis. Part 8 Solving linear equations: direct inversion iterating on the mixed model equations iterating on the data.
TL;DR: In this paper, a new method for estimating the geographical distribution of plant and animal species from incomplete field survey data is developed, by extending a logistic model to include an extra covariate which is derived from the responses at neighbouring squares, known as an autologistic model.
Abstract: 1. A new method for estimating the geographical distribution of plant and animal species from incomplete field survey data is developed. 2. Wildlife surveys are often conducted by dividing a study region into a regular grid and collecting data on abundance or on presence/absence from some or all of the squares in the grid. Generalized linear models (GLMs) can be used to model the spatial distribution of a species within such a grid by relating the response variable (abundance or presence/absence) to spatially referenced covariates. 3. Such models ignore or at best indirectly model dependence on unmeasured covariates, and the intrinsic spatial autocorrelation arising for example in gregarious populations. 4. We describe a procedure for use with presence/absence data in which spatial autocorrelation is modelled explicitly. We achieve this by extending a logistic model to include an extra covariate which is derived from the responses at neighbouring squares. The extended model is known as an autologistic model. 5. To allow fitting of the autologistic model when only a random sample of squares is surveyed, we use the Gibbs sampler to predict presence/absence at unsurveyed squares. 6. We compare the autologistic model with the ordinary logistic model using red deer census data. Both models are fitted to a subsample of 20% of the data and results are compared with the 'true' abundance and spatial distribution indicated by the full census. We conclude that the autologistic model is superior for estimating the spatial distribution of the deer, whereas the ordinary logistic model yields more precise estimates of the overall number of squares occupied by deer at the time of the survey.
TL;DR: In this article, a computationally simple partial R-squared measure of instrument relevance for multivariate models is proposed, which is a measure of the correlation between instruments and explanatory variables.
Abstract: The correlation between instruments and explanatory variables is a key determinant of the performance of the instrumental variables estimator. The R-squared from regressing the explanatory variable on the instrument vector is a useful measure of relevance in univariate models, but can be misleading when there are multiple endogenous variables. This paper proposes a computationally simple partial R- squared measure of instrument relevance for multivariate models.
TL;DR: In this paper, the authors investigated the impact of the normality assumption for random effects on their estimates in the linear mixed-effects model and showed that if the distribution of random effects is a finite mixture of normal distributions, then the random effects may be badly estimated if normality is assumed.
Abstract: This article investigates the impact of the normality assumption for random effects on their estimates in the linear mixed-effects model. It shows that if the distribution of random effects is a finite mixture of normal distributions, then the random effects may be badly estimated if normality is assumed, and the current methods for inspecting the appropriateness of the model assumptions are not sound. Further, it is argued that a better way to detect the components of the mixture is to build this assumption in the model and then “compare” the fitted model with the Gaussian model. All of this is illustrated on two practical examples.
TL;DR: Estimation Theory for Nonlinear Models and Set Membership Uncertainty M. Milanese, A. Vicino, and S.M. Veres, J.P. Norton.
Abstract: Overview of the Volume J. Norton. Optimal Estimation Theory for Dynamic System with Set Membership Uncertainty: An Overview M. Milanese, A. Vicino. Solving Linear Problems in the Presence of Bounded Data Perturbations B.Z. Kacewicz. A Review and a Comparison of Ellipsoidal Bounding Algorithms G. Favier, L.V.R. Arruda. On the Deadzone in System Identification K. Forsman, L. Ljung. Recursive Estimation Algorithms for Linear Models with Set Membership Error G. Belforte, T.T. Tay. Transfer Function Parameter Interval Estimation Using Recursive Least Squares in the Time and Frequency Domains P.O. Gutman. Volume-optimal Inner and Outer Ellipsoids L. Pronzato, E. Walter. Linear Interpolation and Estimation Using Interval Analysis S.M. Markov, E.D. Popova. Adaptive Approximation of Uncertainty Sets for Linear Regression Models A. Vicino, G. Zappa. Worstcase l1 Identification M. Milanese. Recursive Robust Minimax Estimation E. Walter, H. Piet-Lahanier. Robustness to Outliers of Bounded-error Estimators, Consequences on Experiment Design L. Pronzato, E. Walter. Ellipsoidal State Estimation for Uncertain Dynamical Systems T.F. Filipova, et al. Set-valued Estimation of State and Parameter Vectors within Adaptive Control-Systems V.M. Kuntsevich. Limited-complexity Polyhedric Tracking H. Piet-Lahanier, E. Walter. Parameterbounding Algorithms for Linear Errors in Variables Models S.M. Veres, J.P. Norton. Errors-invariables Models in Parameter Bounding V. Cerone. Identification of Linear Objects with Bounded Disturbances in Both Input and Output Channels Yu.A. Merkuryev. Identification of Nonlinear Statespace Models by Deterministic Search J.P. Norton, S.M. Veres. Robust Identification and Prediction for Nonlinear State-Space Models with Bounded Output Error K.J. Keesman. Estimation Theory for Nonlinear Models and Set Membership Uncertainty M. Milanese, A. Vicino. Guaranteed Nonlinear Set Estimation via Interval Analysis L. Jaulin, E. Walter. On Adaptive Control of Systems Subjected to Bounded Disturbances L.S. Zhitecki. Predictive Selftuning Control by Parameter Bounding and Worstcase Design S.M. Veres, J.P. Norton. Estimation of a Mobile Robot Localization: Geometric Approaches D. Meizel, et al. Improved Image Compression Using Bounded Error Parameter Estimation Concepts A.K. Rao. Application of OBE Algorithms to Speech Analysis, Recognition and Coding J.R. Deller Jr., et al. 2 additional articles. Index.
TL;DR: In this paper, the authors investigate the tradeoff between the number of profiles per subject and number of subjects on the statistical accuracy of the estimators that describe the partworth heterogeneity.
Abstract: The drive to satisfy customers in narrowly defined market segments has led firms to offer wider arrays of products and services. Delivering products and services with the appropriate mix of features for these highly fragmented market segments requires understanding the value that customers place on these features. Conjoint analysis endeavors to unravel the value or partworths, that customers place on the product or service's attributes from experimental subjects' evaluation of profiles based on hypothetical products or services. When the goal is to estimate the heterogeneity in the customers' partworths, traditional estimation methods, such as least squares, require each subject to respond to more profiles than product attributes, resulting in lengthy questionnaires for complex, multiattributed products or services. Long questionnaires pose practical and theoretical problems. Response rates tend to decrease with increasing questionnaire length, and more importantly, academic evidence indicates that long questionnaires may induce response biases.
The problems associated with long questionnaires call for experimental designs and estimation methods that recover the heterogeneity in the partworths with shorter questionnaires. Unlike more popular estimation methods, Hierarchical Bayes HB random effects models do not require that individual-level design matrices be of full rank, which leads to the possibility of using fewer profiles per subject than currently used. Can this theoretical possibility be practically implemented?
This paper tests this conjecture with empirical studies and mathematical analysis. The random effects model in the paper describes the heterogeneity in subject-level partworths or regression coefficients with a linear model that can include subject-level covariates. In addition, the error variances are specific to the subjects, thus allowing for the differential use of the measurement scale by different subjects.
In the empirical study, subjects' responses to a full profile design are randomly deleted to test the performance of HB methods with declining sample sizes. These simple experiments indicate that HB methods can recover heterogeneity and estimate individual-level partworths, even when individual-level least squares estimators do not exist due to insufficient degrees of freedom.
Motivated by these empirical studies, the paper analytically investigates the trade-off between the number of profiles per subject and the number of subjects on the statistical accuracy of the estimators that describe the partworth heterogeneity. The paper considers two experimental designs: each subject receives the same set of profiles, and subjects receive different blocks of a fractional factorial design. In the first case, the optimal design, subject to a budget constraint, uses more subjects and fewer profiles per subject when the ratio of unexplained, partworth heterogeneity to unexplained response variance is large. In the second case, one can maintain a given level of estimation accuracy as the number of profiles per subject decreases by increasing the number of subjects assigned to each block.
These results provide marketing researchers the option of using shorter questionnaires for complex products or services. The analysis assumes that response quality is independent of questionnaire length and does not address the impact of design factors on response quality. If response quality and questionnaire length were, in fact, unrelated, then marketing researchers would still find the paper's results useful in improving the efficiency of their conjoint designs. However, if response quality were to decline with questionnaire length, as the preponderance of academic research indicates, then the option to use shorter questionnaires would become even more valuable.
TL;DR: The mathematical extensions and heuristics that move the method from the theoretical to the practical are reviewed and the effectiveness of model regularization, dynamic model modification and optimization strategies are experimentally analyzed.
Abstract: Hidden Markov models (HMMs) are a highly effective means of modeling a family of unaligned sequences or a common motif within a set of unaligned sequences. The trained HMM can then be used for discrimination or multiple alignment. The basic mathematical description of an HMM and its expectation-maximization training procedure is relatively straightforward. In this paper, we review the mathematical extensions and heuristics that move the method from the theoretical to the practical. We then experimentally analyze the effectiveness of model regularization, dynamic model modification and optimization strategies. Finally it is demonstrated on the SH2 domain how a domain can be found from unaligned sequences using a special model type. The experimental work was completed with the aid of the Sequence Alignment and Modeling software suite.
TL;DR: In this paper, an extension of the concept of quantiles in multidimensions that uses the geometry of multivariate data clouds has been considered, based on blending and generalization of the key ideas used in the construction of spatial median and regression quantiles, both of which have been extensively studied in the literature.
Abstract: An extension of the concept of quantiles in multidimensions that uses the geometry of multivariate data clouds has been considered. The approach is based on blending as well as generalization of the key ideas used in the construction of spatial median and regression quantiles, both of which have been extensively studied in the literature. These geometric quantiles are potentially useful in constructing trimmed multivariate means as well as many other L estimates of multivariate location, and they lead to a directional notion of central and extreme points in a multidimensional setup. Such quantiles can be defined as meaningful and natural objects even in infinite-dimensional Hilbert and Banach spaces, and they yield an effective generalization of quantile regression in multiresponse linear model problems. Desirable equivariance properties are shown to hold for these multivariate quantiles, and issues related to their computation for data in finite-dimensional spaces are discussed. n 1/2 consistenc...
TL;DR: In this article, the problem of heterogeneity across groups (individuals, firms, regions or countries) in dynamic panels has been considered and a number of justifications can be advanced for this neglect.
Abstract: This chapter is concerned with the problem of heterogeneity across groups (individuals, firms, regions or countries) in dynamic panels1 While it is widely recognised that parameter heterogeneity can have important consequences for estimation and inference, most attempts at dealing with it have focused on allowing for intercept variation, and in comparison little attention has been paid to the implications of variation in slopes There are a number of justifications that can be advanced for this neglect
TL;DR: This paper describes the form of the TRL studies and the model-fitting procedures used, and gives examples of the models which have been developed, and constitutes a comprehensive methodology for the development of predictive accident models.
TL;DR: This paper obtained strong Bahadur representations for a general class of M-estimators that satisfy the condition that the random variables are independent but not necessarily identically distributed random variables.
Abstract: We obtain strong Bahadur representations for a general class of M-estimators that satisfies $\Sigma_i \psi (x_i, \theta) = o(\delta_n)$, where the $x_i$'s are independent but not necessarily identically distributed random variables The results apply readily to M-estimators of regression with nonstochastic designs More specifically, we consider the minimum $L_p$ distance estimators, bounded influence GM-estimators and regression quantiles Under appropriate design conditions, the error ratesobtained for the first-order approximations are sharp in these cases We also provide weaker and more easily verifiable conditions that suffice for an error rate that is suboptimal but strong enough for deriving the asymptotic distribution of M-estimators in a wide variety of problems
TL;DR: In this article, conditional means priors are extended to generalized linear models and data augmentation priors where the prior is of the same form as the likelihood are also considered, and the prior distribution on regression coefficients is induced from this specification.
Abstract: This article deals with specifications of informative prior distributions for generalized linear models. Our emphasis is on specifying distributions for selected points on the regression surface; the prior distribution on regression coefficients is induced from this specification. We believe that it is inherently easier to think about conditional means of observables given the regression variables than it is to think about model-dependent regression coefficients. Previous use of conditional means priors seems to be restricted to logistic regression with one predictor variable and to normal theory regression. We expand on the idea of conditional means priors and extend these to arbitrary generalized linear models. We also consider data augmentation priors where the prior is of the same form as the likelihood. We show that data augmentation priors are special cases of conditional means priors. With current Monte Carlo methodology, such as importance sampling and Gibbs sampling, our priors result in...
TL;DR: It turns out that statistical linear regression is superior to fuzzy linear regression in terms of predictive capability, whereas their comparative descriptive performance depends on various factors associated with the data set and proper specificity of the model.
TL;DR: A comparative investigation of both logistic regression models and feed-forward neural networks including some extensions is presented and the theoretical features and properties are reviewed and illustrated in two examples.
TL;DR: The authors consider applications tofolio optimization and actuarial risks (investigation of the dependencies between individual claims on the riskiness of portfolios), and their eye is always toward application.
Abstract: and 5 continue with material investigating the comparison and monotonicity of stochastic models and processes. In the concluding chapters (6–8), the authors turn their attention to applications of stochastic orders. Chapter 6 focuses on queuing theory. In particular, the application of stochastic orders to single-server and multiserver systems is considered. Chapter 7 pursues applications to some stochastic models. Applications in reliability, PERT and scheduling, and statistical physics are developed. Chapter 8 concludes the book by addressing applications to comparison of risks. The authors consider applications to nance (portfolio optimization) and actuarial risks (investigation of the dependencies between individual claims on the riskiness of portfolios). It should be said that the book requires a considerable background in stochastic models in some places. However, the authors provide a very logical organization, and their eye is always toward application.
TL;DR: In this article, nonparametric estimators are applied to an earnings model using data from the Current Population Survey (CPS) to estimate the probability that individuals with low earnings will become high earners in the future.
Abstract: Linear models with error components are widely used to analyse panel data. Some applications of these models require knowledge of the probability densities of the error components. Existing methods handle this requirement by assuming that the densities belong to known parametric families of distributions (typically the normal distribution). This paper shows how to carry out nonparametric estimation of the densities of the error components, thereby avoiding the assumption that the densities belong to known parametric families. The nonparametric estimators are applied to an earnings model using data from the Current Population Survey. The model's transitory error component is not normally distributed. Use of the nonparametric density estimators yields estimates of the probability that individuals with low earnings will become high earners in the future that are much lower than the estimates obtained under the assumption of normally distributed error components.
TL;DR: In this paper, the authors make use of bivariate tensor-product B-splines as an approximation of the function and consider M-type regression splines by minimization of?ni=1?(Yi?XTis?gn(Ti)) for some convex function.
TL;DR: In this article, the authors proposed an expanded downscaling approach to preserve daily variability to the extent that possible climate change permlts are preserved by transforming the technique of unconstrained minimization of the error cost function into a constrained minimization problem with the preservation of local covariance forming the side condition.
Abstract: In an attempt to reconcile the capabilities of statistical downscaling and the demands of ecosystem modeling, the technique of expanded downscaling is introduced. Aimed at use in ecosystem models, emphasis is placed on the preservation of daily variability to the extent that possible climate change permlts. Generally, the expansion is possible for any statistical model which is formulated by utilizing some form of regression, but I will concentrate on linear models as they are easler to handle. Linear statistical downscaling assumes that the local climate anomalies are linearly linked to the global circulation anomalies. In expanded downscaling, in contrast. I propose that the local climate covariance is linked bilinearly to the global circulation covariance. This is done by transforming the technique of unconstrained minimization of the error cost function into a constrained minimization problem, with the preservation of local covariance forming the side condition. A general normalization routine is included on the local side in order to perform the downscaling exclusively with normally distributed variables. Application of the expanded operator to the daily, global circulation works essentially like a weather generator Using observed geopotentlal height fields over the North Atlantic and Europe gave consistent results for the weather s ta t~on at Potsdam with 14 measured quantities, even fol moisture-related variables. For GCM (general circulation model) scenarios, satisfactory results are obtained when the original variables are normally distributed. If they are not, strong sensitivity even to small input changes cause the normalization to produce large errors. Non-normally distributed variables such as most moisture variables are therefore strongly affected by even slight deficiencies of current GCMs with respect to daily variability and climatology. This marks the limit of applicability of expanded downscaling. K E Y WORDS: Statistical downscaling . Climate change . Cllrnate impact. Weather generator
TL;DR: In this article, the authors present graphical methods of displaying data one-way designs factorial designs repeated measure designs simple linear regression and multiple regression analysis log-linear models and logistics regression the generalised linear model distribution-free, computer-intensive models multivariate analysis I - principle components analysis and exploratory factor analysis multiivariate analysis II - confirmatory factor analyses and covariance structure modelling multivariate analyses III - cluster analysis, discriminant analysis, and multidimensional scaling the asssessment of reliability
Abstract: Data, models, and a little history graphical methods of displaying data one-way designs factorial designs repeated measure designs simple linear regression and multiple regression analysis log-linear models and logistics regression the generalised linear model distribution-free, computer-intensive models multivariate analysis I - principle components analysis and exploratory factor analysis multivariate analysis II - confirmatory factor analysis and covariance structure modelling multivariate analysis III - cluster analysis, discriminant analysis, and multidimensional scaling the asssessment of reliability
TL;DR: A model which takes into account characteristics of the portfolio optimization problem which are disregarded in most optimization models, which generalizes one of the linear models which recently appeared in the literature as an alternative to the classical Markowitz model is presented.
TL;DR: This thesis investigates how three different kinds of prior process knowledge can be utilized to deal with model structure selection in system identification and proposes a simplified fuzzy model structure customized to reflect a monotone steady-state behavior.
Abstract: One of the most challenging problems in system identification is that of model structure selection. In this thesis we investigate how three different kinds of prior process knowledge can be utilized to deal with this fundamental issue.The material is presented in four parts. The first one reviews some basic modeling and identification concepts, algorithms and ideas from which the following main three parts take off.The second part considers linear model structures that benefit from prior knowledge about the time constants and resonance frequencies of the underlying system. The idea is to generalize FIR modeling by replacing the delay operator with discrete so-called Laguerre or Kautz filters. While the nice properties of the FIR structure (stability, linear regression formulation, good approximation capability, etc.) are retained, the prior is used to reduce the number of parameters typically needed in the models. Tailored and efficient identification algorithms for these structures are developed and analyzed. The usefulness of the proposed methods is also demonstrated through a number of concrete simulation and application studies.The next approach is termed semi-physical modeling. Starting with simple physical insight into the application, often in terms of a set of unstructured equations, we are here looking for suitable nonlinear transformations of the raw measurements, so as to come up with a reasonable model structure. The suggested modeling procedure shows a first step where symbolic computations are employed to determine a set of physically motivated regressors. We discuss and show how constructive tools from commutative and differential algebra can be applied for this. Then, to avoid unnecessarily complex models, we address the intertwined problem of selecting a "good" subset of these regressors and how to estimate the corresponding parameters. More informal tools such as the programming environment are also treated.The fourth part concerns fuzzy identification where the prior structural knowledge comes in terms of a set of linguistic production rules. We emphasize the close connections between a particular fuzzy model structure on one hand and neural networks, model trees, etc. on the other hand. Several estimation related issues for this fuzzy structure are discussed, e.g., what algorithms to use, the need for regularization, and so on. It is also shown that the expert knowledge easily can be lost in the estimation procedure, unless special parameter restrictions are imposed. In addition, we propose a simplified fuzzy model structure customized to reflect a monotone steady-state behavior. Two applications, a tank and a water heating system, are investigated and successfully modeled within this framework.Finally and quite importantly, prototype software tools supporting the suggested approaches have been designed and implemented. The usefulness of these is illustrated in a number of identification applications.
TL;DR: In this article, a wedge-shaped pattern of variation in stream fish standing stock estimates relative to a habitat variable, in which range of standing stocks increases as a function of the variable, is consistent with the concept that the habitat variable is a limiting factor for fish populations.
Abstract: A wedge-shaped pattern of variation in stream fish standing stock estimates relative to a habitat variable, in which range of standing stocks increases as a function of the variable, is consistent with the concept that the habitat variable is a limiting factor for fish populations. This pattern of variation complicates interpretation of parameter estimates and significance of ordinary least-squares (OLS) regression models of conditional mean standing stock; slopes of these regression models may have little or no relation to slopes of models describing standing stock limits. We modeled standing stock limits by testing for homoscedastic error distributions, screening plots of coordinate pairs for evidence of a wedge-shaped pattern of data, and estimating 90th regression quantiles for simple linear models. Application of this technique to data sets supporting 35 previously published OLS regression models of stream fish standing stocks led to rejection of homoscedasticity (P < 0.10) in 13 of the 35 d...
TL;DR: In this paper, an EM algorithm for exact maximum likelihood estimation of the population parameters for nonlinear random effects models was introduced, which can account for both within-and between-individual sources of variability and serial correlation within individual observations when analyzing unbalanced repeated measures data.
Abstract: The pharmaceutical industry is currently interested in the population approach and population models, also known as mixed effects models and random effects models depending on the precise form. Population models are useful in that they can account for both withinand between-individual sources of variability and serial correlation within individual observations when analyzing unbalanced repeated measures data. The modelling of population pharmacodynamic or pharmacokinetic profiles typically involves nonlinear random effects models. Each individual's observations are modelled by identical (up to unknown parameter values) nonlinear regression models, with the distribution of the observations, or a transformation of the observations, about expected responses taken to be normal, with the degree of variability described by a variance model. Between-individual variability is modelled by a population distribution for the individual regression parameter values (random effects). In a parametric analysis the population distribution is taken to be normal, the parameters of which, along with the parameters of the variance model, are known as the population parameters. Maximum likelihood estimation of the population parameters for nonlinear random effects models was pioneered by Beal and Sheiner (1979), and since then a number of algorithms have appeared for approximate maximum likelihood, including Steimer et al. (1984), Lindstrom and Bates (1990), Beal and Sheiner (1992), and Mentre and Gomeni (1995). All of these algorithms are approximate in some way. For a summary see Beal and Sheiner (1992), Wolfinger (1993), Pinheiro and Bates (1994), and Davidian and Giltinan (1995). In this paper an EM algorithm for exact maximum likelihood estimation is introduced. An EM algorithm obtaining maximum likelihood estimates for linear random effects models was introduced by Dempster, Laird, and Rubin (1977). Laird and Ware (1982), Lindstrom and Bates (1988), Jennrich and Schluchter (1986), and Liu and Rubin (1994) all describe hybrid EM algorithms for the linear random effects model. A true EM algorithm for the linear model is described by Jamshidian and Jennrich (1993). Mentre and Gomeni (1995) describe an approximate EM algorithm for nonlinear random effects models and, from the algorithm given in this paper, it can be seen clearly how their approximations arise. The present algorithm uses Monte Carlo methods to perform the E step, a strategy previously adopted in an altogether different model by Guo and Thompson (1994). Guo and Thompson require a Gibbs sampler, that is, a Markov chain Monte Carlo method for their E step, but the present algorithm uses independent samples. In Section 2 of this paper the nonlinear random effects model is described. Section 3 gives the EM algorithm without random effect covariates, while Section 4 gives the modified algorithm in the
TL;DR: In this paper, the authors investigate whether seasonal adjustment procedures are, at least approximately, linear data transformations and define a set of properties for the adequacy of a linear approximation to a seasonal-adjustment filter.
Abstract: We investigate whether seasonal-adjustment procedures are, at least approximately, linear data transformations. This question was initially addressed by Young and is important with respect to many issues including estimation of regression models with seasonally adjusted data. We focus on the X-11 program and rely on simulation evidence, involving linear unobserved component autoregressive integrated moving average models. We define a set of properties for the adequacy of a linear approximation to a seasonal-adjustment filter. These properties are examined through statistical tests. Next, we study the effect of X-11 seasonal adjustment on regression statistics assessing the statistical significance of the relationship between economic variables. Several empirical results involving economic data are also reported.
TL;DR: A tutorial is presented on the use of the hierarchical linear model in a linear model with nested random coefficients for the analysis of longitudinal data, i.e., repeated data on the same subjects, for multilevel research.
Abstract: The hierarchical linear model in a linear model with nested random coefficients, fruitfully used for multilevel research. A tutorial is presented on the use of this model for the analysis of longitudinal data, i.e., repeated data on the same subjects. An important advantage of this approach is that differences across subjects in the numbers and spacings of measurement occasions do not present a problem, and that changing covariates can easily be handled. The tutorial approaches the longitudinal data as measurements on populations of (subject-specific) functions.
TL;DR: In this paper, the authors compare the uncertainty in the solution stemming from the data splitting with neural network specific uncertainties (parameter initialization, choice of number of hidden units, etc.).
Abstract: This article exposes problems of the commonly used technique of splitting the available data into training, validation, and test sets that are held fixed, warns about drawing too strong conclusions from such static splits, and shows potential pitfalls of ignoring variability across splits. Using a bootstrap or resampling method, we compare the uncertainty in the solution stemming from the data splitting with neural network specific uncertainties (parameter initialization, choice of number of hidden units, etc.). We present two results on data from the New York Stock Exchange. First, the variation due to different resamplings is significantly larger than the variation due to different network conditions. This result implies that it is important to not over-interpret a model (or an ensemble of models) estimated on one specific split of the data. Second, on each split, the neural network solution with early stopping is very close to a linear model; no significant nonlinearities are extracted.
TL;DR: In this paper, a review of the recent experience using nonlinear models and ideas of chaos to model economic data and to provide forecasts that are better than linear models is presented, and some of the reasons for this lack of improvement is examined.
Abstract: This paper begins with a brief review of the recent experience using nonlinear models and ideas of chaos to model economic data and to provide forecasts that are better than linear models. The record of improvement is at best meager. The remainder of the paper examines some of the reasons for this lack of improvement. The concepts of "openness" and "isolation" are introduced, and a case is made that open and nonisolated systems cannot be forecasted; the extent to which economic systems are closed and isolated provides the true pragmatic limits to forecastability. The reasons why local "overfitting," especially with nonparametric models, leads to worse forecasts are discussed. Models and "representations" of data are distinguished and the reliance on minimum mean-square forecast error to choose between models and representations is evaluated.
TL;DR: The article explores the possibility of rapidly designing an appropriate neural net (NN) for time series prediction based on information obtained from stochastic modeling, and possibly initial values for the NN parameters, according to the most adequate linear model.
Abstract: The article explores the possibility of rapidly designing an appropriate neural net (NN) for time series prediction based on information obtained from stochastic modeling. Such an analysis could provide some initial knowledge regarding the choice of an NN architecture and parameters, as well as regarding an appropriate data sampling rate. Stochastic analysis provides a complementary approach to previously proposed dynamical system analysis for NN design. Based on E. Takens's theorem (1981), an estimate of the dimension m of the manifold from which the time series originated can be used to construct an NN model using 2m+1 external inputs. This design is further extended by M.A.S. Potts and D.S. Broomhead (1991) who first embed the state space of a discrete time dynamical system in a manifold of dimension n>>2m+1, which is further projected to its 2m+1 principal components used as external inputs in a radial basis function NN model for time series prediction. Our approach is to perform an initial stochastic analysis of the data and to choose an appropriate NN architecture, and possibly initial values for the NN parameters, according to the most adequate linear model.