TL;DR: In this paper, the authors compare regression coefficients between models in the setting where one of the models is nested in the other, and propose fundamental change in strategies for model comparison in social research as well as modifications in the presentation of results from regression or regression-type models.
Abstract: Statistical methods are developed for comparing regression coefficients between models in the setting where one of the models is nested in the other. Comparisons of this kind are of interest whenever two explanations of a given phenomenon are specified as linear models. In this case, researchers should ask whether the coefficients associated with a given set of predictors change in a significant way when other predictors or covariates are added as controls. Simple calculations based on quantities provided by routines for regression analysis can be used to obtain the standard errors and other statistics that are required. Results are also given for the class of generalized linear models (e.g., logistic regression, log-linear models, etc.). We recommend fundamental change in strategies for model comparison in social research as well as modifications in the presentation of results from regression or regression-type models.
TL;DR: The regression models appropriate for counted data have seen little use in psychology and are likely to be misleading unless restrictive assumptions are met, and 3 alternative regression models are presented.
Abstract: The regression models appropriate for counted data have seen little use in psychology. This article describes problems that occur when ordinary linear regression is used to analyze count data and presents 3 alternative regression models. The simplest, the Poisson regression model, is likely to be misleading unless restrictive assumptions are met because individual counts are usually more variable ("overdispersed") than is implied by the model. This model can be modified in 2 ways to accomodate this problem. In the overdispersed model, a factor can be estimated that corrects the regression model's inferential statistics. In the second alternative, the negative binomial regression model, a random term reflecting unexplained between-subject differences is included in the regression model. The authors compare the advantages of these approaches.
TL;DR: In this paper, the authors propose a multiple regime alternative in which different economies obey different linear models when grouped according to initial conditions, and the marginal product of capital is shown to vary with the level of economic development.
Abstract: This paper provides some new evidence on the behaviour of cross-country growth rates. We reject the linear model commonly used to study cross-country growth behaviour in favour of a multiple regime alternative in which different economies obey different linear models when grouped according to initial conditions. Further, the marginal product of capital is shown to vary with the level of economic development. These results are consistent with growth models which exhibit multiple steady states. Our results call into question inferences that have been made in favour of the convergence hypothesis and further suggest that the explanatory power of the Solow growth model may be enhanced with a theory of aggregate production function differences.
TL;DR: In this paper, a method is described with which immittance data can be tested for Kronig-Kramers compliance, which is linear in nature and is based on a predetermined set of relaxation times.
Abstract: A method is described with which immittance data can be tested for Kronig‐Kramers compliance. In contrast with other procedures, this method is linear in nature and is based on a predetermined set of relaxation times. The model contains as many parameters (or less) as there are data sets. Three modes of operation are described, the first two are based on a linear fit of the model function to the imaginary part or to the real part of the data set. With the fit parameters the corresponding real or imaginary dispersion can be calculated and compared with the actual measurement. In the third mode a complex model function is fitted to the complete data set. As the model function does comply with (a relaxed set of) the Kronig‐Kramers (K‐K) rules, it will not be able to reproduce the data set satisfactory in the case of nonK‐K behavior, as can be observed from the residuals plot. Due to its linear nature, no starting values are needed for the data validation. The main limitation of this procedure is the size of the matrix and the accuracy of the matrix inversion.
TL;DR: In this paper, a univariate nonlinear model for US GNP is presented, which is of the threshold autoregressive type and contains evidence of asymmetric effects of shocks over the business cycle.
Abstract: A univariate nonlinear model is estimated for US GNP that on many criteria outperforms standard linear models. The estimated model is of the threshold autoregressive type and contains evidence of asymmetric effects of shocks over the business cycle. In particular the nonlinear model suggests that the post-1945 US economy is significantly more stable than the pre-1945 US economy.
TL;DR: The basic MARS algorithm is summarized, as well as extensions for binary response, categorical predictors, nested variables and missing values, and an example of MARS applied to a set of clinical data is provided.
Abstract: Multivariate Adaptive Regression Splines (MARS) is a method for flexible modelling of high dimensional data The model takes the form of an expansion in product spline basis functions, where the number of basis functions as well as the parameters associated with each one (product degree and knot locations) are automatically determined by the data This procedure is motivated by recursive partitioning (eg CART) and shares its ability to capture high order interactions However, it has more power and flexibility to model relationships that are nearly additive or involve interactions in at most a few variables, and produces continuous models with continuous derivatives In addition, the model can be represented in a form that separately identifies the additive contributions and those associated with different multivariable interactions This paper summarizes the basic MARS algorithm, as well as extensions for binary response, categorical predictors, nested variables and missing values It presents tips on interpreting the output of the standard FORTRAN implementation of MARS, and provides an example of MARS applied to a set of clinical data
TL;DR: A Bayesian model in which both area-specific intercept and trend are modelled as random effects and correlation between them is allowed for is proposed, an extension of that originally proposed for disease mapping.
Abstract: The analysis of variation of risk for a given disease in space and time is a key issue in descriptive epidemiology. When the data are scarce, maximum likelihood estimates of the area-specific risk and of its linear time-trend can be seriously affected by random variation. In this paper, we propose a Bayesian model in which both area-specific intercept and trend are modelled as random effects and correlation between them is allowed for. This model is an extension of that originally proposed for disease mapping. It is illustrated by the analysis of the cumulative prevalence of insulin dependent diabetes mellitus as observed at the military examination of 18-year-old conscripts born in Sardinia during the period 1936-1971. Data concerning the genetic differentiation of the Sardinian population are used to interpret the results.
TL;DR: In this article, the out-of-sample forecasting ability of feed-forward and recurrent neural networks based on empirical foreign exchange rate data is investigated, in which networks are selected based on the predictive stochastic complexity (PSC) criterion, and the selected networks are estimated using both recursive Newton algorithms and the method of nonlinear least squares.
Abstract: SUMMARY In this paper we investigate the out-of-sample forecasting ability of feedforward and recurrent neural networks based on empirical foreign exchange rate data A two-step procedure is proposed to construct suitable networks, in which networks are selected based on the predictive stochastic complexity (PSC) criterion, and the selected networks are estimated using both recursive Newton algorithms and the method of nonlinear least squares Our results show that PSC is a sensible criterion for selecting networks and for certain exchange rate series, some selected network models have significant market timing ability and/or significantly lower out-of-sample mean squared prediction error relative to the random walk model Neural networks provide a general class of nonlinear models which has been successfully applied in many different fields Numerous empirical and computational applications can be found in the Proceedings of the International Joint Conference on Neural Networks and Conference of Neural Information Processing Systems In spite of its success in various fields, there are only a few applications of neural networks in economics Neural networks are novel in econometric applications in the following two respects First, the class of multilayer neural networks can well approximate a large class of functions (Hornik et al, 1989; and Cybenko, 1989), whereas most of the commonly used nonlinear time-series models do not have this property Second, as shown in Barron (1991), neural networks are more parsimonious models than linear subspace methods such as polynomial, spline, and trigonometric series expansions in approximating unknown functions Thus, if the behaviour of economic variables exhibits nonlinearity, a suitably constructed neural network can serve as a useful tool to capture such regularity In this paper we investigate possible nonlinear patterns in foreign exchange data using feedforward, and recurrent networks It has been widely accepted that foreign exchange rates are I(1) (integrated of order one) processes and that changes of exchange rates are uncorrelated over time Hence, changes in exchange rates are not linearly predictable in general For a comprehensive review of these issues, see Baillie and McMahon (1989) Since the empirical studies supporting these conclusions rely mainly on linear time series techniques, it is not unreasonable to conjecture that the linear unpredictability of exchange rates may be due to limitations of linear models Hsieh (1989) finds that changes of exchange rates may be nonlinearly dependent, even though they are linearly uncorrelated Some researchers also
TL;DR: In this article, a predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model, and using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, they implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function.
Abstract: We consider the problem of selecting one model from a large class of plausible models. A predictive Bayesian viewpoint is advocated to avoid the specification of prior probabilities for the candidate models and the detailed interpretation of the parameters in each model. Using criteria derived from a certain predictive density and a prior specification that emphasizes the observables, we implement the proposed methodology for three common problems arising in normal linear models: variable subset selection, selection of a transformation of predictor variables and estimation of a parametric variance function. Interpretation of the relative magnitudes of the criterion values for various models is facilitated by a calibration of the criteria. Relationships between the proposed criteria and other well-known criteria are examined
TL;DR: In this paper, the authors describe a model for estimation of effect size when there is selection based on one-tailed p-values, when the process of publication favors studies with smallp-values and hence large effect estimates.
Abstract: When the process of publication favors studies with smallp-values, and hence large effect estimates, combined estimates from many studies may be biased. This paper describes a model for estimation of effect size when there is selection based on one-tailedp-values. The model employs the method of maximum likelihood in the context of a mixed (fixed and random) effects general linear model for effect sizes. It offers a test for the presence of publication bias, and corrected estimates of the parameters of the linear model for effect magnitude. The model is illustrated using a well-known data set on the benefits of psychotherapy.
TL;DR: A linear mixed-effects model that accounts for the covariances among repeated measurements and for random plot effects is developed with a continuous-time autocorrelation error structure and shows marked improvement compared with models that do not account for the error structure.
Abstract: A linear mixed-effects model that accounts for the covariances among repeated measurements and for random plot effects is developed. A continuous-time autocorrelation error structure permits the mo...
TL;DR: A model-selection approach to the question of whether forward-interest rates are useful in predicting future spot rates indicates that the premium of the forward rate over the spot rate helps to predict the sign of future changes in the interest rate.
Abstract: We take a model-selection approach to the question of whether forward-interest rates are useful in predicting future spot rates, using a variety of out-of-sample forecast-based model-selection criteria—forecast mean squared error, forecast direction accuracy, and forecast-based trading-system profitability. We also examine the usefulness of a class of novel prediction models called artificial neural networks and investigate the issue of appropriate window sizes for rolling-window-based prediction methods. Results indicate that the premium of the forward rate over the spot rate helps to predict the sign of future changes in the interest rate. Furthermore, model selection based on an in-sample Schwarz information criterion (SIC) does not appear to be a reliable guide to out-of-sample performance in the case of short-term interest rates. Thus, the in-sample SIC apparently fails to offer a convenient shortcut to true out-of-sample performance measures.
TL;DR: This paper discusses a method of building nonlinear models of possibly chaotic systems from data, while maintaining good robustness against noise, and shows how the models that are built are close to the simplest possible according to a description length criterion.
TL;DR: The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture.
Abstract: Many authors use feedforward neural networks for modeling and forecasting time series. Most of these applications are mainly experimental, and it is often difficult to extract a general methodology from the published studies. In particular, the choice of architecture is a tricky problem. We try to combine the statistical techniques of linear and nonlinear time series with the connectionist approach. The asymptotical properties of the estimators lead us to propose a systematic methodology to determine which weights are nonsignificant and to eliminate them to simplify the architecture. This method (SSM or statistical stepwise method) is compared to other pruning techniques and is applied to some artificial series, to the famous Sunspots benchmark, and to daily electrical consumption data. >
TL;DR: In this paper, the authors present a set of regression models, including the linear model, the log-linear model, and the generalized least square (GL) model, which is the state-of-the-art.
Abstract: Preface. 1 Linear Algebra, Projections. 1.1 Introduction. 1.2 Vectors, Inner Products, Lengths. 1.3 Subspaces, Projections. 1.4 Examples. 1.5 Some History. 1.6 Projection Operators. 1.7 Eigenvalues and Eigenvectors. 2 Random Vectors. 2.1 Covariance Matrices. 2.2 Expected Values of Quadratic Forms. 2.3 Projections of Random Variables. 2.4 The Multivariate Normal Distribution. 2.5 The chi2, F, and t Distributions. 3 The Linear Model. 3.1 The Linear Hypothesis. 3.2 Confidence Intervals and Tests on eta = c 1 ss 1 + ... + c k ss k . 3.3 The Gauss-Markov Theorem. 3.4 The Gauss-Markov Theorem For The General Case. 3.5 Interpretation of Regression Coefficients. 3.6 The Multiple Correlation Coefficient. 3.7 The Partial Correlation Coefficient. 3.8 Testing H 0 : theta epsilon V 0 V. 3.9 Further Decomposition of Subspaces. 3.10 Power of the F-Test. 3.11 Confidence and Prediction Intervals. 3.12 An Example from SAS. 3.13 Another Example: Salary Data. 4 Fitting of Regression Models. 4.1 Linearizing Transformations. 4.2 Specification Error. 4.3 Generalized Least Squares. 4.4 Effects of Additional or Fewer Observations. 4.5 Finding the "Best" Set of Regressors. 4.6 Examination of Residuals. 4.7 Collinearity. 4.8 Asymptotic Normality. 4.9 Spline Functions. 4.10 Nonlinear Least Squares. 4.11 Robust Regression. 4.12 Bootstrapping in Regression. 4.13 Quantile Regression. 5 Simultaneous Confidence Intervals. 5.1 Bonferroni Confidence Intervals. 5.2 Scheffe Simultaneous Confidence Intervals. 5.3 Tukey Simultaneous Confidence Intervals. 5.4 Comparison of Lengths. 5.5 Bechhofer's Method. 6 Two-and Three-Way Analyses of Variance. 6.1 Two-Way Analysis of Variance. 6.2 Unequal Numbers of Observations Per Cell. 6.3 Two-Way Analysis of Variance, One Observation Per Cell. 6.4 Design of Experiments. 6.5 Three-Way Analysis of Variance. 6.6 The Analysis of Covariance. 7 Miscellaneous Other Models. 7.1 The Random Effects Model. 7.2 Nesting. 7.3 Split Plot Designs. 7.4 Mixed Models. 7.5 Balanced Incomplete Block Designs. 8 Analysis of Frequency Data. 8.1 Examples. 8.2 Distribution Theory. 8.3 Conf. Ints. on Poisson and Binomial Parameters. 8.4 Log-Linear Models. 8.5 Estimation for the Log-Linear Model. 8.6 Chi-Square Goodness-of-Fit Statistics. 8.7 Limiting Distributions of the Estimators. 8.8 Logistic Regression. The Statistical Language R. Answers. Index.
TL;DR: Results from a multiple-linear regression model suggest that as patch sizes, variance/mean ratio, and initial proportions of cover types increase, the proportion error moves in a positive direction and is governed by the interaction of the spatial characteristics and the scale of aggregation.
Abstract: Statistical analyses provide a means for assessing relationships between landscape spatial pattern and errors in the estimates of cover-type proportions as land-cover data are aggregated to coarser scales. Results from a multiple-linear regression model suggest that as patch sizes, variance/mean ratio, and initial proportions of cover types increase, the proportion error moves in a positive direction and is governed by the interaction of the spatial characteristics and the scale of aggregation. However, the standard linear model does not account for the different directions of scale-dependent proportion error since some classes become larger and others become smaller as the scene is aggregated. Addition of indicator variables representing class-type significantly improves the performance by allowing the model to respond differently to different classes. A regression tree model provides a much simpler fit to the complex scaling behavior through an interaction between patch size and aggregation scale. An understanding of the relationships between landscape pattern, scale, and proportion error may advance methods for correcting land-cover area estimates. Such methods could also facilitate high-resolution calibration and validation of coarse-scale remote-sensing-based land-cover mapping algorithms. Ongoing initiatives to produce global land-cover datasets from remote sensing, such as efforts within the IGBP and the EOS MODIS Land-Team, include significant emphasis on high level calibration and validation activities of this nature.
TL;DR: This work considers problems involving the comparison of two or more treatments where the authors have the opportunity to adjust for relevant covariates either conditionally in a regression model or implicitly in repeated measures data, for example, in crossover trials, and sees that for data arising from non-Normal distributions there is the possibility that models adjusting for covariates and those not adjusting forivariability will be inconsistent.
Abstract: We consider problems involving the comparison of two or more treatments where we have the opportunity to adjust for relevant covariates either conditionally in a regression model or implicitly in repeated measures data, for example, in crossover trials. It is seen that for data arising from non-Normal distributions there is the possibility that models adjusting for covariates and those not adjusting for covariates will be inconsistent, that is, at most one of the models can be valid. Alternatively, even if conditional and unconditional models are valid, parameters in each model may have different interpretations. We note that this presents difficulties for the specification and interpretation of the analysis. It is also clear that model validation is critical. Specific attention is paid to survival data analysed by the Cox proportional hazards model.
TL;DR: In this article, the authors give an overview of some of the key issues in empirical nonlinear modeling for chemical process applications, focusing on specific sub-classes of nonlinear models that have analytically useful structural characteristics.
TL;DR: In this article, the authors describe methods for solving economic models when expectations are assumed to have at least some element of consistency with the predictions of the model itself, and present analytical results that establish the convergence properties of alternative solution procedures for linear models with unique solutions.
Abstract: In this report, we describe methods for solving economic models when expectations are presumed to have at least some element of consistency with the predictions of the model itself. We present analytical results that establish the convergence properties of alternative solution procedures for linear models with unique solutions. Only one method is guaranteed to converge, […]
TL;DR: Modelling frequency and count data is a book on categorical data analysis that covers standard models and newly developed ones. It focuses on the distinction between frequencies and counts and demonstrates that much of modern statistics can be seen as special cases of categorical data models.
Abstract: Abstract Categorical data analysis is a special area of generalised linear models, which has become the most important area of statistical applications in many disciplines, from medicine to social sciences. This text presents the standard models and many newly developed ones in a language which can be immediately applied in many modern statistical packages such as GLIM, GENSTAT, S-Plus, as well as SAS and LISP-STAT. The book is structure around the distinction between independent events occurring to different individuals, resulting in frequencies, and repeated events occurring to the same individuals, yielding counts. The book demonstates that much of modern statistics can be seen as special cases of categorical data models; both generalized linear models and proportional hazards models can be fitted as log linear models. More specialized topics such as Markov chains, overdispersion and random effects, are also covered.
TL;DR: A parameter identification method is presented that identifies all the parameters of an induction motor simultaneously and is shown to be more robust than the first with respect to noise sensitivity, which is important since the system must function in an industrial environment.
TL;DR: In this paper, a polynomial type nonlinear autoregressive models with exogenous inputs (NARX) are used to identify and control of highly nonlinear processes.
TL;DR: In this paper, a method of estimating linear model dimension and variable selection is proposed based on a new class of penalty functions and a procedure of sorting covariates based on t-statistics.
Abstract: A method of estimating linear model dimension and variable selection is proposed This new criterion, which generalizes the Cp criterion, the Akaike information criterion (AIC), the Bayes information criterion, and the phiv criterion and is consistent under certain conditions, is based on a new class of penalty functions and a procedure of sorting covariates based on t-statistics In the course of introducing this method, we discuss the important role of the penalty function in the consistency of model dimension estimation and in variable selection The proposed method requires less computation than resampling-based methods that search over all subsets of covariates for the true model Simulation results show that the new method is superior to the Cp criterion and AIC in finite-sample situations as well
TL;DR: In this article, the authors apply the methods of optimum experimental design to models in which the variance, as well as the mean, is a parametric function of explanatory variables, leading to designs when the parameters of both the mean and the variance functions, or the parameter of only one function, are of interest.
Abstract: The methods of optimum experimental design are applied to models in which the variance, as well as the mean, is a parametric function of explanatory variables. Extensions to standard optimality theory lead to designs when the parameters of both the mean and the variance functions, or the parameters of only one function, are of interest. The theory also applies whether the mean and variance are functions of the same variables or of different variables, although the mathematical foundations differ. The example studied is a second-order two-factor response surface for the mean with a parametric nonlinear variance function. The theory is used both for constructing designs and for checking optimality. A major potential for application is to experimental design in off-line quality control.
TL;DR: The apparent superiority of the nonlinear bilinear model suggests that future heart rate dynamics studies should put greater emphasis on nonlinear analyses, as it had a smaller residual variance than either the ARMA or PAR models.
Abstract: The linear autoregressive (AR) model is often used to investigate the pathophysiologic mechanisms controlling heart rate (HR) dynamics. This study implemented parametric models new to this field to determine if a more appropriate HR dynamics modeling structure exists. The linear AR and autoregressive-moving average (ARMA) models, and the nonlinear polynomial autoregressive (PAR) and bilinear (BL) models were fit to instantaneous HR time series obtained from nine subjects in the supine position. Model orders were determined by the Akaike Information Criteria (AIC). Model residual variance was used as the primary intermodel comparison criterion, with significance evaluated by a /spl lambda//sup 2/ distributed statistic. The BL model best represented the HR dynamics, as its residual variance was significantly (p >
TL;DR: In practice, sometimes the performance of such models is not satisfactory and non-linear models are ne... as mentioned in this paper, and nonlinear models may not be suitable for system identification. But they can be used for identification of linear models.
Abstract: An established area within the system identification field is identification of linear models. In practice, sometimes the performance of such models is not satisfactory and non-linear models are ne ...
TL;DR: In this article, a nonlinear model of the decay curves, established according to the nature of Schroeder's decay curves is used for the regression process rather than the linear model used in linear regression.
Abstract: Reverberation decay curves can be obtained by the backward integration of room impulse responses proposed by M. R. Schroeder. The evaluation of reverberation times is often achieved by a linear regression line fitting the reverberation decay curves. However, under noisy conditions, the successful application of this method requires either a careful choice of the integration limit or a precise estimate of the mean‐square value of the background noise. In the present paper, an alternative method using a nonlinear iterative regression approach for evaluating reverberation times from Schroeder’s decay curves is proposed. A nonlinear model of the decay curves, established according to the nature of Schroeder’s decay curves, is used for the regression process rather than the linear model used in linear regression. The regression process is based on the generalized least‐squares error principle in which a rapid convergence can be observed. Preliminary experiments show a slight dependence of the reverberation tim...
TL;DR: It turns out that there are a number of prior choices in the problem formulation, which are crucial for the estimators' behavior, and the role of the prior choices is clarified.
TL;DR: It is demonstrated how, by an appropriate choice of synthesis filters, one can cancel all signal-dependent errors at the output of the system, where the input is a multidimensional signal, and arbitrary sampling lattices are used, as well as to the QMF (alias cancellation) case.
Abstract: A new method for dealing with the effects of quantization in a subband system is proposed. It uses the "gain plus additive noise" linear model for the Lloyd-Max quantizer. Based on this, it is demonstrated how, by an appropriate choice of synthesis filters, one can cancel all signal-dependent errors at the output of the system. The only remaining error is random in nature and not correlated with the input signal. We therefore have a tradeoff between the error being only random or having signal-dependent components as well (since the error variances in both cases are comparable). As a result of having only a random error, it is possible to reduce this error using, for example, a noise removal technique. The result is then extended to the case where the input is a multidimensional signal, and arbitrary sampling lattices are used, as well as to the QMF (alias cancellation) case. To demonstrate the validity of the proposed approach, two types of experiments on images are carried out: In a toy example, it is shown that using noise removal could be beneficial. For a more realistic coding scheme, however, it is demonstrated that even in the case when the model is no longer valid (when some of the subbands are discarded), the output error is still much less correlated with the input signal as opposed to the commonly used subband system, while visually, the reconstructed images look very similar. >