TL;DR: In this paper, it was shown that under very general circumstances coefficients in multiple regression models can be replaced with equal weights with almost no loss in accuracy on the original data sample, and that these equal weights will have greater robustness than least squares regression coefficients.
Abstract: It is proved that under very general circumstances coefficients in multiple regression models can be replaced with equal weights with almost no loss in accuracy on the original data sample. It is then shown that these equal weights will have greater robustness than least squares regression coefficients. The implications for problems of prediction are discussed. In the two decades since Meehl's (1954) book on the respective accuracy of clinical versus clerical prediction, little practical consequence has been observed. Diagnoses are still made by clinicians, not by clerks; college admissions are still done by committee, not by computer. This is true despite the considerable strength of Meehl's argument that humans are very poor at combining information optimally and that regression models evidently combine information rather well. These points were underlined in some recent work by Dawes and Corrigan (1974), in which they found again that human predictors do poorly when compared with regression models. Strikingly, they found that for some reason, linear models with random regression weights also do better than do humans. Even more striking, when all regression weights were set equal to one another they found still higher correlation with criterion on a validating sample. The obvious question here is Why? Is it because humans are so terrible at combining information that almost any rule works better, or is it some artifact of linear regression?
TL;DR: In this article, the strong law of large numbers and the central limit theorem for estimators of the parameters in quite general finite-parameter linear models for vector time series are presented.
Abstract: This paper presents proofs of the strong law of large numbers and the central limit theorem for estimators of the parameters in quite general finite-parameter linear models for vector time series. The estimators are derived from a Gaussian likelihood (although Gaussianity is not assumed) and certain spectral approximations to this. An important example of finite-parameter models for multiple time series is the class of autoregressive moving-average (ARMA) models and a general treatment is given for this case. This includes a discussion of the problems associated with identification in such models. LINEAR PROCESSES; VECTOR ARMA MODELS; IDENTIFICATION; LIMIT THEOREMS;
TL;DR: GENCAT is a computer program which implements an extremely general methodology for the analysis of multivariate categorical data which produces minimum modified chi-square statistics, obtained by partitioning the sums of squares as in ANOVA.
TL;DR: In this article, a parametric linear programing (PLP) is used to identify the impulse response function of a linear hydrologic system from a relatively short input-output record.
Abstract: Experience indicates that in the identification of the impulse response function of a linear hydrologic system the results are extremely sensitive to minor errors in the input-output data. In particular, low-amplitude random errors in these data tend to cause severe oscillations in the response function, thereby making it often impossible to obtain a physically realizable solution by conventional methods. Artificial filtering of the input-output records may help, but since the extent of noise is seldom known a priori, one cannot be sure about the proper choice of a cutoff frequency. Such filtering also causes a loss of data at the end points of the record and is therefore undesirable when the number of data points is small. Filtering the response function itself is only effective in eliminating high-frequency oscillations, and it is far less effective when the frequency of the oscillations is relatively low. Clearly, the ultimate goal of identification is to determine a solution which optimizes the predictive capabilities of the linear model. To achieve this goal, it is not sufficient that an observed output be correctly reproduced from a given input; an equally important criterion of optimality is that the shape of the response function be physically plausible. It is shown that one way to obtain a stable and physically realizable response function from a relatively short input-output record is to use parametric linear programing. According to this approach, the problem is formulated as a multicriterion decision process under uncertainty in a manner analogous to that previously described by one of the authors in connection with the inverse problem of groundwater hydrology. Parametric programing serves as a means of generating a continuous set of alternative solutions to the identification problem together with a bicriterion function representing these alternatives. The shape of this bicriterion curve is then used as a guide by the hydrologist in selecting a particular solution when he is relying on his own value judgment. If none of the alternative solutions appears to be physically plausible at this stage, the hydrologist has a further option of imposing modality constraints to eliminate undesirable low-frequency oscillations from the response function. The method is illustrated by two examples, and the results are compared with those obtained by another approach developed previously by one of the authors.
TL;DR: In this paper, the statistical properties of the certainty equivalence control rule and of the least squares estimates generated by this rule are examined experimentally in a linear model with two unknown parameters.
Abstract: The statistical properties of the certainty equivalence control rule and of the least squares estimates generated by this rule are examined experimentally in a linear model with two unknown parameters. It is found that the least squares certainty equivalence rule converges to its true value with probability one and is asymptotically efficient, having an asymptotic distribution with a variance as small as any other strongly consistent rule. However, while a linear combination of the parameter estimates is consistent, the evidence does not confirm that the individual estimates themselves are consistent. If these converge to their true values at all, they do so very slowly (on the order of (log t)- ').
TL;DR: In this article, the kinematic flow on a converging surface is considered as a simple nonlinear model to describe watershed surface runoff, and the converging overland flow model has three parameters: two are geometric parameters, and one is a Kinematic wave friction relationship parameter.
TL;DR: In this paper, the usual posterior mean or mode of a k-dimensional regression coefficient is shown to be (1) a weighted average of 2k constrained least-squares points, (2) a weighted average of k + 1 "ideal" points, and (3) a Weighted Average of K + 1 principal component points.
Abstract: SUMMARY The usual posterior mean or mode of a k-dimensional regression coefficient is shown to be (1) a weighted average of 2k constrained least-squares points, (2) a weighted average of k +1 "ideal" points and (3) a weighted average of k +1 principal component points. These results are used to interpret pretesting procedures which select a constrained least-squares estimate. For special classes of priors a somewhat distorted Bayesian analysis leads to a posterior measure of location that is an F-weighted average of restricted and unrestricted least squares. A spherical prior can be associated with methods that drop principal components. Multilevel testing is shown to imply a non-traditional prior.
TL;DR: The authors examined the relationship between errors in runoff peak prediction by linear and nonlinear surface runoff models and errors in input intensity and showed that if input intensity errors are sufficiently large, a linear model optimally identified according to a least-squares criterion may perform better than a nonlinear model even though the system is truly nonlinear.
TL;DR: In this article, a mathematically correct constrained estimator has been devised, and its solution is obtained by using quadratic programing, and the higher degree of efficiency of this estimator is shown numerically through Monte Carlo experiments.
Abstract: Instant unit hydrograph (IUH) type linear models have been widely used to simulate many hydrologic systems. Unfortunately, the classical parameter estimators used to calibrate this type of model often fail to give a reliable and meaningful solution. It has been suggested that a set of constraints that can be deduced from the physics of the hydrologic system be imposed on the IUH in order to reduce the high sensitivity of the classical estimators to errors in the available data. Therefore a mathematically correct constrained estimator has been devised, and its solution is obtained by using quadratic programing. The higher degree of efficiency of this estimator (smaller dispersion of the estimates about the true value) is shown numerically through Monte Carlo experiments.
TL;DR: In this article, a class of relative growth rate models is defined which includes the linear, modified exponential and logistic growth curves as special eases, and the ternd curve for each model is obtained by integration and approximate-confidence limits can be obtained for the forecasts of future series values.
Abstract: Many annual time series in socioeconomic systems are steadily increasing functions of time. This paper deals with an empirical approach to analyzing and projecting such trending time series from models of relative growth rates or percent changes. A class of relative growth rate models is defined which includes the linear. exponential, modified exponential and logistic growth curves as special eases. Parameters are estimated for the most part by linear regression techniques since the, relative growth rates for this class of models are linear in the parameters. The ternd curve for each model is obtained by integration and approximate-confidence limits can be obtained for the forecasts of future series values.
TL;DR: In this paper, the invariant imbedding filter is presented as a special case of a novel dynamic programming filter, which is used to identify the state parameters of a steel frame of the moment resisting type.
Abstract: In this paper system identification algorithms are employed to construct appropriate nonlinear models of real structures, undergoing seismic disturbances. A three-story half-scale steel frame of the moment resisting type is modeled as a “shear” frame. The data for this frame were obtained from the Earthquake Engineering Research Center of the University of California. The restoring forces in this model are assumed to be of the viscous differential type, containing four unknown parameters. The solution of the problem, that is, the identification of state parameters, is sought by nonlinear filtering methods. The invariant imbedding filter is presented as a special case of a novel dynamic programming filter. In a purely experimental fashion, it is shown that the adopted model is identifiable and stable by utlizing simulated and real data. Abundant numerical experiments are performed; some of the results are presented. By adding corrective nonlinear terms to the basic linear model, the predictive ability of the model is demonstrated.
TL;DR: In this article, it was shown that the possibility of consistent aggregation and the possibility for achieving identification both depend on the same type of condition, namely the existence of a solution to a certain set of linear equations.
Abstract: THE PURPOSE OF THIS PAPER is to develop the theory of aggregation in a relatively simple and general manner. The theory encompasses linear and nonlinear models, with the results for linear models emerging as special cases of a more general treatment. It was recognized at an early stage of the investigation of aggregation problems that the analysis must be based on implicit function theory [11], [13]; it follows that most of the results have local rather than global validity. However, as one might expect, when specialized to the linear case the results are globally valid. It turns out that the theory of identification in econometric models can be based on precisely the same principles as the theory of aggregation. This presents an interesting dilemma. Theoretical economists on the whole seem to be of the opinion that consistent aggregation, in the sense which is described below, is nearly always impossible to achieve. Practical econometricians invariably assume that their models are (at least partially) identifiable. It is shown in this paper that the possibility of consistent aggregation and the possibility of achieving identification both depend on the same type of condition, namely the existence of a solution to a certain set of linear equations. Of course it is the case that in econometric work assumptions are always made which ensure that the relevant equation system has a solution; these assumptions take the form of the imposition of a priori restrictions to augment the number of linearly independent columns of the matrix of the equation system and thus guarantee that a solution exists.
TL;DR: In this article, a multilevel stabilization scheme for high order linear systems is proposed in the framework of the decomposition-aggregation method used so far for stability analysis of large-scale systems by vector Liapunov functions.
TL;DR: In this article, a linear model is fitted to the precipitation series of each group of stations formed by principal component analysis on 157 precipitation records, and precipitation can thus be decomposed into independent components one each for regional temporal effect, regional spatial effect, and a residual or micro-effect.
Abstract: A linear model is fitted to the precipitation series of each group of stations formed by principal component analysis on 157 precipitation records (Dyer 1975). Precipitation can thus be decomposed into independent components one each for regional temporal effect, regional spatial effect, and a residual or micro-effect. Each region has a temporal effect which is analysed for trend, enabling the conclusion to be made that Southern Africa's precipitation budget is stationary. On the other hand, trend on a microscale is present over randomly distributed parts of the country. Spectral analysis shows the oscillatory behaviour of the regional temporal effects, and provides information useful to the fitting of stochastic forecasting models to the data. The technique solves the problem of dependence between meteorological time series, and can be applied to any variable.
TL;DR: The General Linear Model (GLM) as discussed by the authors is a family of models possessing a common characteristic, namely, linearity in the parameters of the equation specifying the model, which has been used extensively in the analysis of nonlinear data.
Abstract: Recent works by Cohen (1968), Kelly, Beggs, McNeil, Eichelberger, and Lyon (1969), Kerlinger and Pedhazur (1973), McNeil (1970), Walberg (1971), and Bottenberg and Ward (Note 1) have attested to the flexibility of the General Linear Model. These publications have shown the capabilities of a single approach to the solution of correlation, regression, and Fisherian analysis of variance problems. It is noteworthy that all six of these publications claim, more or less, to be using the General Linear Model, but in no case has the particular linear model and its assumptions been clearly specified and consistently applied. The General Linear Model is a name given to the family of models possessing a common characteristic, namely, linearity in the parameters of the equation specifying the model. The members of this family are distinguishable in terms of their various assumptions, and it is the contention of this author that the distinctions among these different linear models are of more than just passing interest. The above publications, plus those of Digman (1966) and of McNeil and Spaner (1971), have shown the capabilities of the General Linear Model in handling the analysis of nonlinear data.1 This approach, with a history dating back to Court (1930),
TL;DR: In this article, an approximate solution based on the method of dynamic programming is provided for the optimal control of a system of nonlinear structural equations in econometrics with unknown parameters using a quadratic loss function.
Abstract: An approximate solution, based on the method of dynamic programming, is provided for the optimal control of a system of nonlinear structural equations in econometrics with unknown parameters using a quadratic loss function. It generalizes the methods previously proposed by the author for the control of a nonlinear econometric model with constant parameters and of a linear econometric model with uncertain parameters. It is an improvement over the method of certainty equivalence which replaces the unknown parameters by their mathematical expectations and utilizes the solution for the resulting model. Since the solution is given in the form of feedback control equations, many of the useful concepts and techniques developed in the theory of optimal feedback control for linear systems are now applicable to the control of nonlinear systems using the method proposed, including the calculation of the expected loss of the system under control by analytical rather than Monte Carlo techniques. IN THIS PAPER, I present an approximate solution to the optimal control of a system of nonlinear structural equations using a quadratic welfare loss function when the parameters of the system are unknown. This is a generalization of ths solution given in Chapter 12 of Chow [2] for the control of nonlinear econometric systems with known parameters. It is also a generalization of the solution given in Chow [1] for the control of linear econometric systems with unknown parameters. The method of dynamic programming is applied to solve an optimal control problem involving a nonlinear econometric system with unknown parameters. As it turns out, the solution amounts to linearizing the nonlinear model about some nearly optimal control solution path and then applying a method for controlling the resulting linear model with uncertain parameters. This paper advances the state of the art in the control of nonlinear econometric systems as it improves upon the certainty-equivalence solution which is obtained by replacing the random parameters in a system by their mathematical expectations. It provides for a set of numerical feedback control equations based on a system of nonlinear structural equations in econometrics. It will show that many useful analytical concepts and tools developed in the theory of control of linear systems are indeed applicable to the control of nonlinear systems. Furthermore, in the derivation of an approximate solution using the method of dynamic programming, it will indicate precisely where the approximation takes place and why an exact solution is difficult to achieve. In Section 2, we set up the control problem and provide an exact solution to the optimal control problem for the last period. In Section 3, we give an approximate solution to the multiperiod control problem using dynamic programming. In Section 4, the mathematical expectations required in the solution of Section 3
TL;DR: In this paper, a single-input linear system model and a multiple-input Linear System Model (LSM) were compared to a finite difference model to predict discharge at the downstream end of a 24.14-km reach of a prismatic channel.
Abstract: A single-input linear system model and a multiple-input linear system model are compared to a finite difference model. Comparisons are based on the ability of the models to predict discharge at the downstream end of a 24.14-km reach of prismatic channel. Four types of channels and two slopes covering a wide range of conditions are evaluated. The single-input model compares favorably in cases where flood wave celerity does not vary greatly with discharge. The multiple-input model can be made to compare favorably in all cases. Linear model parameters cannot be set accurately without calibration, except in the case of rectangular channels. The single-input model is approximately one sixth as costly as the finite difference model, and the multiple-input model is approximately one half as costly for the conditions investigated.
TL;DR: A computer program, written in FORTRAN IV language in order to perform regression analyses when the dependent variable is dichotomous, is implemented and is suitable to fit any function non-linear in the parameters.
TL;DR: In this paper, the problem of estimating the coefficient vector β of a linear regression model with quadratic loss function was considered and some biased estimators which utilize the prior information about β were considered.
Abstract: We consider the problem of estimating the coefficient vector β of a linear regression model with quadratic loss function. Some biased estimators which utilize the prior information about β are considered. Also studied is the problem of estimating the parameters of an over-identified structural equation from undersized samples.
TL;DR: The necessary and sufficient conditions for unbiased estimability of parametric functions CED in the general linear multivariate model (Y, AEP, ∑ ⊗ I) are given in this article.
Abstract: The necessary and sufficient conditions for unbiased estimability of parametric functions CED in the general linear multivariate model (Y, AEP, ∑ ⊗ I) are given.
TL;DR: An equivalent linear model for minimum mean-squared error (MMSE) linear estimation of marked and filtered doubly stochastic Poisson processes is presented and the structure of the MMSE noncausal steady-state linear receiver for synchronous m -ary optical data signals is determined.
Abstract: An equivalent linear model for minimum mean-squared error (MMSE) linear estimation of marked and filtered doubly stochastic Poisson processes is presented. The equivalence is employed to determine the structure of the MMSE noncausal steady-state linear receiver for synchronous m -ary optical data signals.
TL;DR: In this article, a comparative assessment of two mathematical models of surface runoff is made by applying them to 21 terraced and unterraced natural agricultural basins located in two geographically different regions.
Abstract: A comparative assessment of two mathematical models of surface runoff is made by applying them to 21 terraced and unterraced natural agricultural basins located in two geographically different regions The models include a nonlinear kinematic wave model based on the converging section geometry (CONV) and Nash's linear model (NASH) The criteria for comparison include prediction of (a) hydrograph peak, (b) hydrograph peak time and (c) the entire hydrograph, and some elementary statistical tests It is shown that CONV has a tendency to overpredict both peak discharge and its time to occurence; while NASH has a tendency to underpredict them It is also shown that because of its nonlinear character CINV is more sensitive to errors in the input than NASH Furthermore, one model cannot be uniformly better than the other The choice of a particular model much depends on the criterion of comparison
TL;DR: In this paper, an application and specialization of the Bayesian linear model developed by Lindley and Smith (1972) is presented. The context is m-group regression and the application to the prediction of grade prediction.
Abstract: This study is an application and specialization of the Bayesian linear model developed by Lindley and Smith (1972). The context is m-group regression and the application to the prediction of grade ...
TL;DR: In this paper, four commonly used models for predicting sediment yield are analyzed and compared using previously published data, three of these models involve logarithmic transformations, and the extent to which these results can be generalized is discussed in the context of model choice.
Abstract: . Four commonly used models for predicting sediment yield are analyzed and compared using previously published data. Three of these models involve logarithmic transformations. Some of the problems involved in transforming data are discussed in the context of logarithmic transformations. These problems are illustrated using the results of standard regression analyses and economic loss function analyses. For the data analyzed, the linear model is preferable to each of the logarithmic models on the basis of each analysis, and the usual multiple objective nature of the model choice problem is thus modified. The extent to which these results can be generalized is discussed in the context of model choice.