TL;DR: In this paper, a new approach to the analysis of bivariate survival data is presented, which involves the development of a model for bivariate life-tables with a single association parameter which is unaffected by monotone transformation of the marginal distributions.
Abstract: A new approach to the analysis of bivariate survival data is presented. It involves the development of "a model for bivariate life-tables...with a single association parameter which is unaffected by monotone transformation of the marginal distributions. Methods for testing and estimating this parameter from right-censored sample pairs using only rank-order information are presented. The model is a generalization of the proportional hazards model and includes a random effect representing heterogeneity of frailty or proneness to failure." A discussion on "the analysis of litter-matched and matched-pair failure-time data is [included]. Some uses of the methods in rank regression problems involving only one right-censored dependent variable are described and a test is proposed for proportional hazards against alternative error structures leading to converging hazards. Finally the methods are compared and validated by Monte-Carlo simulations." Comments by several people and a discussion concerning the paper are included (pp. 108-17). (EXCERPT)
TL;DR: In this article, the authors argue that the formal hypothesis testing advantage of the Box-Cox functional form is purchased at the expense of other important goals, such as the ability to estimate the prices of the characteristics, measure the response to changes in the prices, and predict future expenditures.
TL;DR: In this paper, a model of women's career choice was tested using the structural equation modeling methodology (Bentler, P. M. (1980). Multivariate analysis with latent variables: Causal modeling.
TL;DR: An autoregressive model for analyzing longitudinal data of this type for the case of a continuous outcome variable is presented and illustrated with data from a longitudinal study that seeks to identify the role of personal cigarette smoking on changes in pulmonary function in children.
Abstract: Korn and Whittemore1,2 have presented methods for analyzing longitudinal data where the number of observations per individual is large relative to the number of variables considered for each subject. However, this is often not the case in epidemiologic studies, since one usually collects data at relatively few time points, and the quantity of data collected for each individual at each time point is typically extensive. We present here an autoregressive model for analyzing longitudinal data of this type for the case of a continuous outcome variable. Some of the important features of this model are that one can (1) in the same analysis, consider both independent variables that are time-dependent and those that are fixed over time, (2) partially use data for an individual where some examinations are missing, (3) assess relationships between changes in outcome and exposure over short periods of time, (4) use ordinary multiple regression methods. Anderson3 has considered this type of model, but, to our knowledge, the model has never been applied to biostatistical problems. We illustrate these methods with data from a longitudinal study that seeks to identify the role of personal cigarette smoking on changes in pulmonary function in children.
TL;DR: The results of this study indicate that farm managers' attitudes are important when studying farm performance since interactions between attitudes and management practices suggest that attitudes act as effect modifers on the management practices-herd performance relationship.
TL;DR: In this paper, the authors address selected issues of their analysis and provide solutions in a natural manner by combining elements of the relevant individual-based theory of stochastic processes with suitable parts of superpopulationist survey methodology.
Abstract: Introduction Background and summary Despite the recent flowering of the literature concerned with methods to analyze human life history segments, comparatively little attention has been given to the particular problems of survey samples of such data This chapter addresses selected issues of their analysis and provides solutions in a natural manner by combining elements of the relevant individual-based theory of stochastic processes with suitable parts of superpopulationist survey methodology There are considerable divergences about some of these issues among current analysts, in particular about whether or when one should weight individual responses by means of reciprocal inclusion probabilities There seems to be a standing dispute between those who would really like to see conventional weights applied in “most” circumstances and others who feel that the case for weighting is much weaker if the analyst wishes to use the data to estimate a properly specified model, since the model presumably “controls for” the effects of the factors which lead to the need for weights in the first place, except perhaps for particular dependent variables in the model (Formulation essentially taken from PSID, 1983, p A-13) Still others may feel that weights are no advantage in model-based analyses
TL;DR: A variety of asymptotically valid tests for orthogonality, serial correlation, predictive failure, and of coefficient restrictions are presented, and their rejection probabilities are assessed in linear structural models with lagged dependent and (possibly) jointly dependent variables by Monte Carlo methods.
TL;DR: In this article, the authors investigated the relationship between the patterns of activities pursued in home-based trip chains and the characteristics of the persons making the chains using non-linear canonical correlation analysis.
Abstract: This research concerns the relationships between the patterns of activities pursued in home-based trip chains and the characteristics of the persons making the chains. The data source is a one-week travel diary reported by persons over eleven years of age in the Netherlands in 1984. All home-based trip chains, including both simple two-link chains and more complex ones, were classified on the basis of the sequence of away-from-home activities. Twenty types were distinguished. The presence or absence of these trip-chain types were then explained in terms of the personal and household characteristics of the travellers using non-linear canonical correlation analysis. This analysis technique can accommodate multiple dependent variables and nominally-scaled (categorical) variables in both the independent and dependent variable sets. These results capture the relationships between the sequences of activities in trip chains and the variables age, sex, working status, household income, stage in the family life cycle, household car ownership, and residential location. The most effective variable was found to be life cycle, followed by age and income.
TL;DR: In this paper, the authors present a study of the relationship of blood pressure, the dependent variable, and certain anthropometric factors in adolescents, where individuals in the original highest and lowest blood pressure groups were deliberately oversampled.
Abstract: Regression analysis is being widely used in the analysis of survey data in situations where complex sampling designs are employed. Often the selection procedure is based on values of the independent variables in which case standard regression methods apply. However the literature abounds with examples where the selection procedure is based on values of the dependent variable. For example DeMets & Halperin (1977) consider the relationship between serum cholesterol, the dependent variable, and dietary cholesterol, the independent variable. From an initial large random sample a smaller subsample was selected with the procedure that only persons with very high or very low initial serum cholesterol were chosen for subsequent measurement of the regression variables. In a study of the relationship of blood pressure, the dependent variable, and certain anthropometric factors in adolescents, Kotchen et al. (1980) performed follow-up on a subsample of a previous survey in which individuals in the original highest and lowest blood pressure groups were deliberately oversampled. A regression analysis was subsequently performed using standard least square methods. Finally Hausman & Wise (1982) discuss income maintenance experiments in econometrics where high-income families were undersampled from the population in a situation where income is the dependent variable of a subsequent regression analysis. There are many reasons why complex sampling designs may be used on the basis of the dependent variable, Y. For example, an initial goal of the study may be to estimate the mean of Y with high precision for which stratified sampling on the basis of Y values may be useful. Subsequently, regression analysis with Y as the dependent variable may be desired. Secondly, oversampling extreme values of Y is a method of reducing the variances of estimated regression coefficients in cases where it may not be feasible to use traditional efficient designs based on the independent variables. For such sampling situations, it is now well known that, in general, ordinary least squares regression estimates will be biased even asymptotically. See, for example, Holt, Smith & Winter (1980) and Nathan & Holt (1980). Various adjusted estimators have been considered which we discuss briefly below.
TL;DR: In this paper, a procedure for determining the equivalent continuum properties of a structure composed of repeated patterns of discrete elements with both displacement and rotational coordinates is presented, and the maximum number of independent variables that may be retained is determined by applying a ranking procedure to the resulting transformation matrix.
Abstract: A procedure for determining the equivalent continuum properties of a structure composed of repeated patterns of discrete elements with both displacement and rotational coordinates is presented. These nodal coordinates are transformed to rigid-body and strain gradient variables using a polynomial approximation. The maximum number of independent variables that may be retained is determined by applying a ranking procedure to the resulting transformation matrix. The possibility of introducing errors by requiring the analyst to supply the strain gradient terms directly is reduced by identifying the appropriate variables through the use of the polynomial expansion and the ranking procedure. Additional constraints may be imposed in this analysis. The equivalent continuum parameters result when a further transformation to the appropriate kinematic variables is applied and the strain energy expression is reduced to these variables. Three-dimensional beamand plate-like structures are treated. The results correspond to findings using other approaches.
TL;DR: The authors investigated the association between two dependent variables, and 58 other variables, in 33 Sheffield secondary schools and found that poor school attendance is strongly associated with socio-economic disadvantage, but not to the same extent with structural or organisational aspects of the school.
Abstract: The study investigated the association between two dependent variables, and 58 other variables, in 33 Sheffield secondary schools. The dependent variables were persistent absenteeism and exclusion for disciplinary reasons. Of the other variables, 22 described the schools’ catchment area, with the remainder describing structural and organisational aspects of the schools themselves. There was no significant relationship between the two dependent variables. Results suggested that poor school attendance is strongly associated with socio‐economic disadvantage, but not to the same extent with structural or organisational aspects of the school. In contrast no model was found which could satisfactorily account for exclusion rates. This was taken as evidence that policy on exclusion is largely idiosyncratic to each school.
TL;DR: The method of experimental data analysis known as least-squares, applicable to a wide range of problems in biochemistry, is presented, which can easily be generalized to include cases where the dependent and independent variables are cross-correlated.
TL;DR: A multinomial logit model has been formulated and estimated for inpatient hospital utilization in a region and provides a good fit of the data and yields insights of particular interest to health planners and hospital administrators.
Abstract: Effective planning and management of multi-institutional health care systems require models to explain and predict hospital utilization in a region. In this study, a multinomial logit model has been formulated and estimated for inpatient hospital utilization in a region. Various measures and indexes that account for the relative attractiveness of hospitals are used in the specification of the model. A data base of more than 130,000 patient records is used to estimate the model. The model provides a good fit of the data and yields insights of particular interest to health planners and hospital administrators. The study also shows that acceptable estimation results can be obtained by careful selection of the estimation method, the procedures for dealing with zero frequencies, and the independent variables.
TL;DR: In this paper, the authors identify those factors which account for administrative innovation in municipal government bureaucracies and examine the relationship between each set of independent variables and the dipensions of innovation.
Abstract: The purpose of this study is to identify those factors which account for administrative innovation in municipal government bureaucracies. Two dimensions of administrative innovation are examined: management and technology. Management innovation refers to procedures and methods by which policies are implemented. Technology innovation refers to the adoption of new physical products or processes. Multiple indicators of specific innovative practices are used to create a management scale and a technology scale and the two scales are then combined to create a composite administration innovation scale. In order to explain the dimensions of innovation the study employs a model comprised of three sets of independent variables: community variables, political system variables, and bureaucratic variables. Multiple regression analysis is used to examine the relationship between each set of independent variables and the dipensions of innovation. A second stage of analysis combines the three sets of explanatory variable...
TL;DR: In this paper, a statistical method called logistic regression is used to relate qualitative dependent variables to one or more independent variables, which may or may not be quantitative, for a hydraulic problem of relating scouring potential in a channel to depth and velocity of flow.
Abstract: Water resource engineers often have to relate qualitative dependent variables to one or more independent variables, which may or may not be quantitative. In such circumstances, the use of conventional regression analysis would encounter a number of difficulties. This paper introduces a statistical method called logistic regression which is specially developed for such conditions. The method is applied to a hydraulic problem of relating scouring potential in a channel to depth and velocity of flow. Whether or not the methodology could become a useful addition in water resources engineering analyses further investigations and applications are necessary.
TL;DR: In this paper, the authors established some new integral inequalities of Opial type in two independent variables and derived the two independent variable analogue of the Opial inequality and its generalization given by G. S. Yang.
Abstract: The aim of the present paper is to establish some new integral inequalities of Opial type in two independent variables. Our results are the two independent variable generalizations of some of the inequalities recently established by the present author and in special cases yield the two independent variable analogue of the Opial inequality and its generalization given by G. S. Yang.
TL;DR: A maximum likelihood estimator for NMDS is developed, and its relationship to the standard Shepard-Kruskal estimation method is described.
Abstract: The properties of nonmetric multidimensional scaling (NMDS) are explored by specifying statistical models, proving statistical consistency, and developing hypothesis testing procedures. Statistical models with errors in the dependent and independent variables are described for quantitative and qualitative data. For these models, statistical consistency often depends crucially upon how error enters the model and how data are collected and summarized (e.g., by means, medians, or rank statistics). A maximum likelihood estimator for NMDS is developed, and its relationship to the standard Shepard-Kruskal estimation method is described. This maximum likelihood framework is used to develop a method for testing the overall fit of the model.
TL;DR: In this article, integral inequalities of the Gollwitzer type in n independent variables are established, which generalize some known results obtained by Gollwitz, Bondge, Pachpatte, Shih, and Yeh.
TL;DR: In this paper, the dependent variable (or response) and the model are transformed in the same way, and two types of transformations, power transformation and weighting, are used together to remove skewness and to induce constant variance.
Abstract: We propose a methodology for fitting theoretical models to data. The dependent variable (or response) and the model are transformed in the same way. Two types of transformations, power transformation and weighting, are used together to remove skewness and to induce constant variance. Our method is applied to the stock-recruitment data of four fish stocks. Also discussed are estimates of the conditional mean and the conditional quantiles of the original response.
TL;DR: In this article, a general multilevel approach that is suitable for structural synthesis is presented, where the dependent design variables are optimized at the first level for any assumed behavior, and some behavior quantities, chosen as the independent variables, are selected at the second level.
Abstract: A general multilevel approach that is suitable for structural synthesis is presented. Some behavior quantities, chosen as the independent variables, are selected at the second level. The dependent design variables are optimized at the first level for any assumed behavior. The elastic analysis (third level) is repeated only after a complete solution of both levels. The number of independent variables in the proposed formulation is not affected by the number of loading conditions. Necessary conditions for a feasible solution are introduced. The main advantages of this approach are: (a) the number of implicit elastic analyses during the solution is relatively small, (b) the number of independent behavior variables is reduced, (c) the possibility of infeasible solutions at the first level is minimized, and (d) the first-level prpblem can be decomposed into several simple independent subproblems. The solution methodology is demonstrated for steel and reinforced concrete structural systems.
TL;DR: It is shown that the only ways to detect a missing variable through residual plots are either through a non-linear trend in the above mentioned residual plot or through a linear or non- linear Trend in the plot of residuals plotted against the missing variables.
Abstract: Regression analysis and residual analysis are discussed. We point out and demonstrate that it is inappropriate to analyse the patterns emerging when the residuals are plotted against the observed dependent variable. This is because of an intrinsic correlation between the residuals and the observed dependent variables. We also point out that the only ways to detect a missing variable through residual plots are either through a non-linear trend in the above mentioned residual plot or through a linear or non-linear trend in the plot of residuals plotted against the missing variables. On this basis we point out some earlier errors in the use of residual analysis.
TL;DR: It is possible to establish load conditions with specific quantitative characteristics which are reflected by behavioural dependent variables which can be evaluated through monitoring physiological and emotional parameters, which in this model are regarded as moderating variables.
Abstract: One problem in psychological stress testing is that the load characteristics of an applied set of experimental conditions are difficult to evaluate. A second problem is that subjects can modify the characteristics of the test situation through their behaviour. Two solutions of these problems are suggested. Firstly, the manipulation of test conditions with respect to a single, common dimension (such as test difficulty); secondly, the individual standardisation of test situations with respect to a common dimensional variable. These approaches are illustrated in two experiments. The results suggest that both procedures are effective. It is possible to establish load conditions with specific quantitative characteristics (task difficulty) which are reflected by behavioural dependent variables. The way in which subjects cope with experimental tasks can then be evaluated through monitoring physiological and emotional parameters, which in this model are regarded as moderating variables.
TL;DR: In this article, various properties of independence or conditional independence between certain random variables may be deduced from the symmetries enjoyed by their joint distributions, where the underlying data-distribution has suitable orthogonal invariance.
TL;DR: In this article, an investigation which compares empirical analyses of a particular type of crime, homicide, that use different measurement strategies, different levels of aggregation, and ratio versus non-ratio variables is presented.
Abstract: Issues of measurement error, level of aggregation, and ratio variables have been considered serious problems in criminological research. Although there have been many recent discussions of these issues in sociology and criminology, studies designed to assess the impact of these problems on the results of empirical research have, for the most part, been absent. After reviewing what is known theoretically and conceptually about these issues, an investigation which compares empirical analyses of a particular type of crime, homicide, that use different measurement strategies, different levels of aggregation, and ratio versus nonratio variables is presented. Utilizing homicide data from the mid-1970s and selected independent variables, the results of this investigation indicate that these three problems can interact in an empirical setting such that potential solutions to these problems do not always apply in the manner suggested in previous studies. The results also indicate that there is great risk in ignoring one or more of these problems in empirical research, in that different substantive conclusions can be reached from analyses that ignore these issues compared with analyses that deal directly with them.
TL;DR: In this paper, a Monte Carlo study of OLS and GLS based adaptive ridge estimators for regression problems in which the independent variables are collinear and the errors are autocorrelated is presented.
Abstract: This paper presents the results of a Monte Carlo study of OLS and GLS based adaptive ridge estimators for regression problems in which the independent variables are collinear and the errors are autocorrelated. It studies the effects of degree of collinearity, magnitude of error variance, orientation of the parameter vector and serial correlation of the independent variables on the mean squared error performance of these estimators. Results suggest that such estimators produce greatly improved performance in favorable portions of the parameter space. The GLS based methods are best when the independent variables are also serially correlated.
TL;DR: In this article, a crime-specific approach to data analysis is proposed and tested to resolve some of the ambiguity of previous research on factors influencing sentence severity, which has produced ambiguous findings.
TL;DR: In this paper, the authors show that direct replacement of an explanatory variable by the forecast yields worse forecasts of the dependent variable than does respecification of the equation to omit Xt.
Abstract: Ashley (1983) gave a simple condition for determining when a forecast of an explanatory variable (Xt ) is sufficiently inaccurate that direct replacement of Xt by the forecast yields worse forecasts of the dependent variable than does respecification of the equation to omit Xt . Many available macroeconomic forecasts were shown to be of limited usefulness in direct replacement. Direct replacement, however, is not optimal if the forecast's distribution is known. Here optimal linear forms in commercial forecasts of several macroeconomic variables are obtained by using estimates of their distributions. Although they are an improvement on the raw forecasts (direct replacement), these optimal forms are still too inaccurate to be useful in replacing the actual explanatory variables in forecasting models. The results strongly indicate that optimal forms involving several commercial forecasts will not be very useful either. Thus Ashley's (1983) sufficient condition retains its value in gauging the usefulness of a...