TL;DR: Geographically weighted regression and the expansion method are two statistical techniques which can be used to examine the spatial variability of regression results across a region and so inform on the presence of spatial nonstationarity as discussed by the authors.
Abstract: Geographically weighted regression and the expansion method are two statistical techniques which can be used to examine the spatial variability of regression results across a region and so inform on the presence of spatial nonstationarity. Rather than accept one set of 'global' regression results, both techniques allow the possibility of producing 'local' regression results from any point within the region so that the output from the analysis is a set of mappable statistics which denote local relation ships. Within the paper, the application of each technique to a set of health data from northeast England is compared. Geographically weighted regression is shown to produce more informative results regarding parameter variation over space. 1 Spatial nonstationarity A frequent aim of data analysis is to identify relationships between pairs of variables, often after negating the effects of other variables. By far the most common type of analysis used to achieve this aim is that of regression, in which relationships between one or more independent variables and a single dependent variable are estimated. In spatial analysis the data are drawn from geographical units and a single regression equation is estimated. This has the effect of producing 'average' or 'global' parameter estimates which are assumed to apply equally over the whole region. That is, the relationships being measured are assumed to be stationary over space. Relationships which are not stationary, and which are said to exhibit spatial nonstationarity, create problems for the interpretation of parameter estimates from a regression model. It is the intention of this paper to compare the results of two statistical techniques, Geographically weighted regression (GWR) and the expansion method (EM), which can be used both to account for and to examine the presence of spatial nonstationarity in relationships .
TL;DR: In this paper, a modified maximum likelihood estimator that corrects for misclassification is proposed, which combines the maximum rank correlation estimator of Han (1987) (Journal of Econometrics 35, 303-316) with isotonic regression.
TL;DR: In this article, a new measure of goodness of fit for linear regression with dichotomous dependent variables is proposed, which can be interpreted intuitively in a similar way to R 2 in the linear regression context.
Abstract: The econometrics literature contains many alternative measures of goodness of fit, roughly analogous to R 2, for use with equations with dichotomous dependent variables. There is, however, no consensus as to the measures' relative merits or about which ones should be reported in empirical work. This article proposes a new measure that possesses several useful properties that the other measures lack. The new measure may be interpreted intuitively in a similar way to R 2 in the linear regression context.
TL;DR: This paper used least squares and least absolute values models to quantify the boundaries of scatter diagrams and compared the estimated slopes for consistency, finding that least squares regression techniques were particularly sensitive to outlying y values and irregularities in the distribution of observations, and that they frequently produced incon- sistent estimates of slope for upper and lower bounds.
Abstract: Scatter diagrams have historically proved useful in the study of associative relationships in ecology. Several important ecological questions involve correlations be- tween variables resulting in polygonal shapes. Two examples that have received consid- erable attention are patterns between prey size and predator size in animal populations and the relationship between animal abundance and body size. Each is typically illustrated using scatter diagrams with upper and lower boundaries of response variables often changing at different rates with changes in the independent variables. Despite recent statistical contri- butions that have stimulated an interest in characterizing the limits of a variable, a consensus on an appropriate methodology to quantify the boundaries of scatter diagrams has not yet been achieved. We tested regression techniques based on least squares and least absolute values models using several independent data sets on prey length and predator length for piscivorous fishes and compared estimated slopes for consistency. Our results indicated that least squares regression techniques were particularly sensitive to outlying y values and irregularities in the distribution of observations, and that they frequently produced incon- sistent estimates of slope for upper and lower bounds. In contrast, quantile regression techniques based on least absolute values models appeared robust to outlying y values and sparseness within data sets, while providing consistent estimates of upper and lower bound slopes. Moreover, the use of quantile regression eliminated the need for an excess of arbitrary decision-making on the part of the investigator. We recommend quantile regression as an improvement to currently available techniques used to examine potential ecological relationships dependent upon quantitative information on the boundaries of polygonal re- lationships.
TL;DR: In this paper, a new technique is developed for the identification of inhomogeneities in Canadian temperature series, which is based on the application of four linear regression models in order to determine whether the tested series is homogeneous, if there is a nonclimatic trend, a step, or trends before and/or after a step.
Abstract: A new technique has been developed for the identification of inhomogeneities in Canadian temperature series. The objective is to identify two types of inhomogeneities—nonclimatic steps and trends—in the series of a candidate station in the absence of prior knowledge of the time of site changes and to properly estimate their position in time and their magnitude. This new technique is based on the application of four linear regression models in order to determine whether the tested series is homogeneous, if there is a nonclimatic trend, a step, or trends before and/or after a step. The dependent variable is the series of the candidate station and the independent variables are the series of some neighboring stations. Additional independent variables are used to describe and measure steps and trends existing in the tested series but not in the neighboring series. After the application of each model, the residuals are analyzed in order to determine the fit of the model. If there is significant autocorrelation in the residuals, nonidentified inhomogeneities are suspected in the tested series and a different model is applied to the datasets. A model is finally accepted when the residuals are considered to be random variables. The description of the technique is presented along with some evaluation of its ability to identify inhomogeneities. Results are illustrated through the provision of an example of its application to archived temperature datasets.
TL;DR: The role in experiments of measures of independent variables and proposed mediating variables is examined and alternative explanations, critical for assessing construct validity, are distinguished from different general theoretical accounts of a finding.
Abstract: A study of experiments in major social psychology journals shows that measures of independent variables have become increasingly common The role in experiments of measures of independent variables and proposed mediating variables is examined In the causal sequence assumed in interpreting an experimental result, the independent variable and proposed mediating variable are presumed to mediate the effect of the experimental treatment on the dependent measure Measures of independent variables and mediators provide checks on the assumptions that the experimental treatment successfully manipulated those variables and are unquestionably useful A separate, controversial issue is whether such measures are necessary in experiments If no plausible alternative explanations exist, data from such measures are not needed Plausible alternative explanations are not eliminated by data from such measures Alternative explanations, criticalfor assessing construct validity (Cook & Campbell, 1979), are distinguished from
TL;DR: Generalized additive models (GAMs) as discussed by the authors allow researchers to fit each independent variable with arbitrary nonparametric functions, subject to the constraint that the non-parametric effects combine additively.
Abstract: Social scientists almost always use statistical models positing the dependent variable as a global, linear function of X, despite suspicions that the social and political world is not so simple, or that our theories are so strong. Generalized additive models (GAMs) let researchers fit each independent variable with arbitrary nonparametric functions, but subject to the constraint that the nonparametric effects combine additively. In this way GAMs strike a sensible balance between the flexibility of nonparametric techniques and the ease of interpretation and familiarity of linear regression. GAMs thus offer social scientists a practical methodology for improving on the extant practice of global linearity by default. We reanalyze published work from several subfields of political science, highlighting the strengths (and limitations) of GAMs. We estimate non-linear marginal effects in a regression analysis of incumbent reelection, nonparametric duration dependence in an analysis of cabinet duration, and within-dyad interaction effects in a reconsideration of the democratic peace hypothesis. We conclude with a more general consideration of the circumstances in which GAMs are likely to be of use to political scientists, as well as some apparent limitations of the technique.
TL;DR: In this article, the authors compared three measurement bases: number pages, sentences and words, and disclosure index score, and concluded that content analysis and disclosure indices measured different concepts, the latter measuring largely a subset of the former, and that researchers, when deciding on whether to measure the dependent variable by content analysis or a disclosure index, will need to define more the relevance of the measurement to be adopted to the research question.
Abstract: Through the juxtaposition of political economy theory and an in-depth empirical analysis, this study provides hrther insights into the understanding of variables that explain variations in voluntary environmental and social accounting disclosures (VESAD) across national and regional boundaries. Factors from three classes of Thomas (1991) classification schema, the organizational attribute (organizational size and economic performance), business environment (industry type) and societal variable (culture, political and civil, system, legal system, level of economic development and equity market) categories, were included in this project.
Listed companies' annual reports were surveyed using content analysis and disclosure index from seven countries in the Asia-Pacific region: Australia, Singapore, Hong Kong, the Philippines, Thailand, Indonesia and Malaysia. The dependent variable, the extent of VESAD information, was measured by four different measurement bases; these were number pages, sentences and words and disclosure index score. Different measurement bases were used to compare and contrast findings from statistical tests to examine if this lead to conflicting or comparable conclusions.
Descriptive and univariate analysis indicated that under all four measurement bases the country of origin was an important determinant of VESAD practices in the Asia-Pacific region. Multiple regression and path analysis showed that organizational size, industry type, culture, political and civil, and legal systems were statistically significant in explaining variations both directly and indirectly. The level of economic development was also found to be of important but only indirectly. It is concluded from these findings that social and political pressures placed on companies by the interaction of these significant variables compel firms to provide VESAD information to meet social expectations and to avoid possible government regulation to preserve their own self interests and survival. Economic performance and equity market factors were of no significant statistical influence.
Empirical results using data measured by the three units of measurement for content analysis were minimal. Differences were . noted however when contrasted against disclosure index scores. It was concluded from these results that content analysis and disclosure indices measured different concepts, the latter measuring largely a subset of the former. The consequence of this finding, is that researchers, when deciding on whether to measure the dependent variable by content analysis or a disclosure index, will need to define more the relevance of the measurement to be adopted to the research question underlying the study. Determination of the unit of analysis to utilize when adopting content analysis is less complicated as each technique provides essentially the same results.
TL;DR: In this paper, a large and representative sample of New Zealand manufacturing exporters is used to empirically test and validate the model of export performance proposed by Aaby and Slater, and a 20-item additive export performance scale is formulated and found to be reliable and normally distributed.
Abstract: A large and representative sample of New Zealand manufacturing exporters is used to empirically test and validate the model of export performance proposed by Aaby and Slater. A 20‐item additive export performance scale, based on both objective and subjective measures, is formulated and found to be reliable and normally distributed. A set of independent variables proposed by Aaby and Slater is operationalised, along with an additional marketing orientation construct based on a ten item scale. A firm size control measure is also utilised. A factor analysis of the independent variable set identifies an interpretable sub‐set of independent measures. Using a multiple regression model, six of seven independent variables are found to have a significant effect on export performance as the dependent variable, and in the hypothesised direction. Implications of the findings for exporters are discussed.
TL;DR: A method and system for grouping multiple data points, each data point being a set (e.g., a vector, a tuple, etc.) including a measured dependent value and at least one related independent variable value, include fitting the data into a model relating the independent and dependent variables of the data, and calculating similarity and distance between the data points and groups of data points.
Abstract: A method and system for grouping multiple data points, each data point being a set (e.g., a vector, a tuple, etc.) including a measured dependent value and at least one related independent variable value, include fitting the data into a model relating the independent and dependent variables of the data, and calculating similarity and distance between the data points and groups of the data points, thereby to group the multiple data points.
TL;DR: It is argued that these types of analyses can support the quantitative multi-scale understanding of land use, needed for the modelling of realistic future land use change scenarios that take into account local and regional conditions of actual land use.
TL;DR: In this paper, a simple change of dependent variables that guarantees positivity of turbulence variables in numerical simulation codes is presented, which is valid for any numerical scheme, be it finite difference, a finite volume, or a finite element method.
Abstract: A simple change of dependent variables that guarantees positivity of turbulence variables in numerical simulation codes is presented. The approach consists of solving for the natural logarithm of the turbulence variables, which are known to be strictly positive. The approach is valid for any numerical scheme, be it finite difference, a finite volume, or a finite element method. The work focuses on the advantages of the proposed change of dependent variables within the framework of an adaptive finite element method. The turbulence equations in logarithmic variables are presented for the standard κ-e model. Error estimation and mesh adaptation procedures are described. The formulation is validated on a shear layer case for which an analytical solution is available. This provides a framework for rigorous comparison of the proposed approach with the standard solution technique, which makes use of k and e as dependent variables. The approach is then applied to solve turbulent flow over a NACA0012 airfoil for which experimental measurements are available. The proposed procedure results in a robust adaptive algorithm. Improved predictions of turbulence variables are obtained using the proposed formulation
TL;DR: In this paper, the authors identify the factors that influence the effectiveness of site managers and determine the relationship between these measures and the independent variables, which are; personal variables, job conditions, project characteristics and organisational variables.
TL;DR: In this article, a Bayesian approach for inference in a simultaneous equation model with limited dependent variables (SLDV) is proposed, which employs a combination of Gibbs sampling and data augmentation to avoid direct evaluation of the non-trivial likelihood function.
TL;DR: This work presents a simple method of adjusting for a common measurement error bias that tends to be overlooked in the modelling of associations with change.
Abstract: Biomedical studies often measure variables with error. Examples in the literature include investigation of the association between the change in some outcome variable (blood pressure, cholesterol level etc.) and a set of explanatory variables (age, smoking status etc.). Typically, one fits linear regression models to investigate such associations. With the outcome variable measured with error, a problem occurs when we include the baseline value of the outcome variable as a covariate. In such instances, one can find a relationship between the observed change in the outcome and the explanatory variables even when there is no association between these variables and the true change in the outcome variable. We present a simple method of adjusting for a common measurement error bias that tends to be overlooked in the modelling of associations with change. Additional information (for example, replicates, instrumental variables) is needed to estimate the variance of the measurement error to perform this bias correction.
TL;DR: In this article, the authors compared the continuous-time competing-risk approach to dynamic microsimulation modelling and approaches based on a discrete-time framework in a systematic way and concluded that a discrete time framework with comparatively short time periods appears to be best suited for causal modelling in dynamic micro-simulation models.
Abstract: In this paper the continuous-time competing-risk approach to dynamic microsimulation modelling and approaches based on a discrete-time framework are compared in a systematic way. Besides the basic modelling approaches the possibilities to extend the models to include quantitative and qualitative dependent variables, to use macroeconomic explanatory variables, and to account for dependencies between microunits are discussed. However, most attention is paid to the problems of causality, of simultaneity and of stochastic dependencies between the partial processes in multivariate models. The main conclusion is that a discrete-time framework with comparatively short time periods appears to be best suited for causal modelling in dynamic microsimulation models.
TL;DR: In this paper, three dimensions of household recycling behavior (frequency of participation, amount of recyclingable materials, and contamination of recyclables by improper material) were observed in 705 households of a suburban residential community over an 8-week period.
Abstract: Empirical knowledge about recycling behavior is needed to inform environmental education efforts and policy proposals. Three dimensions of household recycling behavior (frequency of participation, amount of recyclable materials, and contamination of recyclables by improper material) were observed in 705 households of a suburban residential community over an 8-week period. These dependent variables were predicted by a set of 10 independent variables: recycling knowledge, general environmental concern, community attachment, 3 demographic variables, and 4 specific recycling motivation factors. A different pattern of predictor variables was found for each of the dependent variables, and the results suggest that many of the variables that predicted recycling behavior in past research have weaker relationships in current, more convenient, curbside programs.
TL;DR: In this paper, a study of 1586 Mexican- and Anglo-American defendants from Dona Ana County, New Mexico, showed potential differences in findings from analyses with and without these characteristics.
Abstract: Studies of racial/ethnic disparities in criminal case processing have yielded mixed results. Some differences in findings have probably resulted from analyses of different social settings, but some could be attributable to differences in analytical rigor between studies. Contextual analyses are pointless unless the research yields unbiased estimates of the true relationships between a defendant's race/ethnicity and case dispositions. This goal may be furthered by conducting analyses that simultaneously incorporate the following: (a) corrections for sample bias, (b) analyses of several stages of case processing, (c) measures of prior record and offense seriousness which maximize “explained” variation in the dependent variables examined, (d) statistical controls for extralegal variables that correlate with case dispositions, and (e) more rigorous statistical tests for interactions. To demonstrate potential differences in findings from analyses with and without these characteristics, results from a study of 1586 Mexican- and Anglo-American defendants from Dona Ana County, New Mexico, are presented.
TL;DR: In this article, the effect of biased model coefficients on the prediction of forest growth under different situations is discussed and an example of adjusting the model coefficients for the bias using the SIMulation EXtrapolation (SIMEX) algorithm is given.
TL;DR: This study shows the need for guarantees stronger than the usual ones before concluding that there is a clear possibility of using satellite information to estimate forest parameters by means of regression techniques.
Abstract: In order to build models that relate thematic mapper (TM) imagery to field forest variables, several regression techniques, such as the ones based on the Mallows' C/sub p/ and the adjusted R/sup 2/ statistics, were applied. Nevertheless, although the best created models had good fittings (R/sup 2/>0.65) apparently supported by a clear statistical significance (p<0.0001), later trials tested with additional plots showed that these models were, in fact, nonrobust models (models with very low-predictive capabilities). Two factors were pointed out as causes of these inconsistencies between predicted and observed values: a relatively small number of available field plots and a relatively high number of possible independent variables. Actually, different trials suggested much lower fittings for the expected "really" predictive models. Some restrictions of TM satellite data, such as its radiometric, spectral, and spatial limitations, together with restrictions arising from gathering and processing of field data, might have led to these poor relations. This study shows the need for guarantees stronger than the usual ones before concluding that there is a clear possibility of using satellite information to estimate forest parameters by means of regression techniques.
TL;DR: In this paper, the authors proposed a methodology to formulate a dynamic regression with variables observed at different time intervals, and demonstrated this procedure by developing a forecasting model for Singapore's quarterly GDP based on monthly external trade.
TL;DR: In this paper, the authors present a generalized baseline expression for non-weather related independent variables, which can accommodate up to five simultaneous independent variables for a maximum of eight free parameters.
Abstract: Many utility bill analyses in the literature rely only on weather-based correlations. While often the dominant cause of seasonal variations in utility consumption, weather variables are far from the only determinant factors. Vacation shutdowns, “Plug Creep,” changes in building operation and square footage, and plain poor correlation are all too familiar to the practicing performance contractor. This paper presents a generalized Baseline Equation consistent with prior results by others, but extended to include other, non-weather related independent variables. Its compatibility with extensive prior research by others is shown, as well as its application to several types of facilities. The Baseline Equation as presented, can accommodate up to five simultaneous independent variables, for a maximum of eight free parameters. The use of two additional, empirical Degree-Day threshold parameters is also discussed. The Baseline Equation presented herein is at the base of a commercial Utility Accounting software program. All case studies presented to illustrate the development of the Baseline Equation for each facility are drawn from real life studies performed by users of this program.
TL;DR: In this paper, the authors conduct an ex post comparative evaluation exercise for "consensus" office rent models in the UK, common explanatory variables being derived from a literature review and from a survey of practitioners' operational models.
Abstract: Commercial property is regarded by many as functioning in a relatively inefficient market, so that opportunities exist to earn abnormal gains through the exploitation of information which is not reflected in prices. Property portfolio managers therefore rely to some extent on predictions or forecasts of future commercial property market performance as a tool to aid investment decisions. This paper seeks to conduct an ex post comparative evaluation exercise for “consensus” office rent models in the UK, common explanatory variables being derived from a literature review and from a survey of practitioners’ operational models. Three alternative valuation based rent indices are used as the dependent variables. Models are selected and ranked according to historic fit and used to predict five years ahead given perfect foresight. The paper finds that the best fitting models are not the best predicting models. Generally there is no relationship between the predictive rank of a model and the fit rank of a model
TL;DR: In this article, a two-step procedure is proposed to estimate the model and the corresponding asymptotic covariance matrices are derived, and the maximum-likelihood estimator is derived in this way.
TL;DR: The most recent National Road Traffic Forecasts (NRTF) (Department of Transport, 1989) predict an increase in per capita car ownership for Britain of between 51.6 and 71.3 per cent over the period 1990 to 2025 as mentioned in this paper.
Abstract: The most recent National Road Traffic Forecasts (NRTF) (Department of Transport, 1989) predict an increase in per capita car ownership for Britain of between 51.6 and 71.3 per cent over the period 1990 to 2025.l Car use is predicted to increase by between 71.7 and 112.7 per cent over the same period. Since current levels of car ownership and use already contribute towards significant road traffic congestion costs, as well as other external costs (Pearce, 1993), these forecasts imply serious problems in reconciling the demand for road space with its supply. Given the importance of these forecasts in deter mining transport policies, it is essential that they are as accurate as possible. This paper uses the general to specific methodology and the concept of cointegration to develop a well-specified model of per capita car ownership that produces ex post forecasts gener ally superior to those of the NRTF. The National Road Traffic Forecasts of 1980, 1984, and 1989 are based on an eclec tic approach to modelling the growth of private and commercial road traffic. A combi nation of cross-section and time-series analysis is used, and this quantitative analysis is sometimes supplemented with, or modified by, judgements based on factors such as the international experience of road traffic growth. But as Button et al. (1982, p.65) observe: "Present Department of Transport forecasts ... are ad hoc. What is required is a sin gle model, incorporating simultaneously both causal and proxy variables, from which a range of forecasts would be obtained by making suitable assumptions about the likely time paths of the independent variables."
TL;DR: In this article, the authors extended the application of the bootstrap method in accounting research to a simultaneous equations model of the demand and supply of audit services with mixed qualitative and continuous dependent variables.
Abstract: This paper extends the application of the bootstrap method in accounting research to a simultaneous equations model of the demand and supply of audit services with mixed qualitative and continuous dependent variables. A moderately sized sample of 118 quality control reviews (Copley, Doucet, and Gaver 1994) is used to demonstrate the bootstrap method and compare results to estimates of standard errors obtained from Amemiya's 1978 asymptotic generalized least squares (GLS) procedure. We find that the GLS t-statistics are inflated by as much as 55 percent and the corresponding p-values are likewise overstated when compared to the bootstrap results. The problem is more acute with the qualitative dependent variable for audit quality, which is often the key variable of interest.
TL;DR: This technique is shown to have an advantage in terms of both accuracy and speed over approaches where forward accumulation is applied over the entire iterative process, and can be implemented in such a way as to provide a friendly interface for non-specialist users.
Abstract: This paper deals with the calculation of partial derivatives (w.r.t. the independent variables, x) of a vec of dependent variables y which satisfy a system of nonlinear equationsg(u(x), y) = 0 . A number of authors have suggested that the forward accumulation method of automatic differentiation can be applied to a suitable iterative scheme for solving the nonlinear system with a view to giving simultaneous convergence both to the correct value y and also to its Jacobian matrixy_x . It is known, however, that convergence of the derivatives may not occur at the same rate as the convergence of they values. In this paper we avoid both the difficulty and the potential cost of iterating the gradient part of the calculation to sufficient accuracy. We do this by observing that forward accumulation need only be applied to the functions g after the dependent variables, y, have been computed in standard real arithmetic usin g any appropriate method. This so-called Post-Differentiation (PD) technique is shown, on a number of examples, to have an advantage in terms of both accuracy and speed over approaches where forward accumulation is applied over the entire iterative process. Moreover, the PD technique can be implemented in such a way as to provide a friendly interface for non-specialist users.
TL;DR: In this paper, the least square estimation of a change point in multiple regressions is studied and the analytical density function and the cumulative distribution function for the general skewed distribution are derived.
Abstract: This paper studies the least squares estimation of a change point in multiple regressions. Consistency, rate of convergence, and asymptotic distributions are obtained. The model allows for lagged dependent variables and trending regressors. The error process can be dependent and heteroskedastic. For nonstationary regressors or disturbances, the asymptotic distribution is shown to be skewed. The analytical density function and the cumulative distribution function for the general skewed distribution are derived. The analysis applies to both pure and partial changes. The method is used to analyze the response of market interest rates to discount rate changes.
TL;DR: In this paper, the authors show that very successful and very unsuccessful municipalities respond least, while intermediately successful municipalities respond most, which can be interpreted as an interaction effect between interest in the topic of research and evaluation apprehension.
TL;DR: In this article, the robust Wald test statistic for SUR systems with adding up restrictions where the same explanatory variables are present in all equations and where heteroskedasticity and/or autocorrelation of unknown forms may be present is examined.
Abstract: In this paper, we examine the robust Wald test statistic for SUR systems with adding up restrictions where the same explanatory variables are present in all equations and where heteroskedasticity and/or autocorrelation of unknown forms may be present. For this case, the coefficients are usually estimated by least squares, equation by equation. For testing the typical hypotheses of interest, we show that the robust Wald statistic, i.e., the statistic based on the heteroskedasticity and autocorrelation consistent covariance matrix estimator, is invariant to the equation deleted. Our proof of invariance is algebraic and does not rely on parametric assumptions or on the knowledge of the covariance matrix of disturbances. Furthermore, the adding-up restrictions we consider are of a general form: the weighted sum of the dependent variables adds up to one of the explanatory variables, not necessarily a constant. We illustrate our results using the Capital Asset Pricing Model.