TL;DR: In this paper, the basic assumptions underlying one-factor ANCOVA analysis are discussed and the same general problem applies to all linear models with one or more covariates, including generalized linear models (GLIM) such as logistic regressions and even survival analysis.
TL;DR: In this article, it was shown that the inclusion of additional control variables may increase or decrease the bias, and we cannot know for sure which is the case in any particular situation.
Abstract: Quantitative political science is awash in control variables. The justification for these bloated specifications is usually the fear of omitted variable bias. A key underlying assumption is that the danger posed by omitted variable bias can be ameliorated by the inclusion of relevant control variables. Unfortunately, as this article demonstrates, there is nothing in the mathematics of regression analysis that supports this conclusion. The inclusion of additional control variables may increase or decrease the bias, and we cannot know for sure which is the case in any particular situation. A brief discussion of alternative strategies for achieving experimental control follows the main result.
TL;DR: Path analysis cannot be used to establish causality or even to determine whether a specific model is correct; it can only determine whether the data are consistent with the model, but it is extremely powerful for examining complex models and for comparing different models to determine which one best fits the data.
Abstract: Path analysis is an extension of multiple regression. It goes beyond regression in that it allows for the analysis of more complicated models. In particular, it can examine situations in which there are several final dependent variables and those in which there are "chains" of influence, in that variable A influences variable B, which in turn affects variable C. Despite its previous name of "causal modelling," path analysis cannot be used to establish causality or even to determine whether a specific model is correct; it can only determine whether the data are consistent with the model. However, it is extremely powerful for examining complex models and for comparing different models to determine which one best fits the data. As with many techniques, path analysis has its own unique nomenclature, assumptions, and conventions, which are discussed in this paper.
TL;DR: The conceptual complexity of problems was manipulated to probe the limits of human information processing capacity and suggested that a structure defined on four variables is at the limit of human processing capacity.
Abstract: The conceptual complexity of problems was manipulated to probe the limits of human information processing capacity. Participants were asked to interpret graphically displayed statistical interactions. In such problems, all independent variables need to be considered together, so that decomposition into smaller subtasks is constrained, and thus the order of the interaction directly determines conceptual complexity. As the order of the interaction increases, the number of variables increases. Results showed a significant decline in accuracy and speed of solution from three-way to four-way interactions. Furthermore, performance on a five-way interaction was at chance level. These findings suggest that a structure defined on four variables is at the limit of human processing capacity.
TL;DR: Compared the prediction performance between the CART and the negative binomial regression models, this study demonstrates that CART is a good alternative method for analyzing freeway accident frequencies.
TL;DR: In this paper, two new methods for estimating models with nonseparable errors and endogenous regressors were proposed, one estimating the response of the conditional mean of the dependent variable to a change in the explanatory variable while conditioning on an external variable and then undoing the conditioning.
Abstract: We propose two new methods for estimating models with nonseparable errors and endogenous regressors. The first method estimates a local average response. One estimates the response of the conditional mean of the dependent variable to a change in the explanatory variable while conditioning on an external variable and then undoes the conditioning. The second method estimates the nonseparable function and the joint distribution of the observable and unobservable explanatory variables. An external variable is used to impose an equality restriction, at two points of support, on the conditional distribution of the unobservable random term given the regressor and the external variable. Our methods apply to cross sections, but our lead examples involve panel data cases in which the choice of the external variable is guided by the assumption that the distribution of the unobservable variables is exchangeable in the values of the endogenous variable for members of a group.
TL;DR: By comparing the prediction performance between the negative binomial regression model and the artificial neural network, this study demonstrates that ANN is a consistent alternative method for analyzing freeway accident frequency.
TL;DR: A three-way comparison of prediction accuracy involving nonlinear regression, NNs and CART models using a continuous dependent variable and a set of dichotomous and categorical predictor variables is performed.
Abstract: Numerous articles comparing performances of statistical and Neural Networks (NNs) models are available in the literature, however, very few involved Classification and Regression Tree (CART) models in their comparative studies. We perform a three-way comparison of prediction accuracy involving nonlinear regression, NNs and CART models using a continuous dependent variable and a set of dichotomous and categorical predictor variables. A large dataset on smokers is used to run these models. Different prediction accuracy measuring procedures are used to compare performances of these models. The outcomes of predictions are discussed and the outcomes of this research are compared with the results of similar studies.
TL;DR: This paper investigated empirically the determinants of individuals' attitudes towards preventing environmental damage in Spain using data from the World Values Survey and European Values Survey for the periods 1990, 1995 and 1999/2000.
Abstract: This paper investigates empirically the determinants of individuals' attitudes towards preventing environmental damage in Spain using data from the World Values Survey and European Values Survey for the periods 1990, 1995 and 1999/2000 Compared to many previous studies, we present a richer set of independent variables and found that strongly neglected variables such as political interest and social capital have a strong impact on individuals' preferences to prevent environmental damage An interesting aspect in our study is the ability to investigate environmental preferences over time The results show strong differences over time Finally, using disaggregated data for Spanish regions, we also find significant regional differences
TL;DR: Lagged dependent variable (LDV) and change score (CS) methods for analyzing the effect of events in two-wave panel data are compared, and their performances are compared both with a simulation and a substantive example using the National Survey of Families and Households two- wave panel.
Abstract: Study of the effect of transitions on individual and family outcomes is central to understanding families over the life course. There is little consensus, however, on the appropriate statistical methods needed to study transitions in panel data. This article compares lagged dependent variable (LDV) and change score (CS) methods for analyzing the effect of events in two-wave panel data. The methods are described, and their performances are compared both with a simulation and a substantive example using the National Survey of Families and Households two-wave panel. The results suggest that CS methods have advantages over LDV techniques in estimating the effect of events on outcomes in two-wave panel data.
TL;DR: This article is a simple introduction to the latter methods for dealing with confounding in epidemiology with the emphasis on showing how they work, their assumptions, and how they compare with other methods.
Abstract: Confounding is a major concern in causal studies because it results in biased estimation of exposure effects. In the extreme, this can mean that a causal effect is suggested where none exists, or that a true effect is hidden. Typically, confounding occurs when there are differences between the exposed and unexposed groups in respect of independent risk factors for the disease of interest, for example, age or smoking habit; these independent factors are called confounders. Confounding can be reduced by matching in the study design but this can be difficult and/or wasteful of resources. Another possible approach—assuming data on the confounder(s) have been gathered—is to apply a statistical “correction” method during analysis. Such methods produce “adjusted” or “corrected” estimates of the effect of exposure; in theory, these estimates are no longer biased by the erstwhile confounders.
Given the importance of confounding in epidemiology, statistical methods said to remove it deserve scrutiny. Many such methods involve strong assumptions about data relationships and their validity may depend on whether these assumptions are justified. Historically, the most common statistical approach for dealing with confounding in epidemiology was based on stratification ; the standardised mortality ratio is a well known statistic using this method to remove confounding by age. Increasingly, this approach is being replaced by methods based on regression models . This article is a simple introduction to the latter methods with the emphasis on showing how they work, their assumptions, and how they compare with other methods.
Before applying a statistical correction method, one has to decide which factors are confounders. This sometimes1–4 complex issue is not discussed in detail and for the most part the examples will assume that age is a confounder. However, the use of automated statistical procedures for choosing variables to include in a regression model …
TL;DR: In this article, the influence of sales promotion on consumer brand choice behavior is investigated. But, although they may reach their objective in the short term, when the longer term is considered there are undesirable consumer actions.
Abstract: Purpose – This study evidences the influence that sales promotion has on brand choice behaviour. Establishments wish to influence consumers' buying behaviour, and thus they launch strong promotional campaigns or introduce changes in their price policies, among other actions. However, they are not always capable of achieving their goal, since, although they may reach their objective in the short term, when the longer term is considered there are undesirable consumer actions.Design/methodology/approach – The problem of consumer brand choice can be adequately described with logit models that allow the use of discrete dependent variables. The probability that the consumer chooses a brand depends directly on the capacity of satisfaction that the brand holds for him/her. In this case, the dependent variable is the brand, and the independent variables are price, reference price, losses and gains, and the different types or techniques of sales promotion. With the aim of obtaining the necessary information for the...
TL;DR: In this paper, the relationship between Chlorophyll-a and 16 chemical, physical, and biological water quality variables in Camlidere reservoir (Ankara, Turkey) were studied by using principal component scores (PCS) in multiple linear regression analysis (MLR) to predict CHL-a levels.
TL;DR: In this article, the estimation of a fixed effects dynamic panel data model extended to include either spatial error autocorrelation or a spatially lagged dependent variable is discussed, and two leading cases are considered: the Bhargava and Sargan approximation and the Nerlove and Balestra approximation.
Abstract: This article hammers out the estimation of a fixed effects dynamic panel data model extended to include either spatial error autocorrelation or a spatially lagged dependent variable. To overcome the inconsistencies associated with the traditional least-squares dummy estimator, the models are first-differenced to eliminate the fixed effects and then the unconditional likelihood function is derived taking into account the density function of the first-differenced observations on each spatial unit. When exogenous variables are omitted, the exact likelihood function is found to exist. When exogenous variables are included, the pre-sample values of these variables and thus the likelihood function must be approximated. Two leading cases are considered: the Bhargava and Sargan approximation and the Nerlove and Balestra approximation. As an application, a dynamic demand model for cigarettes is estimated based on panel data from 46 U.S. states over the period from 1963 to 1992.
TL;DR: A general method for obtaining moment inequalities for functions of independent random variables is presented in this article, which is based on a generalized tensorization inequality due to Latala and Oleszkiewicz.
Abstract: A general method for obtaining moment inequalities for functions of independent random variables is presented It is a generalization of the entropy method which has been used to derive concentration inequalities for such functions [Boucheron, Lugosi and Massart Ann Probab 31 (2003) 1583-1614], and is based on a generalized tensorization inequality due to Latala and Oleszkiewicz [Lecture Notes in Math, 1745 (2000) 147-168] The new inequalities prove to be a versatile tool in a wide range of applications We illustrate the power of the method by showing how it can be used to effortlessly re-derive classical inequalities including Rosenthal and Kahane-Khinchine-type inequalities for sums of independent random variables, moment inequalities for suprema of empirical processes and moment inequalities for Rademacher chaos and U-statistics Some of these corollaries are apparently new In particular, we generalize Talagrand's exponential inequality for Rademacher chaos of order 2 to any order We also discuss applications for other complex functions of independent random variables, such as suprema of Boolean polynomials which include, as special cases, subgraph counting problems in random graphs
TL;DR: In this article, a disaggregated import demand model for Fiji using relative prices, total consumption, investment expenditure and export expenditure variables for the period 1970 to 2000 is presented, where the recently developed bounds testing approach to test for a long run relationship is used, while the autoregressive distributed lag model is used to estimate short run and long run elasticities.
Abstract: Purpose – This paper aims to estimate a disaggregated import demand model for Fiji using relative prices, total consumption, investment expenditure and export expenditure variables for the period 1970 to 2000.Design/methodology/approach – The recently developed bounds testing approach to cointegration to test for a long run relationship is used, while the autoregressive distributed lag model is used to estimate short run and long run elasticities. These methodologies are shown to perform well in small sample sizes, particularly given that the bounds F‐test critical values for small sample sizes generated by Narayan in 2004 and 2005 are used.Findings – Amongst the key results it is found: a long run cointegration relationship among the variables when import demand is the dependent variable; and import demand to be inelastic and statistically significant at the 1 per cent level with respect to all the explanatory variables in both the long‐run and the short‐run.Originality/value – The disaggregated import d...
TL;DR: In this article, the problem of parameter inference in (possibly non-linear and non-smooth) econometric models when the data are measured with error is studied, where the auxiliary data is a validation sample, and more importantly, a stratified sample is not from the same distribution as the primary data.
Abstract: We study the problem of parameter inference in (possibly non-linear and non-smooth) econometric models when the data are measured with error. We allow for arbitrary correlation between the true variables and the measurement errors. To solve the identification problem, we require the existence of an auxiliary data-set that contains information about the conditional distribution of the true variables given the mismeasured variables. Our main assumption requires that the conditional distribution of the true variables given the mismeasured variables is the same in the primary and auxiliary data. Our methods allow the auxiliary data to be a validation sample, where the primary and validation data are from the same distribution, and more importantly, a stratified sample where the auxiliary data-set is not from the same distribution as the primary data. We also show how to combine the two data-sets to obtain a more efficient estimator of the parameter of interest. We establish the large sample properties of the sieve based estimators under verifiable conditions. In particular, we allow for the mismeasured variables to have unbounded supports without employing the tedious trimming scheme typically used in kernel based methods. We illustrate our methods by estimating a returns to schooling censored quantile regression using the CPS/SSR 1978 exact match files where the dependent variable is measured with error of arbitrary kind.
TL;DR: Sequel, actor, budget, genre, genre (drama), Motion Picture Association of America rating, release periods, and number of first-week screens were significantly related to total box office performance.
Abstract: This study attempts to devise a new theoretical framework to classify and develop predictors of box office performance for theatrical movies. Three dependent variables including total box office, first-week box office, and length of run were adopted. Four categories of independent variables were employed: brand-related variables, objective features, information sources, and distribution-related variables. Sequel, actor, budget, genre (drama), Motion Picture Association of America rating (PG and R), release periods (Summer and Easter), and number of first-week screens were significantly related to total box office performance.
TL;DR: In this article, the authors consider spatial lags of certain independent variables, as well as of the dependent variable, and consider spatial correlation of the error terms, general patterns of heteroscedasticity and of time series autocorrelation, and systems problems.
Abstract: In recent years researchers have considered a variety of regional models relating to infrastructure productivity. These models are often based upon overly simple econometric specifications and are typically formulated as if spatial interactions are absent. In this paper, we try to account for some of these shortcomings. We do this by considering spatial lags of certain independent variables, as well as of the dependent variable. We also consider spatial correlation of the error terms, general patterns of heteroscedasticity and of time series autocorrelation, and systems problems. Our results strongly suggest that regional infrastructure productivity involves spatial spillovers relating to both observable variables and error terms. They also suggest that corresponding coefficient estimates are very sensitive to model specifications.
Abstract: In this paper, an empirical model of intrametropolitan population and employment growth is developed and estimated. The empirical model is derived from equilibrium relationships that describe intrametropolitan population mid employment levels. The equilibrium relationships are related to the dynamic process of population and employment growth through the use of a lagged adjustment model. The empirical implementation of the model uses data on 365 municipalities in northern New Jersey. The use of observations that the geographically proximate necessitates that the spatial structure of the model he handled explicitly. A simultaneous system with spatial autocorrelation in the dependent variables is estimated. The results generally agree well with theory.
TL;DR: This article describes two analytical strategies—regression and stratification—that can be used to assess and reduce confounding in cohort studies and concludes that neither can eliminate bias related to unmeasured or unknown confounders.
Abstract: Analytical strategies can help deal with potential confounding but readers need to know which strategy is appropriate
The previous articles in this series1 2 argued that cohort studies are exposed to selection bias and confounding, and that critical appraisal requires a careful assessment of the study design and the identification of potential confounders. This article describes two analytical strategies—regression and stratification—that can be used to assess and reduce confounding. Some cohort studies match individual participants in the intervention and comparison groups on the basis of confounders, but because matching may be viewed as a special case of stratification we have not discussed it specifically and details are available elsewhere.3 4 Neither of these techniques can eliminate bias related to unmeasured or unknown confounders. Furthermore, both have their own assumptions, advantages, and limitations.
Regression uses the data to estimate how confounders are related to the outcome and produces an adjusted estimate of the intervention effect. It is the most commonly used method for reducing confounding in cohort studies. The outcome of interest is the dependent variable, and the measures of baseline characteristics (such as age and sex) and the intervention are independent variables. The choice of method of regression analysis (linear, logistic, proportional hazards, etc) is dictated by the type of dependent variable. For example, if the outcome is binary (such as occurrence of hip fracture), a logistic regression model would be appropriate; in contrast, if the outcome is time to an event (such as time to hip fracture) a proportional hazards model is appropriate.
Regression analyses estimate the association of each independent variable with the dependent variable after adjusting for the effects of all the other variables. Because the estimated association between the intervention and outcome variables adjusts …
TL;DR: In this paper, classification tree analysis is evaluated as a predictive soil mapping technique for developing a preliminary soil map for neighboring site from samples extracted from an existing soil map, which can help guide future soil mapping in a nearby area.
TL;DR: All of the quantitative species distribution models were good predictors of the validation data set, but the spatial distribution of mapped habitats differed considerably among models, suggesting that choice of model and variable set could influence the identification of areas for conservation emphasis.
Abstract: The widespread use of spatial planning tools in conjunction with increases in the availability of geographic information systems and associated data has led to the rapid growth in the exploration and application of species distribution models. Conservation professionals can choose from a considerable number of modelling techniques, but there has been relatively little evaluation of predictive performance, data requirements, or type of inference of these models. Empirical data for woodland caribou Rangifer tarandus caribou was used to examine four species distribution models, namely a qualitative habitat suitability index and quantitative resource selection function, Mahalanobis distance and ecological niche models. Models for three sets of independent variables were developed and then a temporally independent set of caribou locations evaluated predictive performance. The similarity of species distribution maps among the four modelling approaches was also quantified. All of the quantitative species distribution models were good predictors of the validation data set, but the spatial distribution of mapped habitats differed considerably among models. These results suggest that choice of model and variable set could influence the identification of areas for conservation emphasis. Model choice may be limited by the type of species locations or desired inference. Conservation professionals should choose a model and variable set based on the question, the ecology of the species and the availability of requisite data.
TL;DR: Researchers should consider using decision-making as a contextual backdrop for exploring information use and behaviour, avoid relying solely on self-reported behaviour as data, and use a variety of research methods to provide a richer picture of information-related behaviour.
Abstract: Aim: This paper reports a study examining the barriers associated with research knowledge transfer amongst primary care nurses in the context of clinical decision-making.
Background: The research literature on barriers to nurses' use of research knowledge is characterized by studies that rely primarily on self-report data, making them prone to reporting biases. Studies of the barriers to evidence-based practice often fail to examine information use and behaviour in the context of clinical decision-making.
Methods: A multi-site, mixed method, case study was carried out in 2001. Data were collected in three primary care organizations by means of interviews with 82 primary care nurses, 270 hours of non-participant observation and 122 Q-sorts. Nurses were selected using a published theoretical sampling frame. Between-methods triangulation was employed and data analysed according to the principles of constant comparison. Multiple linear regression was used to explore relationships between a number of independent demographic variables (such as length of clinical experience) and the dependent variable of nurses' perspectives on the barriers to their use of research knowledge.
Results: Three perspectives on barriers to research information use emerged: the need to bridge the skills and knowledge gap for successful knowledge transfer; information formats need to maximize limited opportunities for consumption; and limited access in the context of limited time for decision-making and information consumption. Demographic variables largely failed to predict allegiance to any of the perspectives identified.
Conclusions: Researchers should consider using decision-making as a contextual backdrop for exploring information use and behaviour, avoid relying solely on self-reported behaviour as data, and use a variety of research methods to provide a richer picture of information-related behaviour. Practice developers need to recognize that understanding the decisions to which research knowledge is to be applied should be a characteristic of any strategy to increase research uptake by nurses.
TL;DR: A variety of different models will be identified to describe physiological and anthropometric variables known to vary with body size and other confounding variables, including simple ratio standards, linear and additive polynomial models, and proportional allometric or power function models.
Abstract: This review explores the most appropriate methods of identifying population differences in physiological and anthropometric variables known to differ with body size and other confounding variables. We shall provide an overview of such problems from a historical point of view. We shall then give some guidelines as to the choice of body-size covariates as well as other confounding variables, and show how these might be incorporated into the model, depending on the physiological dependent variable and the nature of the population being studied. We shall also recommend appropriate goodness-of-fit statistics that will enable researchers to confirm the most appropriate choice of model, including, for example, how to compare proportional allometric models with the equivalent linear or additive polynomial models. We shall also discuss alternative body-size scaling variables (height, fat-free mass, body surface area, and projected area of skeletal bone), and whether empirical vs. theoretical scaling methodologies should be reported. We shall offer some cautionary advice (limitations) when interpreting the parameters obtained when fitting proportional power function or allometric models, due to the fact that human physiques are not geometrically similar to each other. In conclusion, a variety of different models will be identified to describe physiological and anthropometric variables known to vary with body size and other confounding variables. These include simple ratio standards (e.g., per body mass ratios), linear and additive polynomial models, and proportional allometric or power function models. Proportional allometric models are shown to be superior to either simple ratio standards or linear and additive polynomial models for a variety of different reasons. These include: 1) providing biologically interpretable models that yield sensible estimates within and beyond the range of data; and 2) providing a superior fit based on the Akaike information criterion (AIC), Bayes information criterion (BIC), or maximum log-likelihood criteria (resulting in a smaller error variance). As such, these models will also: 3) naturally lead to a more powerful analysis-of-covariance test of significance, which will 4) subsequently lead to more correct conclusions when investigating population (epidemiological) or experimental differences in physiological and anthropometric variables known to vary with body size.
TL;DR: In this article, three axioms provide a formal definition of relative importance in a statistical or econometric model by identifying the likelihood that any ordering of independent variables is correctly ordered with respect to their relative importance.
Abstract: Three axioms provide a formal definition of relative importance in a statistical or econometric model by identifying the likelihood that any ordering of independent variables is correctly ordered with respect to their relative importance. The expected contribution to model performance of independent variables with respect to this distribution is the proportional marginal decomposition of model performance with respect to the performance measure. Decomposition components are shown to be equal to the proportional value (Ortmann (2000), Feldman (1999, 2002)) of an appropriately constructed cooperative game. Also addressed are admissibility criteria for measures of relative importance, other measures of relative importance, the entropy of cooperative games and extensions and limitations.
TL;DR: The R statistic, when used in a regression or ANOVA context, is appealing because it summarizes how well the model explains the data in an easy-to-understand way.
Abstract: The R statistic, when used in a regression or ANOVA context, is appealing because it summarizes how well the model explains the data in an easy-tounderstand way. R statistics are also useful to gauge the effect of changing a model. Generalizing R to mixed models is not obvious when there are correlated errors, as might occur if data are georeferenced or result from a designed experiment with blocking. Such an R statistic might refer only to the explanation associated with the independent variables, or might capture the explanatory power of the whole model. In the latter case, one might develop an R statistic from Wald or likelihood ratio statistics, but these can yield different numeric results. Example formulas for these generalizations of R are given. Two simulated data sets, one based on a randomized complete block design and the other with spatially correlated observations, demonstrate increases in R as model complexity increases, the result of modeling the covariance structure of the residuals.
TL;DR: While variable clustering gave similar predictive performance to PCA, habitat factors generated by the former were more readily interpreted than conventional principal components, and cost constraints and the need for dissemination key applied issues offer an important potential advance.
Abstract: Summary
1
Two priorities for applied ecologists are to (i) maintain quantitative rigour with minimal resources and (ii) ensure that multivariate results are readily understood by end users. Habitat descriptions and other complex data present particular challenges.
2
Principal components analysis (PCA) is often used to reduce data and stabilize subsequent statistical analyses. Interpretation can be difficult, however, and PCA is optimized for quantitative (cf. categorical) data. Moreover, future applications (e.g. in predicting species’ distributions) require the recording of all contributing variables irrespective of cost or importance.
3
We considered the potential benefits of two PCA variants. First, we considered whether a cluster analysis on the correlation matrix of independent variables (i.e. variable clustering), followed by a PCA within each cluster, produced a more easily interpreted output than conventional PCA, while simultaneously reducing costs. Secondly, we considered whether a generalized PCA capable of analysing qualitative data could out-perform conventional PCA when ecological data include ordinal variables. As a case study, we used data from river habitat survey (RHS), a key applied tool in river ecology that uses more than 100 variables to describe river structure and relies heavily on three-point ordinal scales. In distribution models that linked river birds to RHS, we compared the interpretability and efficiency of variable clustering and generalized PCA against conventional PCA.
4
While variable clustering gave similar predictive performance to PCA, habitat factors generated by the former were more readily interpreted than conventional principal components. Of the two cluster-scoring methods, optimally scaled PCA explained 24% more variance in the first principal component and marginally improved the accuracy of distribution models.
5
Synthesis and applications. Initial variable clustering makes PCA more interpretable and will benefit the understanding of research results and their translation into management. Variable clustering should also reduce costs as variables contributing to unused clusters need not be recorded in future (cf. PCA). Optimal scaling further increases the versatility of PCA: qualitative ecological data (e.g. habitat categories) can be analysed in the same way as quantitative data, with real benefits to applied research. With cost constraints and the need for dissemination key applied issues, our results offer an important potential advance.
TL;DR: Low correlations among e-government measures and low to moderate consistency in the relationships between established independent variables and e- government measures indicate a significant measurement validity problem and conclusions recommend consideration of a different approach.
Abstract: Global e-government is being tracked using a variety of different measures, none of which have been systematically validated. Little research compares and contrasts these measures and little work has sought to frame and identify potential independent drivers of e-government at the national level. This paper systematically compares and contrasts the dependent variables and the relationships between the e-government variables and independent drivers. Using data from a variety of institutional sources, we find low correlations among e-government measures and low to moderate consistency in the relationships between established independent variables and e-government measures. Findings indicate a significant measurement validity problem and conclusions recommend consideration of a different approach.
TL;DR: In this paper, the macroeconomic determinants of banking sector distresses in the Nordic countries, Belgium, Germany, Greece, Spain and the UK are analysed using an econometric model estimated on panel data from partly the early 1980s to 2002.
Abstract: The macroeconomic determinants of banking sector distresses in the Nordic countries, Belgium, Germany, Greece, Spain and the UK are analysed using an econometric model estimated on panel data from partly the early 1980s to 2002. The dependent variable is the ratio of banks' loan losses to lending. In addition to the lagged dependent variable, the explanatory variables include a surprise change in incomes and real interest rates, both variables as a separate cross-product term with lagged aggregate indebtedness. The underlying macroeconomic account that this paper puts forward is that loan losses are basically generated by strong adverse aggregate shocks under high exposure of banks to such shocks. The underlying innovations to income and real interest rates are constructed using published macroeconomic forecast for these variables. According to the results, high customer indebtedness combined with adverse macroeconomic surprise shocks to income and real interest rates contributed to the distress in banking sector. Loan losses also display strong autoregressive behaviour which might indicate a feedback effect from loan losses back to macroeconomic level in deep recessions. The results can be used in macro stresstesting the banking sector.