TL;DR: It is shown that for a balanced competition this is equivalent to a simple calculator method using only data from the final ladder and a paired home advantage is defined and shown to be linearly related to the distance between club grounds.
Abstract: Least squares is used to fit a model to the individual match results in English football and to produce a home ground advantage effect for each team in addition to a team rating. We show that for a balanced competition this is equivalent to a simple calculator method using only data from the final ladder. The existence of a spurious home advantage is discussed. Home advantages for all teams in the English Football League from 1981-82 to 1990-91 are calculated, and some reasons for their differences investigated. A paired home advantage is defined and shown to be linearly related to the distance between club grounds.
TL;DR: This paper provides an introduction to the practical aspects of function optimization using this approach to simulated annealing, and uses two examples to illustrate the behaviour of the algorithm in low dimensions.
Abstract: Much work has been published on the theoretical aspects of simulated annealing. This paper provides a brief overview of this theory and provides an introduction to the practical aspects of function optimization using this approach. Different implementations of the general simulated annealing algorithm are discussed, and two examples are used to illustrate the behaviour of the algorithm in low dimensions. A third example illustrates a hybrid approach, combining simulated annealing with traditional techniques.
TL;DR: This third edition of Modelling Survival Data in Medical Research contains new chapters on frailty models and their applications, competing risks, non-proportional hazards, and dependent censoring.
Abstract: Modelling Survival Data in Medical Research 3rd Edition Dec 04, 2014 · Modelling Survival Data in Medical Research describes the modelling approach to the analysis of survival data using a wide range of examples from biomedical research. Well known for its nontechnical style, this third edition contains new chapters on frailty models and their applications, competing risks, non-proportional hazards, and dependent censoring. It also describes techniques for
TL;DR: In this article, three different Bayesian approaches to sample size calculations based on highest posterior density (HPD) intervals are discussed and illustrated in the context of a binomial experiment.
Abstract: Three different Bayesian approaches to sample size calculations based on highest posterior density (HPD) intervals are discussed and illustrated in the context of a binomial experiment. The preposterior marginal distribution of the data is used to find the sample size needed to attain an expected HPD coverage probability for a given fixed interval length. Alternatively, one can find the sample size required to attain an expected HPD interval length for a fixed coverage. These two criteria can lead to different sample size requirements. In addition to averaging, a worst possible outcome scenario is also considered. The results presented here provide an exact solution to a problem recently addressed in the literature.
TL;DR: The authors provided an empirical analysis of the distribution of daily returns to the three London Stock Exchange indices, the FT-SE 100, Mid 250 and the 350, over the period 1986-92.
Abstract: This paper provides an empirical examination of the distribution of daily returns to the three London Stock Exchange indices, the FT-SE 100, Mid 250 and the 350, over the period 1986-92. Empirical densities are fitted to each of the return distributions before their shapes are explored by using Tukey's g- and h-distributions. The returns are characterized by highly non-Gaussian behaviour, being both skewed and extremely kurtotic, although the shapes of the distributions depend on whether data from 1986 and 1987 are included or not. Some implications for portfolio analysis are drawn.
TL;DR: In this paper, the authors used a threshold type non-linear time series model with conditional heteroscedastic variance to study the possible asymmetric behavior of stock prices during bear and bull markets.
Abstract: Possible asymmetric behaviour of stock prices during bear and bull markets are studied by using a threshold type non-linear time series model with conditional heteroscedastic variance. Using Hong Kong data it is demonstrated that the return series could have a conditional mean structure which depends on the rise and fall of the market on a previous day. The findings also shed some light on why it could be difficult to reject the efficient market hypothesis. The threshold model with conditional changing variance is also of interest in other financial applications.
TL;DR: The authors investigated the distribution of aggregate attendances between member clubs of the Football League and found that club-specific base levels of support depend on the market size and composition, and the club's age.
Abstract: Using a new data set to investigate the distribution of aggregate attendances between member clubs of the Football League, it is found that club-specific base levels of support depend on the market size and composition, and the club's age. The sensitivity of attendance to success and price is found to be greater for clubs from towns with high proportions of manual workers, whereas a loyalty effect, measured through the estimation of a set of dynamic club-specific attendance equations, is found to be similar across clubs of widely differing sizes and other characteristics.
TL;DR: In this paper, a simulation of various kinds of sporting tournament have been carried out to assess their relative ability to produce as winner the best of the entrants, and a strong contender when it is necessary to play relatively few games is the seeded draw and process, when the players are closely matched and there can be more games the round robin played twice is most effective.
Abstract: Simulations of various kinds of sporting tournament have been carried out to assess their relative ability to produce as winner the best of the entrants. A strong contender when it is necessary to play relatively few games is the seeded draw and process. When the players are closely matched and there can be more games the round robin played twice is most effective.
TL;DR: In this paper, the authors compare two groups in terms of multiple quantiles and illustrate a method for accomplishing this task, which is similar to the one described in this paper.
Abstract: Typically two independent groups are compared in terms of some measure of location, usually the mean, or a method based on ranks. A concern about both of these approaches, already raised in statistical references, is that they can miss important differences. For example, a new treatment method might be beneficial for some subjects but detrimental for others. Details are given in this paper. An approach to this problem is to compare two groups in terms of multiple quantiles. The paper describes and illustrates a method for accomplishing this
TL;DR: A case is made for the development of new types of smart exploratory analysis tools able to explore spatial data effectively while also coping with the problems associated with the data and the skill levels of the end-users.
Abstract: The paper examines some of the problems that users of geographical information systems (GISs) face in attempting to perform spatial analysis. A case is made for the development of new types of smart exploratory analysis tools able to explore spatial data effectively while also coping with the problems associated with the data and the skill levels of the end-users. Some suggestions are made about how artificial intelligence methods borrowed from artificial life can be used to create spatial pattern hunting creatures that may provide the basis for more effective spatial analysis procedures for use with GISs
TL;DR: In this paper, the authors review several properties of sample size methods and discuss the importance of these properties in the context of a binomial experiment, and present a general algorithm for Bayesian sample size determination that is useful for more complex sampling situations based on Monte Carlo simulations.
Abstract: SUMMARY Several criteria for Bayesian sample size determination have recently been proposed. Criteria based on highest posterior density (HPD) intervals from the exact posterior distribution in general lead to smaller sample sizes than those based on non-HPD intervals and/or normal approximations to the exact density. The economies are variable, however, and depend both on the prior inputs and the desired posterior accuracy and coverage probability. In our reply we review several properties of sample size methods and discuss the importance of these properties in the context of a binomial experiment. A general algorithm for Bayesian sample size determination that is useful for more complex sampling situations based on Monte Carlo simulations is briefly described.
TL;DR: In this paper, some useful properties of columns in the arrays are reviewed and several rules for the assignments are presented, helpful in designing experiments using orthogonal arrays.
Abstract: Often industrial experiments require good fractional factorial designs to examine the effects of many factors by using only a small number of experimental runs. These experimental runs can be determined by assigning factors to the columns of appropriate orthogonal arrays. When the experimental runs are carried out in a time order sequence, the responses can depend on the run order. Frequently level changes are more expensive for some factors in the study than for others. To avoid unwanted time effects and to reduce costs, information is needed about the columns of the orthogonal arrays to assign factors to appropriate columns. In this paper, we review some useful properties of columns in the arrays and present several rules for the assignments. These are helpful in designing experiments using orthogonal arrays. For illustration, several examples are given after these rules have been presented.
TL;DR: In this paper, it is shown that the required adjustment is merely a linear combination of the two last observed errors, which can be applied manually by using a feed-back adjustment chart which is no more difficult to use than a Shewhart chart.
Abstract: SUMMARY It is well known that discrete feed-back control schemes chosen to produce minimum mean-squared error at the output can require excessive manipulation of the compensating variable. Also very large reductions in the manipulation variance can be obtained at the expense of minor increases in the output variance by using constrained schemes. Unfortunately, however, both the form and the derivation of such schemes are somewhat complicated. The purpose of this paper is to show that suitable 'tuned' proportional-integral (PI) schemes in which the required adjustment is merely a linear combination of the two last observed errors can do almost as well as the more complicated optimal constrained schemes. If desired, these PI schemes can be applied manually by using a feed-back adjustment chart which is no more difficult to use than a Shewhart chart. Several examples are given and tables are provided that allow the calculation of the optimal constrained PI scheme and the resulting adjustment variance and output variance. Methods of tuning such controllers by using evolutionary operation and experimental design are briefly discussed.
TL;DR: In this article, the authors present a Brief Introductory guide to S-PLUS, including data description and simple inference, IQ scores of children of de-pressed and non-depressed women.
Abstract: l. A Brief Introductory Guide to S-PLUS. 2. Chapter 1-Data Description and Simple Inference: IQ Scores Of Children of De-pressed and Non-Depressed Women. 3. Chapter 2-Multiple Regression: Predicting the Volume of Black Cherry Trees from Measurements of their Height and Diameter. 4. Chapter 3-Analysis of Variance: Diets for Chickens. 5. Chapter 4-Logistic Regression: Predicting Outcome of Treatment for Patients with Leukemia. 6. Chapter 5-Survival Analysis: Modelling the Survival Times of Patients with Leukemia. 7. Chapter 6-Non-Linear Modelling: Fitting the Michaelis-Menten Equation to Hormone-Receptor Assay Results. 8. Chapter 7-The Analysis of Time Series: Yearly Numbers of Sunspots. 9. Chapter 8-Principal Components:Exploring the Progress of Competitors in a 100 Kilometer Road Race. 10. Chapter 9-Cluster Analysis: Classifying Countries in Terms of the Athletic Prowess of their Women. ll. Chapter 10-Correspondence Analysis: Suicide Behaviour in the Former West Germany. 12. Chapter 11-Scattergrams and Density Estimation: Birth and Death Rates for 69 Countries.
TL;DR: In this paper, a self-exciting threshold autoregressive (SETAR) model was used to fit a recent epidemiological time series of reported cases of Salmonella typhimurium in France.
Abstract: In this paper we fit a self-exciting threshold autoregressive (SETAR) model, introduced by Tong, to a recent epidemiological time series of reported cases of Salmonella typhimurium in France. The procedure proposed by Tsay for fitting this class of model is briefly presented. The fitted 'full' model is compared with a simple autoregressive (AR) model. Finally, we compare the full model with the 'restricted' model discussed by Thanoon. Our results favour modelling by a SETAR process instead of an AR process. Thus, the time series of infections due to Salmonella typhimurium exhibits a type of non-linearity which can be accounted for by a threshold model. For parsimony and ease of interpretation of the model, the restricted SETAR model is finally preferred.
TL;DR: The performance of linear Bayes rules in estimating a normal mean is explored and it is found that they have reasonably high efficiencies and therefore can be used instead of the Bayes Rules as the linear rules are much easier to obtain.
Abstract: In this paper we explore the performance of linear Bayes rules in estimating a normal mean. We take a robust Bayesian standpoint, expressing our limited information about the parameter by adopting a family Γ of priors. The family of priors that we are using is fully described by Azzalini and is suitable for modelling non-symmetric situations. We find in many cases that the linear Bayes rules have reasonably high efficiencies (with respect to the corresponding Bayes rules) and therefore can be used instead of the Bayes rules as the linear rules are much easier to obtain.
TL;DR: In this paper, two measures of explained variation for survival data have been proposed by Kom and Simon and by Schemper, and they demonstrate how these measures compare and work out the conceptual differences between them.
Abstract: Two measures of explained variation for survival data have been proposed by Kom and Simon and by Schemper. In this paper, we demonstrate how these measures compare and work out the conceptual differences between them. Both compare the variance when a covariate is accounted for with the variance when it is ignored, but only the second incorporates differences between observed and fitted outcomes. First, the relationship between both measures is studied for the situation without censoring. It turns out that considering the explained variation as a process in time is helpful in this context as well as in its own right. Censoring is incorporated quite differently for the two measures. We illustrate the points made by examples of fictitious data and a study on the treatment of breast cancer.
TL;DR: In regression problems, individual confidence bands on the parameters typically are examined. as discussed by the authors discuss the relationship between the two representations and suggest a simple volume calculation that could be routinely displayed in regression outputs and provide useful information.
Abstract: In regression problems, individual confidence bands on the parameters typically are examined. Doing this avoids the difficulties of envisaging more accurate ellipsoidal confidence regions. We discuss the relationship between the two representations and suggest a simple volume calculation that could be routinely displayed in regression outputs and provide useful information.
TL;DR: In this paper, the sampling behavior of raw and partial measures of association between categorical variables is analyzed and the validity of the asymptotic results is stressed by means of a simulation study.
Abstract: This paper is concerned with the sampling behaviour of raw and partial measures of association between categorical variables. It summarizes the asymptotic results established for raw measures and extends them in a systematic way for the derived partial associations. The validity of the asymptotic results is then stressed by means of a simulation study. Three proportional reduction in error of prediction measures are considered for nominal variables and three concordance-discordance indices for ordinal variables.
TL;DR: In this paper, the density of random vectors and their densities is estimated using stochastic processes with regular conditional probabilities and optimal stopping strategies, asymptotic normality and exponential families.
Abstract: Random vectors and their densities Stochastic processes Regular conditional probabilities Optimal stopping strategies Exponential families Consistency of maximum estimators Asymptotic normality Exercises Prerequisites.
TL;DR: In this paper, a comparison of some of the rules for binomial sampling for sample size determination is presented, and some difficulties of the multinomial distribution computations for the multi-parameter distribution are also reviewed.
Abstract: Bayesian methods for sample size determination (SSD) are very powerful and the basic ideas extend to any univariate distribution. This commentary is a comparison of some of the rules for binomial sampling. Some of the difficulties of SSD computations for the multinomial distribution are also reviewed.
TL;DR: In this article, a simple way to correct the usual sample size formulae (based on normal theory) to account for the relationships between the treatment covariate, the other covariates and the response is presented.
Abstract: References on sample size calculations for randomized studies with a continuous response are abundant. The sample size is usually calculated from a simple two-sample z-test. In fact, most introductory statistical books contain the calculations. Suppose that an investigator is interested in one main covariate, say a treatment covariate, in a non-randomized (or observational) study. In a non-randomized study, besides the treatment covariate, there are also other covariates that are used in a regression. The relationship between the main covariate and other covariates needs to be taken into account when calculating the necessary sample size for detecting a treatment erect. In this paper, we give a simple way to correct the usual sample size formulae (based on normal theory) to account for the relationships between the treatment covariate, the other covariates and the response; the relationships are accounted for by looking at various coefficients of determination
TL;DR: The most complete treatment to date on classical sample size methodology seems to be Desu and Rhagavarao (1990) where the Bayesian approach is barely mentioned as mentioned in this paper.
Abstract: I wish to thank the Editor of The Statistician for inviting me to write this commentary which addresses not only the specific results obtained by Joseph et al. (1995) but also some related results presented by other researchers. Joseph et al. (1995) have written quite an interesting paper on sample size in a Bayesian context, which contributes significantly to the on-going discussion on this topic. As mentioned by Adcock (1992) the determination of sample size is an important question for the applied statistician. Although this question has been of interest for a long time in classical frequentist statistics, it seems to have attracted the attention of Bayesians only quite recently; the reasons could be those mentioned by Pham-Gia and Turkkan (1992). The most complete treatment to date on classical sample size methodology seems to be Desu and Rhagavarao (1990) where the Bayesian approach is barely mentioned. A list of useful sample size tables can also be found in that work.