TL;DR: The pooled mean group estimator (PMG) estimator as discussed by the authors constrains long-run coefficients to be identical but allows short run coefficients and error variances to differ across groups.
Abstract: It is now quite common to have panels in which both T, the number of time series observations, and N, the number of groups, are quite large and of the same order of magnitude. The usual practice is either to estimate N separate regressions and calculate the coefficient means, which we call the mean group (MG) estimator, or to pool the data and assume that the slope coefficients and error variances are identical. In this article we propose an intermediate procedure, the pooled mean group (PMG) estimator, which constrains long-run coefficients to be identical but allows short-run coefficients and error variances to differ across groups. We consider both the case where the regressors are stationary and the case where they follow unit root processes, and for both cases derive the asymptotic distribution of the PMG estimators as T tends to infinity. We also provide two empirical applications: Aggregate consumption functions for 24 Organization for Economic Cooperation and Development economies over th...
TL;DR: The author’s focus is very basic—two-level experiments, estimating effects, visualizing two-factor interactions, assessing an effect's signi cance with normal probability plots and con dence intervals, and basic fractional factorials and aliasing.
Abstract: illustrate how to block on a source of variation. The module could also bene t from a graphical depiction of how a hypothesis test works. The nonparametric-statistics module contains the Wilcoxon–Mann–Whitney test for two averages and the Kruskal–Wallis rank sum test. The basic and special-application SPC sections are, in my opinion, the cornerstone of the reference. Nearly 40% of the modules in the book are devoted to SPC. Crossley provides the customary Xbar/R, Xbar/s, X=MR1 p1 np1 c, and u chart details. He also includes modules devoted to the exponentially weighted moving average (EWMA) chart and two forms of multivariate SPC, a Hotelling’s T 2 chart and a standardized Euclidean distance chart. Three modules are devoted to short-run SPC. Modifying a more common form of SPC chart to obtain a Z chart serves as the foundation for the short-run SPC charts. I would have liked to have seen a discussion on rational subgrouping and the thought process involved prior to plotting any data or computing statistics. Other SPC modules are devoted to nonmainstream control charts— for example, precontrol, demerit charts, 2 charts, and zone charts. Finally, there are two modules devoted to acceptance sampling. The process-improvement section should be viewed as the design of experiment (DOE) section. There are modules devoted to designed experiments, sequential simplex optimization, evolutionary operations, and the Taguchi loss function. The designed-experiment module is the longest module in the book at 38 pages. The author’s focus is very basic—two-level experiments, estimating effects, visualizing two-factor interactions, assessing an effect’s signi cance with normal probability plots and con dence intervals, and basic fractional factorials and aliasing. The DOE portion does not provide extensive tables of possible designs or discuss response surface modeling, blocking, residual analysis, robust designs, or mixture designs. The context is twolevel (fractional) factorials, the industry workhorse, and not mixed models or D-optimal designs. The most obscure topic is a brief discussion of Plackett– Burman screening designs. Nonetheless, the module is easy to read and should be useful as a very basic introduction to the subject. The need for BHH has still not been supplanted. The author seems to have successfully compromised in an area that could easily ll 11000C pages of material. The sequential simplex optimization module could best be described as the sequential experimentation module for low-dimensional problems. The reliability section consists of the reliability and Weibull-analysis modules. Both modules combine for a total of 25 pages. Similar to the DOE comment, one should not expect a comprehensive treatment of a subject that could easily ll several books. Most of the emphasis is on basic concepts such as estimating distribution parameters for the exponential and Weibull distributions, failure-rate con dence intervals, series/parallel systems, and probability plots. There is nothing in the way of accelerated testing, con dence intervals for percentiles, regression with reliability data, Bayesian reliability, or the notorious burn-in sample-size questions. I consider this to be the thinnest section of the book. Except for the items mentioned previously, the book contains little in the way of multivariate or nonparametric statistics. I would have liked to have seen more emphasis on sample-size calculations in the hypothesis-testing modules, given that these questions routinely occur in practice. Aside from an EWMA control chart, there is no mention of autoregressive integrated moving average models or time series. These are all relatively minor points provided the reader does not expect to nd neural-net-like exotica. A serious aw in the book is the large number of errors. Most of the errors appear to be simple typos, but I am not quali ed to technically vouch for all of the material presented. The author also makes minor notation changes between modules. For example, he uses both ä (p. 160) and ” (p. 292) to denote the normal cdf and refers to the minimum-life characteristic with „ (p. 417) and then both ƒ and ‹ (p. 423) in the Weibull-analysis module. The chi-square contingency and goodness-of- t module contains a section (p. 63) in which four of the ve equations on the same page contain errors. Both S and s are used interchangeably to denote the sample standard deviation (p. 94), p and P are used to denote the probability of success (p. 157), and there are several instances in which x , X , and 2 denote the same entity, even in the same set of equations (p. 92). Equations lack connecting equals and separating commas, and one can nd C in place of D. I eventually stopped trying to track the typos. The author generally provides a bibliography at the end of each module. The normal distribution and F test modules do not have a bibliography. Was this intended or an editorial oversight? On a more technical note, the author gives a Bayesian interpretation in reporting the results of a con dence interval. Hypothesis tests are discussed, but there is no mention of the ubiquitous p value. The SPC modules ought to contain some mention of independence and autocorrelation, problems inherent in today’s manufacturing environment. Similarly, Crossley did not mention that one should not compute a Cpk for a process that has not demonstrated stability. These types of basic comments need to be explicitly stated to avoid the common engineering “cookbook” approach to statistics. Some of the material is not freestanding or user-friendly. For example, the acceptance-sampling modules incorporate inspection levels that are not de ned elsewhere in the book. The nonnormal distribution Cpk module could be intimidating to the uninitiated in that it involves unfamiliar topics such as skewness and kurtosis and has a more obscure theoretical underpinning. The author does provide a disclaimer in the Preface, but this may be unfair to the targeted audience. Fortunately, I found only a handful of modules to be dif cult to read. The book would also bene t from more extensive cross-referencing. Several modules make use of hypothesis testing, and some readers could bene t from an appropriate pointer. The SPC modules contained more extensive cross-referencing. The average engineer would bene t most from the material in the book. It is adequately indexed, generally easy to read, algorithmic, and covers a lot of topics. Full-time quality engineers and statisticians will have less use for the book. The material is generally too basic and introductory in nature to answer the subtle questions and pro-and-con evaluations that occur in practice. The reader should not expect an in-depth coverage of SPC, modeling/regression, designed experiments, or reliability. If the book contained fewer errors, I would consider it a welcome addition to my bookshelf. The book contains several digestible morsels that I had not tasted elsewhere. Finally, I do wish the author had presented more in the way of editorial. This may be an unreasonable expectation, but knowing how and when to use a tool in addition to the tool’s limitations is often more important than having a toolbox packed with gadgets. Naturally, I leave it up to each reader to determine whether or not the porridge is just right.
TL;DR: This work uses a simulation approach to illustrate the sampling distribution of the standard deviation for continuous outcomes and the event rate for binary outcomes, and presents the impact of increasing the pilot sample size on the precision and bias of these estimates, and predicted power under three realistic scenarios.
Abstract: External pilot or feasibility studies can be used to estimate key unknown parameters to inform the design of the definitive randomised controlled trial (RCT). However, there is little consensus on how large pilot studies need to be, and some suggest inflating estimates to adjust for the lack of precision when planning the definitive RCT. We use a simulation approach to illustrate the sampling distribution of the standard deviation for continuous outcomes and the event rate for binary outcomes. We present the impact of increasing the pilot sample size on the precision and bias of these estimates, and predicted power under three realistic scenarios. We also illustrate the consequences of using a confidence interval argument to inflate estimates so the required power is achieved with a pre-specified level of confidence. We limit our attention to external pilot and feasibility studies prior to a two-parallel-balanced-group superiority RCT. For normally distributed outcomes, the relative gain in precision of the pooled standard deviation (SD
p
) is less than 10% (for each five subjects added per group) once the total sample size is 70. For true proportions between 0.1 and 0.5, we find the gain in precision for each five subjects added to the pilot sample is less than 5% once the sample size is 60. Adjusting the required sample sizes for the imprecision in the pilot study estimates can result in excessively large definitive RCTs and also requires a pilot sample size of 60 to 90 for the true effect sizes considered here. We recommend that an external pilot study has at least 70 measured subjects (35 per group) when estimating the SD
p
for a continuous outcome. If the event rate in an intervention group needs to be estimated by the pilot then a total of 60 to 100 subjects is required. Hence if the primary outcome is binary a total of at least 120 subjects (60 in each group) may be required in the pilot trial. It is very much more efficient to use a larger pilot study, than to guard against the lack of precision by using inflated estimates.