TL;DR: A convenient, although not comprehensive, presentation of required sample sizes is providedHere the sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests.
Abstract: One possible reason for the continued neglect of statistical power analysis in research in the behavioral sciences is the inaccessibility of or difficulty with the standard material. A convenient, although not comprehensive, presentation of required sample sizes is provided here. Effect-size indexes and conventional values for these are given for operationally defined small, medium, and large effects. The sample sizes necessary for .80 power to detect effects at these levels are tabled for eight standard statistical tests: (a) the difference between independent means, (b) the significance of a product-moment correlation, (c) the difference between independent rs, (d) the sign test, (e) the difference between independent proportions, (f) chi-square tests for goodness of fit and contingency tables, (g) one-way analysis of variance, and (h) the significance of a multiple or multiple partial correlation.
TL;DR: A slightly more complex rule-of thumb is introduced that estimates minimum sample size as function of effect size as well as the number of predictors and it is argued that researchers should use methods to determine sample size that incorporate effect size.
Abstract: Numerous rules-of-thumb have been suggested for determining the minimum number of subjects required to conduct multiple regression analyses. These rules-of-thumb are evaluated by comparing their results against those based on power analyses for tests of hypotheses of multiple and partial correlations. The results did not support the use of rules-of-thumb that simply specify some constant (e.g., 100 subjects) as the minimum number of subjects or a minimum ratio of number of subjects (N) to number of predictors (m). Some support was obtained for a rule-of-thumb that N ≥ 50 + 8 m for the multiple correlation and N ≥104 + m for the partial correlation. However, the rule-of-thumb for the multiple correlation yields values too large for N when m ≥ 7, and both rules-of-thumb assume all studies have a medium-size relationship between criterion and predictors. Accordingly, a slightly more complex rule-of thumb is introduced that estimates minimum sample size as function of effect size as well as the number of predictors. It is argued that researchers should use methods to determine sample size that incorporate effect size.
TL;DR: This paper discusses the uses of the correlation coefficient r, either as a way to infer correlation, or to test linearity, and recommends the use of z Fisher transformation instead of r values because r is not normally distributed but z is (at least in approximation).
Abstract: Correlation and regression are different, but not mutually exclusive, techniques. Roughly, regression is used for prediction (which does not extrapolate beyond the data used in the analysis) whereas correlation is used to determine the degree of association. There situations in which the x variable is not fixed or readily chosen by the experimenter, but instead is a random covariate to the y variable. This paper shows the relationships between the coefficient of determination, the multiple correlation coefficient, the covariance, the correlation coefficient and the coefficient of alienation, for the case of two related variables x and y. It discusses the uses of the correlation coefficient r, either as a way to infer correlation, or to test linearity. A number of graphical examples are provided as well as examples of actual chemical applications. The paper recommends the use of z Fisher transformation instead of r values because r is not normally distributed but z is (at least in approximation). For eithe...
TL;DR: The onion method is explained in terms of elliptical distributions and extended to allow generating random correlation matrices from the same joint distribution as the vine method to study the relationship between the multiple correlation and partial correlations on a regular vine.
TL;DR: A general matrix formula is derived of the semi-partial correlation for fast computation and it is shown that users can readily calculate the coefficients of both partial and semi- partial correlations without computational burden.
Abstract: Lack of a general matrix formula hampers implementation of the semi-partial correlation, also known as part correlation, to the higher-order coefficient. This is because the higher-order semi-partial correlation calculation using a recursive formula requires an enormous number of recursive calculations to obtain the correlation coefficients. To resolve this difficulty, we derive a general matrix formula of the semi-partial correlation for fast computation. The semi-partial correlations are then implemented on an R package ppcor along with the partial correlation. Owing to the general matrix formulas, users can readily calculate the coefficients of both partial and semi-partial correlations without computational burden. The package ppcor further provides users with the level of the statistical significance with its test statistic.