TL;DR: A slightly more complex rule-of thumb is introduced that estimates minimum sample size as function of effect size as well as the number of predictors and it is argued that researchers should use methods to determine sample size that incorporate effect size.

...read moreread less

Abstract: Numerous rules-of-thumb have been suggested for determining the minimum number of subjects required to conduct multiple regression analyses. These rules-of-thumb are evaluated by comparing their results against those based on power analyses for tests of hypotheses of multiple and partial correlations. The results did not support the use of rules-of-thumb that simply specify some constant (e.g., 100 subjects) as the minimum number of subjects or a minimum ratio of number of subjects (N) to number of predictors (m). Some support was obtained for a rule-of-thumb that N ≥ 50 + 8 m for the multiple correlation and N ≥104 + m for the partial correlation. However, the rule-of-thumb for the multiple correlation yields values too large for N when m ≥ 7, and both rules-of-thumb assume all studies have a medium-size relationship between criterion and predictors. Accordingly, a slightly more complex rule-of thumb is introduced that estimates minimum sample size as function of effect size as well as the number of predictors. It is argued that researchers should use methods to determine sample size that incorporate effect size.

...read moreread less

3,736 citations

Journal Article•10.1177/014662168801200410•

Set Correlation and Contingency Tables.

[...]

Jacob Cohen¹•Institutions (1)

New York University¹

01 Dec 1988-Applied Psychological Measurement

TL;DR: Set correlation is a realization of the general multi variate linear model, can be viewed as a multivariate generalization of multiple correlation analysis, and may be employed in the analysis of m... as mentioned in this paper.

...read moreread less

Abstract: Set correlation is a realization of the general multi variate linear model, can be viewed as a multivariate generalization of multiple correlation analysis, and may be employed in the analysis of m...

...read moreread less

1,218 citations

Journal Article•10.1080/10408340500526766•

The Correlation Coefficient: An Overview

[...]

Agustin G. Asuero¹, Ana Sayago¹, Antonio G. González¹•Institutions (1)

University of Seville¹

01 Jan 2006-Critical Reviews in Analytical Chemistry

TL;DR: This paper discusses the uses of the correlation coefficient r, either as a way to infer correlation, or to test linearity, and recommends the use of z Fisher transformation instead of r values because r is not normally distributed but z is (at least in approximation).

...read moreread less

Abstract: Correlation and regression are different, but not mutually exclusive, techniques. Roughly, regression is used for prediction (which does not extrapolate beyond the data used in the analysis) whereas correlation is used to determine the degree of association. There situations in which the x variable is not fixed or readily chosen by the experimenter, but instead is a random covariate to the y variable. This paper shows the relationships between the coefficient of determination, the multiple correlation coefficient, the covariance, the correlation coefficient and the coefficient of alienation, for the case of two related variables x and y. It discusses the uses of the correlation coefficient r, either as a way to infer correlation, or to test linearity. A number of graphical examples are provided as well as examples of actual chemical applications. The paper recommends the use of z Fisher transformation instead of r values because r is not normally distributed but z is (at least in approximation). For eithe...

...read moreread less

1,188 citations

Journal Article•10.1016/J.JMVA.2009.04.008•

Generating random correlation matrices based on vines and extended onion method

[...]

Daniel Lewandowski¹, Dorota Kurowicka¹, Harry Joe²•Institutions (2)

Delft University of Technology¹, University of British Columbia²

01 Oct 2009-Journal of Multivariate Analysis

TL;DR: The onion method is explained in terms of elliptical distributions and extended to allow generating random correlation matrices from the same joint distribution as the vine method to study the relationship between the multiple correlation and partial correlations on a regular vine.

...read moreread less

1,083 citations

Journal Article•10.2307/2346488•

Discarding Variables in a Principal Component Analysis. I: Artificial Data

[...]

Ian T. Jolliffe¹•Institutions (1)

University of Kent¹

01 Jun 1972-Journal of The Royal Statistical Society Series C-applied Statistics

TL;DR: It is shown that several of the rejection methods, of differing types, each discard precisely those variables known to be redundant, for all but a few sets of data.

...read moreread less

Abstract: Often, results obtained from the use of principal component analysis are little changed if some of the variables involved are discarded beforehand. This paper examines some of the possible methods for deciding which variables to reject and these rejection methods are tested on artificial data containing variables known to be “redundant”. It is shown that several of the rejection methods, of differing types, each discard precisely those variables known to be redundant, for all but a few sets of data.

...read moreread less

1,023 citations