TL;DR: In this paper, a method for combining results across independent-groups and repeated measures designs is described, and the conditions under which such an analysis is appropriate are discussed, and a meta-analysis procedure using design-specific estimates of sampling variance is described.
Abstract: When a meta-analysis on results from experimental studies is conducted, differences in the study design must be taken into consideration. A method for combining results across independent-groups and repeated measures designs is described, and the conditions under which such an analysis is appropriate are discussed. Combining results across designs requires that (a) all effect sizes be transformed into a common metric, (b) effect sizes from each design estimate the same treatment effect, and (c) meta-analysis procedures use design-specific estimates of sampling variance to reflect the precision of the effect size estimates.
TL;DR: In between-subje cts (BS) designs, different groups may be asked to make judgments on numerical rating scales as discussed by the authors, which can lead to strange conclusions: when different groups judge the subjective size of numbers, 9 is judged significantly larger than 221.
Abstract: In between-subje cts (BS) designs, different groups may be asked to make judgments on numerical rating scales. According to judgment theory, judgments obtained BS are not an ordinal scale of subjective value. This article illustrates how BS designs can lead to strange conclusions: When different groups judge the subjective size of numbers, 9 is judged significantly larger than 221. The theory is that 9 brings to mind a context of small numbers, among which 9 seems "average" or even "large"; however, 221 invokes a context of 3-digit numbers, among which 221 seems relatively "small." Within-subjects, however, judges would not have said 9 > 221. Implications of this problem and suggestions for dealing with it are discussed. The purpose of this article is to illustrate how between-subjects (BS) experiments, in which the dependent variable is a judgment, can lead to dubious conclusions. Although this point has been made previously (Birnbaum, 1974, 1982, 1992; Birnbaum & Mellers, 1983; Greenwald, 1976; Grice, 1966), the implications of this thesis may not yet be fully appreciated by researchers. This article uses a simple example lo illustrate how difficult it is to compare judgments between subjects. When different groups of people judge a stimulus, the response by a given person on a specific occasion is theorized to be a function of subjective value: R(i,k) = Jk(s,)
TL;DR: This paper alerts potential users of ANCOVA of the need to center the covariate measures when the design contains within-subject factors, and indicates how they can avoid biases when one cannot assume that the expected value of the covariATE measure is the same for all of the groups in a classification design.
Abstract: A number of statistical textbooks recommend using an analysis of covariance (ANCOVA) to control for the effects of extraneous factors that might influence the dependent measure of interest. However, it is not generally recognized that serious problems of interpretation can arise when the design contains comparisons of participants sampled from different populations (classification designs). Designs that include a comparison of younger and older adults, or a comparison of musicians and non-musicians are examples of classification designs. In such cases, estimates of differences among groups can be contaminated by differences in the covariate population means across groups. A second problem of interpretation will arise if the experimenter fails to center the covariate measures (subtracting the mean covariate score from each covariate score) whenever the design contains within-subject factors. Unless the covariate measures on the participants are centered, estimates of within-subject factors are distorted, and significant increases in Type I error rates, and/or losses in power can occur when evaluating the effects of within-subject factors. This paper: 1) alerts potential users of ANCOVA of the need to center the covariate measures when the design contains within-subject factors, and 2) indicates how they can avoid biases when one cannot assume that the expected value of the covariate measure is the same for all of the groups in a classification design.
TL;DR: In this paper, a general population online survey was used to evaluate the between-subject design and its potential to reduce social desirability bias, using a split-half design, where sensitive dimension in the vignette texts was either varied within or between subjects.
Abstract: Factorial survey designs have gained increasing popularity within the social sciences. Compared to single-item questions, the method allows the researcher to model more realistic, multidimensional decision scenarios. Furthermore, it has been argued that assessing sensitive dimensions in factorial surveys can help to overcome social desirability bias.
One rarely used implementation mode is the between subject design, in which the sensitive dimension varies only between respondents. This method is assumed to attract less attention than a design based on the usual within subject implementation, where respondents see variations on the sensitive dimension among their vignettes. In order to empirically evaluate the between design and its potential to reduce social desirability bias, we conducted an experiment within a general population online survey. Using a split-half design, the sensitive dimension in the vignette texts was either varied within or between subjects. More precisely, the factorial survey module under study assessed respondents’ judgements on just fees for early childcare. Among other dimensions, the vignette texts included the child’s religious denomination (Christian, Muslim, none) as one possible attribute on which discrimination can be based. The split-half approach allows us to compare the widely used within subject design to the alternative between approach. Furthermore, data on respondent characteristics is used to obtain insights about differential design effects for different education groups (differential social desirability bias) and respondents from different religious backgrounds (ingroup favouritism). While results concerning a differential social desirability bias were inconclusive, we found evidence for ingroup favouritism from respondents without a religious denomination in the between condition. In general, our findings suggest that the between subject design is a suitable method for reducing social desirability bias in factorial surveys.