TL;DR: In this article, the authors present experiments and generalized Causal inference methods for single and multiple studies, using both control groups and pretest observations on the outcome of the experiment, and a critical assessment of their assumptions.
Abstract: 1. Experiments and Generalized Causal Inference 2. Statistical Conclusion Validity and Internal Validity 3. Construct Validity and External Validity 4. Quasi-Experimental Designs That Either Lack a Control Group or Lack Pretest Observations on the Outcome 5. Quasi-Experimental Designs That Use Both Control Groups and Pretests 6. Quasi-Experimentation: Interrupted Time Series Designs 7. Regression Discontinuity Designs 8. Randomized Experiments: Rationale, Designs, and Conditions Conducive to Doing Them 9. Practical Problems 1: Ethics, Participant Recruitment, and Random Assignment 10. Practical Problems 2: Treatment Implementation and Attrition 11. Generalized Causal Inference: A Grounded Theory 12. Generalized Causal Inference: Methods for Single Studies 13. Generalized Causal Inference: Methods for Multiple Studies 14. A Critical Assessment of Our Assumptions
TL;DR: It is shown that it is feasible to develop a checklist that can be used to assess the methodological quality not only of randomised controlled trials but also non-randomised studies and it is possible to produce a Checklist that provides a profile of the paper, alerting reviewers to its particular methodological strengths and weaknesses.
Abstract: OBJECTIVE: To test the feasibility of creating a valid and reliable checklist with the following features: appropriate for assessing both randomised and non-randomised studies; provision of both an overall score for study quality and a profile of scores not only for the quality of reporting, internal validity (bias and confounding) and power, but also for external validity. DESIGN: A pilot version was first developed, based on epidemiological principles, reviews, and existing checklists for randomised studies. Face and content validity were assessed by three experienced reviewers and reliability was determined using two raters assessing 10 randomised and 10 non-randomised studies. Using different raters, the checklist was revised and tested for internal consistency (Kuder-Richardson 20), test-retest and inter-rater reliability (Spearman correlation coefficient and sign rank test; kappa statistics), criterion validity, and respondent burden. MAIN RESULTS: The performance of the checklist improved considerably after revision of a pilot version. The Quality Index had high internal consistency (KR-20: 0.89) as did the subscales apart from external validity (KR-20: 0.54). Test-retest (r 0.88) and inter-rater (r 0.75) reliability of the Quality Index were good. Reliability of the subscales varied from good (bias) to poor (external validity). The Quality Index correlated highly with an existing, established instrument for assessing randomised studies (r 0.90). There was little difference between its performance with non-randomised and with randomised studies. Raters took about 20 minutes to assess each paper (range 10 to 45 minutes). CONCLUSIONS: This study has shown that it is feasible to develop a checklist that can be used to assess the methodological quality not only of randomised controlled trials but also non-randomised studies. It has also shown that it is possible to produce a checklist that provides a profile of the paper, alerting reviewers to its particular methodological strengths and weaknesses. Further work is required to improve the checklist and the training of raters in the assessment of external validity.
TL;DR: Chapters 2-17 end with a Summary of Methodological Approaches to the Social World Conclusions.
Abstract: Chapters 2-17 end with a Summary CHAPTER 1. INTRODUCTION Why Study Research Methods? Methodological Approaches to the Social World Conclusions I. THE SCIENTIFIC AND ETHICAL CONTEXTS OF SOCIAL RESEARCH CHAPTER 2. THE NATURE OF SCIENCE The Aim of Science Science as Product Science as Process Science: Ideal versus Reality CHAPTER 3. RESEARCH ETHICS Data Collection and Analysis Treatment of Human Subjects Making Ethical Decisions The Uses of Research: Science and Society II. RESEARCH DESIGN CHAPTER 4. ELEMENTS OF RESEARCH DESIGN Origins of Research Topics Units of Analysis Variables Relationships Formulating Questions and Hypotheses Research Purposes and Research Design Stages of Social Research CHAPTER 5. MEASUREMENT The Measurement Process Levels of Measurement Reliability and Validity Reliability Assessment Validity Assessment A Final Note on Reliability and Validity CHAPTER 6. SAMPLING Why Sample? Population Definition Sampling Designs Probability Sampling Nonprobability Sampling Other Sampling Designs Factors Affecting Choice of Sampling Design Factors Determining Sample Size Final Notes on Sampling Errors and Generalizability III. METHODS OF DATA COLLECTION CHAPTER 7. EXPERIMENTATION The Logic of Experimentation Staging Experiments The Experiment as a Social Occasion Experimentation Outside the Laboratory CHAPTER 8. EXPERIMENTAL DESIGNS Threats to Internal Validity Pre-experimental Designs True Experimental Designs Factorial Experimental Designs Quasi-experimental Designs CHAPTER 9. SURVEY RESEARCH General Features of Survey Research The Uses and Limitations of Surveys Survey Research Designs Steps in Survey Research: Planning Face-to-Face and Telephone Interviewing Paper-and-Pencil Mailed Questionnaires Computer-Assisted Interviews Mixed-Mode Surveys Field Administration CHAPTER 10. SURVEY INSTRUMENTATION The Survey as a Social Occasion Materials Available to the Survey Designer "Sketches" or Preliminaries Filling in the Sketch: Writing the Items Pretesting CHAPTER 11. FIELD RESEARCH The Potentials and Limitations of Field Research Research Design and Sampling Field Observation Field Interviewing Stages of Field Research CHAPTER 12. RESEARCH USING AVAILABLE DATA Sources of Available Data Advantages of Research Using Available Data General Methodological Issues in Available-Data Research Historical Analysis Content Analysis CHAPTER 13. MULTIPLE METHODS Triangulation Multiple Measures of Concepts within the Same Study Multiple Tests of Hypotheses across Different Studies A Comparison of the Four Basic Approaches to Social Research Meta-Analysis CHAPTER 14. EVALUATION RESEARCH Framework and Sample Studies Types of Evaluation Research Methodological Issues in Evaluation Research The Social and Political Context of Evaluation Research IV. DATA PROCESSING, ANALYSIS, AND INTERPRETATION CHAPTER 15. DATA PROCESSING AND ELEMENTARY DATA ANALYSIS Preview of Analysis Steps Data Processing Data Matrices and Documentation The Functions of Statistics in Social Research Inspecting and Modifying the Data Preliminary Hypothesis Testing CHAPTER 16. MULTIVARIATE ANALYSIS Modeling Relationships Elaboration: Tables and Beyond Multiple-Regression Analysis Other Modeling Techniques CHAPTER 17. WRITING RESEARCH REPORTS Searching the Literature Using the Internet Using the Library Outlining and Preparing to Write Major Headings Other Considerations Length
TL;DR: The concept of study quality and the methods used to assess quality are discussed and the methodology for both the assessment of quality and its incorporation into systematic reviews and meta-analysis is discussed.
Abstract: This is the first in a series of four articles
The quality of controlled trials is of obvious relevance to systematic reviews. If the “raw material” is flawed then the conclusions of systematic reviews cannot be trusted. Many reviewers formally assess the quality of primary trials by following the recommendations of the Cochrane Collaboration and other experts. 1 2 However, the methodology for both the assessment of quality and its incorporation into systematic reviews and meta-analysis are a matter of ongoing debate.3-5 In this article we discuss the concept of study quality and the methods used to assess quality.
#### Components of internal and external validity of controlled clinical trials
Internal validity —extent to which systematic error (bias) is minimised in clinical trials
Quality is a multidimensional concept, which could relate to the design, conduct, and analysis of a trial, its clinical relevance, or quality of reporting.6 The validity of the findings generated by a study clearly is an important dimension of quality. In the 1950s the social scientist Campbell proposed a useful distinction between internal and external validity (see box below). 7 8 Internal validity implies that the differences observed between groups of patients allocated to different …
TL;DR: The inability of case-mix adjustment methods to compensate for selection bias and the inability to identify non- randomised studies that are free of selection bias indicate that non-randomised studies should only be undertaken when RCTs are infeasible or unethical.
Abstract: OBJECTIVES: To consider methods and related evidence for evaluating bias in non-randomised intervention studies. DATA SOURCES: Systematic reviews and methodological papers were identified from a search of electronic databases; handsearches of key medical journals and contact with experts working in the field. New empirical studies were conducted using data from two large randomised clinical trials. METHODS: Three systematic reviews and new empirical investigations were conducted. The reviews considered, in regard to non-randomised studies, (1) the existing evidence of bias, (2) the content of quality assessment tools, (3) the ways that study quality has been assessed and addressed. (4) The empirical investigations were conducted generating non-randomised studies from two large, multicentre randomised controlled trials (RCTs) and selectively resampling trial participants according to allocated treatment, centre and period. RESULTS: In the systematic reviews, eight studies compared results of randomised and non-randomised studies across multiple interventions using meta-epidemiological techniques. A total of 194 tools were identified that could be or had been used to assess non-randomised studies. Sixty tools covered at least five of six pre-specified internal validity domains. Fourteen tools covered three of four core items of particular importance for non-randomised studies. Six tools were thought suitable for use in systematic reviews. Of 511 systematic reviews that included non-randomised studies, only 169 (33%) assessed study quality. Sixty-nine reviews investigated the impact of quality on study results in a quantitative manner. The new empirical studies estimated the bias associated with non-random allocation and found that the bias could lead to consistent over- or underestimations of treatment effects, also the bias increased variation in results for both historical and concurrent controls, owing to haphazard differences in case-mix between groups. The biases were large enough to lead studies falsely to conclude significant findings of benefit or harm. Four strategies for case-mix adjustment were evaluated: none adequately adjusted for bias in historically and concurrently controlled studies. Logistic regression on average increased bias. Propensity score methods performed better, but were not satisfactory in most situations. Detailed investigation revealed that adequate adjustment can only be achieved in the unrealistic situation when selection depends on a single factor. CONCLUSIONS: Results of non-randomised studies sometimes, but not always, differ from results of randomised studies of the same intervention. Non-randomised studies may still give seriously misleading results when treated and control groups appear similar in key prognostic factors. Standard methods of case-mix adjustment do not guarantee removal of bias. Residual confounding may be high even when good prognostic data are available, and in some situations adjusted results may appear more biased than unadjusted results. Although many quality assessment tools exist and have been used for appraising non-randomised studies, most omit key quality domains. Healthcare policies based upon non-randomised studies or systematic reviews of non-randomised studies may need re-evaluation if the uncertainty in the true evidence base was not fully appreciated when policies were made. The inability of case-mix adjustment methods to compensate for selection bias and our inability to identify non-randomised studies that are free of selection bias indicate that non-randomised studies should only be undertaken when RCTs are infeasible or unethical. Recommendations for further research include: applying the resampling methodology in other clinical areas to ascertain whether the biases described are typical; developing or refining existing quality assessment tools for non-randomised studies; investigating how quality assessments of non-randomised studies can be incorporated into reviews and the implications of individual quality features for interpretation of a review's results; examination of the reasons for the apparent failure of case-mix adjustment methods; and further evaluation of the role of the propensity score.