Modeling Rater Effects and Complex Learning Progressions using Item Response Models

Open Access

Modeling Rater Effects and Complex Learning Progressions using Item Response Models

- 01 Jan 2015

TL;DR: Shin et al. as discussed by the authors investigated extensions and applications of multilevel and multidimensional item response models with a primary focus on detecting rater effects in double-scored performance assessments, monitoring human raters with automated scoring engine, and developing measurement models for complicated learning progressions.

Abstract: Author(s): Shin, Hyo Jeong | Advisor(s): Mark, Wilson | Abstract: This dissertation is comprised of three papers that propose and apply psychometric models to deal with complexities and challenges in large-scale assessments, focusing on modeling rater effects and complex learning progressions. In particular, three papers investigate extensions and applications of multilevel and multidimensional item response models, with a primary focus on (1) detecting rater effects in double-scored performance assessments, (2) monitoring human raters with automated scoring engine, and (3) developing measurement models for complicated learning progressions.The first paper applies and assesses the trifactor model for multiple ratings data in double-scored performance assessments, in which two different raters give independent scores for the same responses (e.g., the GRE essay). The trifactor model incorporates a cross- classified structure (e.g., items and raters) in addition to the general dimension (e.g., examinees). The paper includes a simulation design that follows the GRE example to reflect the incompleteness and imbalance in the real world assessments. The effect of the missingness rate in the data and ignoring the differences among the raters are investigated using the simulations. The use of the trifactor model is illustrated with empirical data.The second paper applies mixed-effects ordered probit models for the purpose of examining the effectiveness and efficiency of utilizing scores from automated scoring engines (AE) to monitor and provide diagnostic feedback to human raters under training compared to the scores from the human experts (HE). Using the real rater training study data, three types of rater effects—severity, accuracy, and centrality of each rater—are related with model parameters, and compared for cases (a) when the AE is considered as the true score and (b) when the HE is considered as the true score.The third paper proposes a structured constructs model based on change-point analysis to deal with complicated learning progressions, in which relations between levels across multiple constructs are assumed in advance. Based on the change-point analysis, and reparameterizations of the multidimensional Rasch model and partial credit model, cut score parameters and discontinuity parameters are incorporated to classify the examinees into the levels in the learning progressions, and to model the hypothesized relations as the advantage for examinees belonging to a certain level in one construct to reach a level in another construct. Parameter recovery of the proposed model and the consequences of ignoring the hypothesized relations are assessed using simulations. The use of the proposed model is illustrated with empirical data and interpreted as contributing to validity evidence for the hypothesized relations.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

References

•Journal Article

R: A language and environment for statistical computing.

R Core Team

- 01 Jan 2014

- MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

410.8K

•Journal Article•10.1109/TAC.1974.1100705

A new look at the statistical model identification

Hirotugu Akaike

- 01 Dec 1974

- IEEE Transactions on Automatic Control

TL;DR: In this article, a new estimate minimum information theoretical criterion estimate (MAICE) is introduced for the purpose of statistical identification, which is free from the ambiguities inherent in the application of conventional hypothesis testing procedure.

...read moreread less

53.1K

Journal Article•10.1037/H0046016

Convergent and discriminant validation by the multitrait-multimethod matrix.

Donald T. Campbell, +1 more

- 01 Mar 1959

- Psychological Bulletin

TL;DR: This transmutability of the validation matrix argues for the comparisons within the heteromethod block as the most generally relevant validation data, and illustrates the potential interchangeability of trait and method components.

...read moreread less

17.4K

•Journal Article•10.1214/SS/1177011136

Inference from Iterative Simulation Using Multiple Sequences

Andrew Gelman, +1 more

- 01 Nov 1992

- Statistical Science

TL;DR: The focus is on applied inference for Bayesian posterior distributions in real problems, which often tend toward normal- ity after transformations and marginalization, and the results are derived as normal-theory approximations to exact Bayesian inference, conditional on the observed simulations.

...read moreread less

16.9K

Journal Article•10.1080/00401706.1991.10484817

Categorical Data Analysis

Alan Agresti

- 01 May 1991

- Technometrics

TL;DR: In this article, categorical data analysis was used for categorical classification of categorical categorical datasets.Categorical Data Analysis, categorical Data analysis, CDA, CPDA, CDSA

...read moreread less

15.1K