Detecting differential item functioning using generalized logistic regression in the context of large-scale assessments
TL;DR: In this article, the authors examined item-level differential item functioning (DIF) in the context of international large-scale assessment (ILSA) using a generalized logistic regression approach.
read more
Abstract: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences regarding performance can be regarded as valid. Relatively fewer studies examined an item-level approach to measurement equivalence, particularly in settings where a large number of groups is included. This simulation study examines item-level differential item functioning (DIF) in the context of international large-scale assessment (ILSA) using a generalized logistic regression approach. Manipulated factors included the number of groups (10 or 20), magnitude of DIF, percent of DIF items, the nature of DIF, as well as the percent of affected groups with DIF. Results suggested that the number of groups did not have an effect of the performance of the method (high power and low Type I error rates); however, other factors had impacted the accuracy. Specifically, Type I error rates were inflated in non-DIF conditions, while they were very conservative in all of the DIF conditions. Power was generally high, in particular in conditions where DIF magnitude was large, with one exception – in conditions where DIF was introduced in difficulty parameters and the percent of DIF items was 60. Our findings presented a mixed picture with respect to the performance of the generalized logistic regression method in the context of large number of groups with large sample sizes. In the presence of DIF, the method was successful in distinguishing between DIF and non-DIF, as evidenced by low Type I error and high power rates. On the other hand, however, in the absence of DIF, the method yielded increased Type I errors.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
European Organization for Research and Treatment of Cancer
Victor R. Preedy,Ronald R. Watson +1 more
- 01 Jan 2010
1.2K
Test-Taking Motivation in Education Students: Task Battery Order Affected Within-Test-Taker Effort and Importance
TL;DR: Examination of changes in effort and importance resulting from variations in test battery order and their relations to response processes indicated intraindividual changes in education students’ effort or importance depending on test order but similar mock-exam response processes.
Investigating Item Bias in a CS1 Exam with Differential Item Functioning
Matt J. Davidson,Brett Wortzman,Amy J. Ko,Min Li +3 more
- 03 Mar 2021
TL;DR: In this article, the authors used differential item functioning (DIF) methods and specifically investigated bias related to binary gender and year of study on a final exam in a large CS1 course.
9
Impact of differential item functioning on group score reporting in the context of large-scale assessments
TL;DR: In this article , the authors investigated the potential impact of differential item functioning on group-level mean and standard deviation estimates using empirical and simulated data in the context of large-scale assessment, and found that the DIF adjustment reduced the bias by 50% on average.
References
•Journal Article
R: A language and environment for statistical computing.
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
410.8K
•Book
Applications of Item Response Theory To Practical Testing Problems
Frederic M. Lord
- 01 Jul 1980
TL;DR: The application of item response theory to practical testing problems is discussed in this article, where the authors present an example of the application of the theory to real-world testing problems in a practical setting.
5.6K
Measurement Invariance, Factor Analysis and Factorial Invariance.
TL;DR: In this article, structural bias, weak measurement invariance, strong factorial invariance (SFI), and factorial robustness have been defined and defined for employment/admissions testing and salary equity.
4.3K
•Book
Fundamentals of Item Response Theory
Ronald K. Hambleton,Hariharan Swaminathan,H. Jane Rogers +2 more
- 23 Jul 1991
TL;DR: This research attacked the mode-based approach to item response theory with a model- data fit approach, and found that the model-Data Fit approach proved to be more accurate than the other approaches.
3.4K
Fundamentals of Item Response Theory.
Abstract: Background Concepts, Models, and Features Ability and Item Parameter Estimation Assessment of Model-Data Fit The Ability Scale Item and Test Information and Efficiency Functions Test Construction Identification of Potentially Biased Test Items Test Score Equating Computerized Adaptive Testing Future Directions of Item Response Theory
2.6K