Detecting differential item functioning using generalized logistic regression in the context of large-scale assessments

doi:10.1186/S40536-014-0004-5

Open AccessJournal Article10.1186/S40536-014-0004-5

Detecting differential item functioning using generalized logistic regression in the context of large-scale assessments

Dubravka Svetina, +1 more

- 26 Jun 2014

- Large-scale Assessments in Education

- Vol. 2, Iss: 1, pp 4

18

TL;DR: In this article, the authors examined item-level differential item functioning (DIF) in the context of international large-scale assessment (ILSA) using a generalized logistic regression approach.

Abstract: When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences regarding performance can be regarded as valid. Relatively fewer studies examined an item-level approach to measurement equivalence, particularly in settings where a large number of groups is included. This simulation study examines item-level differential item functioning (DIF) in the context of international large-scale assessment (ILSA) using a generalized logistic regression approach. Manipulated factors included the number of groups (10 or 20), magnitude of DIF, percent of DIF items, the nature of DIF, as well as the percent of affected groups with DIF. Results suggested that the number of groups did not have an effect of the performance of the method (high power and low Type I error rates); however, other factors had impacted the accuracy. Specifically, Type I error rates were inflated in non-DIF conditions, while they were very conservative in all of the DIF conditions. Power was generally high, in particular in conditions where DIF magnitude was large, with one exception – in conditions where DIF was introduced in difficulty parameters and the percent of DIF items was 60. Our findings presented a mixed picture with respect to the performance of the generalized logistic regression method in the context of large number of groups with large sample sizes. In the presence of DIF, the method was successful in distinguishing between DIF and non-DIF, as evidenced by low Type I error and high power rates. On the other hand, however, in the absence of DIF, the method yielded increased Type I errors.

Chat with Paper

AI Agents for this Paper

Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps

Citations

Book Chapter•10.1007/978-0-387-78665-0_5620

European Organization for Research and Treatment of Cancer

Victor R. Preedy, +1 more

- 01 Jan 2010

1.2K

•Journal Article•10.3389/FPSYG.2020.559683

Test-Taking Motivation in Education Students: Task Battery Order Affected Within-Test-Taker Effort and Importance

Anett Wolgast, +2 more

- 25 Nov 2020

- Frontiers in Psychology

TL;DR: Examination of changes in effort and importance resulting from variations in test battery order and their relations to response processes indicated intraindividual changes in education students’ effort or importance depending on test order but similar mock-exam response processes.

...read moreread less

12

Book Chapter•10.1002/9781118884997.CH40

Measurement invariance in international large-scale assessments : Integrating theory and method

Deana Desa, +5 more

- 01 Jan 2019

10

Proceedings Article•10.1145/3408877.3432397

Investigating Item Bias in a CS1 Exam with Differential Item Functioning

Matt J. Davidson, +3 more

- 03 Mar 2021

TL;DR: In this article, the authors used differential item functioning (DIF) methods and specifically investigated bias related to binary gender and year of study on a final exam in a large CS1 course.

...read moreread less

9

•Journal Article•10.1186/s40536-022-00135-7

Impact of differential item functioning on group score reporting in the context of large-scale assessments

Sean Joo, +3 more

- 15 Nov 2022

- Large-scale Assessments in Education

TL;DR: In this article , the authors investigated the potential impact of differential item functioning on group-level mean and standard deviation estimates using empirical and simulated data in the context of large-scale assessment, and found that the DIF adjustment reduced the bias by 50% on average.

...read moreread less

6

...

Expand

References

•Journal Article

R: A language and environment for statistical computing.

R Core Team

- 01 Jan 2014

- MSOR connections

TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.

...read moreread less

410.8K

•Book

Applications of Item Response Theory To Practical Testing Problems

Frederic M. Lord

- 01 Jul 1980

TL;DR: The application of item response theory to practical testing problems is discussed in this article, where the authors present an example of the application of the theory to real-world testing problems in a practical setting.

...read moreread less

5.6K

Journal Article•10.1007/BF02294825

Measurement Invariance, Factor Analysis and Factorial Invariance.

William Meredith

- 01 Dec 1993

- Psychometrika

TL;DR: In this article, structural bias, weak measurement invariance, strong factorial invariance (SFI), and factorial robustness have been defined and defined for employment/admissions testing and salary equity.

...read moreread less

4.3K

•Book

Fundamentals of Item Response Theory

Ronald K. Hambleton, +2 more

- 23 Jul 1991

TL;DR: This research attacked the mode-based approach to item response theory with a model- data fit approach, and found that the model-Data Fit approach proved to be more accurate than the other approaches.

...read moreread less

3.4K

Journal Article•10.2307/2075521

Fundamentals of Item Response Theory.

Magnus Stenbeck, +3 more

- 01 Mar 1992

- Contemporary Sociology

Abstract: Background Concepts, Models, and Features Ability and Item Parameter Estimation Assessment of Model-Data Fit The Ability Scale Item and Test Information and Efficiency Functions Test Construction Identification of Potentially Biased Test Items Test Score Equating Computerized Adaptive Testing Future Directions of Item Response Theory

...read moreread less

2.6K