Measurement in STEM education research: a systematic literature review of trends in the psychometric evidence of scales
TL;DR: In this article , the authors identify characteristics, trends, and gaps in measurement in Science, Technology, Engineering, and Mathematics (STEM) education research, focusing on the psychometric development of scales developed on college/university students for the context of post-secondary STEM education.
read more
Abstract: Abstract Background The objective of this systematic review is to identify characteristics, trends, and gaps in measurement in Science, Technology, Engineering, and Mathematics (STEM) education research. Methods We searched across several peer-reviewed sources, including a book, similar systematic reviews, conference proceedings, one online repository, and four databases that index the major STEM education research journals. We included empirical studies that reported on psychometric development of scales developed on college/university students for the context of post-secondary STEM education in the US. We excluded studies examining scales that ask about specific content knowledge and contain less than three items. Results were synthesized using descriptive statistics. Results Our final sample included the total number of N = 82 scales across N = 72 studies. Participants in the sampled studies were majority female and White, most scales were developed in an unspecified STEM/science and engineering context, and the most frequently measured construct was attitudes. Internal structure validity emerged as the most prominent validity evidence, with exploratory factor analysis (EFA) and confirmatory factor analysis (CFA) being the most common. Reliability evidence was dominated by internal consistency evidence in the form of Cronbach’s alpha, with other forms being scarcely reported, if at all. Discussion Limitations include only focusing on scales developed in the United States and in post-secondary contexts, limiting the scope of the systematic review. Our findings demonstrate that when developing scales for STEM education research, many types of psychometric properties, such as differential item functioning, test–retest reliability, and discriminant validity are scarcely reported. Furthermore, many scales only report internal structure validity (EFA and/or CFA) and Cronbach’s alpha, which are not enough evidence alone. We encourage researchers to look towards the full spectrum of psychometric evidence both when choosing scales to use and when developing their own. While constructs such as attitudes and disciplines such as engineering were dominant in our sample, future work can fill in the gaps by developing scales for disciplines, such as geosciences, and examine constructs, such as engagement, self-efficacy, and perceived fit.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Citations
Couplet scoring for research based assessment instruments
Michael Vignal,Gayle Geschwind,Marcos D. Caballero,Heather Lewandowski +3 more
- 06 Jul 2023
TL;DR: Couplet scoring as discussed by the authors employs the couplet as an alternative unit of assessment, where a couplet is essentially an item viewed and scored through the lens of a specific assessment objective (AO).
2
Evaluation of the Open-Ended Green Chemistry Generic Comparison (GC)<sup>2</sup> Prompt for Probing Student Conceptions about the Greenness of a Chemical Reaction
Krystal Grieger,Alexey Leontyev +1 more
TL;DR: This study evaluates the Green Chemistry Generic Comparison (GC)² prompt's ability to assess student conceptions about green chemistry, finding it sensitive for detecting knowledge gains and suitable for measuring student understanding of green chemistry principles.
1
STEM-Based Science E-Module: Is It Effective to Improve Students' Creative Thinking Skills?
Wulan Octi Pratiwi,Pramudiyanti Pramudiyanti,Ryzal Perdana +2 more
TL;DR: STEM-based e-modules are effective in improving elementary school students' creative thinking skills about electrical energy.
1
Propiedades psicométricas de las escalas de competencias investigativas: una revisión sistemática
Calixto Tapullima-Mori,José Livia Segovia,Nieves del Pilar Pizzan Tomanguillo,Sandra Lucero Pizzán Tomanguillo,Milagros Iñipe Cachay,Astrid Irene Saenz Chisquipama,Fiorella Gómez Sangama +6 more
TL;DR: This systematic review examines the psychometric properties of 11 investigative competence scales published between 2014 and 2023, finding adequate factorial validity and high internal consistency, suggesting their effectiveness in evaluating and developing research skills in university students.
1
Designing and Implementing a Globally Focused Interdisciplinary STEM Program
Moe Debbagh Greene,Yaoying Xu,Jill Elizabeth Blondin +2 more
TL;DR: Designing and implementing a globally-focused interdisciplinary STEM program in preservice teacher education emphasizes collaborative learning, experiential learning, and leveraging instructional technology to enhance student learning.
References
One Size Doesn’t Fit All: Using Factor Analysis to Gather Validity Evidence When Using Surveys in Your Research
TL;DR: The aspects of validity that researchers should consider when using surveys are reviewed and factor analysis is focused on, a statistical method that can be used to collect an important type of validity evidence.
453
Editorial: Measurement Invariance.
Rens van de Schoot,Rens van de Schoot,Peter Schmidt,Peter Schmidt,Alain De Beuckelaer,Alain De Beuckelaer,Alain De Beuckelaer,Kimberley Lek,Mariëlle Zondervan-Zwijnenburg +8 more
TL;DR: The first formal treatment of different forms of MI and their consequences for the validity of multi-group/multi-time comparisons is attributable to Meredith (1993), as well as a recent book by Millsap (2011) containing a general systematic treatment of the topic of MI.
Measuring Undergraduate Students' Engineering Self‐Efficacy: A Validation Study
TL;DR: In this article, the authors evaluated the factor structure, validity, and reliability of general and skill-specific engineering self-efficacy measures created for use with undergraduate engineering students, and found evidence for the reliability, validity and predictive utility of the engineering selfefficacy scales.