Assessing mathematical problem solving using comparative judgement
TL;DR: In this paper, the authors present two studies that test a method called comparative judgement (CJ) that might be well suited to assessing mathematical problem solving, which is an alternative to traditional scoring that is based on collective expert judgements of students' work rather than item-by-item scoring schemes.
read more
Abstract: There is an increasing demand from employers and universities for school leavers to be able to apply their mathematical knowledge to problem solving in varied and unfamiliar contexts. These aspects are however neglected in most examinations of mathematics and, consequentially, in classroom teaching. One barrier to the inclusion of mathematical problem solving in assessment is that the skills involved are difficult to define and assess objectively. We present two studies that test a method called comparative judgement (CJ) that might be well suited to assessing mathematical problem solving. CJ is an alternative to traditional scoring that is based on collective expert judgements of students’ work rather than item-by-item scoring schemes. In study 1, we used CJ to assess traditional mathematics tests and found it performed validly and reliably. In study 2, we used CJ to assess mathematical problem-solving tasks and again found it performed validly and reliably. We discuss the implications of the results for further research and the implications of CJ for the design of mathematical problem-solving tasks.
read more
Chat with Paper
AI Agents for this Paper
Find similar papers on Google Scholar, PubMed and Arxiv
Write a critical review of this paper
Analyze citations of this paper to find unaddressed research gaps
Figures

Figure 1a: Reproduction of the “Sports Bag” task from the Bowland assessment materials. 
Figure 6: Correlation of the two groups’ parameters for the Bowland scripts. 
FIGURE 7 HERE 
FIGURE 5 HERE 
Figure 4: Correlation between the GCSE scripts’ mean parameters and grades. Error bars show the standard deviation of the scripts’ parameters at each grade. 
Figure 3: Correlation of the two groups’ parameters for the GCSE scripts.
Citations
The Role of Problem-Based Learning to Improve Students' Mathematical Problem-Solving Ability and Self Confidence.
TL;DR: The study found that on MPSA, its gain, and on MSC students getting treatment with PBL approach obtained better grade than that of students taught by conventional teaching.
Peer assessment without assessment criteria
Ian Jones,Lara Alcock +1 more
TL;DR: In this article, first year mathematics undergraduates sat a written test on conceptual understanding of multivariable calculus, then assessed their peers' responses using pairwise comparative judgement and found high validity and inter-rater reliability, suggesting that the students performed well as peer assessors.
The problem of assessing problem solving: can comparative judgement help?
Ian Jones,Matthew Inglis +1 more
TL;DR: In this paper, the authors report a study that tested an alternative approach to assessment, called comparative judgement, which may represent a superior method for assessing open-ended questions that encourage a range of unpredictable responses.
Validity of comparative judgement to assess academic writing: examining implications of its holistic character and building on a shared consensus
TL;DR: In this paper, the authors examined the implications resulting from two critical assumptions underpinning the use of comparative judgement, namely: its holistic characteristic and how the final rank order reflects the shared consensus on what makes for a good essay.
63
Scale Separation Reliability: What Does It Mean in the Context of Comparative Judgment?
TL;DR: A meta-analysis is performed on 26 CJ assessments showing that the SSR is a good measure for split-half reliability and Regarding SSR as expressing a correlation with the truth, the results are mixed.
References
A law of comparative judgment
TL;DR: The law of comparative judgment as mentioned in this paper is applicable not only to the comparison of physical stimulus intensities but also to qualitative comparative judgments such as those of excellence of specimens in an educational scale.
5.6K
•Book
Applying the Rasch Model: Fundamental Measurement in the Human Sciences
Trevor G. Bond,Christine M. Fox +1 more
- 01 Apr 2001
TL;DR: This volume contends that Rasch measurement is the model of choice because it is the closest to realizing the sort of objective fundamental measurement so long revered in the physical sciences.
The Common Core State Standards for Mathematics
TL;DR: The Common Core State Standards for Mathematics (CCSSM) was published in 2010 and includes a complete collection of standards that are published and reviewed as a ‘common core’ in which math skills have been extensively adopted as discussed by the authors.