How To Lie With Statistics
Edexcel's boast about grading isn't as impressive as it sounds.
22 January 2018
It may feel like a long time ago, but how did you get on in your GCSEs (or O Levels)? Whatever your results, you probably accepted that the grades they gave you were right, even if that C you got in history wasn't quite what you'd hoped for.
Exam boards take GCSE grades very seriously. After all, that's what we judge them on. So it's understandable that Pearson (Edexcel) wants to publicise the fact that according to recently published statistics, "99.2% of [our GCSE & A Level] grades were accurate". This, apparently, is a better record than all the other exam boards. It sounds very reassuring.
But what does that 99.2% statistic mean?
It means that of all the papers that Edexcel marked, only 0.8% had the grade changed when the exam was re-marked. Which still sounds impressive. Until you realise that the only papers that got re-marked were the ones that made it to the appeal process. According to figures I've seen, 69,000 papers went through the appeal process, and of those, a whopping 11,300 had their grades changed. That's over 15% of grades in this sample of re-marked papers that turned out to be 'wrong'.
What about the hundreds of thousands of exam papers that didn't get re-marked? Ofqual have made it much harder in recent years to appeal an exam grade. And in any case, when a teenager gets their results and they are told they got a C, while many might feel it wasn't as good as they hoped for, most would have no reason (or desire) to go through the expensive and time-consuming rigmarole of getting it re-checked. Meanwhile the ones who say "blimey I got an A, how did I fluke that??" certainly aren't going to challenge the grade.
The reality, however, is quite shocking. Last year, in a report that seems to have received almost no attention (maybe it's been quietly buried in a dusty filing cabinet), Ofqual revealed that the grades assigned in some exam subjects are extremely unreliable. In English Literature - one of the worst offenders - it was found that if you were to take a random cross-section of exam entries and get a different examiner to mark them, more than 40% (yes FORTY percent!!!) would get a different grade from the one they were first assigned.*
This is not because exam markers are 'poor' or making 'mistakes'. It's because exam marking is to some extent a subjective exercise. Human beings are applying their skill and judgment to assess Little Johnny's essay, and whether it is worth 13/20 or 15/20 is inevitably to some extent a matter of opinion. And expert opinions differ, in all walks of life.
So here's a bizarre way to lie with statistics. Edexcel can claim that 99.2% of the grades they give are 'accurate' (i.e. were never subsequently changed). Yet at the same time, when Little Johnny got grade B in English Literature GCSE, there was a high chance, maybe as high as 50%, that if a different examiner had marked it he would have got an A or a C. Somehow the word 'accurate' doesn't seem so appropriate now.
And the moral of this story? Unless we reduce an exam to multiple choice questions marked by computers, then we have to accept that examination marking is an inexact science, and exam grades should be treated with caution. Don't let anyone fool you otherwise.
p.s. Dear Exam Boards - if I have made any factual or interpretive errors in this blog, please let me know. I don't want to mislead people with any statistics.
* Here's the report - see page 25, Figure 14: https://www.gov.uk/government/uploads/system/uploads/attachment_data/file/568424/Marking_consistency_metrics_-_November_2016.pdf