Gender, teacher assessment and stereotypes
By: Simon Burgess
The gender gap in attainment is a key fact of our times, with girls now out-performing boys pretty much throughout the education system. Nevertheless, there are currently significant gaps in jobs: women are still under-represented in science, technology, engineering and mathematics. How the gap in qualifications plays out into jobs and pay over the next twenty years or so is going to have significant consequences for the nature of work, the composition of the leading professions, family life, bringing up children and more.
But that’s for the future. For now, we are still trying to understand the implications of the gender gap in schools. Last week a new report from the OECD uses the PISA data to shed more light on the gender gap across a large group of countries. The TES highlights the conclusions drawn about teacher assessments and stereotyping:
“ … while teachers generally reward girls with higher marks in both mathematics and language-of-instruction courses, after accounting for their PISA performance in these subjects, girls’ performance advantage is wider in language-of-instruction than in mathematics. This suggests both that girls may enjoy better marks in all subjects because of their better classroom discipline and better self-regulation, but also that teachers hold stereotypical ideas about boys’ and girls’ academic strengths and weaknesses.” (OECD, p. 56)
We used data from the National Pupil Database (NPD) to compare written tests and teacher assessments of the same characteristic, namely the pupil’s ability in Maths, English and Science. The tests were nationally set and remotely marked; the teacher assessment was provided by the pupil’s subject teacher.
We can make this comparison because the end of Keystage 2 at age 11 has both these forms of assessments. There is no presumption that one form of the assessment is the Truth and one is biased. They are independent but noisy measures of the same underlying characteristic – just how good is this pupil at maths? But a comparison of the two across a large sample is revealing. Since we used all the eleven year-olds in England, our sample is big enough. Overall, the most common outcome is that the two estimates of ability agree, there is no difference between teacher assessment and remotely marked test.
But there are systematic patterns in the differences that are very interesting. In terms of gender, our findings for England are similar to the OECD, although since we use NPD data from the mid 2000’s, girls’ progress has moved on. We show that girls are “over-assessed” in English and “under-assessed” in maths. That is to say: the gaps between the test and the teacher assessment are on average positive in maths and negative in English for girls.
In terms of social class, we found that pupils eligible for free school meals were “under-assessed” on all three subjects.
Another way of saying the same thing is that poor pupils systematically and significantly out-performed what their teachers thought they would achieve.
We show the results for different ethnic, gender and social divides in the graph below. It shows very starkly that groups doing well in a test at a national level tend to be over-assessed by teachers; and equivalently, groups doing badly nationally tend to be under-assessed.
None of this is to say that teachers are biased. Like everyone else all the time, they use stereotypes to help make decisions when their information is imperfect.
But there are consequences. It is important that we do not rely solely on teacher assessments and that we retain and use nationally set and remotely marked tests. Using teacher assessments rather than the test scores to define attainment would result in a much greater recorded gap between poor and non-poor pupils. Tests allow pupils to show what they can do independently of someone’s opinion of them, including that person being their teacher.