Evaluating Computer Science Undergraduate Teaching: Why Student Evaluations Are Likely Biased

Georgia Institute of Technology Professor Mark Guzdial

Our campus has been having a lot of discussions lately about student evaluations of teaching. Our Center for Teaching and Learning circulated a copy of an article by Carl Wieman from Change magazine, "A better way to evaluate undergraduate teaching." (A free abbreviated form is available here.)

Wieman argues that we need a better way to evaluate teaching. Student evaluations do not correlate with desirable outcomes (as described here) and are biased.

"To put this in more concrete terms, the data indicate that it would be nearly impossible for a physically unattractive female instructor teaching a large required introductory physics course to receive as high an evaluation as that of an attractive male instructor teaching a small fourth-year elective course for physics majors, regardless of how well either teaches."

Wieman suggests using a Teaching Practices Inventory as a better way to evaluate undergraduate teaching. Using more practices that are evidence-based is likely to lead to better outcomes. This hasn't been an easy sell, as Wieman discovered at the White House Office of Science and Technology Policy. It's not gone over well on my campus, either.

It's a complex issue. There are eminent scholars, like Nira Hativa, who argue that student evaluations of teaching are a valid and effective way to recognize good teaching. Student evaluation of teaching is relatively easy, and it's current standard practice. It's hard to change current practice. Wieman's Teaching Practices Inventory has been called "radical" on my campus.

I am not a scholar of studies about student evaluation of teaching. I study computing education. From what I know about computer science and unconscious bias, the quote from Wieman above is likely just as true in computer science.

Unconscious bias is a factor in women's under-representation in STEM generally, and computer science specifically. The idea is that we all have biases that influence how we make decisions. Unconsciously, many of us (at least in the Western world) are biased to think about computer scientists are mostly male. Unless we consciously recognize our biases, we are likely to express them in our decisions. A 2013 multi-institutional study found that undergraduates see computer scientists as male. That's a source for bias.

Women in computer science report on biases that keep them from succeeding in computer science. Studies show that female science students are more likely to be interrupted and less likely to get instructors to pay attention. The National Center for Women and IT (NCWIT) has developed a video "Unconscious bias and why it matters for women and tech." A recent report from Google and researchers at Stanford presents evidence that unconscious bias influences teachers' decisions in computer science classrooms. They recommend professional development for the teachers, to help them to reduce their expression of bias. Google is funding the development of a simulation for teachers to address their unconscious bias.

The tech industry recognizes that unconscious bias is a significant problem. Microsoft is making their unconscious bias training publicly available worldwide. Google is asking 60,000 employees to take training to recognize unconscious bias.

So here's the question: If unconscious bias is so pervasive in computing, and training is our best remedy, how can untrained students evaluate their CS teachers without bias?

Computing Research News raised concerns about bias in student evaluations of CS teaching back in 2003. A study just last year found that students are biased against female instructors. There is evidence that online students evaluate instructors more highly if they think that they are male.

I have not seen a study explicitly showing bias in CS students' evaluations of their teachers, but the evidence is pretty overwhelming that it's there. How could the students possibly avoid it? We know that, without training, students do evaluate teachers with bias. We have found unconscious bias across computing. How could undergraduates evaluate a female CS instructor fairly compared to a male one? What would be a mechanism that might lead them to evaluate teaching without gender bias?

We have too few women in computer science. We need to recruit more female faculty in CS and retain them. We need to encourage and reward good teaching. Biased student evaluations as our only way to measure undergraduate teaching quality doesn't help us with either need.