CS Classes Have Different Results Than Laboratory Experiments – Not in a Good Way

Georgia Institute of Technology professor Mark Guzdial

I have collaborated with Lauren Margulieux on a series of experiments and papers around using subgoal labeling to improve programming education. She has just successfully defended her dissertation. I describe her dissertation work, and summarize some of her earlier findings, in the blog post linked here.

She had a paragraph in her dissertation’s methods section that I just flew by when I first read it.

Demographic information was collected for participants’ age, gender, academic field of study, high school GPA, college GPA, year in school, computer science experience, comfort with computers, and expected difficulty of learning App Inventor because they are possible predictors of performance (Rountree, Rountree, Robins, & Hannah, 2004; see Table 1). These demographic characteristics were not found to correlate with problem solving performance (see Table 1).

Then I realized her lack of result was a pretty significant result.

I asked her about it at the defense. She collected all these potential predictors of programming performance in all the experiments. Were they ever a predictor of the experiment outcome? She said she once, out of eight experiments, found a weak correlation between high school GPA and performance. In all other cases, "these demographic characteristics were not found to correlate with problem solving performance" (to quote her dissertation).

There has been a lot of research into what predicts success in programming classes. One of the more controversial claims is that mathematics background is a prerequisite for learning programming. Nathan Ensmenger suggests the studies show a correlation between mathematics background and success in programming classes, but not in programming performance. He suggests over-emphasizing mathematics has been a factor in the decline in diversity in computing (see blog post here about this point).

These predictors are particularly important today. With our burgeoning undergraduate enrollments, programs are looking to cap enrollment using factors like GPA to decide who gets to stay in CS. (See Eric Roberts’ history of enrollment caps in CS.) Lauren’s results suggest choosing who gets into CS based on GPA might be a bad idea. GPA may not be an important predictor of success.

I asked Lauren how she might explain the difference between her experimental results and the classroom-based results. One possibility is that there are effects of these demographic variables, but they’re too small to be seen in short-term experimental settings. A class experience is the sum of many experiment-size learning situations.

There is another possibility Lauren agrees could explain the difference between classrooms and laboratory experiments. We may teach better in experimental settings than we do in classes. Lauren has almost no one dropping out of her experiments, and she has measurable learning. Everybody learns in her experiments, but some learn more than others. The differences cannot be explained by any of these demographic variables.

Maybe characteristics like "participants’ age, gender, academic field of study, high school GPA, college GPA, year in school, computer science experience, comfort with computers, and expected difficulty of learning" programming are predictors of success in programming classes because of how we teach programming classes. Maybe if we taught differently, more of these students would succeed. The predictor variables may say more about our teaching of programming than about the challenge of learning programming.