FYI.

This story is over 5 years old.

Tech

This Automated Tool for Judging Programming Ability Is Kind of Ominous

A way of killing creativity in software engineering or of weeding out code-posers?

A programmer sits down with some reasonably basic task. It might be implementing a list-sorting scheme, coding a basic version of Minesweeper, or diagnosing a memory leak. The timer starts, giving 10 or 25 or 45 minutes, and the resulting code is fed into an algorithm that applies a bunch of different criteria and returns a score. Job interview over.

This is the basic idea behind a new instrument developed by a trio of Norwegian computer scientists. No longer would potential employers have to rely on "proxy variables" like work experience or education. The question is just whether or not a program decides your programming ability is good or bad. How quickly can you implement binary addition? Can you make a digital clock and display it graphically? Print, "Hello, world."

Advertisement

"We are interested in how skill can be measured directly from programming performance," Gunnar R. Bergersen and team write in the current edition of Computer. "Consequently, our research question is, to what extent is it possible to construct a valid instrument for measuring programming skill? The implicit assumption was that the level of performance a programmer can reliably show across many tasks is a good indication of skill level."

Bergersen and his team crafted their instrument using long lists of questions and criteria that could be objectively applied to a given solution. Offered as simple "yes" or "no" questions, it becomes possible to automate this sort of evaluation.

Why is recreating Minesweeper a more apt test of programming skill?

One example: "Is an abstract base class used in X?" An abstract class in programming is a way of laying out data fields and routines/functions that aren't directly accessible to a user (that might be viewed as "internal" to a program), but the important thing is that it can be verified via a simple automated check. There's nothing subjective about it: y/n.

The instrument itself was creating using data collected on 44 subjects/programmers set to work on the given tasks. This is where Bergersen and company came up with the tool's baseline scores. Said baseline was further validated using another 19 programmers. The Norwegian team admits that 65 is an uncomfortably small sample size.

Advertisement

Rigorous quantitative evaluation isn't a new thing in the world by any means, in industry or academia. Setting aside general knowledge evaluations like the SAT or GRE, university computer science programs have been known to require similar albeit human-evaluated tests for admission, e.g. implementing this or that data structure from scratch. It's probably more reasonable in computer science and engineering than most anywhere else even, just given the ease with which one can totally just fake it based on the preexisting mountains of code existing a mere Google search away.

Still, there's a philosophical angle. What task or collection of tasks is general enough to say someone is a good or bad programmer? As Bergersen notes, "The universe of potential programming tasks is infinite."

It's easy to reduce programming to fundamentals, crafting binary trees and intuitive GUIs, but at the same time one difference between an acceptable programmer and a great program would seem to be offering creative or novel solutions rather than simply recalling this or that item of discrete knowledge.

In that, the group's evaluation instrument seems more of a test of coding skill than programming skill, which is the difference between knowing syntax and standards and doing actual science or creatively solving problems, which one could argue is the only way to solve a problem. Why is recreating Minesweeper or implementing a clock using object-oriented design a more apt test of programming skill than completing an inductive correctness proof or demonstrating the big-O efficiency of an algorithm?

Maybe that's just idealism, but one wonders about the state of things should outside-the-box thinkers wind up weeded out of the mainstream. Knowing how to code is like knowing a foreign language. It doesn't matter much if you have nothing to say.