To Make AI Less Biased, Give It a Worldview

Can “fairness” be expressed in numbers?

One of the most difficult emerging problems when it comes to artificial intelligence is making sure that computers don't act like racist, sexist dicks.

As it turns out, it's pretty tough to do: humans created and programmed them, and humans are often racist, sexist dicks. If we can program racism into computers, can we also train them to have a sense of fairness?

Some experts believe that the large databases used to train modern machine learning programs reproduce existing human prejudices. To put it bluntly, as Microsoft researcher Kate Crawford did for the New York Times, AI has a white guy problem. For example, an algorithmically-judged beauty contest recently picked nearly all white winners among a pool where many races were represented, and software trained on text from the web rated white-sounding names as more "pleasant" than non-white ones.

Maybe the racism and sexism hidden in machine learning databases can be balanced out by a fairness algorithm—a piece of code that could reliably account for biased data and produce equitable results when, say, deciding how much someone's insurance premiums should be, or who gets into college.

Read More: Does Crime-Predicting Software Bias Judges? Unfortunately, There's No Data

A team of researchers based at the University of Utah, the University of Arizona, and Haverford College in Pennsylvania believe they've just taken a preliminary step towards such a fairness algorithm by formalizing ideas of "fairness" in mathematical terms that computer scientists can use to teach machines.

"In a sense we're really just putting into mathematical language what people in political science and philosophy have been saying for a long time," Suresh Venkatasubramanian, a computer scientist at the University of Utah who co-authored the work, told me. "Like, hey, there's structural bias in the world and people need to be aware of this."

To get fair and unbiased results from a machine, you have to code it with strong assumptions about how truth is represented in seemingly objective data that may harbor prejudice, they write in a paper posted on Sunday to the arXiv preprint server (it hasn't been peer reviewed).

Such an algorithm, coupled with careful attention paid to what sort of information machines learn from in the first place, could be a powerful antidote to digitized, automated prejudice, they argue.

"There's structural bias in the world and people need to be aware of this"

To this end, the researchers designed a mathematical way of expressing two different "worldviews" that may be eventually coded into machines. The first, called "What you see is what you get," or WYSIWYG, assumes that whatever the data says constitutes an objective picture of the world, even if inequalities appear in the results. For example, if a computer trained on a dataset of IQ test results ends up deciding that overall one particular ethnicity is less fit for an academic scholarship, well, that's just the way it is.

The other scenario, which the researchers call "We're all equal," or WAE, assumes that data does not closely reflect reality due to biases that are inevitably contained within it—for example, cultural biases embedded in IQ tests that disadvantage non-white or economically underprivileged people. So the machine assumes that in the end, different groups (differentiated by sex, gender, ethnicity, etc.) should be similar overall and mathematically compensates for any major discrepancies in the results.

What these two mathematically defined worldviews do, Venkatasubramanian said, is govern how computers judge observed phenomena (like the IQ test scores) and how accurately they map to what you want to eventually decide—who is most qualified for an academic scholarship.

Some readers might point out that the WYSIWYG framework is likely to be problematic in all cases—after all, when is structural inequality not a factor in institutional decision-making?

"As a computer scientist, it's not clear to me which is the correct view," said Venkatasubramanian. "It's more important to expose the differences in how people think."

As for the problem of ensuring that people are, basically, socially aware enough to choose WAE over WYSIWYG… Well, it's tough to get people to see past bigotry, even when computers aren't part of the equation.

"We don't expect to solve that problem," Venkatasubramanian said, "but at the very least we can code machines to be aware of these issues."

If Venkatasubramanian and his colleagues are right, then one day we might be able to essentially tell a machine that it's a pretty messed up world out there, and what you see really isn't always what you get.