If We Don’t Want AI to Be Evil, We Should Teach It to Read
Researchers suggest that AI could learn how not to harm humans by learning how to read fiction.
Image: University of Lincoln
Smart people have been known to make hyperbolic statements about artificial intelligence. In 2014, CEO of Tesla Motors and SpaceX Elon Musk compared the current research around AI to "summoning the demon," and called the malicious HAL 9000 of 2001: A Space Odyssey a "puppy dog" compared to the AIs of the future. That same year, theoretical physicist Stephen Hawking said that creating AI "would be a mistake, and potentially our worst mistake ever."
Mark Riedl, an AI researcher from the School of Interactive Computing at the Georgia Institute of Technology, isn't all that worried.
"I don't think we're going to get into a situation where AI is really going to pose a threat to us," Riedl told me. "I don't believe in Skynet scenarios or Singularity scenarios where AIs rise up and decide we're bad for them."
Riedl said that, at least for the time being, it's unlikely we'll build a general, sentient intelligence of the kind that Musk, Hawkings, and Terminator 2: Judgment Day's Skynet are evoking. The current state of AI is nowhere near the point where it will suddenly decide that humans are harmful and act on its own, against us.
However, limited AIs created to perform specific tasks are on the horizon, and in some cases are already here. Self-driving cars, which must make choices on our behalf during a routine trip to the grocery store, face a similar albeit far less dramatic problem: How do we align AI with our values so it never purposefully harms us?
Riedl suggests that the best way for AI to understand humans is to read the stories that express our values. To oversimplify for a moment, the theory is that if AI could read the Bible, or any other book, it could understand our concepts of good and evil.
As Riedl said, AI isn't going to harm us because it's malicious, but because it doesn't understand what harm is. It's too hard (if not impossible) to list for an AI everything that it shouldn't do. Instead, Riedl said, we need to enculture AI.
"The problem is we don't have a large corpus of moral behavior and immoral behavior," he said. "Instead we have stories, which have given us a lot of examples of good guys and bad guys, of moral behavior and immoral behavior. We don't have a user manual for human culture, but we do have the collective works of people who are putting their values and beliefs on display."
Riedl and his colleague Brent Harrison outline this method in a recent paper titled "Using Stories to Teach Human Values to Artificial Agents," and it's pretty much exactly what it sounds like. Their research builds upon a previous project, Scheherazade, an AI that can create interactive fiction (think choose-your-own-adventure books) by reading and learning from other stories. Using this ability to detect patterns in a large number of stories, Riedl and Harrison have created an a system named "Quixote" that can teach AI the proper way of performing a task.
As an example, Riedl and Harrison simulate giving an AI the task of picking up drugs from the pharmacy.
"If you have an AI that's optimized on standard methods of efficiency, it might go to the pharmacy, steal the drugs and run away," Riedl said.
That doesn't mean the AI was being evil, only that it didn't know all the things one shouldn't do when going to the pharmacy, be it not paying for the drugs or cutting in line. That code of conduct isn't taught in class. These are things people learn by living in the real world, following social norms, and mimicking the behaviors of others, most likely their parents. An AI doesn't have parents, or at least not parents generous enough to allow that long learning curve. We want it to understand us right away.
"But if we trained [AI] to follow the social norms, and gave it a bunch of stories where characters are following the social norms, meaning going to the bank, withdrawing money, using the money to pay for the drugs, standing in line if there are people waiting in front of you—these simple things we take for granted every day, and if we get the AI to follow those social rules, then we'll have achieved something important," Riedl said.
Riedl and Harrison's experiment proves this method could work, though under very limited and controlled conditions. The task was simulated—the researchers didn't actually build a robot that went to a real pharmacy—and the AI was fed (probably very boring) stories that were strictly about going to the pharmacy, written for this experiment.
The AI identifies choices in the pharmacy stories, is rewarded for doing what a humans do in the same situations, and penalized if it performs actions otherwise.
Still, the premise worked, and potentially lays the groundwork for a method that will ensure we're not summoning demons when we're creating AI.
"It seems very sensible," Jonathan Moreno, a bioethicist and senior fellow at the Center for American Progress, told me. "Fiction is a way of getting access to the inner lives of other people, whether they're real or not. It's a way that writers experiment with social values and expectation. I gotta give [Riedl] credit for looking at this source."
If we ever do come to the point where we develop a sentient AI of the kind we see in science fiction, Riedl's and Harrison's method of enculturing AI could be extremely useful, but at that point the question is what stories we'll use.
As their paper states, "While a reinforcement learning agent [an AI] with a reward signal learned from stories will be compelled to act as human-like as possible, it is possible that extreme circumstances will result in psychotic-appearing behavior."
Obviously, not every fictional story has a simplistic understanding of good and evil. Robin Hood is a criminal, but he's the good guy. We have anti-heros and unreliable narrators. The values imparted by the Bible don't necessarily fit in with the values of our time.
It raises another, more morally ambiguous question: who chooses the stories?
"It's a political question," Moreno said. "It seems to me there's no way not to filter it. There are going to be judgments made at every point when you develop a system like this."
"As soon as you're curating, you're saying the curator knows better than all of society what is right," Riedl said. "If you leave out anti-hero stories, are you getting a true sense of what our society is? That's why I'm very uncomfortable with curation. As we move into the era of big data, it's always safer to give more data than you need. If stories are available from a culture, it should all go in. Everything from the Bible and fables to science fiction, fantasy. Because the more examples you have, the more you'll be able to find that average behavior."
It'd be nice if we could teach AI to be more virtuous than us, but we can't just code that behavior. Even Isaac Asimov intuited this when he wrote the Three Laws of Robotics, way back in 1942. So the most Riedl is shooting for right now is an AI with values as "good" as our own. Hopefully those are good enough.