The time for stressing about superintelligent AI will come soon.
The problem with the world today isn't that too many people are afraid—it's that too many people are afraid of the wrong things. Consider this: what scares you more, that your life could end because of a terrorist attack or because you get crushed to death under a large piece of furniture?
Despite a media environment in which the threat of terrorism is omnipresent and the threat of furniture nonexistent, your gravestone is actually more likely to say, "Died under a couch recently bought from Ikea" than "Perished in a terrorist attack."
In fact, asteroids are more likely to kill the average person than lightning strikes, and lightning strikes are more dangerous than terrorism. The point is that, as I've written elsewhere, our intuitions often fail to track the actual risks around us. We dismiss many of the most likely threats while obsessing over improbable events.
This basic insight forms the basis for a recent TED talk by the neuroscientist Sam Harris about artificial superintelligence. For those who pay attention to the news, superintelligence has been a topic of interest in the popular media at least since the Oxford philosopher Nick Bostrom published a surprise best-seller in 2014 called—you guessed it—Superintelligence.
Major figures like Bill Gates, Elon Musk, and Stephen Hawking subsequently expressed concern about the possibility that a superintelligent machine of some sort could become a less-than-benevolent overlord of humanity, perhaps catapulting us into the eternal grave of extinction.
It isn't just another "tool" that someone could use to destroy civilization. Rather, superintelligence is an agent in its own right.
Harris is just the most recent public intellectual to wave his arms in the air and shout, "Caution! A machine superintelligence with God-like powers could annihilate humanity." But is this degree of concern warranted? Is Harris as crazy as he sounds? However fantastical the threat of superintelligence may initially appear, a closer look reveals that it really does constitute perhaps the most formidable challenge that our species will ever encounter in its evolutionary lifetime.
Ask yourself this: what makes nuclear, biological, chemical, and nanotech weapons dangerous? The answer is that an evil or incompetent person could use these weapons to inflict harm on others. But superintelligence isn't like this. It isn't just another "tool" that someone could use to destroy civilization. Rather, superintelligence is an agent in its own right.
And, as scholars rightly warn us, a superintelligent mind might not be anything like our minds. It could have a completely different set of goals, motivations, categories of thought, and perhaps even "emotions." Anthropomorphizing a superintelligence by projecting our own mental properties onto it would be like a grasshopper telling its friends that humans love nothing more than perching atop a blade of grass because that's what grasshoppers enjoy. Obviously, that's silly—and simply incorrect.
So, a superintelligence wouldn't be something that humans use for their own purposes, it would be a unique agent with its own purposes. And what might these purposes be? Since a superintelligence would be our offspring, we could perhaps program certain goals into it, thereby making it our friend rather than foe—that is, making it prefer amity over enmity.
This sounds good in theory, but it raises some serious questions. For example, how exactly could we program human values into a superintelligence? Getting our preferences into computer code poses significant technical challenges. As Bostrom notes, high-level concepts like "happiness" must be defined "in terms that appear in the AI's programming language, and ultimately in primitives such as mathematical operators and addresses pointing to the contents of individual memory registers."
Even more, our value systems turn out to be far more complex than most of us realize. For instance, imagine we program a superintelligence to value the well-being of sentient creatures, which Harris himself identifies as the highest moral good. If the resulting superintelligence values well-being, then why wouldn't it immediately destroy humanity and replace us with a massive warehouse of human brains hooked up to something like the Matrix, except the virtual worlds in which we'd live would be overflowing with constant bliss—unlike the "real" world, which is full of suffering.
A bunch of Matrix brains living in a virtual paradise would produce far more overall well-being in the universe than humans living as we do, yet this would (most would agree) be a catastrophic outcome for humanity.
Adding to this difficulty, there's the confounding task of figuring out which value system to start with in the first place. Should we choose the values espoused by a particular religion, according to which the aim of moral action is to worship God? Should we borrow the values of contemporary ethicists? If so, which ethicists? (Harris?) There's a huge range of diverse ethical theories, and almost no consensus among philosophers who study such issues about which theories are correct.
So, not only is there the "technical problem" of embedding values into the superintelligence's psyche, but there's the "philosophical problem" of figuring out what the heck those values are.
This being said, one might wonder why exactly it's so important for a superintelligence to share our values (whatever they are). After all, John prefers chocolate while Sally prefers vanilla, and John and Sally get along just fine. Couldn't the superintelligence have a different value system and coexist with humanity in peace?
The answer appears to be No. First, consider the fact that intelligence confers power. By "intelligence," I mean what cognitive scientists, philosophers, and AI researchers mean: the ability to acquire and use effective means to achieve some end, whether that end is solving world poverty or playing tic-tac-toe. Thus, a cockroach is intelligent insofar as it's able to evade the broom I use to swat it, and humans are intelligent insofar as we're able to say, "Hey, let's go to the moon," and then actually do this.
If intelligence confers power, then a superintelligence would be superpowerful. Don't picture here a Terminator-like android with a bipedal posture marching through the world with machine guns. This dystopic vision is one of the great myths of AI. Instead, the danger would come from something more like a ghost in the hardware, capable of controlling any device within electronic reach—such as weapon systems, automated laboratory equipment, the stock market, particle accelerators, and future devices like the nanofactory, or some as-yet unknown technology (that it might invent).
Making matters even worse, electrical potentials propagating inside a computer transfer information way, way faster than the action potentials in our puny little brains. A superintelligence could thus think about one million times faster than us—meaning that a single minute of objective time would equal nearly two years of subjective time for the AI. From its perspective, the outside world would be virtually frozen in place, and this would give it ample time to analyze new information, simulate different strategies, and prepare backup plans between every word spoken by a human being in realtime. This could enable it to eventually trick us into hooking it up to the Internet, if researchers initially denied it access.
It could use its power to destroy our species for the same reason that our species destroys ant colonies.
These considerations suggest that a superintelligence could crush humanity with the ease of a child stomping on a spider. But there's a crucial catch: a superintelligence with the means for destroying humanity need not have the motivation to do this. On the one hand, it's entirely possible for a superintelligence to be explicitly malicious, and thus try to kill us on purpose. On the other hand, the situation is far more menacing than this: even a superintelligence with no ill-will toward humanity at all could pose a direct and profound existential risk to human civilization.
This is where the issues of power and values collide with nightmarish implications: if the superintelligence's goals don't almost completely align with ours, it could use its power to destroy our species for the same reason that our species destroys ant colonies when we convert land into a construction site. It's not that we hate ants. Rather, they just happen to be in the way, and we don't really care much about ant genocides. Harris makes this point well in his talk.
For example, imagine that we tell a superintelligence to harvest as much energy from the sun as possible. So what does it do? Obviously, it covers every square inch of land with solar panels, thereby obliterating the biosphere (a "sphere" of which we are a part). The once extant Homo sapiens then goes extinct.
Or imagine that we program the superintelligence to maximize the number of paperclips in the universe. Like the case just mentioned, this appears, at first glance, to be a pretty benign goal for the superintelligence to pursue. After all, a "paperclip maximizer" wouldn't be hateful, belligerent, sexist, racist, homicidal, genocidal, militaristic, or misanthropic. It would just care a lot about making as many paperclips as possible. (You can think of this as its passion in life.)
So what happens? The superintelligence looks around and notices something relevant to its mission: humans just so happen to be made of the same chemical ingredient that paperclips are made of, namely atoms. It thus proceeds to harvest the atoms contained in every human body—all 7.4 billion of us and counting—thereby converting each individual into a pile of lifeless, twisted steel wire.
These aren't even all the reasons we should be worried about superintelligence, but they do warrant serious concern about the topic—even if our intuitions fail to sound the emotional alarm in our heads: "Be worried!"
As Harris points out in his talk, superintelligence not only presents a behemoth challenge for the best minds on Earth this century, but we have no idea how long it might take to solve the problems specified above, assuming that they're soluble at all. It could take only 2 more years of AI research, or require the next 378 years during which billions of work hours are spent ruminating this issue.
This is troublesome because according to a recent survey of AI experts, there's a very good chance superintelligence will join us by 2075, and 10 percent of respondents claimed that it could arrive by 2022. So, superintelligence could show up before we've had enough time to solve the "control problem." But even if it looms in the far future, it's not too early to start thinking about these issues—or spreading the word through popular media.
The fact is that once the AI exceeds human-level intelligence, it could be permanently out of our control. Thus, we may have only a single chance to get everything right. If the first superintelligence is motivated by values even slightly incompatible with ours, the game would be over, and humanity will have lost. Perhaps truth is stranger than science fiction.
Author's Note: Thanks to Daniel Kokotajlo for helpful comments on an earlier draft.