Why some people fear AI, explained.
Stephen Hawking has said, “The development of full artificial intelligence could spell the end of the human race.” Elon Musk claims that AI is humanity’s “biggest existential threat.”
Some researchers distinguish between “narrow AI” — computer systems that are better than humans in some specific, well-defined field, like playing chess or generating images or diagnosing cancer — and “general AI,” systems that can surpass human capabilities in many domains. We don’t have general AI yet, but we’re starting to get a better sense of the challenges it will pose.
For example, we tell a system to run up a high score in a video game. We want it to play the game fairly and learn game skills — but if it instead has the chance to directly hack the scoring system, it will do that. It’s doing great by the metric we gave it. But we aren’t getting what we wanted.
It’s rare, though, to find a top researcher in AI who thinks that general AI is impossible. Instead, the field’s luminaries tend to say that it will happen someday — but probably a day that’s a long way off.
I.J. Good, a colleague of Alan Turing who worked at the Bletchley Park codebreaking operation during World War II and helped build the first computers afterward, may have been the first to spell it out, back in 1965: “An ultraintelligent machine could design even better machines; there would then unquestionably be an ‘intelligence explosion,’ and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make.”
In his 2009 paper “The Basic AI Drives,” Steve Omohundro, who has worked as a computer science professor at the University of Illinois Urbana-Champaign and as the president of Possibility Research, argues that almost any AI system will predictably try to accumulate more resources, become more efficient, and resist being turned off or modified: “These potentially harmful behaviors will occur not because they were programmed in at the start, but because of the intrinsic nature of goal driven systems.”
In 2014, he wrote a book explaining the risks AI poses and the necessity of getting it right the first time, concluding, “once unfriendly superintelligence exists, it would prevent us from replacing it or changing its preferences. Our fate would be sealed.”
But with internet access, an AI could email copies of itself somewhere where they’ll be downloaded and read, or hack vulnerable systems elsewhere. Shutting off any one computer wouldn’t help.
Alphabet’s DeepMind, a leader in this field, has a safety team and a technical research agenda outlined here. “Our intention is to ensure that AI systems of the future are not just ‘hopefully safe’ but robustly, verifiably safe ,” it concludes, outlining an approach with an emphasis on specification (designing goals well), robustness (designing systems that perform within safe limits under volatile conditions), and assurance (monitoring systems and understanding what they’re doing).
When the AI gets smarter, might it figure out morality by itself? Again, researchers emphasize that it won’t. It’s not really a matter of “figuring out” — the AI will understand just fine that humans actually value love and fulfillment and happiness, and not just the number associated with Google on the New York Stock Exchange. But the AI’s values will be built around whatever goal system it was initially built around, which means it won’t suddenly become aligned with human values if it wasn’t designed that way to start with.