Google and Oxford Are Working on an A.I. Kill Switch That A.I. Won’t Learn to Turn Off
Google isn't going to let robots take over the world. That's Google's job.
Google wants to make sure that if anyone is going to take over the world and subjugate humanity, it’s going to be Google the company not Google the inevitable self-chosen name of the sentient AI that will one day destroy us all. That’s why they’re working on an off switch for artificial intelligence—or just for the safety of humans working alongside less ambitious but still potentially dangerous AI—which is a trickier proposition than it may seem.
After all, the point of AI is, eventually, to help make advances in computing (and everything else) faster and better than humans can. We’re not quite at the point where AI is equal to or better than humans just yet, but once we are, AI will be able to improve on itself even faster than we can, which makes it a difficult task to come up with a human-designed don’t-murder-us button that won’t immediately become obsolete if AI so chooses.
In the more near term, though, such a thing will also be necessary as robots working in, say, factories make use of AI. You’d want to make sure the AI doesn’t lead the robot to do something that isn’t exactly helpful to the manufacturing process, so you’d need to make them what Google and Oxford University’s aptly named Future of Humanity Institute call “safely interruptible agents”—or, as they put it in layman’s terms in their paper: give AI a “big red button.” It would theoretically allow a person to remove a robot—or themselves—from whatever dangerous situation arose and then safely reactivate the AI at a later time.
They explain, “This paper explores a way to make sure a learning agent will not learn to prevent (or seek!) being interrupted by the environment or a human operator. We provide a formal definition of safe interruptibility and exploit the off-policy learning property to prove that either some agents are already safely interruptible, like Q-learning, or can easily be made so, like Sarsa. We show that even ideal, uncomputable reinforcement learning agents for (deterministic) general computable environments can be made safely interruptible.”
There may just be hope for us yet.
(via Gizmodo, image via Futurama)
—The Mary Sue has a strict comment policy that forbids, but is not limited to, personal insults toward anyone, hate speech, and trolling.—