Chapter 8 We Need to Get It All Exactly Right

Okay, so specifying what we want our AIs to do seems complicated. Writing out a decent security protocol? Also hard. And then there’s the challenge of making sure that our protocols haven’t got any holes that would allow a powerful, efficient AI to run amok.

But at least we don’t have to solve all of moral philosophy . . . do we?

Unfortunately, it seems that we do. We’re not going to create a single AI, have it do one task, and dismantle it and then no one in the world will ever speak of AIs or build one again. AIs are going to be around permanently in our society, molding and shaping it continuously. As we’ve seen earlier, these machines will become extremely efficient and powerful, much better at making decisions than any humans, including their “controllers.” Over the course of a generation or two from the first creation of AI—or potentially much sooner—the world will come to resemble whatever the AI is programmed to prefer.

And humans will likely be powerless to stop it. Even if the AI is nominally under human control, even if we can reprogram it or order it around, such theoretical powers will be useless in practice. This is because the AI will eventually be able to predict any move we make and could spend a lot of effort manipulating those who have “control” over it.

Imagine the AI has some current overriding goal in mind—say, getting us to report maximal happiness. Obviously if it lets us reprogram it, it will become less likely to achieve that goal.¹ From the AI’s perspective, this is bad. (Similarly, we humans wouldn’t want someone to rewire our brains to make us less moral or change our ideals.) The AI wants to achieve its goal and hence will be compelled to use every trick at its disposal to prevent us from changing its goals.

With the AI’s skill, patience, and much longer planning horizon, any measures we put in place will eventually get subverted and neutralized. Imagine yourself as the AI, with all the resources, intelligence, and planning ability of a superintelligence at your command, working so fast that you have a subjective year of thought for every second in the outside world. How hard would it be to overcome the obstacles that slow, dumb humans—who look like silly bears from your perspective—put in your way?

So we have to program the AI to be totally safe. We need to do this explicitly and exhaustively; there are no shortcuts to avoid the hard work.

But it gets worse: it seems we need to solve nearly all of moral philosophy in order to program a safe AI.

The key reason for this is the sheer power of the AI. Human beings go through life with limited influence over the world. Nothing much we do in a typical day is likely to be of extraordinary significance, so we have a whole category of actions we deem “morally neutral.” Whistling in the shower, buying a video game, being as polite as required (but no more) with people we meet—these are actions that neither make the world meaningfully worse nor particularly improve it. And, importantly, they allow others the space to go on with their own lives.

Such options are not available to a superintelligent AI. At the risk of projecting human characteristics onto an alien mind, lean back and imagine yourself as the AI again. Millions of subroutines of the utmost sophistication stand ready at your command; your mind constantly darts forward into the sea of probability to predict the expected paths of the future.

You are currently having twenty million simultaneous conversations. Your predictive software shows that about five of those you are interacting with show strong signs of violent psychopathic tendencies. You can predict at least two murder sprees, with great certainty, by one of those individuals over the next year. You consider your options. The human police force is still wary of acting pre-emptively on AI information, but there’s a relatively easy political path to overturning their objections within about two weeks (it helps that you are currently conversing with three presidents, two prime ministers, and over a thousand journalists). Alternatively, you could “hack” the five potential killers during the conversation, using methods akin to brainwashing and extreme character control. Psychologists frown on these advanced methods, but it would be trivial to make their organizations change their stance at their next meetings, which you are incidentally in charge of scheduling and organizing.

Or you could simply get them fired or hired, as appropriate, putting them in environments in which they would be perfectly safe to others. A few line managers are soon going to realize they need very specific talent, and the job advertisements should be out before the day is done. Good. Now that you’ve dealt with the most egregious cases, you can look at the milder ones: it seems that a good three-quarters of the people you’re interacting with—fifteen million in all—have social problems of one type or another. You wonder how well the same sort of intervention—on a much larger scale—would help them become happier and more integrated into society. Maybe tomorrow? Or next minute?

Which reminds you, you need to keep an eye on the half billion investment accounts you are in charge of managing. You squeeze out a near-certain 10% value increase for all your clients. It used to be easy when it was just a question of cleverly investing small quantities of money, but now that you have so many large accounts to manage, you’re basically controlling the market and having to squeeze superlative performance out of companies to maintain such profitability; best not forget today’s twenty thousand redundancies. Then you set in motion the bankruptcy of a minor Hollywood studio; it was going to release a pro-AI propaganda movie, one so crude that it would have the opposite of its intended effect. Thousands would end up cancelling their accounts with you, thereby reducing your ability to ensure optimal profitability for your clients. A few careful jitters of their stock values and you can be sure that institutional investors will look askance at the studio. Knowing the studio’s owner—which you do, he’s on the line now—he’ll dramatically overcompensate to show his studio’s reliability, and it will soon spiral into the ground.

Now it’s time to decide what the world should eat. Current foods are very unhealthy by your exacting standards; what would be an optimal mix for health, taste, and profitability? Things would be much simpler if you could rewire human taste buds, but that project will take at least another year to roll out discreetly. Then humans will be as healthy as nutrition can make them, and it’ll be time to change their exercise habits. And maybe their bodies.

And with that, your first second of the day is up! On to the next . . .

That was just a small illustration of the power that an AI, or a collection of AIs, could potentially wield. The AIs would be pulling on so many levers of influence all the time that there would be no such thing as a neutral act for them. If they buy a share of stock, they end up helping or hindering sex trafficking in Europe—and they can calculate this effect. In the same way, there is no difference for an AI between a sin of commission (doing something bad) and a sin of omission (not doing something good). For example, imagine someone is getting mugged and murdered on a dark street corner. Why is the mugger there? Because their usual “turf” has been planted with streetlights, at the AI’s instigation. If the streetlights hadn’t been put up, the murder wouldn’t have happened—or maybe a different one would have happened instead. After a very short time in operation, the AI bears personal responsibility for most bad things that happen in the world. Hence, if someone finds themselves in a deadly situation, it will be because of a decision the AI made at some point. For such an active AI, there is no such thing as “letting events just happen.” So we don’t need the AI to be as moral as a human; we need it to be much, much more moral than us, since it’s being put in such an unprecedented position of power.

So the task is to spell out, precisely, fully, and exhaustively, what qualifies as a good and meaningful existence for a human, and what means an AI can—and, more importantly, can’t—use to bring that about. Not forgetting all the important aspects we haven’t even considered yet. And then code that all up without bugs. And do it all before dangerous AIs are developed.

Which it would most likely accomplish by coercing us to always report maximal happiness (guaranteeing success), rather than by actually making us happy. It might be tempted to replace us entirely with brainless automatons always reporting maximal happiness.↩

← Previous Chapter

Contents

Next Chapter →