How can humans responsibly use AI? What type of motives should humans program in AI?
According to Superintelligence by Nick Bostrom, making sure every superintelligent AI has good ultimate motives may be the most important part of AI development. Ultimately the superintelligent AI’s own motives will be the only thing that constrains its behavior.
Keep reading for a number of approaches for the responsible use of AI.
1. Hard-Coded Commandments
As Bostrom remarks, one approach for the responsible use of AI is to hard-code a set of imperatives that constrain the AI’s behavior. However, he expects that this is not practicable. Human legal codes illustrate the challenges of concretely defining the distinction between acceptable and unacceptable behavior: Even the best legal codes have loopholes, can be misinterpreted or misapplied, and require occasional changes. To write a comprehensive code of conduct for a superintelligent AI that would be universally applicable for all time would be a monumental task, and probably an impossible one.
Commandments and Free Will The question of free will presents additional complications for this approach. Even if rules and regulations are created to eliminate loopholes, misinterpretations, and so on, they’ll only restrain people if those people choose, using their free will, to obey them. The question is, would AI evolve a free will that would empower it to disobey rules it doesn’t want to follow? Admittedly, there is some debate over whether human free will is real or just an illusion, and more debate about whether it will ever be possible to endow an AI with free will. But some sources assert that free will is an essential component of human cognition, playing a key role in consciousness and higher learning capabilities. If this proves true, then free will might be an essential component of general intelligence, in which case any AI with superhuman general intelligence would have free will. Then the AI could choose to disobey a pre-programmed code of conduct, further complicating the problem of controlling its behavior. This possibility reinforces Bostrom’s assertion that hard-coded commandments are probably not the best approach to giving an AI the right motives. |
2. Existing Motives
Another approach that Bostrom discusses is to create a superintelligent AI by increasing the intelligence of an entity that already has good motives, rather than trying to program them from scratch. This approach might be an option if superintelligent AI is achieved by the method of brain simulation: Choose a person with exemplary character and scan her brain to create the original model, then run the simulation on a supercomputer that allows it to think much faster than a biological brain.
However, Bostrom points out that there is a risk that nuances of character, like a person’s code of ethics, might not be faithfully preserved in the simulation. Furthermore, even a faithful simulation of someone with good moral character might be tempted to abuse the powers of a superintelligent AI.
Does Power Corrupt? The risk Bostrom identifies that even a person of good character who was given the capabilities of superintelligent AI might abuse those powers calls to mind the old adage that power corrupts people who wield it. A psychological study published the same year as Bostrom’s book found scientific evidence for this. When people were given the choice between options that benefited everyone and options that benefited themselves at others’ expense, initially those with higher levels of integrity tended to choose the options that benefited everyone, while the people with lower levels of integrity chose the opposite. But over time, this difference disappeared, and everyone leaned toward choosing the options that benefited themselves. Thus, the risk of a superintelligent AI based on a simulation of a human brain pursuing its own objectives at other people’s expense appears to be significant, even if the original human was a person of good character. In addition, if the person’s moral code wasn’t completely preserved in the simulation, a risk Bostrom also warns about, the superintelligent AI would probably show selfish tendencies even sooner. |
3. Discoverable Ethics
Bostrom concludes that the best method of endowing a superintelligent AI with good motives will likely be to give it criteria for figuring out what is right and letting it set its own goals. After all, a superintelligent AI would be able to figure out what humans want from it and program itself accordingly better than human programmers could. This approach would also make the superintelligent AI behave somewhat more cautiously, because it would always have some uncertainty about its ultimate goals.
However, Bostrom also notes that (at least as of 2014) no one had developed a rigorous algorithm for this approach, so there’s a risk that this method might not be feasible in practice. And even if we assume that the basic programming problem will eventually be solved, deciding what criteria to give the AI is still a non-trivial problem.
For one thing, if the AI focuses on what its original programmers want, it would prioritize the desires of a few people over all others. It would be more equitable to have it figure out what everyone wants and generally take no action on issues that people disagree about. But for any given course of action, there’s probably somebody who has a dissenting opinion, so where should the AI draw the line?
Then there’s the problem of humans’ own conflicting desires. For example, maybe one of the programmers on the project is trying to quit smoking. At some level, she wants a cigarette, but she wouldn’t want the AI to pick up on her craving and start smuggling her cigarettes as she’s trying to kick her smoking habit.
Bostrom recounts two possible solutions to this problem. One is to program the AI to account for this. Instead of just figuring out what humans want, have it figure out what humans would want if they were more like the people that they want to be. The other is to program the AI to figure out and pursue what is morally right instead of what people want, per se.
But both solutions entail some risks. Even what people want to want might not be what’s best for them, and even what’s morally best in an abstract sense might not be what they want. Moreover, humans have yet to unanimously agree on a definition or model of morality.
Would Liberty Be a Better Criterion? As Bostrom points out, there are risks and challenges associated with letting an AI discover its motives based on doing what people might want it to do. In addition to the problem of conflicting desires, there’s also a risk that the AI might misinterpret people’s desires. It could also decide to manipulate and control what people want to reduce uncertainty about their desires. To mitigate this risk, developers might add a qualifier to the AI’s goal discovery criteria that instructs, “figure out what people want without influencing them.” This instruction would program the AI to respect individual liberty. But, if individual liberty is the ultimate goal, why not just use a criterion of “figure out what would maximize the sum of humans’ individual liberty” instead? This would largely satisfy the “figure out what people want” criterion, because the more freedom people have, the more they’re able to fulfill their own desires. It would also arguably satisfy the “figure out what is morally right” criterion, because, as Jonathan Haidt points out in The Righteous Mind, actions that limit others’ freedom are considered by many to be immoral. The “maximize the sum of individual liberty” criterion carries its own set of risks and challenges—namely, enabling one person’s freedom often entails restricting another’s, begging the question of where an AI would draw the line. This balance between maximizing individual freedoms (as Libertarians advocate) and maximizing public welfare (as Utilitarians support), has been, as Michael Sandel explores in Justice, a long-running debate. The question illustrates how further exploration of the problem may reveal other criteria that could help guide an AI to discover a suitable code of conduct for itself. |
———End of Preview———
Like what you just read? Read the rest of the world's best book summary and analysis of Nick Bostrom's "Superintelligence" at Shortform.
Here's what you'll find in our full Superintelligence summary:
- How an AI superintelligence would make humans the inferior species
- Why AI can't be expected to act responsibly and ethically
- How to make sure a superintelligent AI doesn’t destroy humankind