Friday, 8 November 2019

Review: Human Compatible by Stuart Russell

Stuart Russell is a very influential figure in the Artificial Intelligence (AI) community. He is co-author of a widely studied text book on the subject, Artificial Intelligence: A Modern Approach, that was key to moving the subject from being dominated by a formal logic approach to one that focused communities of artificial agents operating and cooperating to maximise rewards in uncertain environments.
His most recent book, Human Compatible, is written for a general audience. It helps if you are a reasonably informed member of the general public but the book rewards the effort with not only an overview of the current successes of AI but also addressing the technical and ethical challenges with novel, constructive  proposals. It is highly recommended.

Russell takes the challenges to human purpose, authority and basic well being seriously without being unnecessarily alarmist. Even with current and immanent applications of AI there are threats to jobs but also real gains in efficiency and cost effectiveness in medicine and production.

The aim of what Russell calls the standard model of AI has been defined almost since its inception as
Machines are intelligent to the extent that their actions can be expected to achieve their objectives.
However the major challenge comes with an anticipated major step forward. That is the creation of  general-purpose AI (GAI). In analogy with general purpose computers, that is one computer can carry out any computational task,  GAI will be able to act intelligently, that is plan and act to achieve what it needs to achieve, on a wide range of tasks. This would include creating and prioritising new tasks. Currently AI produces highly specialised agents but many of the algorithms  can be put to diverse uses; learning different skills but once learned that skill is exercised narrowly. For example, Deepmind's AlphaGo can play the board game Go very well, better than any human, but that is it. Related underlying techniques, however, can be applied to a number of applications such text interpretation, translation, image analysis and so on.

The major breakthrough that would enable the AI to escape from the constraint of narrow specialism has yet to be made. As indicated above, GAI would be a method that is applicable across all types of problems and would work effectively for large and difficult tasks. It’s the ultimate goal of AI research.  The system would need no problem-specific engineering and could be asked to teach sociology or even run a government department. A GAI would learn what from all the available sources, question humans if needed, and formulate and execute plans that work. Although a GAI does not yet exist, Russell argues that a lot of  progress will results from research that is not directly about building "threatening" general-purpose AI systems. It will come from research an the narrow type of AI described above plus a breakthrough on how the technology is understood and organised.

It is the mistake to think that what will materialise is some humanoid robot. The GAI will be distributed and decentralised; it will exist across a network of computers. It may enact a specific task such as cleaning the house by using lower level specialised machines but the GAI will be a planner, coordinator, creator and executive. Its intelligence would not be human it will have significant hardware advantages in speed and memory it will also have access to an unlimited number of sensors and actuators. The GAI would be able to reproduce and improve upon itself. It would do this at speed. 

And the role for the human? None that the GAI requires. The GAI would be able to construct the best plan to tackle climate change, for example, but on its terms  and in a way that maximised its utility. And to maximise its utility it will find ways to stop itself being switched off.

This danger, according to Russell, comes from the aim that has guided the whole research programme so far. That is, to give an AI objectives that become the AI's own objectives. A GAI could then invent new objectives such as self replication or inventing better information storage and retrieval to speed up its own actions. A GAI could then be said to have its own preferences and effectively is a form of person. It need not be malevolent but equally it need not be motivated by benevolence to humans  either.

In short, the solution proposed by Russell to the problem of human redundancy is to engineer the GAI so it cannot have preferences of its own. It is engineered to enact preferences of individual humans. The final third of the book is devoted to exploring the ramification around a set of principles for benevolent machines:
  1. The machine’s only objective is to maximise the realisation of human preferences. 
  2. The machine is initially uncertain about what those preferences are.
  3. The ultimate source of information about human preferences is human behaviour.
These principles are there to inform design but also regulation and the formulation of research programmes. Russell deals with these and the societal efforts and cooperation that will be required to deal with unintended consequences as well as the actions of some less than benevolent humans who may want to create a GAI that has a goal other than to satisfy human preferences.  The discussion addresses the challenges of multiple artificial agents and multiple and incompatible human preferences. Of course Russell does not definitively solve any of these but indicates routes to resolution.

It is important that this book shows there is an approach in which humanity can reap the benefits of AI research without being subjugated to a superior intelligence or being forces to implement an AI ban. It is a positive vision that I hope informs and shapes our approach to this technology. But there is urgency, because we do not know when or where the break through to GAI will take place. If it happens within the current standard model pattern of research and application then disaster threatens.