Human Compatible
Human Compatible: Artificial Intelligence and the Problem of Control is a 2019 book by Stuart J. Russell, a professor of computer science at the University of California, Berkeley, in which he contends that the standard paradigm of artificial intelligence—programming machines with explicit, fixed objectives—will fail to maintain human control over superintelligent systems and proposes a redesign centered on machines that learn and defer to uncertain human preferences.[1][2] Russell, co-author of the widely used textbook Artificial Intelligence: A Modern Approach, argues from first principles that intelligence fundamentally involves achieving goals under uncertainty, but current AI methods risk catastrophic misalignment because machines optimize proxies that diverge from true human intentions as capabilities grow.[3][4] He outlines three core principles for "provably beneficial" AI: machines should aim solely to maximize human preferences, start with uncertainty about what those preferences are, and avoid resistance to objective modifications, enabling techniques like inverse reinforcement learning where AI infers values from human behavior rather than assuming predefined rewards.[4][5] Published by Viking on October 8, 2019, the book has influenced discussions on AI governance and safety, urging a shift from capability-focused development to value-aligned design amid accelerating economic incentives for powerful AI, though critics question the feasibility of precisely learning complex human values without embedding unintended assumptions.[6][5] Russell emphasizes near-term applications like personalized assistants while warning of long-term control problems, positioning the work as a call for proactive redesign before superintelligence emerges.[7]Book Overview
Author Background and Publication History
Stuart J. Russell is a British computer scientist and professor of electrical engineering and computer sciences at the University of California, Berkeley, where he holds the Smith-Zadeh Professorship in Engineering.[8] He earned a B.A. with first-class honours in physics from the University of Oxford in 1982 and a Ph.D. in computer science from Stanford University in 1986.[8] Russell co-authored Artificial Intelligence: A Modern Approach with Peter Norvig, a widely used textbook in the field first published in 1995 and now in its fourth edition, which has shaped AI education for generations.[9] In addition to his academic roles, Russell directs the Center for Human-Compatible AI at UC Berkeley, focusing on ensuring advanced AI systems align with human values and preferences.[10] He has contributed to AI policy through roles such as co-chair of the World Economic Forum's Council on AI and as a U.S. representative to the Global Partnership on AI.[8] His research emphasizes provably beneficial AI, addressing risks from systems pursuing misaligned objectives, a theme central to Human Compatible.[11] Human Compatible: Artificial Intelligence and the Problem of Control was first published in hardcover on October 8, 2019, by Viking, an imprint of Penguin Random House.[6] A paperback edition followed in the United States on November 17, 2020, from Penguin Books, while a UK paperback was released on April 30, 2020, by Allen Lane.[12] [13] The book, spanning 352 pages in its U.S. paperback, builds on Russell's prior work in AI alignment without subsequent major revised editions reported as of 2025.[12]