What to Read First

  • If you’re new to the entire topic, see the 5-page Reducing Long-Term Catastrophic Risks from Artificial Intelligence.
  • If you don’t think human-level AI is possible this century, read Intelligence Explosion: Evidence and Import.
  • If you think that safe AI is the default outcome, or that designing safe AI will be easy, see The Singularity and Machine Ethics or Complex Value Systems are Required to Realize Valuable Futures.
  • If you want to see how cognitive biases could affect one’s thinking about AI risk, see Cognitive Biases Potentially Affecting Judgment of Global Risks or Not Built to Think About AI.
  • If you want to know what can be done to reduce AI risk, see How to Purchase AI Risk Reduction or So You Want to Save the World.

Resources for other researchers

  • Our other publications (mostly old ones).
  • Research by others about AI risk.
  • The Singularity Institute’s continuously updated BibLaTeX file and Mendeley group.
  • Journals that may publish papers on AI risk.
  • Forthcoming and desired articles on AI risk.
  • Keep up with the very latest research relevant to Friendly AI by subscribing to the Friendly AI Research blog.
  • For an overview of what research can be done on the AI risk problem, see So You Want to Save the World.
  • IntelligenceExplosion.com and Friendly-AI.com.

Publications

Safe AI Architectures

What features make an advanced AI beneficial?

  • Peter de Blanc (2011). Ontological Crises in Artificial Agents’ Value Systems. The Singularity Institute.
  • Daniel Dewey (2011). Learning What to Value. In Proceedings of AGI 2011. Springer.
  • Eliezer Yudkowsky (2010). Timeless Decision Theory. The Singularity Institute.
  • Peter de Blanc (2009). Convergence of Expected Utility for Universal Artificial Intelligence. The Singularity Institute.

Safe AI Goals

What goals should an AI have?

  • Luke Muehlhauser and Louie Helm (2012). The Singularity and Machine Ethics. In Singularity Hypotheses. Springer.
  • Eliezer Yudkowsky (2011). Complex Value Systems are Required to Realize Valuable Futures. In Proceedings of AGI 2011. Springer.
  • Nick Tarleton (2010). Coherent Extrapolated Volition: A Meta-Level Approach to Machine Ethics. The Singularity Institute.
  • Carl Shulman, Nick Tarleton, and Henrik Jonsson (2009). Which Consequentialism? Machine Ethics and Moral Divergence. In Proceedings of AP-CAP 2009.
  • Carl Shulman, Henrik Jonsson, and Nick Tarleton (2009). Machine Ethics and Superintelligence. In Proceedings of AP-CAP 2009.
  • Eliezer Yudkowsky (2004). Coherent Extrapolated Volition. The Singularity Institute.

Strategy

How can we predict what AIs will do? How can we build toward a desirable future?

  • Nick Bostrom and Eliezer Yudkowsky (2012). The Ethics of Artificial Intelligence. In The Cambridge Handbook of Artificial Intelligence. Cambridge University Press.
  • Anna Salamon and Luke Muehlhauser (2012). Singularity Summit 2011 Workshop Report. The Singularity Institute.
  • Luke Muehlhauser and Anna Salamon (2012). Intelligence Explosion: Evidence and Import. In Singularity Hypotheses. Springer. (Español)
  • Carl Shulman and Nick Bostrom (2012). How Hard Is Artificial Intelligence? Evolutionary Arguments and Selection Effects. Journal of Consciousness Studies 19 (7–8): 103–130.
  • Luke Muehlhauser (2012). AI Risk Bibliography 2012. The Singularity Institute.
  • Roman Yampolskiy and Joshua Fox (2012). Safety Engineering for Artificial General Intelligence. Topoi.
  • Roman Yampolskiy and Joshua Fox (2012). Artificial General Intelligence and the Human Mental Model. In Singularity Hypotheses. Springer.
  • Kaj Sotala and Harri Valpola (2012). Coalescing Minds: Brain Uploading-Related Group Mind Scenarios. International Journal of Machine Consciousness 4 (1): 293–312.
  • Kaj Sotala (2012). Advantages of Artificial Intelligences, Uploads, and Digital Minds. International Journal of Machine Consciousness 4 (1): 275-291.
  • Stuart Armstrong and Kaj Sotala (2012). How We’re Predicting AI – or Failing to. Beyond AI Conference.
  • Joshua Fox and Carl Shulman (2010). Superintelligence Does Not Imply Benevolence. In Proceedings of ECAP 2010. Verlag Dr. Hut.
  • Carl Shulman and Anders Sandberg (2010). Implications of a Software-Limited Singularity. In Proceedings of ECAP 2010. Verlag Dr. Hut.
  • Kaj Sotala (2010). From Mostly Harmless to Civilization-Threatening. In Proceedings of ECAP 2010. Verlag Dr. Hut.
  • Steven Kaas, Steve Rayhawk, Anna Salamon and Peter Salamon (2010). Economic Implications of Software Minds. The Singularity Institute.
  • Carl Shulman (2010). Whole Brain Emulation and the Evolution of Superorganisms. The Singularity Institute.
  • Anna Salamon, Steve Rayhawk, and János Kramár (2010). How Intelligible is Intelligence?. In Proceedings of ECAP 2010. Verlag Dr. Hut.
  • Eliezer Yudkowsky, Carl Shulman, Anna Salamon, Rolf Nelson, Steven Kaas, Steve Rayhawk, Zack Davis, and Tom McCabe (2010). Reducing Long-Term Catastrophic Risks from Artificial Intelligence. The Singularity Institute.
  • Carl Shulman (2010). Omohundro’s “Basic AI Drives” and Catastrophic Risks. The Singularity Institute.
  • Carl Shulman and Stuart Armstrong (2009). Arms Control and Intelligence Explosions. Paper presented at ECAP 2009.
  • Steve Rayhawk, Anna Salamon, Michael Anissimov, Thomas McCabe, and Rolf Nelson (2009). Changing the Frame of AI Futurism: From Storytelling to Heavy-Tailed, High-Dimensional Probability Distributions. Paper presented at ECAP 2009.
  • Eliezer Yudkowsky (2008). Cognitive Biases Potentially Affecting Judgement of Global Risks. In Global Catastrophic Risks. Oxford University Press. (Italiano) (Pу́сский)
  • Eliezer Yudkowsky (2008). Artificial Intelligence as a Positive and Negative Factor in Global Risk. In Global Catastrophic Risks. Oxford University Press. (官话) (Italiano) (한국어) (Português)
  • Eliezer Yudkowsky (2007). Levels of Organization in General Intelligence. In Artificial General Intelligence. Springer.
  • Eliezer Yudkowsky (2001). Creating Friendly AI 1.0: The Analysis and Design of Benevolent Goal Architectures. The Singularity Institute.