MS&E338/CS338 Aligning SuperintelligenceWithin a couple of decades, or less, it is plausible that humans will create an AI that is much smarter than humans in practically all domains of human activity. We refer to such an AI as a superintelligence. The alignment problem is how to make sure that such a superintelligence acts according to its human designer's intent. This course is intended for a technical audience interested in thinking about this problem. But why would an AI not act according to its human designer’s intent? And if the AI were to misbehave, wouldn't the designer just modify it or shut it down? Furthermore, even if we accept that the AI will not always behave as intended, why should this be considered a major source of risk, let alone, a catastrophic risk? Why are some people saying that these risks should be a global priority on par with pandemics and nuclear war while others are saying that these concerns are overhyped? In this course, we will discuss:
There will be Google doc (link) that contains additional details about the course. This document is the main hub for course information and will be updated throughout the course. Anyone with a Stanford email has access. If you are curious about the course but not at Stanford, please reach out and we will invite you. PrerequisitesThe course will place special emphasis on formalizing ideas. About ⅔ of the course will be theoretical and ⅓ empirical. To have the background to participate in this, each student is recommended to have taken
What this course is not about
Logistics Semyon Lomasov, slomasov AT stanford DOT edu. Benlin Gan, bgan2 AT stanford DOT edu. |