
AI Alignment (2023)
This is the old curriculum from 2023. The new 2024 curriculum is available here. We strongly recommend you do the new course, and to only follow this course if you were already part-way through it. AI systems are rapidly becoming more capable and more general. Despite AI’s potential to radically improve human society, there are still open questions about how we build AI systems that are controllable, aligned with our intentions and interpretable. You can help develop the field of AI safety by working on answers to these questions. The AI Alignment course is designed to introduce the key concepts in AI safety and alignment, and will give you space to engage with, evaluate and debate these ideas. You’ll meet others who are excited to help mitigate risks from future AI systems, and explore opportunities for your next steps in the field. The course was originally designed by Richard Ngo, and is run by BlueDot Impact – a non-profit that supports people to develop the knowledge, skills and connections they need to pursue a high-impact career.
Curriculum
Introduction to Machine Learning (23)
Unit 0
View unit→Artificial General Intelligence
Unit 1
View unit→Reward misspecification and instrumental convergence
Unit 2
View unit→Goal misgeneralisation
Unit 3
View unit→Task decomposition for scalable oversight
Unit 4
View unit→Adversarial techniques for scalable oversight
Unit 5
View unit→Interpretability
Unit 6
View unit→Governance (Alignment 23)
Unit 7
View unit→Agent foundations
Unit 8
View unit→Careers and Projects
Unit 9
View unit→