AI Alignment (2023)
0. Introduction to Machine Learning (23)
1. Artificial General Intelligence
2. Reward misspecification and instrumental convergence
3. Goal misgeneralisation
4. Task decomposition for scalable oversight
5. Adversarial techniques for scalable oversight
6. Interpretability
7. Governance (Alignment 23)
8. Agent foundations
9. Careers and Projects
Unit 1: Artificial General Intelligence