Unit 1: Further Understanding the Alignment Problem
Resources: Further Understanding the Alignment Problem
Resources (1 hr 20 mins)
- AGI ruin: a list of lethalities
Create a free account to track your progress and unlock access to the full course content.
- Where I agree and disagree with Eliezer
Create a free account to track your progress and unlock access to the full course content.
- Worst case thinking in AI alignment
Create a free account to track your progress and unlock access to the full course content.
- Empirical findings generalize surprisingly far
Create a free account to track your progress and unlock access to the full course content.
Optional Resources
- Optimal Policies Tend To Seek Power
Create a free account to track your progress and unlock access to the full course content.
- Reward is not the optimization target
Create a free account to track your progress and unlock access to the full course content.
- Advanced artificial agents intervene in the provision of reward
Create a free account to track your progress and unlock access to the full course content.
- Risks from Learned Optimisation: Deceptive alignment
Create a free account to track your progress and unlock access to the full course content.
- The theory-practice gap
Create a free account to track your progress and unlock access to the full course content.
- Yudkowsky contra Ngo on agents
Create a free account to track your progress and unlock access to the full course content.
- How do we become confident in the safety of a machine learning system?
Create a free account to track your progress and unlock access to the full course content.