Unit 3: Challenges in achieving AI safety
Resources: Challenges in achieving AI safety
Resources (1 hr 20 mins)
- Purpose of session
Create a free account to track your progress and unlock access to the full course content.
- Emergent Deception and Emergent Optimisation
Create a free account to track your progress and unlock access to the full course content.
- AI Safety Seems Hard to Measure
Create a free account to track your progress and unlock access to the full course content.
- Compilation: Why Might Misaligned, Advanced AI Cause Catastrophe?
Create a free account to track your progress and unlock access to the full course content.
- Nobody’s on the ball on AGI alignment
Create a free account to track your progress and unlock access to the full course content.
- Avoiding Extreme Global Vulnerability as a Core AI Governance Problem
Create a free account to track your progress and unlock access to the full course content.
Optional Resources
- What are some arguments for AI safety being less important?
Create a free account to track your progress and unlock access to the full course content.
- Goal Misgeneralisation: Why Correct Specifications Aren’t Enough For Correct Goals
Create a free account to track your progress and unlock access to the full course content.
- What failure looks like
Create a free account to track your progress and unlock access to the full course content.
- Goal Misgeneralisation examples
Create a free account to track your progress and unlock access to the full course content.
- The other alignment problem: mesa-optimisers and inner alignment
Create a free account to track your progress and unlock access to the full course content.
- Beyond Near- and Long-Term: Towards a Clearer Account of Research Priorities in AI Ethics and Society
Create a free account to track your progress and unlock access to the full course content.
- AI Timelines: Where the Arguments, and the "Experts," Stand
Create a free account to track your progress and unlock access to the full course content.
- What Everyone in Technical Alignment is Doing and Why
Create a free account to track your progress and unlock access to the full course content.
- Discontinuous progress in history: an update
Create a free account to track your progress and unlock access to the full course content.
- AI Governance: Opportunity and Theory of Impact
Create a free account to track your progress and unlock access to the full course content.
- Coordination challenges for preventing AI conflict
Create a free account to track your progress and unlock access to the full course content.
- AI Governance Optional Resources (Extended)
Create a free account to track your progress and unlock access to the full course content.