Unit 4: Understanding AI
How does AI think?
Resources (35 mins)
- What Do Neural Networks Really Learn? Exploring the Brain of an AI Model
Create a free account to track your progress and unlock access to the full course content.
- Introduction to Mechanistic Interpretability
Create a free account to track your progress and unlock access to the full course content.
- Neel Nanda on the race to read AI minds
Create a free account to track your progress and unlock access to the full course content.
- The Misguided Quest for Mechanistic AI Interpretability
Create a free account to track your progress and unlock access to the full course content.
Optional Resources
- MoSSAIC: AI Safety After Mechanism
Create a free account to track your progress and unlock access to the full course content.
- Let's Try To Understand AI Monosemanticity
Create a free account to track your progress and unlock access to the full course content.
- Against Almost Every Theory of Impact of Interpretability
Create a free account to track your progress and unlock access to the full course content.
- Interpretability Will Not Reliably Find Deceptive AI
Create a free account to track your progress and unlock access to the full course content.
- Barriers to Mechanistic Interpretability for AGI Safety
Create a free account to track your progress and unlock access to the full course content.