The challenge
Transformative AI could be humanity’s most consequential technology, with potential for both immense benefit and catastrophic harm through misuse or loss of control.
Despite these risks, few people with the right combination of skills, context, and motivation are working in roles reducing these risks.
Our courses aim to provide people with these attributes, so they can contribute to catastrophic risk reduction from AI.
However, we’ve realised we’re not certain what exactly we need people to be working on for AI to go well. Which makes it hard to accurately target our course portfolio and content to achieve this objective.
We looked outwards to find a strategy or plan for AI safety, that would help guide us here. But we didn’t find anything we that met our needs here. In particular, we wanted something that is:
- Sufficient: if all the actions are carried out, we would consider the world in a good state
- Feasible: we think it’s reasonably plausible the plan can be executed. This generally excludes plans that require actors to take significant actions against their own interests.
- Action-orientated: the plan is a set of actions that explain how people can contribute (and thus how our course could prepare people to contribute), and does not just describe a list of events that happen.
- Comprehensive: the plan covers all actions needed, not just some for a short time frame or one jurisdiction.
We therefore are trying to compile an AI safety strategy to inform both our courses and the wider field that meets these criteria.
For more details about what we’re trying to achieve, see the context section of the public work test details.