Resources: Adversarial techniques for scalable oversight

Exercises: Adversarial techniques for scalable oversight

Resources: Reward misspecification and instrumental convergence

Exercises: Reward misspecification and instrumental convergence

Resources: Governance (Alignment 23)

Exercises: Governance (Alignment 23)

Resources: Task decomposition for scalable oversight

Exercises: Task decomposition for scalable oversight

Resources: Careers and Projects

Exercises: Careers and Projects

Next steps: Programs

Resources: Interpretability

Exercises: Interpretability

Resources: Agent foundations

Exercises: Agent foundations

Resources: Artificial General Intelligence

Exercises: Artificial General Intelligence

Resources: Introduction to Machine Learning (23)

Exercises: Introduction to Machine Learning (23)

Resources: Goal misgeneralisation

Exercises: Goal misgeneralisation

AI safety is a young field with few legible opportunities but, counterintuitively, there’s a lot of work to be done. Figuring out what to do can be difficult, as there aren't many default paths to follow.
This week's main objective is to set some time aside to think about your career and goals, and to propose a project you could work on over the next four weeks to help you work towards those goals, informed by the knowledge you've gained throughout this course. The project could include learning a new skill, trying to develop a tentative opinion about an important topic and writing about it, starting a longer-term project, testing fit for different roles, etc. 
The main focus of this week are the two exercises, which you can find after the resources section. Please submit these before the session. Having any plan helps resolve decision paralysis, and you will likely revisit and adapt this plan as you take action and learn more. The important thing, for now, is to get started.

How to approach this week
1\. Read the provided resources (there are fewer than usual)
2\. Complete the Big picture plan exercise before the session, focused on brainstorming different long-term paths you could pursue 
3\. Complete the Capstone projects & next steps exercise before the session, focused on what self-directed project(s) you could work on over the next four weeks
During the session with your cohort, you’ll have the opportunity to offer and receive feedback from your peers on your plans, and to discuss how you can support each other to achieve your goals. Don't worry if any of these plans are rough drafts, or if you’re very uncertain. Your cohort is there to support you, and the intent of the session is for you to help each other improve and refine your plans.

Working on your project
After the session, you have the next 4 weeks to work on your project! We'll ask for submissions or reports at the end to see what people got up to (these won't be "graded", but they are very helpful for our impact evaluations and informing our decisions in future courses).
The Slack will remain open for conversations with your cohort, general help/advice or feedback on your project - from your cohort or others. As alumni of the AGI Safety Fundamentals: Alignment Course, you’ll also have access to follow-on courses and workshops we intend to run in the future!

By the end of the unit, you should be able to:
\- Career plan: Develop a draft of a big picture/career plan, which you feel motivated to work towards in the long-term and immediate-term.
\- Actionable, accountable steps: Within this career plan, develop a 4-week plan for how you’ll make progress on addressing your key uncertainties. Break these into actionable, week-by-week SMART goals steps that you’ll be accountable for in the next four weeks. 
\- Aiming towards the alignment problem: If your plan is intended to address AI alignment, make a case for how your final objective makes progress on the alignment problem.
\- Moving swiftly towards the aim: Make a case for how your intended 4-week plan helps you reach your final objective, via addressing your key uncertainties. 


During this course, you've learnt a lot about alignment, and you may now have a sense of the opportunities currently available. How might you factor this into your own career's big picture plan? 
This exercise involves brainstorming different long-term paths you could pursue, similar to the suggestions by 80,000 Hours [here](https://80000hours.org/career-planning/process/longer-term-paths/). We encourage you to respond to the prompts below and list any uncertainties you have, and to submit your responses in the text box.

- To help you consider a variety of different paths forward, brainstorm 3-5 roles you could have in 5-10 years, or different projects you could be pursuing, that are important for AI safety
- You could start by suggesting roles you think are needed to make AI safety go well, then evaluate what roles currently exist and what [jobs](https://www.agisafetyfundamentals.com/opportunities) are [available](https://jobs.80000hours.org/?refinementList%5Btags_area%5D%5B0%5D=AI%20safety%20%26%20policy) (the available jobs does not necessarily reflect the work that needs to be done!)
    - You could also consider roles are adjacent or related to AI safety, that would help with developing skills and relationships. 
- Based on all these ideas, identify a few roles that seem both important for AI safety and that you'd be excited to test your fit for or work on. 

Other prompts:
- Clarify the strengths, skills, competencies, and other career capital you already have. List 1-3 of these assets below.
- Brainstorm other skills or areas of expertise that you could develop, that would enhance your ability to contribute. Try to focus on skills that are both [rare and valuable](https://www.scotthyoung.com/blog/2022/08/21/key-career-progress/). 
    - Another framing for this is to have [sets of skills](https://www.lesswrong.com/posts/XvN2QQpKTuEzgkZHY/being-the-pareto-best-in-the-world) where very few people have capabilities in both, but that can be leveraged together in high-value ways. 
    - We encourage you to consider options that you'd be excited to explore, or have a comparative advantage for.  


Big picture plan

The next few weeks are an opportunity for you to receive support and encouragement from your cohort to pursue your own self-directed project, that helps you address key uncertainties you have in your career plan and how you could contribute to AI safety. 
You might want to consider a project that helps you [develop your own hypotheses](https://www.cold-takes.com/learning-by-writing/) about important questions relevant to the alignment problem or specific research agendas, or that set you up on a path to [master a relevant skill](https://www.scotthyoung.com/blog/2020/04/01/skills-that-matter/), or that involve [cheap tests](https://80000hours.org/career-guide/personal-fit/#how-to-explore-cheap-tests-first) to help you test your fit for different roles and career paths.
Your approach could look like this: 
- Brainstorm as many actions as you can think of, that would help you resolve your key uncertainties.
- Pick the actions that seem most valuable or interesting to you, then assign them to individual weeks, and make them [SMART goals](https://en.wikipedia.org/wiki/SMART_criteria) (specific, measurable, achievable, realistic, and time-related)
- What could go wrong in your plan? Revise any parts of your plan that seem brittle or likely to fail.
- Think about what kind of accountability mechanism you’d like for the group to have, and what kinds of collaboration you’d ideally have for the group.


AI Alignment (2023)

Next steps: Programs