Getting good people retrospective (AI Alignment March 2024)
This part of the retrospective is primarily for me and Li-Lian, to understand how we can continue to attract and accept strong participants to the course.
Despite aiming to figure out how we should be prioritising paid marketing, I did not get a good answer on this. This is because (1) we did not collect great data here, and (2) the data we do have suggests overall what we did wasn’t great - and the numbers of people we got from paid ads were too small to reasonably analyse. We have since significantly improved the way we do paid marketing, which I hope to write more about in future.
The article therefore primarily focuses on our application process, and considers who we should be accepting. It also touches on a number of non-paid marketing sources.
What do we count as a ‘strong participant’?
We identified strong participants based on pre- and post- course factors. Specifically, we considered someone a strong participant if any of the following applied:
- had a ‘strong yes’ application decision. Applications are scored on several metrics, then based on these scores + some subjective judgement are given an overall ranking from ‘strong no’, ‘weak no’, ‘neutral’, ‘weak yes’, and ‘strong yes’. We put a lot of work into calibrating these rankings so participants are evaluated fairly. We do another pass over these, and usually accept ‘weak yes’ and ‘strong yes’ participants, and some ‘neutral’ participants.
- submitted a project that we evaluated as ‘high-quality’. Projects are scored on several metrics, then based on these scores + some subjective judgement are given an overall ‘low’, ‘moderate’, or ‘high’ quality ranking that is carefully standardised. We put a fair amount of work to calibrate these rankings, but less than our application process (as this data is primarily used in aggregate to review course performance). The bar for a ‘high-quality’ project is genuinely quite high: it needs to be well-communicated novel work that we expect will be useful for others, with direct relevance to AI safety.
- were highlighted by a facilitator. At the end of the course, we send facilitators a feedback survey which allows them to highlight particularly high-promise participants. There is a fair bit of variance between facilitators in this.
We also identified ‘highly engaged’ participants. We define these as those that missed at most 1 cohort session. In future, we’d like to improve this perhaps by reviewing their activity in cohort documents or our Slack.
Both of these are likely to be highly lossy proxies for impact. Anecdotally we know of many high-impact participants who are not captured in the above.[1] We thought hard about ways to identify strong participants, and concluded that these were the best we could do without manually evaluating each participant again. Evaluating each participant would be very time-intensive, and likely be less helpful so soon after the course. For an example of when we did this later after the course, see our recent follow-up with our 2022 AI alignment course graduates.
How well does our application process identify strong participants?
Our application process does line up with facilitator-highlighted participants. 11% of 'strong yes' applicants received facilitator recommendations, compared to 5% of other applicants (statistically significant, p = 0.04).
However, it’s not predictive of project success or high engagement. 14% of 'strong yes' applicants submitted a high-quality project at the end of the course, compared to 12% of other applicants. 28% of 'strong yes' applicants were highly engaged on the course, compared to 30% of other applicants. Both of these are not significant. Based on user interviews, I think there might be two factors at play here cancelling each other out:
- ‘Strong yes’ participants are often more keen to learn from the course and get into AI safety. They often express clearer paths to impact, or generally come across more excited about learning from the course.
- ‘Strong yes’ participants are often more busy. Many of the best participants have senior roles in important organisations or a wide range of commitments. In addition, these participants sometimes said they didn’t feel the need to do a project because they were already effectively doing this as part of their job.
We were borderline about many strong participants.[2] 20% of the strong participants were evaluated ‘neutral’. This is lower than the 25% on the course, suggesting our application process has some predictive power, but it’s still far from accurate. Looking back at these applications, I think the key problem was them having a vague path to impact - we have since significantly changed our application form, and provided some guidance on this in the context of our AI governance course (and plan to write a similar article for AI alignment). We did actually change our application form part way through applications, and it seems like the newer application form is slightly more accurate (but the small numbers here make it hard to say for certain - we’ll better the effect, if any, on the June 2024 iteration).
We probably did miss good participants. Spot-checking a few rejected applicants now and seeing what they’re up to suggests we probably should have accepted them.[3]
Could we have identified good participants with other factors?
We currently make most application decisions based on the three open questions on the application form, with a slight leaning towards professionals and away from high schoolers.
While running this course, we came up with a few theories for how different groups behave. In this section I took a look at some of the other data we have from applications, to see if relates to some of the course outputs above.
Career stage
PhD students had the greatest rate of doing high-quality projects (27%). This maybe isn’t surprising given it may be similar to what they already do for their PhD, and they’re likely to have the most flexibility in their time. It is surprising that they did so much better than postdocs and professors who were the worst (6%).
Professionals had the highest attendance. On average, professionals attended 7.6 sessions - this jumps to 8.5 sessions excluding people who didn’t turn up to any session. They missed few sessions, on average having 1.8 absences.[4]
Undergrads had poor attendance. On average they attended 5.9 sessions, increasing to 7.6 excluding non-attenders. They were also much more likely to miss scheduled sessions, with an average of 3.2 absences, and created an outsized number of manual switching requests which are particularly taxing for us to deal with.
High schoolers had the worst attendance. This course we experimented with a small number of talented high schoolers joining some undergraduate cohorts. Having previously had difficult experiences with high schooler participants, we had a very high bar for accepting high schoolers and tried to select for commitment. Even with this, high schoolers had awful attendance - on average attending 4.5 sessions, with 3.4 absences. A couple of high schoolers did do well and submitted reasonable projects, but these were the exceptions and not the norm.
This matches our experiences with other courses. Our pandemics course saw similar problems with undergrads, and anecdotally the discussion sessions we’ve reviewed seem to be less insightful. Professionals being the best at regularly attending is perhaps initially counterintuitive, but having seen this play out a few times we have some theories:
- Professionals have more regular schedules: most work a job with standard hours. Undergrads often have changing term schedules or different events week-to-week.
- There’s a selection bias effect where professionals who take an online course might be more serious than the average undergrad.
- Professionals are more experienced, so are generally more capable of getting themselves in the right place at the right time.
Country
Strong participants mostly come from richer countries. 94% of strong participants who we have country data for come from countries with an average GDP per capita over $20,000. This compares to 81% of other participants. This could be because of different reasons, and how we should respond depends on which one it is (and figuring that out would likely require further research). Given most of the strong participants are based on projects, some theories are:
- People in richer countries are better educated coming into the course (in a way that our application process isn’t currently picking up on), particularly around communication and reasoning skills, which are fairly critical for the project.
- People in richer countries tend to be more wealthy, so are able to spend more money and time on their project. We tried to mitigate the money aspect by launching a rapid grants scheme (which excluded reimbursing people for time spent on the project).
- Internet quality and access to high-quality tech are better in richer countries, so participants are better able to get value from project sessions. Anecdotally, participants from poorer countries seem much more likely to have connection or other tech issues. We did launch audio quality grants to help mitigate this somewhat, especially because having a participant with poor tech significantly negatively affects other participants’ experiences.
There is a risk here that if we filter by country, particularly for rich countries, we’d be effectively increasing inequality. It also feels quite unfair to filter by country, especially as we might be able to directly target the above factors instead.
Writing quality
Strong participants write well. All the strong participant applications we reviewed were clear and straightforward. They directly answered the questions asked, and for longer answers used topic sentences and paragraphs effectively. We did previously weight writing skill for AI governance course applicants on the advice of experts from the field, but haven’t been doing this for AI alignment. We could investigate how strong this pattern actually is, and consider adding it to our criteria.
This might be a result of using high-quality projects as a proxy for strong participants. Clear communication was a key explicit criteria in judging projects, so this is certainly not independent. However, we still think communication seems to be a critical skill that most applicants could improve at. We frequently speak to people in the AI safety field who highlight communication skills as a key thing they look for in new hires or collaborators.
Takeaways
Based on the above, in the next round of the alignment course I plan to:
- Start prioritising PhD students.
- Continue prioritising professionals, and deprioritising undergrads.
- Better evaluate high schoolers’ commitment and reliability. This probably looks like asking them extra questions, and/or getting them to more explicitly commit to the course.[5] This does come with an increased risk of rejecting good participants.[2]
- Continue to not use country for applicant evaluation, given the uncertainty we have around the participant experience and fairness.
- Evaluate participants’ writing quality and communication skills, and use this to help make decisions about borderline candidates.
Additionally, during the course I plan to:
- More carefully track per-participant absences, and be stricter about removing people from the course if they have a high number of absences without good cause.
I also think we should spend more time thinking about how well we can predict strong participants, and what our strategy should be if the answer to that is ‘not very well’ (which I think the data above suggests might be the case).
Where did people come from?
Most people who applied, and who we accepted were referred to our course by someone else. We ask all participants where they heard about our course on the application form, and then try to put this in rough categories. Approximately 51% of applications, and 60% of accepted participants were recommended to take our course by someone else (percentages are out of total accepted participants):
- 20% from the EA community broadly
- 20% generic referrals
- 15% from 80,000 Hours (both the website and their career advising)
- 5% from MATS, GovAI, the Atlas Fellowship, Rational Animations, and Rob Miles combined
User interviews also suggested referrals were a key factor in their decision to apply. Aligning with the above analysis, most participants we interviewed mentioned applying because they saw the course recommended by others.
Some participants do a lot of research before applying. One participant said they read through the curriculum, most of our website and searched elsewhere (including searching our name on Google, and finding a Reddit thread positively discussing the course) before applying. This behaviour seemed slightly more likely in people with a more technical background (e.g. coming from software engineering or ML engineering).
Participants highlighted the availability of course resources as key to their decision. Multiple users said it was useful to look through the full curriculum and get a sense of the course:
- “this was a massive plus for me, to see what topics would be covered - it definitely impacted my decision”
- “I really like the course hub and that is what prompted me to apply for the AI alignment course. It is very helpful to see an outline of the entire course on the left navigation. It's like a table of contents. I like knowing what I'm getting into up front.”
User interviews also found people were particularly excited about the projects, and found the website testimonials convincing. Comments included:
- “[the projects] feel like fun, and useful for my future career - I want to make a project to put on my GitHub that I can show when applying to jobs in safe AI”
- “the people recommending the course seem credible, like from good companies”
Many people found us through organic search. After referrals, the next largest category was organic search which represented 8% of applications and 8% of accepted participants. Organic search participants had the lowest drop-out rate of 7%, below the average of 17%.
Organic search growth seems related to the website changes we made in early 2024. We put effort into making our website much better earlier this year, and that seems to have paid off: before making these changes our traffic was very stable for a year - afterwards we had 1518% more impressions and 174% more clicks from Google Search. However, it is hard to directly draw the connection especially with many other confounders such as running more courses, and more interest in AI safety generally (we don’t think holiday seasons or deadline proximity have much of an effect, given quite how stable it was for the last 18 months, and how consistent the increase has been).
80,000 Hours and LessWrong were particularly good applicant sources. We don’t include application sources in evaluating participants for our courses (unless someone writes them into their core answers). However, we found we accepted a high proportion of applicants who had heard from us via 80,000 Hours (58% acceptance rate) and LessWrong (63%). Additionally, both these groups also have higher rates of submitting high-quality projects at the end of the course (16% and 30% respectively). We think we might have been more likely to accept them because these applicants are more likely to be aligned with our framing of working on AI safety, and have thought about their path to impact. LessWrong has a much higher project completion rate, which might reflect its existence as a forum of people who are used to doing small research projects and writing them up.
Rational Animations shoutouts provided some strong applicants, among a lower quality than average source. Referrals from Rational Animations had the second lowest acceptance rate (15%). However, ones we did accept had the highest ‘high engagement’ rate (45%), a low drop-out rate (9%), and produced some high-quality projects. Since we’ve made our application evaluation process more efficient we’re very happy to continue receiving these applications and thank RA for sending great people our way.
No strong applicants came via any paid marketing. We spent money on LinkedIn Ads for this round of the alignment course. These participants had the lowest acceptance rate of any group (14%), the highest drop-out rate (33%), and none of them submitted a high-quality project or were recommended by their facilitators. They did have about average ‘high engagement’ rate: 33%, compared to 30% across all applicants. Note that these statistics are based on relatively few LinkedIn Ad participants given how few we ended up accepting.
What other analysis would we have liked to do?
Deeper applications analysis. With more time, we could dive deeper into figuring out what kind of scores or attributes of an application really are predictive of good course outcomes. This does feel like a good problem to throw machine learning at: we have a bunch of scores and data on applicants, and we have many outputs we could predict that would be informative (such as attendance, and likelihood to do a moderate/high-quality project). Feature importance analysis could help us focus the application form further towards the most useful inputs to predict this accurately. We could also run LLMs over the data with new scoring criteria to evaluate old applications for writing quality, to backtest that idea properly - ideally also finding a metric other than project quality to monitor (see section on writing quality for reasoning). We are hiring a software engineer and product manager, and you could solve explore kinds of problems if you joined us!
Evaluating options for improving referrals. Referrals from others are the source of most applications. I didn’t evaluate how much room for growth there is here, or what we could do to encourage this growth - but this seems potentially very valuable. Some ideas of what we could do (this is a short scrappy brainstorm, I’m sure we can do much better by thinking about this properly + getting expert feedback!):
- Build relationships with people who recommend us. We currently have no formal relationship management of the marketing referrals we get: we should reach out to people to properly understand why they recommend us, and how we can better support them doing this.
- Encourage people or organisations to share who they think might be particularly good, so we can take this into account when evaluating their application. We sometimes do this on an ad-hoc basis, but having this more consistently would likely enable us to make better decisions.
- Make it easy for people to get branding resources. I’ve seen recommendations which look a bit ugly, or use our old branding (which looks less professional and could confuse people as they arrive and see our new, different, branding).
- Encourage people to share their experiences on the course, as well as their projects, at the end of the course on social media. Sharing certificates could be another good route to this.
- Support others to create content, for example offering to be podcast interview guests (or offering AISF graduates as guests).
Better understanding paid marketing would be useful, given this is a lever we can easily control for future courses. In particular, we would have liked to know:
- What designs and messaging performed the best in our ads?
- What is the expected ROI for advertising? (this also depends on how we value a participant completing our course, which we don’t presently have high-quality estimates of)
Unfortunately, paid marketing did not perform well enough to properly conduct this analysis: there were simply too few people we were excited about from paid ads this time around.
Even if we did have many applications, our tracking setup was not ideal for understanding the journey they took from learning about us to applying. We’ve made fairly simple changes to improve this considerably, in a way that we think is still fair and respects people’s privacy.
We’ve learnt a lot from this experience, and preliminary analysis for the next round of the AI alignment course suggests we did at least 10x better per dollar spent on LinkedIn. I hope to write up what we did differently in the future, as well as try to answer the questions above looking at that course.
Conclusions
Our application process shows mixed efficacy: it identifies some strong participants but seems likely to miss others, and does not predict project success or engagement well. Leaning towards professionals continues to make sense, but we should also likely lean towards accepting more PhD students. Experimenting with evaluating writing quality (which could be first validated with a backtest) seems potentially promising. High schoolers performed poorly again, and we should not accept them in the same way we did for the previous round.
Unpaid marketing was dominated by people recommending us, particularly people we’re already well aware of in the AI safety space. Given quite how crucial this seems for getting strong applicants, I think we should focus much more on nurturing these referrals.
Paid marketing performed very poorly. We got few participants from it, and no strong participants. We’re working on improving this and already made significant changes for the June course.
Go back to the top-level retro page.
Footnotes
For example:
- a participant in national security who did an interesting sounding project with work as part of the course, but who wasn’t able to share their project so we couldn’t evaluate it fully.
- a few participants who seemed engaged and likely to go into high impact work from our interactions with them, but where their project might have not been great. For example due to medical issues, family emergencies, or simply not being great at self-directed project work (but still potentially a great contributor to AI safety in an organisation).
- a few participants who are doing interesting-sounding projects, but who haven’t submitted yet for various similar reasons.
- a couple participants who decided not to do a project to prioritise other high impact or similar AI safety work, particularly those already in senior roles relevant to AI safety, for example in governments or think tanks.
We have debated what our policy for applications should be: whether it should be ‘we want to make sure everyone good can take the course’ or ‘we want only good people on the course’. This point matters a lot if it’s the former, and comparably less if it’s the latter.
In practice, we haven’t actually had this choice, given we’ve usually been capped on the number of places by the number of facilitators we’ve been able to attract (which was true for this course too). However, our teaching fellow program among other interventions has increased the number of spaces we can offer so this might become more of a meaningful question to answer.
By this, I mainly mean that it appears they did have strong skills in their application, and based on their LinkedIn profiles they now appear to be seriously trying to get into AI safety, or have successfully done so.
By absence, we mean any session that they were scheduled to attend that they missed. This means we count it as an absence even if the participant switches to another cohort to make up that week.
During this course there was no way for a participant to skip a single session without it being recorded as an absence (as to avoid an absence participants would have had to switch to another cohort running that session). Participants can drop out of the course and their future sessions are not considered as abscences.
We track this because we think it gives some insight into how disruptive their non-attendance has been (because a non-forewarned absence tends to be harder to handle e.g. we can’t move participants around to balance out the cohort sizes).
One version of this might be to have them complete the session 1 resources and exercises, and send their exercise responses to us, to confirm their application or place.