A framework for comparing global problems in terms of expected impact - EA Forum
Suppose you’re trying to figure out whether to learn about health in developing countries; or whether to become a researcher in solar energy; or whether to campaign for criminal justice reform in the…
After a large amount of resources have been dedicated to a problem, you’ll hit diminishing returns. This is because people take the best opportunities for impact first, so as more and more resources get invested, it becomes harder and harder to make a difference. It’s therefore often better to focus on problems that have been neglected by others.
To make more wide ranging comparisons between problems, you need to turn to “yardsticks” for scale. These are more measurable ways of comparing scale that we hope will correlate with long-run social impact.
For instance, economists often use GDP growth as a convenient yardstick for economic progress (although it has many weaknesses). Nick Bostrom has argued that the key yardstick for long run welfare should be whether an action increases or decreases the risk of the end of civilization – what he called existential risk.
However, we think that society’s mechanisms for doing good are far from efficient, so all else equal, neglectedness is a good sign.
In other cases – where solving a problem requires innovative techniques – the scores are usually assigned based on judgement calls, ideally based on a survey of expert opinion.
For scoring we use the ‘expected value’ approach. That is, a 10% chance of solving all of a problem is scored the same as a project that would definitely reduce it by 10%.
we prefer to use the scores to make relative comparisons rather than absolute estimates.
While personal fit is not assessed in our problem profiles, it is relevant to your personal decisions.
Within a field, the top performers often have 10 to 100 times as much impact as the median.
A great entrepreneur or researcher has far more impact than an average one, so if you’re planning to contribute in either of those ways, personal fit matters a lot. However, if you’re earning to give, personal fit is less relevant because you’re sending money rather than your unique skills.
So to assess personal fit in more depth, you could estimate your percentile in the field, then multiply by a factor that depends on the variation of performance in the field.
If you’ve used our rubric above, you can add the scores together to get a rough answer of which problem will be more effective to work on.
Bear in mind that these scores are imprecise, and adding them increases the uncertainty even further, because we only measure each one imprecisely. This means you need to take your final summed score with a grain of salt – or rather a lot of salt.
Within 80,000 Hours, if the difference in score between two problems is 4 or larger, we have a reasonable level of confidence that it’s a more effective problem to work on. If the difference is 3 or smaller it looks more like a close call.
For one, our scores have to be tempered by common sense judgements about the world.
The scores we get when using this framework suggest that some problems are 10,000x more effective to work on than others. However, we don’t believe that the differences really are that large.
Some other reasons for being modest about what such prioritisation research can show us are discussed here.
Explicitly quantifying outcomes can enable you to notice large, robust differences in effectiveness that might be difficult to notice qualitatively, and help you to avoid scope neglect.
Going through the process of making these estimates is a great way to test your understanding of a problem, since it forces you to be explicit about your assumptions and how they fit together.
In practice, these types of estimates usually involve very high levels of uncertainty. This means their results are not robust: different assumptions can greatly alter the conclusion of the analysis. As a result, there is a danger of being misled by an incomplete model, when it would have been better to go with a broader qualitative analysis, or simple common sense.
An individual can only focus on one or two areas at a time, but a large group of people working together should most likely spread out over several.