Every point of intervention
Events are already set for catastrophe, they must be steered along some course they would not naturally go. [...]
Are you confident in the success of this plan? No, that is the wrong question, we are not limited to a single plan. Are you certain that this plan will be enough, that we need essay no others? Asked in such fashion, the question answers itself. The path leading to disaster must be averted along every possible point of intervention.
— Professor Quirrell (competent, despite other issues), HPMOR chapter 92
This post is a quickly-written service-post, an attempt to lay out a basic point of strategy regarding decreasing existential risk from AGI.
1. Keeping intervention points in mind
By default, AGI will kill everyone. The group of people trying to stop that from happening should seriously attend to all plausible points of intervention.
In this context, a point of intervention is some element of the world—such as an event, a research group, or an ideology—which could substantively contribute to leading humanity to extinction through AGI. A point of intervention isn't an action; it doesn't say what to do. It just says: Here's some place along the path leading to disaster, where there might be useful levers we could pull to stop the flow towards disaster.
1.1. The vague elephant
Before going on, I'll briefly say: Don't do bad unethical things.
Just because we should attend to every point of intervention, does not mean we should carry out every act of intervention! E.g. don't be an ad-hominem dick to people, whether in private in public. In general, if you're about to do that thing, and you know perfectly well that if you thought about it for three minutes then you'd see that almost everyone would tell you that's a really really bad thing to do, then you should probably not do that thing. And if you still want to do it, then you should probably first try talking to several people who you trust (and who you don't strongly pre-select to be people who are egging you on to do that thing).
1.2. Example: France
Someone was telling me about their somewhat-solitary efforts to get the government apparatus of France to notice AGI x-risk and maybe do something about it; and to not be too swayed by influence saying to ignore those concerns. They expressed being unsure as to whether these efforts would matter much. People in the policy space would tend to think of the US and China as being the two players that really matter.
I argued to them that actually those efforts are pretty high-value. Leaving aside tractability (IDK) and neglectedness (yes) and goodness (probably, though there's always the worry of stimulating R&D investment), I wanted to argue for importance.
1.3. Full-court press
In basketball, there's a defensive mode called "full-court press". That's where you pressure the team with offensive possession of the ball everywhere on the court, trying to regain possession of the ball before the offensive team gets close enough to the basket to score. This contrasts with half-court press, where you basically let the opposing team take the ball to the half of court with your basket, and concentrate your defenses there.
Full-court press has the disadvantage of allocating some defensive resources away from the home side of the court. Thus, you can be more vulnerable if the opposing team gets near your basket. Also, full-court press is simply more expensive—the defending team has to run around much more, and gives up the advantage of clustering where they know the opposing team has to go (near the scoring basket).
But, full-court press is a good way to spend more resources to get better outcomes. You make them pass the ball more, giving them more chances to mess up, often producing turnovers. You make them run around more, which tires them out.
Likewise, intervening at every point along the path leading to AGI disaster may be a broad strategy that demands higher costs and risks allocating some resources away from important points; but that may also come with the benefits of giving more less-correlated opportunities to block the flow towards disaster.
1.4. Multi-stage fal... opportunity!
Suppose that there are 5 events that might occur, and if all of them occur, something really bad happens; on the other hand, if one of the events does not occur, then the really bad thing does not happen. Suppose each event will occur with probability 0.9.
First of all, how likely is the really bad thing to happen? One answer would would be $0.9^5 \approx 0.6$, i.e. there's a 60% chance of it happening. However, this answer is falling prey to the three multi-stage fallacies. You can't conclude that the bad result is only medium-likely, just because you made a list of events that all have to happen.
But here's a different question: How unlikely can you make the really bad event?
1.4.1. Brief tangent about a conjunction of disjunctions
Of course, the answer depends a lot on the specific structure of these events. But here's one kind of structure:
Suppose each of the five prerequisite events $P_1, ..., P_5$ is itself a disjunction. In other words, if any one of $D_{1, 1}, D_{1, 2}, ..., D_{1, n}$ happens, then $P_1$ happens. I think this is often the case in the real world. E.g., several different funders might fund some research group; several different research groups might succeed at some goal; several different technologies might provide workable components that enable some subsequent technology; etc. Furthermore, it's often the case that it's easy to intervene on some of the $D_{1, i}$ but not on others. In this case, it's easy to decrease the probability of $P_1$ somewhat, but not easy to decrease it a lot. You prevent some of the $D_{1, i}$ that are easy to prevent, and then you call it a day.
Does it help to somewhat decrease the probability of each $P_i$, without greatly decreasing any of them? Yep! As long as the probability of the conjunction is fairly high, the marginal value of decreasing the absolute probability of each of the $P_i$ is roughly the same.
1.4.2. Varied interventions help
Anyway, basically the point of this subsection is that it helps to intervene along many channels / at many points, if there are multiple conjunctive prerequisites to disaster.
Note that [multiple conjunctive prerequisites to disaster] is logically equivalent to [multiple disjunctive stoppers of disaster]. For example, it's plausible to me that either an international ban on AGI research, or a strong social norm in academia against AGI research, would very substantially slow down AGI research.
1.4.3. Sources of correlation indicate deeper intervention points
One of the three multi-stage fallacies is forgetting to use conditional probabilities for the prerequisites to disaster. For example, conditional on [we can't convince major nations to ban AGI research], it's probably much less likely that [we can convince AGI researchers to stop doing that].
The outlook of "every point of intervention" says to consider this correlation as a pointer to some deeper element of the world. In this example, the source of correlation might be [the same funder is paying both groups to continue AGI research], or [AGI risk doesn't feel real to people], or [people are secretly nihilistic and don't actually have hope in a deeply satisfying shared human future], or many other possibilities. (These are therefore not necessarily temporal points of intervention—events in a sequence—but generally, elements that could be intervened on.
2. Some takeaways
-
Focus on the places where you feel shocked everyone's dropping the ball.
-
This perspective doesn't help much with prioritization. But, generally, it says we should competently do a diverse portfolio of strategies. On the margin, I think competent newcomers should be directed towards the possibility of starting a new / neglected effort, rather than joining an existing one (though of course many existing efforts have important talent gaps).
-
There's lots of meaning everywhere. There may or may not be any good plans to decrease x-risk, but there are many things to try that are pretty worth-it and quite neglected.
-
If someone is deferring to you about strategy, consider helping them keep in mind that there are many approaches.
-
This doesn't mean "do random stuff and hope it decreases x-risk".
- One still has to think about which plans would be useful. Most plans don't help, and many plans actively hurt (mainly anything about contributing money or talent or social support to AGI capabilities research). Whether or not a point of intervention is potentially impactful is basically orthogonal to whether a possible act intervention is good. But, this does mean that if something is neglected, you should be less prone to say "that's ineffectual so not worth it". Ignoring points of intervention is a bias about the upside risks.
- It doesn't matter how correct and original you are in pointing out that some point of intervention is neglected, if you don't do anything about it, or if you do something about it that's harmful. Doing anything helpful usually requires a bunch of work, a lot of which is boring and/or thankless and/or of unclear importance.
- Sometimes people feel helpless coming specifically from the sense that "there's nothing that I can do that would help; there's only a few important ways to help, and I'm less capable than the people already working on that". I think that's not right, because there's many different ways to intervene against disaster, many of which are neglected. You can manufacture comparative advantage just by caring about a neglected approach and then investing serious effort into investigating that approach.
- Sometimes people feel helpless coming from the sense that "there's so many things I could do; this spreads out importance too thinly between many different plans; so none of them is worthwhile / IDK what to do". I think that's not right because importance isn't really conserved that way.
3. Some biases potentially affecting strategy porfolio balancing
-
Each actor (person, research group, funder) has to specialize in one or two points of intervention.
- Each actor therefore mainly thinks about their point of intervention. They are selected and incentivized to think that their point of intervention is especially important.
- When thinking out loud about what to invest their own resources into, an actor is likely to apply more pruning than would make sense at the level of making a global porfolio. (This is correct for them to do.)
- So each actor might tend to (explicitly and implicitly) underemphasize the general point "there are many points of intervention that are in the same ballpark of importance", even if the set of actors would disagree about which intervention is the important one.
-
Non-top-priority interventions are neglected.
- It's easier to coordinate around things that other people are already working on, thinking about, investing in, and acknowledging as worthwhile. This makes sense to a large degree, but probably not to the actually-practiced degree.
- People defer, often mistakenly, creating correlated choices and a meta-level inability to correct that situation.
- Even with a correct consensus belief, if resource-allocators fail to check the global margin for intervention categories, then the actual allocation portfolio will be biased towards top intervention categories.
- As an analogy, when discussing intervening on genetic variants identified by constructing polygenic scores, a common intuition is that it's somehow different to intervene on a genetic variant that has a large, high-probability effect (e.g. a single mutation that causes Huntington's disease) vs. on a genetic variant with a small effect that's more uncertain. This has a sort of sum-threshold structure: One can have a large overall effect by making many small interventions, where each single intervention seems not worth the effort.
4. A terse opinionated partial list of maybe-underattended points of intervention
(These are phrased as actions, but points of intervention can be backed out of them.)
-
International treaties to stop AGI research
- Support from many factions (many governments, interest groups, social leaders, etc.)
-
Convincing elements of the AI researcher pipeline (e.g. student programs for AI / ML research) to stop
- Philanthropists
- Government programs
- Schools
- Academics
-
General social milieu / norms
- Elite opinion
- Common opinion
- Academic opinion
- Student opinion; CS student opinion
- Journalist opinion
Illustration: A professor doing cutting-edge domain-nonspecific AI research should read in the paper that this is very bad; then should have students stop signing up for classes and research; and have student protests; and should be shunned by colleagues; and should have administration pressure them to switch areas; and then they should get their government funding cut. It should feel like what happens if you announce "Hey everyone! I'm going to go work in advertising for a bunch of money, convincing teenagers to get addicted to cigarettes!", but more so.
-
- AGI funders
- AGI employee researchers
- AGI research leads
- AGI fans
-
Making more very smart people, especially via reprogenetics.
-
Healing society; decreasing pressure / incentive to do AGI research
- If there's no long-term positive vision for the future of humanity, people may feel nihilistic / desperate. So they might not care as much if AGI kills everyone, and some people might even decide to do AGI research just for thrills or out of desperation.
- Generally, if society is healthier, it's more likely to direct human efforts towards good ends rather than AGI.
- Cryonics / brain preservation is deeply neglected. How much could you impact the social and financial difficulties in getting good brain preservation by making this your mission in life? How much would it change society, and people's believed tradeoffs around risky tech, if it was widely understood that we were working towards no involuntary death, and that this is already accessible?
-
Legibilizing AGI x-risk.
- (I debated this one because my surface impression is that the Redwood cluster is already doing a good job with this; but on second thought, legibilizing the deeper / more abstract / more core / more difficult problems is probably neglected.)
-
Group rationality, e.g. better debates.