HIA and X-risk part 1: Why it helps

1. Introduction: human intelligence amplification and existential risk

I've always taken it for granted that increasing the brainpower of humanity is a good thing. It's intuitively clear to me. But since I've started working on human intelligence amplification (HIA), several people have raised the question of whether HIA is actually good—and especially, the question:

Does HIA increase existential risk by making more smart people who will work on making (misaligned by default) AGI?

Since I've taken it for granted that HIA is good, I'll think it through more explicitly here. I'll address a question with a scope that's intermediate between the above question, and generally "Is HIA good for humanity?". Namely:

Does human intelligence amplification increase or decrease existential risk from AGI?

I'm probably going to keep working on HIA, because I think that it's good for humanity in general and that it will decrease existential risk. But, I want others to have my thinking on this available for inspection; and I want to call for counterarguments; and I want to think it through a bit more, so that I can update more easily; and, possibly I could talk myself out of working on it.

Related:

In this article, part 1, I'll lay out the reasons I think HIA is likely to decrease existential risk. This won't be everything to say on the "decrease" side of the argument, because I'll also want to respond to arguments on the "increase" side. Later parts, if written, will address why HIA might increase X-risk, find relevant facts, and discuss what to do in response. So, what follows is the first volley, the main background reasons I think HIA helps with X-risk.

2. Deference

Partly, it's because Yudkowsky says so. He suggests HIA as a, or the, main way out of AGI omnicide, assuming you've unstably delayed the creation of AGI for some time. As I understand it, the main way he hopes HIA would help is by the very smart people solving AGI alignment (see e.g. https://x.com/ESYudkowsky/status/1651040314973626368). Yudkowsky is the best thinker regarding AGI x-risk reduction strategy.

Also, most (I think almost all) other people whose thoughts I somewhat trust on x-risk reduction seem in favor of HIA.

3. Reversal test

The three-line proof that HIA would decrease X-risk is:

  • Would making humanity / many people less smart decrease X-risk? No.
  • How plausible is it that the current level of smartness is the best level, regarding X-risk? Not very plausible, come on.
  • Therefore, increasing intelligence would probably decrease X-risk.

This argument is applying the Reversal Test (wiki, Bostrom's original paper on enhancing human intelligence) to HIA and X-risk. It's not actually a proof of course.

4. Human empowerment is good

Empowering humans—to understand and affect the world more, and to become more themselves—is good. Intelligence helps humans do that.

This holds in great generality. When humans are empowered, they get what they value more. They can satisfy their own needs more, which is good; they can help their loved ones and their society more, which is good; and they invent new ways of empowering each other, which is good because it is good, which is what I'm saying.

(Of course, there are exceptions. People hurt people, people sell cigarettes and leaded gasoline, people do AGI capabilities research. If those people were empowered more, they could do those things more.)

This statement is abstract, and may seem like I'm defending some broader claim, like "HIA would be good for humanity, regardless of AGI.". Which I do think, but that's not why I'm saying it. The reason I'm saying it is that it's the core intuition I have about HIA, from which all the other arguments flow. Each other argument is somehow an instance of this argument.

5. Abundance decreases X-risk

More intelligence implies more empowerment, which implies more abundance. Abundance decreases X-risk, e.g. in the following ways.

5.1. Abundance makes less motive to make AGI

The main arguments for making AGI stem from: human ingenuity brings us so much stuff that we want; AGI would be even smarter; so it would give us even more of that stuff. For example, so the argument goes, AGI would bring us:

  • more technology in general, e.g. abundant energy and fast transportation;
  • more science in general, e.g. physics and biology breakthroughs;
  • cures for diseases;
  • ways to extend lifespand and healthspan;
  • general material abundance through technology, e.g. cheap housing and food;
  • cognitive output in general, e.g. art, literature, computer programs, and so on.

The argument doesn't make sense given that AGI would just kill everyone, and you can't eat cancer cures if you're dead, but this is the main stated argument.

If HIA brought abundance, there would be less motivation to make AGI. You don't need to take any gamble, if there's millions of geniuses already curing cancer, curing aging, making the most entertaining movies, etc. In general, human intelligence can replace hypothetical AGI.

Further, there would be less public justification. In other words, even if people want to make AGI for some other reason, they wouldn't be able to use the above motives as much to justify making AGI.

5.2. Abundance makes a healthier society

When society has more Slack, things can be healthier in general. If things were healthier in general, people might be less inclined to try to build machines that will kill everyone. Some key examples:

  • Fewer smart kids traumatized into researching AGI capabilities.

    • I suspect that some smart kids are traumatized by their upbringing, and some of them consequently end up trying to make AGI.
    • I don't know how widespread this is; AFAIK could be 4% or 40%.
    • If society were healthier in general, and also more accustomed to accommodating very smart kids, this would happen less. E.g. with more resources to handle education for very smart kids, those kids won't be stifled by school, won't be bullied as much, and will have a community of peers to invest hope in.
    • Examples of how this might have been happening to kids:
      • School is deeply respectful to the agency of children, and smart kids might feel this especially acutely in the realm of intellectual pursuits, so may spitefully "take flight" to counter-academic areas of innovation.
      • Smart kids may more easily compute out implications of rules, and might be more inclined to follow those implications, and so might be more traumatized by the ambient society's pattern of pretending to be rule-following but not actually following rules. Some kids might not understand why people are like that, and that might turn into a loss of faith that society at large is good; which may or may not be true, but either way, this loss of faith could lead some people to seek unilateralistic ways to upend the world.
      • Age peers might be especially mean to smart kids, bullying and ostracizing them.
      • I suspect that for some especially smart kids, there tends to be a consequence from not having enough peers. This consequence leads to them being a bit overly oriented toward projects that are high leverage and that they can do without needing permission from others and without needing to work with many other people. (This is mostly fine and even good, if a little sad sometimes; and this doesn't apply to anywhere near all smart kids.) I suspect this influences some people to end up being very drawn to AI; e.g. the coolness of AI is compared with reference to other, less high-leverage things that one could do by oneself, rather than with reference to things that large groups of people or humanity as a whole could do and which the person in question could participate in. I suspect that having lots and lots of really smart kids who can hang out with each other will avoid this consequence to a large extent; and that they will as a result be more oriented towards leveraging the human community (e.g. doing science together rather than trying to build a scientist). (But this is all quite speculative of course.)
  • In general, with more Slack, people have more luxury to care about others, to care about the long-term, and to be more wise. If society were more clearly on an upward trajectory, not just in material abundance but also in spiritual growth, then there would be more justified hope in the long-term goodness of society and the world. There would be less intuitive reason to throw a brick through the window, to gain power, to take away power from society.

  • In general, with more Slack, society has more resources to put towards watchdogging AGI development. It has more energy for better epistemics and better social norms and pressures, to push people to not try to make AGI. It has more resources to make institutions that monitor research. States have more spare capacity to self-regulate, and more competence to work out international treaties.

  • In general, with more Slack, society has more wisdom / ideas / thoughtfulness / sanity / kindness.

A piece of evidence for this is the general, centuries-long trend of things getting better. Better health outcomes, more leisure, longer lives, more creative endeavors, less war, more generous welfare, more liberty. This trend could be due to material abundance from industrialization. I.e., with more material abundance, society more aims toward good. As a special consequence of aiming more toward good, society would aim more away from AGI.

6. Empowerment that democratizes is good

Given the current social, economic, regulatory, and technological milieu, there are many people who are willing to try advance A(G)I capabilities. Most people don't want anyone to do that, but some people do it anyway. The way humanity usually works, sometimes for better and sometimes for worse, is that strong moral majorities regulate moral deviators.

If some technology empowers its users, and only a small portion of people have access to the technology, then

  • a morally deviating minority might capture that technology, and circumvent society's regulation; and
  • the minority with access might start morally deviating.

But on the flip side, generally increasing everyone's ability to solve their problems also increases the moral majority's ability to regulate moral deviators. (And to have better opinions about what's moral.)

HIA, IF it is made widely available, would provide empowerment that democratizes.

7. Big Good harder than Big Bad; requires more ideas

7.1. Good outcomes are narrow targets

In general, our values point at narrow targets. Most arrangements of atoms are neutral; probably most arrangements of atoms with conscious experience are suffering; some narrow sliver of possible arrangements is good or amazing, e.g. many flourishing humane souls. Most changes to organisms or minds are bad: most things you can eat are at least not helpful (probably poisonous or mechanically disruptive), and most rearrangements of your tissues are bad (would just break stuff, make you bleed, etc.).

7.2. Easier to hurt than to help

If you're an aspiring surgeon, and you have a knife and a patient, you can definitely do something-like-good-surgery, in that you can cut into your patient's tissues with a knife. But to actually help your patient, you'd have to know specifically where and what and how to cut, and what to do once you've made the cut, and how to follow-up with post-operative care. That's a bunch of extra understanding you have to have.

It's much easier to make a pile of nuclear material that melts down, than to make a pile of nuclear material that heats up without melting down, and then to extract usable energy from it.

It's much easier to make an AI that wreaks havoc, killing everyone if it's explosive enough, than to make an AI that ends up beneficial for humanity.

So there's this big gap between what you need to know in order to get big bad outcomes, vs. big good outcomes. To get good outcomes in difficult domains, we need more good understanding and ideas. To get that, we need more brainpower.

7.3. One-shot problems require thinking hard

This problem is compounded by the one-shot nature of many important tasks. AGI is one-shot: if you get it wrong on your first try, you kill everyone, and you don't get more tries to learn how to do it well. Having a good government is one-shot or few-shot: You don't get a bunch of tries at the task "Make a government that makes the U.S. be prosperous and beneficial for humanity over the course of a century.". That's because there is only one U.S. government—it's exclusive—and it doesn't change very quickly (at least, generally not for the better). Having a good prevailing social milieu / social epistemology / religion / normset is another one-shot / few-shot task.

If you're just trying to have Big effects, even if the effects are bad, everything is a many-shot problem, unless you specifically blow up only yourself. You can keep trying to make AGI, and see your failures (because you didn't kill everyone), and then use that feedback to try again and again. Thus, causing Big Bad stuff tends to be less one-shot than causing Big Good stuff.

One-shot problems are especially in need of thinking—coming up with good ideas and understanding the domain theoretically. You can't let reality do your thinking for you (by empirically trying everything), because that kills you. One-shot problems require theoretical understanding and foresight especially much.

Therefore, increasing humanity's capacity to come up with difficult new ideas preferentially increases our ability to get Big Good outcomes, compared to how much it increases our ability to get Big Bad outcomes.

7.4. We're still bottlenecked on ideas

No one knows how to

  • make aligned AGI
  • otherwise hard-stop AGI capabilities progress
  • robustly convince ~everyone for 100 years to not make AGI
  • do some other clever thing that stops AGI from killing everyone
  • make society super sane so it does good stuff and doesn't do bad stuff
  • cure cancer
  • etc.

That's because we don't have the right ideas. If we did, we'd be in better shape. Getting ideas is hard and requires brainpower.

8. HIA downsides are opportunities for HIA meta-level upsides

One argument against HIA goes: "If there are a bunch more smart people, many of them will work on AGI capabilities. That increases X-risk.". Which is valid.

But I always want to respond: Ok, so you're saying that currently, society is set up so that many smart people end up working on AGI capabilities. That seems bad. That state of affairs seems like a central source of X-risk. How can we fix that? I've thought about it a bit and it seems really hard. So hard, in fact, that in the long term, this problem might also need more brainpower to solve more thoroughly. Such as for example via HIA.

This pattern repeats itself. For example, one might argue: "Being really clever, in the IQ style, isn't helping. We need better values and judgement and wisdom.". This is also maybe partially valid, but again, I would say: judgement and wisdom are difficult cognitive tasks. Even values depend on understanding, and therefore require groundwork of understanding to have been done—often by especially insightful people (who have need of more brainpower). Further, influencing global discourse so that we trend towards better values is another difficult cognitive task. These cognitive tasks seem like they would benefit from better and better performance, with no clear cap, and may require better performance to be effective enough to substantially decrease X-risk.

Again, the pattern repeats with "We should have more social disapproval for capabilities research, and we should have a strong pervasive narrative against it.". I agree, but is this going to stop AGI development for a century? How? That seems really hard too and needs brainpower.

9. HIA downside insincerity

This is maybe a bit too discourse-level, but: Following from the previous section, I also sometimes doubt the sincerity of a claim that HIA would increase X-risk.

The reason I doubt the sincerity is: Suppose you think that some process would make it so that humans getting smarter and more empowered ends up bad. That implies that society is somehow deeply perverse—it takes empowerment, it takes problems solving and ideas and cleverness, and it does something bad with it, in aggregate, reliably. Now, that's a fine hypothesis to have. But IF you think that, THEN it seems like you SHOULD be spending all your time trying to fix THAT. You'd be working on "Ok, let's heal society" or "Ok, let's deeply investigate why society is like that".

But people who make the claim that HIA would increase X-risk don't seem to be doing that investigation, for the most part. So it seems a bit insincere.

10. It's even plausibly better to have smarter capabilities research leaders

If anyone builds AGI, everyone dies. Even most leaders of AI capabilities research don't want everyone to die. If they understood the fact that it would kill them, they wouldn't want to build it.

If you're smarter, it's easier to understand things and compute out the consequences of things. You see the first thousand cases of a pandemic, you see the R estimates, and you understand exponentials very well because you've been playing with them since you were 8 years old, and you immediately intuitively know to be afraid and on what timescale to be afraid. You can more easily see the wideness of the space of possible values, and the speed and depth at which an AGI could self-modify and self-improvement, and you intuitively grasp that this thing is not safe, can't be monitored, can't be shut down, can't be controlled, etc.

So if you have sufficiently smarter capabilities researchers, they start to notice that they are building a self-detonating terabomb.

Yes, Sinclair's Razor applies. Even very smart people can warp their own thinking to believe false things.

But, I think this self-deception is a quantitative ability with quantitative counter-pushes. It's fairly easy to convince yourself that you don't have to worry about something that's abstract, far away—something that will happen in 20 years, or is hard to verify currently empirically. It's harder and less common to convince yourself not to worry about very concrete, near things. People quite rarely jump off the roof under a sincere belief in their own flightworthinees, I presume.

Further, how abstract+far vs. concrete+near something is, depends on what you know and understand. If I tell you that this needle was just stuck into an HIV-positive person, and then I try to stick that same needle into your arm to draw blood, you will feel visceral, fast-action-inducing revulsion / fear. How? Why? Viral tranmission is an abstract, invisible thing, but you basically understand it well enough to know its implications.

11. Proof by Intimidation

Currently, if I'm a capabilities researcher, I can kinda convince myself and others that it's ok to ignore arguments and warnings by other smart people, because I'm really smart, and I actually know about AI unlike those other smart people. They're irrationally fearful because they don't understand what the technology is actually like.

But if there were ton of people who are obviously smarter, clearer-thinking, better informed about AI, and more competent than me, and they were almost all saying that AGI is extremely dangerous and must not be built, it becomes harder for me to keep my morale up to try to make AGI anyway.

HIA could result in a lot of people who are smart enough to ~automatically / ~independently figure out that alignment is very hard and misalignment is lethal. As long as HIA does that, even if it also results in a bunch of people who are not that smart and who could be recruited into capabilties research, I think the narrative shifts against AGI being even ok to do, which makes it harder for AGI researchers to coordinate intra- and inter-personally.