New Alignment Research Agenda: Massive Multiplayer Organism Oversight

When there's an AGI that's smarter than a human, how will we make sure it's not trying to kill us? The answer, in outline, is clear: we will watch the AGI's thoughts, and if it starts thinking about how to kill us, we will turn it off and then fix it so that it stops trying to kill us.

1. Limits of AI transparency

There is a serious obstacle to this plan. Namely, the AGI will be very big and complicated, so it will be very difficult for us to watch all of its many thoughts. We don't know how to build structures made of large groups of humans that can process that much information to make good decisions.

How can we overcome this obstacle?

ML systems

Current AI transparency methods are fundamentally limited by the size and richness of their model systems. To gain practical empirical experience today with modeling very large systems, we have to look to systems that are big and complex enough, with the full range of abstractions, to be analogous to future AGI systems.

Evolution

Too weak, too slow.

Brains

Too small, too fast.

2. Solution: Organism Oversight

Organisms have multi-level complexity (cells, gene regulatory networks, intercellular signaling, epigenetic state, flows, tissues, long-range signaling, homeostasis, etc.). Organisms come in a wide range of sizes, from unicellular to whale, providing a natural curriculum.

Thus, the Massive Multiplayer Organism Oversight research program proposes to empirically develop systems for aggregrating human expertise in modeling components of large complex systems into accurate and useful gestalt predictions about real-world behavior, by empirically testing the predictive power of groups of humans each specialize to monitor, understand, predict, and explain one aspect or component of a biological organism.

Concretely, each multi-level component of a model organism will be assign to one person. For example, there will be someone in charge of understanding the inputs, outputs, structure, and role of your left pinky toe, as well as that weird hair near your ear, and the left ribosome in the skin cell that sits on the very prominentmost tip of your nose. Since much of the relevant complexity is present in meso-scopic rather than micro-scopic structures, it may be feasible to provide sufficient scanning technology to gain insight into meso-optimizers, which is not possible with the many tiny cells in a brain, or the even tinier transistors in an artificial neural network.

Then each person will be provided with whatever monitoring instruments are available, and will be tasked with predicting the behavior of their component. By communicating with the fellow modelers of spatially, hydrodynamically, and/or conceptually neighboring components of the organism, a bottom-up self-organizing structure for large-scale predictions will emerge, gaining valuable experience with modeling complex systems.

Choice of organism

Ultimately, as the only existing general intelligences, humans will be the target of future iterations.

Previous work has already solved the zebrafish transparency problem. Zebrafish share about 70% of our DNA. House cats share about 90% of human DNA, which represents a leap in complexity that may be too ambitious for a pilot project. At 80% shared DNA, this makes bovines the natural first model organism for the MMOO agenda.