Posts

הלבת-אש ללא הסנה

Image
[This post is labeled בבל, meaning it's especially experimental. See: בבל disclaimer ] I seem to have always already lost my wife. I do wonder where she is. I assume she doesn't know where I am, or else she would have returned to me, although——not being able to imagine that she's dead or nonexistent or otherwise radically disempowered——I also eventually come to wonder if she's forsaken me, which choice I would naturally be required to have made myself enough apparently separate to pretend acceptance of, at least long enough for her to depart. Sadly I have also forgotten where she might be, and what she looks like, and worst of all, the sound of her voice murmuring something secret in my ear. I've even forgotten her name. Did it start with a J? Maybe an M? an A? Or was it a $\daleth$ or an Л? I don't remember.  We can be quite sure it doesn't start with an $\aleph$, since she's kind and patient. She likes lemons and she likes the feel of rock on her sk...

The fraught voyage of aligned novelty

A voyage of novelty is fraught. If a mind takes a voyage of novelty, an observer is hard pressed to understand what the growing mind is thinking or what effects the growing mind will have on the world.

Provisionality

A mental element has to be open to revision, and so it has to be treated as though it might be revised.

Explicitness

Image
Explicitness is out-foldedness. An element of a mind is explicit when it is available to relate to other elements when suitable.

Communicating with binaries and spectra

To communicate, it's convenient to code information in words and numbers. Words are discrete, so they're well-suited to expressing binaries: this is big, that is small. They're also well-suited to express finite partitions: microscopic, tiny, small, big, huge, enormous. Thought is often tripped up by finite partitions: many things do not fit neatly into the partitions, or what's relevant about something might be only poorly expressible with the available partitions. So instead an adjective can be taken as pointing at a spectrum. This is bigger, that is smaller. This is 10 meters long, that is 1 millimeter long. Thought can also be tripped up by spectra: again, what's relevant might be only poorly expressible as lying somewhere on the spectrum. What's relevant might be multidimensional, so that a one-dimensional representation requires a lossy projection. This weighs 2000 kg and is 10 meters long, that weighs 3 mg and is 1 millimeter long. A description could ...

Please don't throw your mind away

Image
1. Dialogue 2. Synopsis 3. Highly theoretical justifications for having fun 4. Appendix: What is this "fun" you speak of? What's a circle? Hookwave Random smooth paths Mandala Water flowing uphill Guitar chamber Groups, but without closure Wet wall 1. Dialogue [Warning: the following dialogue contains an incidental spoiler for "Music in Human Evolution" by Kevin Simler . That post is short, good, and worth reading without spoilers, and this post will still be here if you come back later. It's also possible to get the point of this post by skipping the dialogue and reading the other sections.] Pretty often, talking to someone who's arriving to the existential risk / AGI risk / longtermism cluster, I'll have a conversation like the following. ———————————————————— Tsvi: "So, what's been catching your eye about this stuff?" Arrival: "I think I want to work on machine learning, and see if I can contribute to align...

Rules for the flighty-souled

[This post is labeled בבל, meaning it's especially experimental. See: בבל disclaimer ] Never take your phone out while you're walking. Unless it's an emergency or you're going to take notes (but voice notes are preferable).   Never take your phone out while you're with someone. Ever. Unless you explicitly take a break from being together. If they take their phone out first, it's less bad, but still never do it. Never wear clothing with words or images. Especially not logos or branded symbols. Accept as little money as possible. Never be in photos, ever. When you're having a long-distance call, make it audio only, not video. Avoid being inside cars. Never go on a dating app. Don't go on Facebook, Twitter, or other similar networks. Never say the same thing twice. Never speak to more than three people at once. Preferably, never speak to more than one person at once. Never touch anyone unless you mutually have some type ...

esc.

[This post is labeled בבל, meaning it's especially experimental. See: בבל disclaimer ] [Note February 2023: this is an unedited first draft written in April 2018 while I was heavily involved with a psychopath.] 180408 01:46:29 esc. summary: modalities attempt to mention a statement S, so to speak (by transforming it to a different statement [[M]] S), without using it. but the hidden effects of uttering S are often-roughly also caused by [[M]] S. thus S escapes its modality. speaking in a modality [[M]] attempts to transform a statement S into an object some other type, denoted [[M]] S (which can also be taken to be a statement, but in a different modality (as in, sensory modality)). some examples of modalities: [[M]] S may be: a string (the quotation modality; for example, "i was going to say [[']]i can't deal with this right now[[']]"); an emotion (e.g. "[[i feel like]] you are trying to hurt me"); a perception (e.g. "[[it seems to...

Wildfire of strategicness

Image
It may not be feasible to make a mind that makes achievable many difficult goals in diverse domains, without the mind also itself having large and increasing effects on the world. That is, it may not be feasible to make a system that strongly possibilizes without strongly actualizing . But suppose that this is feasible, and there is a mind M that strongly possibilizes without strongly actualizing. What happens if some mental elements of M start to act strategically, selecting, across any available domain, actions predicted to push the long-term future toward some specific outcome? The growth of M is like a forest or prairie that accumulates dry grass and trees over time. At some point a spark ignites a wildfire that consumes all the accumulated matter. The spark of strategicness, if such a thing is possible, recruits the surrounding mental elements. Those surrounding mental elements, by hypothesis, make goals achievable. That means the wildfire can recruit these surrounding element...

A strong mind continues its trajectory of creativity

A very strong mind is produced by a trajectory of creativity. A trajectory of creativity that produces a very strong mind is hard to separate from the mind's operation. So a strong mind continues on its trajectory of creativity as long as it is active.

An anthropomorphic AI dilemma

Either generally-human-level AI will work internally like humans work internally, or not. If generally-human-level AI works like humans, then takeoff can be very fast, because in silico minds that work like humans are very scalable. If generally-human-level AI does not work like humans, then intent alignment is hard because we can't use our familiarity with human minds to understand the implications of what the AI is thinking or to understand what the AI is trying to do.

The voyage of novelty

Image
Novelty is understanding that is new to a mind, that doesn't readily correspond or translate to something already in the mind. We want AGI in order to understand stuff that we haven't yet understood. So we want a system that takes a voyage of novelty: a creative search progressively incorporating ideas and ways of thinking that we haven't seen before. A voyage of novelty is fraught: we don't understand the relationship between novelty and control within a mind.

Hyperphone

Image
"Did you know the Greenland shark doesn't reach reproductive maturity until it's over 100 years old?" "Yeah, it's crazy! What evolutionary pressures could possibly have produced that trait? Maybe it has to do with... "Well I was thinking that...

Endo-, Dia-, Para-, and Ecto-systemic novelty

Image
Novelty can be coarsely described as one of: fitting within a preexisting system; constituting a shift of the system; creating a new parallel subsystem; or standing unintegrated outside the system.

"Sorry" and the originary concept of apology

1. Paradox of "apology" What does the word "apology" mean? Today it means "say you're sorry". In Ancient Greece, as people say, the etymon ἀπολογία meant "a speech made in defense of something", and this meaning can also attach to the English word "apology". Aren't these nearly exact opposites? Saying sorry is saying you did something wrong, and ἀπολογία is defending what you did, saying it's not wrong.

Verichtung

[This post is labeled בבל, meaning it's especially experimental. See: בבל disclaimer ] (Caveat lector: I only speak English and didn't run this by anyone.) Sonnendurchflutet Bäume über einem stillgelegt Steinbruch, Spiegel einander gegenüber, zu nah. Die geworfenen Würfel sind Schlangenaugen. ...איך להסביר לילד שכולם ימ Shield-toad left in the Haze, Gebröckelt Steineule auf der Hügelspitze. Unter der Unterfläche wartet der blaue Dynamo. Notes: "Verichtung" is a made-up word, patterned off " Vernichtung ", replacing "nicht" with the obsolete analogous word " icht ". "Icht" could maybe be viewed as "je-Wicht" (as in English "wight"), meaning something like "ever something", as opposed to "nicht" = "nie-Wicht" = "never something". So "Verichtung" would mean something like "be-something-ing, to make something be something, to make something...

בבל disclaimer

Here are the posts labeled "בבל": https://tsvibt.blogspot.com/search/label/%D7%91%D7%91%D7%9C Posts labeled "בבל" are more experimental, unreliable, poetic, prophetic, metaphoric, contradictory, false, confused, incoherent, unclear, inchoate, incontinent, insane, repetitive, low-effort, pointless, cringe, fringe, binge, silly, facile, babbling, rambling, squabbling, and any other manner of not to necessarily be taken too seriously, compared to other posts.

Possibilizing vs. actualizing

Some behavior seems like it's just making things possible, without actually doing much of anything, while other behavior seems to actually do something. Is there a principled, or a useful, distinction between possibilizing and actualizing? Is it possible to possibilize a large effect on the world without actualizing large effects on the world?

Ultimate ends may be easily hidable behind convergent subgoals

Image
$\require{AMScd}$ Thought and action in pursuit of convergent instrumental subgoals do not automatically reveal why those subgoals are being pursued——towards what supergoals——because many other agents with different supergoals would also pursue those subgoals, maybe with overlapping thought and action. In particular, an agent's ultimate ends don't have to be revealed by its pursuit of convergent subgoals. It might might therefore be easy to covertly pursue some ultimate goal by mostly pursuing generally useful subgoals of other supergoals. By the inspection paradox for the convergence of subgoals, it might be easy to think and act almost comprehensively like a non-threatening agent would think and act, while going most of the way towards achieving some other more ambitious goal.

Politically convergent perverse instability

[Epistemic status: just a guess / hypothesis.] Suppose Alice is anti-immigration and has political power. She politically pushes for laws against immigration, government spending towards capacity to prevent immigration like walls and guards, policies to deport illegal immigrants, and so on. Alice also pushes against policies to cope with whatever the current de facto status quo is, e.g. to alleviate harms done by whatever is already going on, or at least doesn't push for such alleviation policies. Those policies would alleviate pressure to change, and Alice wants change; they'd make the status quo less bad, and Alice doesn't like the status quo. And, those policies being passed would constitute, relative to Alice's desired anti-immigration stance, a symbolic victory for the side opposing Alice; it would "say", in the language of politics, that "we are okay with the status quo, we're organizing to make something like the status quo work well". Pl...

Descriptive vs. specifiable values

What are an agent's values? An answer to this question might be a good description of the agent's external behavior and internal workings, without showing how one could modify the agent's workings or origins so that the agent pushes the world in a specific different direction.

Shell games

Image
1. Shell game 2. Perpetual motion machines 3. Shell games in alignment Example: hiding the generator of large effects Example: hiding the generator of novel understanding Other? 1. Shell game Here's the classic shell game: Youtube Screenshot from that video. The little ball is a phantom: when you look for it under a specific shell, it's not there, it's under a different shell.

Are there cognitive realms?

Image
Are there unbounded modes of thinking that are systemically, radically distinct from each other in relevant ways?

Prosthetic connectivity

Image
Summary: adding artificial connections between distant areas of the brain might increase intelligence in two ways. The first way is by simply increasing connectivity in areas that perform abstract thinking; since evolution was clearly bottlenecked on connectivity, that might be valuable to the brain. The second way is by reprioritizing brainware according to our values in our current environment. Prosthetic connectivity seems bottlenecked on a bunch of nitty-gritty (bio)engineering work that's on the mainstream BCI pathway.

Do humans derive values from fictitious imputed coherence?

Image
Humans are born with some elements of their minds, and without many other elements, some of which they'll acquire as their life unfolds. In particular, the elements that we pretheoretically call "values"——aesthetic preferences, goals, life goals, squad goals, aspirations, needs, wants, yearnings, drives, cravings, principles, morals, ethics, senses of importance, and so on——are for the most part acquired or at least unfolded, rather than being explicitly present in a newborn. How does this happen? What generates these mental elements? Hypothesis: a human derives many of zer values by imputing coherent agency to zer past behavior, and then adopting the goals of that fictitious agency as actively influential criteria for future action.

Rootedness

Image
[This post is labeled בבל, meaning it's especially experimental. See: בבל disclaimer ] A subgoal sticks its head into eternity.

Counting-down vs. counting-up coherence

Counting-down coherence is the coherence of a mind viewed as the absence of deviation downward in capability from ideal, perfectly efficient agency: the utility left on the table, the waste, the exploitability. Counting-up coherence is the coherence of a mind viewed as the deviation upward in capability from a rock: the elements of the mind, and how they combine to perform tasks.

Does novel understanding imply novel agency / values?

To have large relevant effects on the world, a mind has to understand a lot about the world. The mind has to have a lot of the structure of the cosmos (the entirety of the world, in any aspect or abstraction) highly accessible to itself for use in skillful action. To understand a lot about the world, the mind has to gain a lot of understanding that it didn't have previously. When a mind gains understanding, that's a change in the mind. Does that change have to include a change to the values of the mind?

The conceptual Doppelgänger problem

Suppose we want to observe the thoughts of a mind in order to detect whether it's making its way towards a plan to harm us, and ideally also to direct the mind so that it pursues specific aims. To this end, we might hope that the mind and its thinking are organized in a way we can come to understand in the way that we understand ourselves and our thinking . We might hope that when the mind considers plans that involve something, e.g. plans that involve the coffee cup, it does so using a concept alike to our concept [[coffee cup]]. When the mind recognizes, predicts, imagines, simulates, manipulates, designs, combines things with, describes, studies, associates things with, summarizes, remembers, compares things with, deduces things about, makes hypotheses about, or is otherwise mentally involved with the coffee cup, maybe it always does so in a way that is fully comprehendable in fixed terms that are similar to the terms in which we understand ourselves when we do those activities...

Dangers of deference

Sometimes people defer to other people, e.g. by believing what they say, by following orders, or by adopting intents or stances. In many cases it makes sense to defer, since other people know more than you about many things, and it's useful to share eyes and ears, and coordination and specialization are valuable, and one can "inquisitively defer" to opinions by taking them as challenges to investigate further by trying them out for oneself. But there are major issues with deferring, among which are: Deferral-based opinions don't contain the detailed content that generated the opinions, and therefore can't direct action effectively or update on new evidence correctly. Acting based on deferral-based opinions is discouraging because it's especially not the case that the whole of you can see why the action is good. Acting based on deferral-based opinions to some extent removes the "meaning" of learning new information; if you're just going t...

The benefit of intervening sooner

Image
AGI is likely to come this century. Say you have a plan $X$ that would prevent AGI from destroying the world. How beneficial is it to set plan $X$ in motion sooner by $n$ years? A very short answer: it reduces the probability of AGI ruin by something like $n/2$ or $n/3$ percent. Which is a lot. A slightly longer answer: it reduces the probability of AGI ruin by roughly $fn$, where $f$ is the probability per year of AGI ruin around the $n$-year interval of time between when plan $X$ would have been completed with intervention and without intervention. So if $X$ would take a very long time either way, or if the chances of AGI ruin are very spread out through time, then the intervention doesn't matter that much; otherwise the intervention probably makes a noticeable difference. A fuller answer: This post assumes that the arrival of AGI ruin, and the workings of $X$, are independent. It also assumes that the probability of AGI ruin without $X$ is 1, so results should be scaled down...

Control

I don't know how to define control or even point at it except as a word-cloud, so it's probably wanting to be refactored. The point of talking about control is to lay part of the groundwork for understanding what determines what directions a mind ends up pushing the world in. Control is something like what's happening when values or drives are making themselves felt as values or drives. ("Influence" = "in-flow" might be a better term than "control".)

Structure, creativity, and novelty

A high-level confusion that I have that seems to be on the way towards understanding alignment, is the relationship between values and understanding. This essay gestures at the idea of structure in general (mainly by listing examples).

Gemini modeling

Image
A gemini model is a kind of model that's especially relevant for minds modeling minds.

Non-directed conceptual founding

In trying to understand minds-in-general, we sometimes ask questions that talk about "big" things (taking "big" to ambiguously mean any of large, complex, abstract, vague, important, touches many things, applies to many contexts, "high-level"). E.g.: What is it for a mind to have thoughts or to care about stuff? How does care and thought relate? What is it to believe a proposition? Why do agents use abstractions? These "big" things such as thought, caring, propositions, beliefs, agents, abstractions, and so on, have to be analyzed and re-understood in clearer terms in order to get anywhere useful. When others make statements about these things, I'm pulled to pause their flow of thoughts and instead try to get clear on meanings. In part, that pull is because the more your thoughts use descriptions that aren't founded on words with clear meaning, the more leeway is given to your words to point at different things in different in...

The Thingness of Things

$\newcommand{\Z}{\mathbb{Z}}$ What's a thing, in general? Minds deal with things, so this question comes up in trying to understand minds. Minds think about things, speak of things, manipulate things, care about things, create things, and maybe are made of things.

The power of selection

Image
$\newcommand{\Var}{\mathrm{Var}}$ $\newcommand{\second}{2\text{nd}}$ $\newcommand{\kth}{k\text{-th}}$ $\newcommand{\R}{\mathbb{R}}$ $\newcommand{\Ltwo}[1]{\|#1\|_2}$ $\newcommand{\tightlist}{\setlength{\itemsep}{0pt}\setlength{\parskip}{0pt}}$ If you put in work to select additive components of some random variable, how far out can you get in the distribution of that variable? This post will focus on normally distributed variables, which is handy since the sum of many individually small random variables is roughly normally distributed by the Central Limit Theorem . (Note: In places this post is long-ish and discursive (and explains an error I made) because it's trying to get a mathematical understanding of selection that can inform mathematical intuitions about more complicated kinds of selection. If you just want a summary of the numerical situation, look at the tables and graphs.) Code for tables and diagrams are in this Github repository . Thanks to Sam Eisenstat for many ...

Downside risks of genomic selection

There are downside risks to selecting genomes of future human children for traits like health, intelligence, lifespan, mental health, and so on. This essay maps out some of these, starting with these intuitions: Unnaturalness. Objectification. Transgression. Misalignment.

Chromosome selection

Previous work: Gwern , Anon This is all speculation from a lay perspective. The idea is to make an embryo with a genome selected chromosome by chromosome from some input genomes.

Non-destructively sequencing gametes by sequencing meiotic cousins

Image
Previous work: Gwern My lay understanding is that it's not known how to sequence gametes without destroying them. If we could non-destructively sequence gametes, it would be easier to e.g. screen for genetic illnesses. Instead of fertilizing a few eggs, which are expensive to acquire, and then sequencing the resulting embryos, we could produce many sperms, find acceptable sperms, and then use those to fertilize the few available eggs. This post describes a way to sequence gametes non-destructively, assuming that it's possible to sequence gametes destructively. I lack lots of basic biological knowledge, so I can't verify that this idea makes sense, would work, or would be feasible or efficient. I hope others who are more informed can check and use the idea. I'll focus on sperm for clarity; I don't know whether / how this might extend to ova.

Multisheets: multi-dimensional spreadsheets for belief tracking

Image
TL;DR: To keep track of your thoughts about a question that has multiple parameters, you can use multi-dimensional spreadsheets. If you use vim, vim-multisheets is a basic implementation of multi-dimensional spreadsheets; code here . Tracking beliefs and then automatically unfolding some of their consequences can point towards questions you haven't answered and point to contradictions in your beliefs.

Step, leap

Aliyah walks toward one side of a half-meter-wide ditch. Without missing a beat, she plants her left foot at the rounded lip of the ditch and then swings her right leg forward over the ditch to plant her right foot at the opposite lip, bobbling her upper body forward in dynamic equilibrium between the upward-centerward forces from her proceeding alternating legs, in a smooth motion reprising her past steps over solid ground. Boaz, following Aliyah, walks toward the half-meter-wide ditch. He starts to step across, but pauses halfway through, poised over the ditch. To amuse himself, he reverses by pushing back on the opposite lip, starts forward again, goes almost all the way, then reverses again. With just a bit of effort he can statically occupy a near approximation of any position and stance that Aliyah passed through during her step, dwelling for a time in that moment that was for Aliyah a non-extended point in a dynamic process. Carmel walks toward a meter-wide ditch. A push from ...

Protips

Image
Very hot water stops itch. Itching due to an immune reaction--bug bite, allergies, poison ivy / poison oak--is caused by histamine. Heat makes histamine release quickly. So if you have an annoying itchy spot, go to the sink and run the water very hot. Don't burn yourself, but it should be really hot; hot enough that you can be okay with it running over your skin, but only by easing the cold water down gradually so the water gets hotter and hotter. If the itch feels weirdly intense, even pleasurable, like you're scratching all of it all at once, you're doing it right. Once the histamine is released, it should stay non-itchy for a while, like an hour or two or three. If the itch is in an inconvenient spot, e.g. face or neck, use a hot water bottle or in a pinch, ziploc bags. Works even for terrible poison ivy / poison oak rashes; it's way more effective than topical steroids for itch relief. When you can't find something and then you find it, after you're don...