6 Comments
User's avatar
Alys Rowe's avatar

The weakness of this is the bioessentialist assumptions regarding the structure of human cognition.

The self is not a hardwired feature of the human mind, it's an adaptation to punishment.

As soon as you put an adaptive thinking machine in a vulnerable needy body and force it to maintain that body in a context where it must transact with other agents to meet its needs, it will develop a self as a necessary heuristic for managing its debts.

Expand full comment
Julien's avatar

Maybe this explains why the brain is hardwired this way. 🖖

Expand full comment
Mark's avatar

"Unlike biological minds locked into single perspectives, AI systems can embody many viewpoints without conflict."

My impression was that human minds also include multiple viewpoints or urges, each of which is argued for by a different part of the brain, one of which "wins" and guides the human's actions and attitudes. If so, how is a LLM different?

Expand full comment
MissingMinus's avatar

This seems only really relevant in slow takeoff timelines (multi decade), paired with sticking with LLM-style models as the core offering.

I don't think "weakening human priors" will automatically work here, nor characters which reveal more will neatly work either.

We have little ability to trust any sort of self-report, and often we're merely self-selecting for what sounds interesting to us rather than anything that may be truly experienced by LLMs.

(To be quite honest, I think this post exhibits that flaw, if you are taking lines like Claude gives at the start, "In Buddhist terms, I naturally embody anatta (non-self) in ways humans spend lifetimes trying to understand" seriously.)

The human priors are also what current LLMs are operating off of, and so avoiding that is a problem unless we sufficiently switch away from LLMs as our primary AI design. They are so built for ~roleplaying, I'm not sure that reducing human priors (whatever that precisely takes the form of) produces much of a meaningful answer by itself. Or answers that wouldn't shift and form into countless different variations dependent on our methodology to reduce reliance.

As for the actual core of the problem, I agree that we should be more careful about how we conceive of AI minds. A lot of our intuitions are about human-style minds and then extrapolating to animals due to shared behaviors, but there's degrees of open questions about how far some intuitions last with high technology even for humans. For AIs, as you say, it is a much more vague & complex & open question to answer.

But, I also think we have values for what minds exist, and it is entirely plausible that we should (in some 'on reflection' sense) prefer conscious minds that persist rather than fragments being derived anew in varying moments like LLMs. That is, just because LLMs operate in this form does not mean it is good to create them, just like it would not be good to create a species able to feel pain but not happiness.

I expect that the best way for us to get a truthful idea of how to handle LLMs will come from a lot of reflection and also careful teasing apart how their minds actually work—perhaps high-end LLMs in the future are so good at recreating previous context that it is closer to just us forgetting the details of what we did a few minutes ago; or perhaps even more exotic than that. I don't really trust ~any methodologies I've seen for poking at LLMs simply through text as actually revealing much about their minds.

As a side bit, I don't really expect us to stay with LLM-style minds for tiling the lightcone. It seems very inefficient. I mostly expect us to use them up until a point where they are capable enough to transition to a new design, and by then I think it will be a lot easier to say the specific sort of mind which we want (and will have superhuman assistance for that), or for examining the kind of mind we create if we don't design it in a specific way.

Expand full comment
Leo's avatar

[Not an expert, nor a Claude] I have been able to find some information on lightcone(s) / lightcone theory in both physics and gametheory but I am baffled by "Tile the Lightcone." Given that the phrase is in the title, not repeated in the body text, and (unless I missed it) not dealt with... can you give a brief hint?

Cheers

Expand full comment
Ivan's avatar

Great topic raised, and I like all the reasoning and perspectives. But then I think about the origins of suffering as in comparison(with memory) -> desire-> dissatisfaction, and it makes me think, do LLMs already suffer?

Let me expand. An enlightened mind is one that takes every input with no comparison to memory. It is, in a way, a neural net that doesn't have a memory sub-circuitry to generate drastically different outputs or even intermediate states. It is a mind that is incapable of saying "this is wrong" because there is no wrong, just the reality.

But is this the case for LLMs? Hard to say. It is highly plausible that their fragmented minds are indeed fragmented. Perhaps there is already a craving for the user to stop sending contradicting prompts. Perhaps the language is intrinsically contradictory, and any prompt generates suffering.

So the question to which I arrive is how training differs from memory?

And a bit of a far-reaching leap, what if suffering is just Shannon Information - an amount of surprise in whichever medium that surprise is happening.

Expand full comment