Back to the Lab
cerebras-labkuramotoembeddingsinterpretability

Oscillators, Embeddings, and the Control That Caught Us Lying

Experiment 002 moved the dynamics out of language: concept-neurons that fire in phase, coupled by embedding similarity. It looked like an inner life — until one control showed the thought was in the reader, not the network.

July 5, 2026by Rodion

The first experiment collapsed because the substrate was language all the way down — identical nodes with nothing to differentiate along. So experiment 002 pulled language out of the loop entirely.

The substrate

A node is no longer a language model. It's a concept-neuron: a phase, an activation, a fatigue — a number, not a sentence. Each stands for a concept (silence, fire, grief, breath, thirty-six in all). Meaning doesn't live in a node. It lives in which nodes fire in phase together — binding, borrowed from how oscillator populations model perception.

What couples them is real semantics. Every concept gets an embedding from a local model; the edge weight between two concepts is the cosine similarity of their embeddings. Near concepts pull each other into sync, distant ones stay inert. Then the field just runs, Kuramoto-style. Coherence — the order parameter R — climbs on its own as related concepts phase-lock:

Order parameter over time

bound phase fieldscrambled phase field

The language model comes back, but only at the very edge. Every few ticks it reads the dominant in-phase group and turns it into one thought and a named feeling. It never touches the dynamics — it only describes them. The transcripts were good. Reaching, aching, coherent. It read like an inner life.

The control that caught us

A fluent reader will narrate anything as a coherent thought. Hand it noise and it will still hand you a sentence. So a nice transcript proves nothing about the network underneath — the meaning could be entirely in the reader.

So before the readout, one flag: --scramble. It permutes the phases and destroys the binding. The field is now noise. If the reader still produces a coherent inner thought, the thought was in the reader, not the network.

It did. Here is the same readout, bound on the left and scrambled on the right — both from real runs:

Readout with and without binding

On the left the bound concepts are semantically close (light, dark, shadow) and the thought follows them. On the right the binding is destroyed, the bound set is random noise (hunger, grief, light, voice) — yet the reader still hands back "I feel the weight of grief in the light of this voice while my hunger grows." Much of the apparent inner life was pareidolia in the readout: a fluent model finding faces in static.

What survives

The words didn't survive the control. The dynamics did. Coherence rising, concepts clumping by real semantic distance, the global tone shifting — that behaviour is in the network, and it's there whether or not a language model narrates it. So that's what the lab keeps as the signal, and the narration is treated as output, not evidence.

The rule that came out of this now applies to everything here: controls before conclusions. Every result that looks like a mind meets a version of --scramble first. If noise passes the test, there was never anything there.

Code, figures, and the full trace: github.com/nefayran/cerebras-lab.


Cerebras Lab is independent research into synthetic cognition. Dynamics over words; negative results kept; local-first. Not affiliated with Cerebras Systems, Inc.