We inherited a comfortable metaphor: the brain is a computer, the mind its software. Useful for a decade or two, then it started to mislead. Brains operate closer to living weather—stability through endless small adjustments, prediction wrapped around sensation, memory braided into action. Artificial intelligence—the kind that runs on server farms—doesn’t live in a body and doesn’t grow up inside a culture. It’s fast, it’s plastic in the wrong places, and it forgets at scale. The conflict between these facts isn’t cosmetic. It’s a design problem, an ethics problem, a physics-of-information problem.
What Brains Actually Do: Prediction, Compression, and Slow Moral Memory
The brain doesn’t “store data.” It compresses experience into usable constraints. Prediction errors drive change; energy cost limits possibilities. Call it active inference, predictive coding, or just: keeping your balance in a world that won’t hold still. Neuroscience keeps returning to this: perception as controlled hallucination tuned by sensory feedback; motor plans as hypotheses tested against the world; attention as budgeted precision, not spotlight magic. Even the feeling of a continuous self—the “I” that wakes up every morning—looks like a rolling summary, a quick codec approximating a tangled body-world model.
Two tempos matter. First, the quick: millisecond spikes, local recurrent loops, cortical microcircuits adjusting predictions on the fly. Second, the slow: synapses consolidate; sleep reorganizes; the hippocampus replays and reshapes what counts as structure rather than noise. Cultures add a third tempo—generational memory. Norms, taboos, rituals. Not as superstition but as slowly learned priors about cooperation and danger. Call it moral memory, accumulated at biological and social timescales that refuse to rush.
Now picture how modern artificial intelligence learns. Pretraining gulps down vast webs of text and images—surface correlations baked into weights. Then a fine-tuning pass, sometimes with human thumbs up or down. Impressive. Also brittle. Short memory for consequences, long memory for statistics. Lacking the hippocampal bargain—store raw episodes now, distill later—or the neuromodulators that say: this matters, change fast; that doesn’t, hold your horses. The result is a system fluent in patterns without the organism’s cautious slowness about value. We patch with rules. We call it alignment. It often becomes theater.
Compression deserves more attention. Brains don’t just squeeze bits; they conserve meaning by discarding redundancies that won’t help survival. Meaning, here, is not mystic. It’s constraint that changes action. That’s a different goal than loss functions defined by text prediction alone. If information is the substrate—the real stuff under matter and mind—then learning is constraint-finding plus the skill of forgetting the right things at the right times. Synapses specialize in that selectivity. So do stories, laws, and shame. Slower than gradient descent, yes. Also safer.
Where AI Learns From Biology—And Where It Pretends To
To be fair, there’s real cross-pollination. Convolutions mimic visual cortex locality. Attention echoes flexible binding. Replay in reinforcement learning borrows hippocampal tricks for credit assignment. Neuromodulation shows up as learned optimizers or dynamic temperature controls. Spiking models hint at energy thrift; dendritic computation inspires gating and sparsity. The lesson is consistent: structure helps. Not just more parameters—better constraints, tuned to how learning unfolds in a world with friction.
But there’s also cosplay. We lift vocabulary—“neurons,” “memory,” “reasoning”—while running systems with none of the relevant anchors. No body that gets hungry, no metabolism that complains about waste heat, no multi-decade apprenticeship inside a moral community. We compensate with human labelers and policy layers that bolted on after the fact. The “moral patch.” Auditable, sort of. Gameable, absolutely. You can see the mismatch in failure modes: confident nonsense, weird edge cases, plasticity where we want stability and rigidity where we want revision. It’s not that transformers are bad; they’re miracles of engineered constraint. It’s that we keep mistaking linguistic fluency for grounded knowledge.
I sometimes think the deeper divide between neuroscience and artificial intelligence is about time. Brains inhabit layered timescales: synaptic seconds, homeostatic hours, developmental years, cultural centuries. Current models mostly do two: pretraining (massive, once) and serving (fast, stateless). Retrieval and external memory help, but they often bolt on caches rather than cultivate durable, value-laden memory. If “meaning equals constraint on future action,” then models need slow channels that are expensive to change. Not just for security, but for identity. Without slow channels, everything is costume.
Another missing piece: consequences that bite back. An organism faces risk—the cliff is real. Models rarely do. Adversarial training fakes it, but there’s no pain, no depletion, no social sanction that persists. So models learn to win the loss function, not to live with themselves later. That’s why safety by prompt templates feels thin. You can’t crowbar moral memory into a system that has never had to metabolize regret. If this sounds dour, good. It argues not for prohibition but for redesign—build training loops where cost, delay, and accountability are not affordances but physics.
Design Principles If Information Is the Substrate
If reality is pattern and constraint before it’s object and scene, then learning machines should honor constraints first. A few practical principles follow—proposals, not commandments. Testable.
First, multi-tempo learning. Fast plasticity for perception and task-switching; medium plasticity for skills; slow, protected plasticity for values and institutional memory. Shield it. Require deliberation to change it. Crypto won’t save you here; governance will, if it’s local and legible. Think of hippocampal-cortical dialogue as a protocol: episodic scratchpad feeding consolidated structure. In engineering terms: ephemeral workspaces plus periodic, contested writes to a stable store that refuses the easy gradient step.
Second, consequences with teeth. Put models in loops where predictions carry delayed costs. Not synthetic penalties only—contracts, audits, resource budgets, downstream impact that arrives late and sticks. In robotics this is obvious. In software it’s abstract, but possible: commit to decisions, trace their effects, force models to reconcile earlier choices with later feedback. Moral memory isn’t a filtered dataset. It’s an archive of injuries and repairs.
Third, energy realism. Spiking, sparsity, dendritic gating, active forgetting. The brain’s trick is doing more with less by moving computation to the edge and waking circuits only when prediction error justifies it. Neuromorphic hardware isn’t a fetish; it’s a way to embed constraint into silicon so that efficiency and safety co-produce each other. Edge models that learn in place under tight budgets behave differently—more like organisms, less like oracles. This matters in healthcare triage, where false calm and false alarm don’t weigh the same. It matters in infrastructure control, where timing beats eloquence.
Fourth, open processes. If our systems will absorb and emit culture, the lab notebook can’t be a trade secret. Pretraining corpora, curation rules, veto lists, update cadences—the slow channels most of all—should be publishable and forkable. Alignment can’t be an internal memo; it has to be a living, inspectable memory that institutions co-own. Yes, this sounds messy. So does peer review, case law, and science. We work with mess because mess is how distributed memory stays honest.
Finally, representation that admits it is partial. Models that surface their own uncertainty and let humans ask: which constraints drove that output? Not a fake chain-of-thought for optics, but a compact proof-of-work—what signals mattered, which values were invoked, what trade-offs were enforced. Brains do something like this with confidence and surprise; cultures do it with justification rituals. We can do it with minimal, public rationales bound to the slow store.
Are there case studies? Small ones. Hippocampal-style replay improves sample efficiency in navigation; structural sparsity slashes energy without cratering accuracy; local-first systems with durable, append-only memories make post-hoc audits cheaper. The open question is scale without erasing slowness. Can we keep values from melting when the KPI turns? Can we refuse speed when speed would erase memory? I don’t know. But the direction seems right: less spectacle, more constraint; less talk about minds, more engineering of timescales; fewer patches, deeper priors. If information is the substrate, then learning is the art of binding ourselves to the right limits—slowly enough to matter, quickly enough to stay alive.
Sapporo neuroscientist turned Cape Town surf journalist. Ayaka explains brain-computer interfaces, Great-White shark conservation, and minimalist journaling systems. She stitches indigo-dyed wetsuit patches and tests note-taking apps between swells.