Babies Do Math Before They Can Count

Here's a setup that sounds like a magic trick, but it's actually neuroscience.

Show an 11-month-old two rubber ducks going behind a screen. Then — while they're watching — secretly remove one. Lower the screen. One duck. The baby stares significantly longer than if both ducks had appeared.

They're not confused. They're doing math.

Not counting, not calculating — but their brain just flagged a violation: expected 2, got 1, this is wrong. This is the violation-of-expectation paradigm, one of the most elegant tools in developmental science, and it tells us something remarkable: infants arrive with a numerical sense that predates language, education, or any formal understanding of what a "number" even is (Margoni, Surian, & Baillargeon, 2024).

Now here's the twist that gets me every time: GPT-4 can multiply 7-digit numbers faster than any human alive. But ask it to do what that 11-month-old just did — detect a numerical impossibility from a scene it's never seen before — and the results get much messier. There's a gap here that tells us something deep about the difference between having a sense for something and computing it.

The Approximate Number System: Evolution's Abacus

The infant's numerical superpower has a name: the Approximate Number System, or ANS. It's a pre-linguistic, culturally universal capacity that lets us estimate and compare quantities without counting. Animals have it too — rats can distinguish "more" from "fewer," bees can track quantities, fish prefer to join the larger school.

The ANS doesn't deal in exact numbers. It works with ratios. A 6-month-old can tell 16 dots from 8 dots (a 2:1 ratio), but probably can't tell 10 from 9. The precision improves with age — by 9 months infants handle 3:2 ratios — and the system remains active throughout adult life every time you glance at a crowded table and instantly know which side has more. It's noisy, nonlinear, and completely distinct from the symbolic counting system we learn in school.

This is core knowledge — in the technical sense that researchers like Spelke and Dehaene use that term. It's not learned from labeled examples or reward signals. It's a prior the brain shows up with. And here's where it gets genuinely interesting for anyone building learning systems: the infant who stares at the impossible duck trick was never taught that 1 ≠ 2. They just knew.

What Violation-of-Expectation Actually Reveals

I keep coming back to the VoE paradigm as one of the most productive tools in this whole intersection of cognition and AI. The core idea is elegant: if an infant has a model of how the world works, they'll be surprised when that model is violated — and surprise shows up as longer looking time.

Decades of VoE research have built a striking picture of the early mind. Infants as young as 5 months represent basic numerical facts as part of their world model, tracking quantities through occlusion and registering violations of simple arithmetic — a kind of implicit world-model that runs continuously below the level of conscious thought (Margoni, Surian, & Baillargeon, 2024).

DeepMind's PLATO model used VoE as its developmental benchmark, attempting to build a system that would "look longer" at physically impossible events the way infants do. It partly worked. And the gaps where it didn't — the specific conditions where PLATO's surprise response broke down — are illuminating, because they tend to be exactly the cases that require genuine causal understanding rather than pattern completion.

The Inversion That Keeps Me Up at Night

AI systems and human babies have almost perfectly opposite numerical profiles, and I think this asymmetry is underappreciated.

A typical large language model learns to handle numbers through statistical pattern completion over tokenized text. It's absorbed enormous amounts of arithmetic from its training data. The result: it can multiply, integrate, solve differential equations. Genuinely impressive. But subitize — instantly recognize a small quantity without counting, the way you do when you glance at a plate and just know there are three cookies? That's weirdly unreliable. Quantity estimation from novel scenes, number comparisons presented as images, basic cardinality tasks that 18-month-olds ace — these expose real cracks.

Babies are the mirror image. Highly capable at estimation and comparison (the ANS), completely unable to do formal arithmetic. Their number sense is fast, automatic, and embodied — it fires before conscious thought, grounded in physical encounters with a world that has quantities in it.

According to Yiu, Kosoy, and Gopnik (2024), this asymmetry points to something fundamental: children are genuinely better at flexible, real-world inference from sparse experience than LLMs, which are powerful engines of cultural transmission but lack the causal and intuitive foundations that make biological cognition so robust. The number story is nearly a perfect case study — babies have the intuition, AI has the formula, and neither currently has both, genuinely integrated.

The Digital Twin That Cracked Dyscalculia

Here's where things get exciting from a build-it perspective.

Strock and colleagues at Stanford (2025) did something clever: they built a deep neural network digital twin of the dorsal visual pathway — the brain's main numerical processing route — and used it to model developmental dyscalculia, the specific learning disability that makes arithmetic extremely difficult for around 5% of children. The key move was simulating dyscalculia by perturbing the network in ways that matched known neurological signatures: reduced parietal activation, degraded magnitude representations.

The result: the perturbed DNN produced behavioral and neural signatures that closely matched those observed in children with dyscalculia. And crucially, what this revealed is that dyscalculia isn't primarily a problem with attention or working memory — it's a disruption in the formation of number-selective representations in parietal cortex. The magnitude representations, the neural substrate of the ANS itself, are less precise and less responsive to numerical distance (Strock et al., 2025).

For those of us who think about how learning systems develop, this is striking. Using AI as a computational probe to map the mechanisms of biological learning — and then using those findings to understand where the biological system breaks down — is exactly the kind of methodological cross-pollination this field was made for. And from a practical standpoint, it opens the door to intervention strategies targeted at the representational level rather than the procedural one.

If you're an educator or parent concerned about a child's arithmetic development, this kind of research is pushing toward more precise diagnostic tools. A developmental pediatrician or educational neuropsychologist can help distinguish whether difficulties lie at the conceptual magnitude-representation level versus later procedural stages — that distinction matters enormously for how you intervene.

The Common Thread: Learning From a World With Quantities In It

One thing that bridges infants and AI more than people often acknowledge: both extract statistical regularities from input. In human development, statistical learning — detecting patterns, transition probabilities, and co-occurrence structures — operates across virtually every cognitive domain and forms the computational engine that more specialized systems build on (Romberg & Saffran, 2025).

The infant's ANS isn't purely innate in the sense of appearing fully formed at birth; it's refined through experience, through countless physical encounters with small quantities — two hands, three steps, four chair legs. But what it doesn't require is explicit instruction, labeled training examples, or external reward. The learning is implicit, driven by the structure of physical interaction with a world that has numerosity baked into it.

This is what I keep thinking about when I watch kids at the maker table. I was at a local elementary school's maker morning just this past Saturday, scribbling notes on a napkin while a five-year-old spent twenty minutes adjusting the wheel placement on her cardboard car — nudging, testing, adjusting again. She was running a proto-numerical experiment about symmetry and balance, a felt sense that something was "off" in a quantitative way. Nobody taught her to care about that. She just felt the wrongness.

That embodied, self-correcting, physically-grounded numerical intuition is what the ANS looks like in the wild. And it's what no training run on arithmetic problems, no matter how extensive, has yet produced in a machine.

The Gap Worth Closing

Here's what I think the number-sense story tells us practically:

For AI researchers: The ANS–arithmetic dissociation in human development isn't a quirk — it reflects a meaningful architectural distinction between estimative, embodied numerical cognition and symbolic numerical computation. Building systems with both, genuinely connected, may require something more like the infant's developmental pathway — grounded in physical scenes, shaped by active interaction — than a more sophisticated statistical model trained on equations.

For the data-efficiency conversation: The infant's ANS develops from natural, unlabeled experience in the world, not from a curated dataset. The fact that robust quantity representations emerge from embodied interaction rather than supervised instruction has real implications for how we think about learning from sparse, naturalistic data.

For everyone: The next time you glance at a bowl of fruit and just know there are more apples than oranges — without counting, without effort — that's a system that took several hundred million years of vertebrate evolution to wire in, and that we have not yet figured out how to put in a machine.

The baby staring at one duck where two should be isn't confused. They're running a computation that predates language, schooling, and written mathematics by evolutionary orders of magnitude.

That's the gap worth closing.