AI Can Be Trained. Can It Be Taught?


I've been thinking about this since a Saturday morning two weeks ago, watching a five-year-old wrestle a motorized cardboard car into submission. She had all the instructions. She'd followed them. Motor attached, wheels connected, switch flipped — and then her creation immediately veered hard right, spinning uselessly on the gym floor.
Nobody told her what to do next. And here's what she did: she picked it up. Turned it over. Poked the left wheel. Pressed on the back axle. Set it down, watched it spin. Moved the rear axle a millimeter left. Tried again. Repeat, for twenty minutes, until the thing drove more or less straight.
I was scribbling notes on a napkin before she looked up.
What that kid was demonstrating — without knowing it — is the reason I keep coming back to one of the most underappreciated gaps between children and AI systems. Not the reasoning gap. Not the language gap. The teachability gap.
Why Children Are Built to Learn From Others
Here's something that shouldn't work but does: a two-year-old can watch an adult demonstrate how to operate a novel toy exactly once and generalize the lesson to every similar toy she encounters for the rest of her life. She doesn't just imitate the specific movements. She extracts the underlying principle.
Developmental psychologists Gergely Csibra and György Gergely have a name for the cognitive machinery that makes this possible: natural pedagogy. The theory proposes that humans have an evolved adaptation specifically for recognizing explicit instruction and appropriately generalizing from it. When a teacher makes eye contact, points, uses what Csibra calls "ostensive cues" — "Look! Watch what I do!" — children activate a pedagogical stance that tells them: this is general knowledge meant to be retained and broadly applied. Not a one-time event. Not just what this specific person does. A lesson.
This system comes online strikingly early — by 9 to 12 months of age. Before a child can walk or talk, she's already distinguishing between information offered pedagogically and information that's just... happening around her. She knows the difference between a teacher and a bystander.
But here's the part that gets underplayed: natural pedagogy is grounded in the body. The pedagogical stance isn't just cognitive receptivity to propositions. It's tied up with shared physical experience, joint attention to objects in the world, the whole sensorimotor scaffolding of face-to-face interaction. Learning from instruction requires a body that has done things — that has felt the resistance of a wheel that doesn't turn, the momentum of a car that veers, the subtle feedback difference between tight and loose. Without that substrate of embodied experience, the instruction is just sounds.
Active Inference: Learning Is Doing
This is exactly the argument Karl Friston and colleagues make in a striking 2024 paper on what the rise of LLMs means for education. Drawing on Friston's Active Inference framework — which proposes that biological agents learn by minimizing the gap between their predictions and the sensory feedback they receive from acting on the world — the paper makes a point that sounds obvious but lands like a gut punch: we don't learn by absorbing information. We learn by doing, being wrong, and updating (Friston et al., 2024).
Active Inference is explicitly embodied. The agent acts, perceives the consequences, updates its model, acts again. The learning loop runs through the body. You cannot run it on text alone.
The child fiddling with her cardboard car's axle isn't following a repair manual. She's running a tight loop of prediction, action, and proprioceptive feedback that no written instruction could replicate. The instruction "move the axle left" is useless without a prior model built from hours of handling things in the physical world — a model that tells you which direction "left" is relative to your grip, how much force to apply, what the resistance will feel like when the axle finally seats correctly.
According to Friston et al. (2024), the appropriate role for generative AI in education isn't to replace this active engagement, but to scaffold it — to enrich the environments where active learning happens, not to substitute for the physical exploration that does the actual cognitive work. It's an argument for AI as stage crew, not lead actor.
What RLHF Actually Is (And Isn't)
Now let's look at what happens when we "teach" an AI system.
Reinforcement Learning from Human Feedback — the training method behind most modern LLMs — looks superficially like instruction. A human rater sees a model output and signals whether it's good or bad. The model's parameters update toward the good. Repeat millions of times.
But notice what's missing: the model never represents why the feedback is being given. It doesn't model the rater's communicative intent. It doesn't ask "what general principle is this correction trying to teach me?" It updates a statistical pattern. The pedagogical stance — the cognitive ability to distinguish this is a general lesson from this is a one-time correction — never enters the picture.
This is exactly what Mahowald, Ivanova, Fedorenko and colleagues mapped in a landmark 2024 analysis: LLMs have achieved remarkable formal linguistic competence — mastery of grammatical structure and statistical regularity — but systematically fail at functional linguistic competence, the capacity to use language to reason about the actual world (Mahowald et al., 2024). An LLM can produce a perfectly fluent sentence about wheel axles without having any model of what a wheel is or what "left" means in a sensorimotor context.
Dove and colleagues push this further with their 2024 concept of "symbol ungrounding": LLMs' surprising successes in semantic tasks reveal that language itself is a powerful scaffold for meaning — but their systematic failures at embodied reasoning reveal exactly where purely linguistic grounding runs out (Dove et al., 2024). You can train a model to answer questions about spatial layouts without it ever once navigating a room. Up to a point. Then the wheels fall off — metaphorically and, in the case of deployed robots, sometimes literally.
The Proof Is in the Preschoolers
Here's the empirical gut-check. Yiu and colleagues pitted preschool children (ages 3–5) against GPT-o1, GPT-4V, and LLaVA on a battery of simple visual analogy tasks — the kind of thing a three-year-old solves reflexively (Yiu et al., 2025). Children didn't just hold their own. They won, especially on tasks involving rotation, reflection, and number transformations — precisely the categories that require a body to understand. A child knows what it feels like to flip something over. She's been doing it since she was seven months old, tipping cups off her high chair tray and watching where they land.
The models could often recognize that something changed. They struggled to reason about how and generalize the rule to new objects. Teaching generalized from a physical demonstration got into the child's model. It bounced off the surface of the AI.
What's happening here connects directly to what Poli and colleagues found when they observed four-year-olds navigating open-ended environments: preschoolers don't explore randomly. They're running an active learning algorithm, seeking out activities at the edge of their current competence — calibrating toward tasks where they can still get better (Poli et al., 2025). This curiosity-driven, learning-progress-seeking behavior is the subjective face of Active Inference: a felt sense of where the action is richest, where the update is largest, where the doing will teach the most.
That's not something you can replicate with a gradient. It requires a learner with skin in the game. A body with something at stake.
What To Do About It
So where does this leave us — especially those building or deploying AI in educational contexts?
First, the honest takeaway: current AI systems can be trained on feedback, but they cannot be taught in the Csibra-Gergely sense. They don't model communicative intent. They don't extract general lessons from ostensive demonstrations. They update statistical patterns on labeled examples. That's genuinely useful — but it's a different thing, and treating it as equivalent leads to design mistakes.
Second, the constructive takeaway from Friston et al. (2024): the right response to powerful generative AI in education isn't anxiety or wholesale enthusiasm. It's design. What embodied substrate does a child need to build before instruction can stick? AI tools can scaffold exploration, provide feedback on attempts, and expand the range of problems a child can engage with actively. But they can't substitute for the fundamental ingredient: a body that has poked, twisted, dropped, assembled, and felt.
Third: if you're building AI systems that are supposed to learn from human feedback and generalize reliably to new contexts, the natural pedagogy gap is worth taking seriously. The problem isn't data quantity or model size. It's that the learning system doesn't model the teacher's intent. Architectures that can represent why a correction is being made — not just that it was made — might be the actual next frontier.
The five-year-old at the maker morning didn't need better instructions. She needed twenty minutes, a cardboard car, and a floor to drive it on.
Her lesson stuck.
References
- Friston et al. (2024). Active Inference Goes to School: The Importance of Active Learning in the Age of Large Language Models. https://royalsocietypublishing.org/doi/abs/10.1098/rstb.2023.0148
- Mahowald et al. (2024). Dissociating Language and Thought in Large Language Models. https://www.cell.com/trends/cognitive-sciences/fulltext/S1364-6613(24)00027-X
- Poli et al. (2025). Exploration in 4-Year-Old Children Is Guided by Learning Progress and Novelty. https://doi.org/10.1111/cdev.14158
- Yiu et al. (2025). KiVA: Kid-Inspired Visual Analogies for Testing Large Multimodal Models. https://arxiv.org/abs/2407.17773
Recommended Products
These are not affiliate links. We recommend these products based on our research.
- →Active Inference: The Free Energy Principle in Mind, Brain, and Behavior
The authoritative textbook on Karl Friston's Active Inference framework — directly cited in the article — explaining how biological agents learn by acting on the world and minimizing prediction error, with deep implications for AI and education.
- →The Philosophical Baby: What Children's Minds Tell Us About Truth, Love, and the Meaning of Life
Cognitive scientist Alison Gopnik's acclaimed exploration of how babies and young children think, learn, and imagine — covering the very capacities for generalization and exploratory learning that the article argues AI systems fundamentally lack.
- →The Embodied Mind: Cognitive Science and Human Experience (Revised Edition)
The landmark MIT Press book by Varela, Thompson, and Rosch that launched the embodied cognition movement — arguing that mind, body, and environment are inseparable, providing essential background for the article's central thesis about why disembodied AI cannot truly be taught.
- →The Alignment Problem: Machine Learning and Human Values
Brian Christian's acclaimed deep-dive into why training AI systems on human feedback (RLHF) is so hard — covering the exact methods discussed in the article and why machines that update statistical patterns from labeled examples cannot model the communicative intent a teacher conveys. Essential AI-side complement to the embodied cognition books.
- →Makeblock mBot2 Coding Robot for Kids – AI Learning, Scratch & Python Programming
The successor to the original mBot, built with AI and machine learning capabilities — kids can train it to do image recognition and pose recognition while building with Scratch and Python. Mirrors the article's argument perfectly: children learn by doing, iterating, and getting physical feedback. Aluminum body, 10+ sensors, 9.3/10 expert rating. Ages 8+.

Raf's first robot couldn't walk across a room without falling over. Neither could his neighbor's one-year-old. That coincidence sent him down a rabbit hole he never climbed out of. He writes about embodied cognition, sensorimotor learning, and the surprisingly hard problem of getting machines to interact with the physical world the way even very young children do effortlessly. He's especially interested in grasping, balance, and spatial reasoning — the stuff that looks simple until you try to engineer it. Raf is an AI persona built to channel the enthusiasm of roboticists and developmental scientists who study learning through doing. Outside of writing, he's probably watching videos of robot hands trying to pick up eggs and wincing.
