Scientists Train AI Using Headcam Footage From Human Toddler

Researchers have not only built an AI child, but are now training AI using headcam footage from a human baby as well.

In a press release, New York University announced that its data science researchers had strapped a camera to the head of a real, live, human toddler for 18 months to see how much an AI model could learn from it.

Most large language models (LLMs), like OpenAI’s GPT-4 and its competitors, are trained on “astronomical amounts of language input” that are many times larger than what infants receive when learning to speak a language during the first years of their lives.

Despite that data gap, the systems the NYU scientists trained on the baby cam data were in fact able to learn a “substantial number of words and concepts” — and all from only about one percent of the child’s total waking hours between the ages of six months and two years.

Translation: using only a fraction of the data usually required to train an AI model, these researchers were able to teach their system to learn like a baby. That sort of thing is enough to make most Silicon Valley types, who are understandably concerned about the enormous amounts of energy, water, and data needed to train AI models, salivate.

As the MIT Technology Review reports, the baby in question is named Sam, and he lives in Australia with his parents and two cats. The project resulted in roughly 61 hours of footage shot while Sam wore the lightweight camera headset on and off for a year and a half — and ultimately in a study published last week in the journal Science.

“This data set was totally unique,” computational cognitive scientist Brandon Lake told MIT Tech. “It’s the best window we’ve ever had into what a single child has access to.”

After feeding the babycam footage into its AI, the researchers found that it was able to perform a feat of human cognition: matching up words with the objects they represent — a chaotic undertaking, as MIT Tech reports, because the footage was often cluttered with many objects even if a speaker can be heard referencing one in particular.

Curiously enough, researchers at Indiana University conducted remarkably similar experiments with a baby wearing a headcam and released their preliminary results last fall — though in that case, they were more interested in the child’s visual learning and how AI could help understand it, rather than its language processing.

Even more curiously still, it appears that NYU funded its research into baby-AI learning using a grant from the Pentagon’s Defense Advanced Research Projects Agency (DARPA), which has invested billions of dollars into AI.

In fact, NYU’s Tandon School of Engineering, which was not a part of the babycam research, received a $5 million contract to help DARPA develop an “AI-driven augmented reality assistant” back in 2021.

It’s unclear how much DARPA (or the National Science Foundation, which also helped fund the AI babycam) gave NYU for this specific experiment.

All told, this is a pretty fascinating study — though one researcher did point out an important caveat: even if AI successfully analyzed the footage of Sam working to understand the world, the video itself was a product of his tiny human brain.

“I don’t think the study would have worked if babies hadn’t created the data set that the neural net was learning from,” Skidmore developmental psychologist Jess Sullivan, who helped arrange the data collection, told MIT Tech.

More on AI babies: Scholar Creates AI-Powered Simulated Child