AI and eye tracking could transform how kids learn online
New research shows how AI and eye tracking pinpoint the exact moments when children grasp or lose concepts in educational videos.

Nearly 200 children took part in a study using AI and eye tracking to uncover seven key video moments tied to learning. (CREDIT: Shutterstock)
Children’s eyes may hold the secret to how they learn from videos, and new research suggests artificial intelligence could unlock it. By pairing eye-tracking technology with advanced neural networks, researchers are starting to identify the exact moments when young viewers grasp new concepts—or when those ideas slip away.
The findings point toward a future where video lessons adjust in real time to match a learner’s needs, changing how science is taught in classrooms, homes, and beyond.
Tracking young eyes in action
Researchers at Ohio State University conducted a large-scale experiment involving 197 children between ages 4 and 8. Each child watched a four-minute video stitched together from the YouTube shows “SciShow Kids” and “Learn Bright.” The lesson focused on camouflage in animals, a concept that blends visual cues with scientific explanation.
Before watching, children were asked questions to measure their baseline knowledge. Afterward, they answered similar questions to test what they had learned. During the lesson, a high-precision eye tracker recorded where and how long each child looked at the screen. Lead author Jason Coronel, an associate professor of communication, explained why that step matters. “Eye-tracking allowed us to measure attention to the video in real time, which is critical for learning,” he said.
The team then fed this detailed data into two different artificial neural networks. One followed a standard approach, while the other included a theory-driven design that accounted for how new information interacts with older material over time. The theory-based model proved more accurate, especially in predicting which children would correctly answer questions about camouflage.
Key learning moments revealed
The AI analysis revealed that certain parts of the video had an outsized influence on whether children understood the lesson. Seven “key moments” stood out, each tied to noticeable shifts in eye movement and linked to comprehension. One of the strongest predictors came early on, when the host asked viewers to help her find her cartoon sidekick, Squeaks.
Children who followed the cue with focused attention were more likely to grasp harder concepts later in the video. As the study notes, “Our machine learning and eye-tracking data indicate that children’s eye movements during this early moment are among the strongest predictors of their overall understanding of the video.”
Related Stories
- AI-powered brain stimulation boosts attention span and concentration
- A new prescription eye drop brings hope beyond reading glasses
Another powerful moment occurred when the narrator defined camouflage explicitly and paired the explanation with the printed word. These transitions, known as “event boundaries,” mark points when one meaningful segment ends and another begins. According to co-author Alex Bonus, who specializes in children’s media, the seven key points aligned almost perfectly with these natural boundaries.
Why timing matters for learning
Learning is not just about receiving information—it is about how the brain organizes it over time. Coronel and his colleagues framed their machine learning work around this idea of temporal interdependence. They argue that when past and new information connect seamlessly, comprehension improves.
This theory-guided approach allows AI models to detect more than just surface-level viewing patterns. Instead, they highlight how shifts in attention at critical moments shape later understanding. The results hint at the potential of designing videos with well-placed boundaries that keep children engaged and better prepared for more complex ideas.
Still, Coronel stressed that the work is preliminary. “But this method has the potential to help experts design messages with event boundaries that enhance learning,” he said.
A glimpse into personalized video education
The research, published in the Journal of Communication, comes as both eye-tracking and artificial intelligence are becoming more accessible. Once reserved for high-end labs, eye-tracking hardware is now cheaper and easier to use in classrooms or even at home.
At the same time, machine learning models are growing more sophisticated in processing streams of data like gaze patterns. Together, these advances could change how children learn from videos. Currently, it can take teachers days or weeks to notice if a student misunderstands a lesson. By then, the class may have moved on, leaving gaps in knowledge.
Coronel envisions a different path: “Our ultimate goal is to build an AI system that can tell in real time whether a viewer is understanding or not understanding what they are seeing in an educational video. That would give us the opportunity to dynamically adjust the content for an individual person to help them understand what is being taught.”
This might mean a video automatically offers a second example, changes the pace, or switches to a new explanation style the moment a learner falters. Such adaptability could make instruction more personal, efficient, and scalable across different classrooms and learning environments.
The road ahead
Much remains unknown about the exact cognitive processes behind these key video moments. Researchers still need to learn why attention spikes at certain times and how that attention translates into stronger memory or problem-solving skills. The Ohio State team plans to expand on this work, combining more advanced models with longer and more varied lessons.
For now, the study provides a glimpse into the next stage of educational technology, where real-time signals from the body guide how information is delivered. With nearly 200 children showing consistent patterns in a short video, the results suggest that young viewers’ eyes may indeed reveal when learning is—or is not—taking place.
As Coronel put it, “Imagine a future where eye tracking can tell instantaneously when a person is not understanding a concept in a video lesson, and AI dynamically changes the content to help. Maybe the video can offer a different example or way of explaining the concept. This could make instruction more personalized, effective and scalable.”
Note: The article above provided above by The Brighter Side of News.
Like these kind of feel good stories? Get The Brighter Side of News' newsletter.

Joshua Shavit
Science & Technology Writer
Joshua Shavit is a Los Angeles-based science and technology writer with a passion for exploring the breakthroughs shaping the future. As a co-founder of The Brighter Side of News, he focuses on positive and transformative advancements in AI, technology, physics, engineering, robotics and space science. Joshua is currently working towards a Bachelor of Science in Business and Industrial Engineering at the University of California, Berkeley. He combines his academic background with a talent for storytelling, making complex scientific discoveries engaging and accessible. His work highlights the innovators behind the ideas, bringing readers closer to the people driving progress.