AI turns everyday videos into interactive 3D worlds for games and robots

A simple phone video of your kitchen can now become an interactive, photorealistic 3D world—thanks to new AI that builds digital twins.

New AI tool turns basic room videos into interactive 3D simulations for games and robot training. (CREDIT: Wei-Chiu Ma)

Creating a lifelike digital replica of a physical space no longer requires specialized equipment or labor-intensive manual mapping. A new system, DRAWER, developed by researchers at Cornell University, can now build realistic, interactive 3D environments from just a short, casual video. With it, a simple scan of your kitchen or office turns into a high-fidelity simulation where drawers open, cabinets swing, and objects move—an innovation that opens new doors for gaming, robotics, and digital design.

A Seamless Leap From Real to Virtual

The process begins with something as simple as recording a video using a smartphone. “Our input is just a video that you casually capture in the kitchen. You don’t need to interact with any cabinet doors or with the objects,” explained Hongchi Xia, a Ph.D. student in computer science who presented the research at the IEEE/CVF Conference on Computer Vision and Pattern Recognition in Nashville.

Instead of relying on specialized sensors or staged interactions, DRAWER works by analyzing the video’s visual data to reconstruct both the look and feel of the scene. It identifies the shapes, textures, and dimensions of all visible objects. Using this, it builds a 3D model that not only appears lifelike, but also responds to user interaction. That means you can virtually open a cupboard or pull out a drawer—interactions that previous reconstruction methods couldn’t easily support.

Assistant professor Wei-Chiu Ma at Cornell’s Ann S. Bowers College of Computing and Information Science. (CREDIT: Wei-Chiu Ma)

While other systems have rendered digital scenes from various camera angles, they often lacked interactivity. DRAWER, led by assistant professor Wei-Chiu Ma at Cornell’s Ann S. Bowers College of Computing and Information Science, breaks new ground by combining detailed realism with responsive movement.

“Existing techniques, although they allow you to synthesize what the world looks like from different viewpoints, sometimes lack this capability of being immersive, where you can really interact with the scene,” Ma said.

What Makes DRAWER Different

At the core of this breakthrough are two main components. First is the reconstruction module, which builds a digital version of the room with fine-grained geometric detail. Second is the articulation module, which detects which parts of the environment are moveable—like doors or drawers—and figures out how they’re supposed to move.

To create something truly immersive, DRAWER also includes a perception module that recognizes different articulation types and hinge locations.

For example, it knows a fridge door should swing outward and that a drawer slides horizontally. Even hidden parts of the space, like the inside of a cabinet, are generated using trained AI models. This level of scene understanding enables the simulation to not just look real but feel real.

One of the biggest challenges wasn’t developing these individual modules, but integrating them into a single, working system. “I just hold my iPhone – you don’t need an advanced video device or expensive camera,” said Xia. But combining all the AI parts into a functional framework required months of refinement.

Once the system was working, Xia used DRAWER to recreate rooms including a bathroom, kitchen, and his own office. These were not just static displays. In a demonstration, he turned the digital kitchen into a simple video game using Unreal Engine. In it, users could toss virtual balls to knock over items like kettles and soap bottles—proof that the generated scenes respond naturally to dynamic interaction.

Original rendering of a video of a kitchen. (CREDIT: Wei-Chiu Ma, et al.)

A Tool for More Than Just Games

While the implications for video game design are obvious—simpler, faster creation of realistic and interactive environments—DRAWER’s value doesn’t end there. The system also offers powerful potential for robotics. One method for teaching robots to function in new environments is called real-to-sim-to-real transfer. A robot first learns tasks in a virtual model of the space before applying that learning in the actual environment.

Using DRAWER, Ma’s team trained a robotic arm entirely within the digital kitchen. After learning how to put objects away in virtual drawers, the robot was able to perform the same actions accurately in the real world. Because the simulation is nearly identical to the real scene, the knowledge transfers smoothly.

This has huge implications for cost and safety. Training robots directly in physical settings can be expensive and risky, especially when mistakes might break things—or the robots themselves. Simulated learning reduces both cost and risk, making automation more accessible.

Final rendering of the kitchen. (CREDIT: Wei-Chiu Ma, et al.)

One future use case? Ordering a robot online, filming your home with a smartphone, and uploading the video. The company could then simulate your home and pre-train the robot to navigate it before it even arrives on your doorstep.

Toward a Digital Copy of Everything

Right now, DRAWER focuses on rigid objects—those that don’t bend or change shape, like a pot or a drawer. The researchers plan to expand it to include soft or deformable items like fabric, or even dynamic surfaces like mirrors and windows. These elements are far harder to model due to their unpredictable nature, but the team sees them as necessary steps toward a complete simulation of reality.

DRAWER currently models only single rooms, but work is underway to scale the system up to entire buildings. Eventually, it could be applied to outdoor environments too. That would enable uses in urban planning, smart farming, and disaster response, allowing decision-makers to test scenarios in virtual spaces that perfectly match the real world.

“Our final goal is to try to build a digital twin of everything in the world,” said Xia. That may sound like science fiction, but the team believes they’ve taken an essential first step. The more environments DRAWER can simulate, the more real-world tasks it can help with—from game design and robot training to architecture and beyond.

The use of generative AI is the key. These advances allow DRAWER to fill in the blanks where data is missing, guess what’s inside closed cabinets, and simulate realistic lighting and texture. That ability to "imagine" unseen parts of the world makes the digital twins feel truly alive.

Even as the system improves, Ma and Xia say they are focused on real-world usability. DRAWER doesn’t require powerful computers or exotic sensors. A smartphone and a bit of video are enough. That’s what makes it so promising—not just for researchers and engineers, but for everyone. As technology pushes forward, the gap between physical and virtual grows smaller. With DRAWER, that gap almost disappears.

Research findings are available online here.

Note: The article above provided above by The Brighter Side of News.


Like these kind of feel good stories? Get The Brighter Side of News' newsletter.


Mac Oliveau
Mac OliveauScience & Technology Writer

Mac Oliveau
Science & Technology Writer | AI and Robotics Reporter

Mac Oliveau is a Los Angeles–based science and technology journalist for The Brighter Side of News, an online publication focused on uplifting, transformative stories from around the globe. Passionate about spotlighting groundbreaking discoveries and innovations, Mac covers a broad spectrum of topics—from medical breakthroughs and artificial intelligence to green tech and archeology. With a talent for making complex science clear and compelling, they connect readers to the advancements shaping a brighter, more hopeful future.