Scientists recreate Star Trek’s holodeck using ChatGPT and video game assets

In the realm of science fiction, the holodeck of Star Trek fame stands as a prime example of immersive virtual environments. (CREDIT: Yue Yang)

In the realm of science fiction, the holodeck of Star Trek fame stands as a prime example of immersive virtual environments, where characters could engage in various simulations with just a spoken command.

In our present reality, the concept of virtual interactive environments finds its application in training robots before they venture into the physical world, a process termed "Sim2Real."

However, the creation of these virtual environments has been labor-intensive, with artists spending significant time crafting each scene manually. This scarcity poses a challenge for training robots to navigate the complexities of real-world environments.

Example outputs of HOLODECK—a large language model powered system, which can generate diverse types of environments (arcade, spa, museum), customize for styles (Victorian-style), and understand fine-grained requirements (“has a cat”, “fan of Star Wars”). (CREDIT: Yue Yang)

Yue Yang, a doctoral student at the University of Pennsylvania's Computer and Information Science (CIS) department, highlights this issue, emphasizing the time-consuming nature of manual environment creation.

Mark Yatskar and Chris Callison-Burch, professors in CIS, further elaborate on the importance of data for training AI systems, stressing the need for vast amounts of 3D environments to train embodied AI effectively.

To address this challenge, researchers at the University of Pennsylvania, in collaboration with Stanford University, the University of Washington, and the Allen Institute for Artificial Intelligence (AI2), have developed Holodeck.

Using everyday language, users can prompt Holodeck to generate a virtually infinite variety of 3D spaces, which creates new possibilities for training robots to navigate the world. (CREDIT: Yue Yang)

Similar to how Captain Picard might request a simulation of a speakeasy on Star Trek, researchers can instruct Holodeck to generate specific environments, like "a 1b1b apartment of a researcher who has a cat." Holodeck breaks down these requests into steps, creating floors, walls, furnishings, and other elements while maintaining coherence and realism.

To evaluate Holodeck's performance, the researchers conducted studies involving Penn Engineering students. Participants were asked to compare scenes generated by Holodeck with those from an earlier tool called ProcTHOR, which relied on human-created rules rather than AI-generated text.

Across various criteria, including asset selection, layout coherence, and overall preference, Holodeck consistently received higher ratings.

Furthermore, Holodeck demonstrated its capability to generate a wide range of environments, from apartment interiors to stores, public spaces, and offices. In comparisons with ProcTHOR, human evaluators consistently favored Holodeck's outputs, indicating its effectiveness in creating diverse and realistic scenes.

The researchers also utilized Holodeck-generated scenes to fine-tune an embodied AI agent, aiming to enhance its ability to navigate unfamiliar spaces safely. Results showed significant improvements in the agent's performance across different types of virtual environments, such as offices, daycares, gyms, and arcades.

For example, when pre-trained using ProcTHOR, the agent succeeded in finding a piano in a music room only about 6% of the time. However, after fine-tuning with music rooms generated by Holodeck, the success rate increased to over 30%, showcasing the efficacy of Holodeck in enhancing the agent's navigation skills.

Yatskar emphasizes the importance of exploring diverse environments beyond residential spaces, underscoring Holodeck's role in efficiently generating a multitude of environments for robot training.

With its ability to create realistic and varied scenes, Holodeck offers a promising solution to the longstanding challenge of training robots to navigate the complexities of the real world.

Additional co-authors include Fan-Yun Sun, Jiajun Wu, and Nick Haber at Stanford; Ranjay Krishna at the University of Washington; Luca Weihs, Eli Vanderbilt, Alvaro Herrasti, Winson Han, Aniruddha Kembhavi, and Christopher Clark at AI2.

For more science and technology news stories check out our New Innovations section at The Brighter Side of News.

Note: Materials provided above by The Brighter Side of News. Content may be edited for style and length.

Like these kind of feel good stories? Get the Brighter Side of News' newsletter.