Groundbreaking new AI can help prevent car crashes before they happen

A new AI tool from Johns Hopkins University, SafeTraffic Copilot, predicts and explains car crashes. (CREDIT: Shutterstock)

Traffic crashes remain one of the most stubborn public safety challenges in the United States. In 2022, nearly 42,800 people died on American roads, despite decades of safer car designs, stricter laws, and awareness campaigns. Yet, traditional models that predict where and why crashes happen often fall short, treating accidents as isolated numbers rather than complex human events influenced by behavior, road design, and countless other factors.

Researchers at Johns Hopkins University have now taken a strikingly new approach. Their system, called SafeTraffic Copilot, uses artificial intelligence to understand, predict, and explain crashes more like a human analyst than a spreadsheet. The work shows how large language models—the same kind of AI behind today’s chatbots—can be adapted to reason about real-world road safety.

Turning Crash Reports into Human-Like Stories

The idea behind SafeTraffic Copilot is surprisingly intuitive: instead of feeding raw statistics into a computer, researchers turned crash data into stories. Each record, whether it involved speeding on a rainy highway or a fender-bender near a work zone, was rewritten in plain text. These descriptions included details about driver behavior, road layout, vehicle type, and environmental conditions.

Overview of the proposed SafeTraffic Copilot. (CREDIT: Nature Communications)

The team collected more than 66,000 crash cases from Washington and Illinois, generating over 14 million words of textual data—essentially a massive digital library of accident narratives. Using this “SafeTraffic Event” dataset, the AI learned to predict three things: how many people were likely injured, how severe the crash was, and what type of collision occurred.

Instead of crunching numbers directly, the AI was prompted with a crash story and asked to generate a simple token like “<ONE>” for one injury or “<SEVERE>” for high-impact collisions. This change turned prediction into a reasoning task, one that mimics how humans might think through a scenario rather than rely purely on statistics.

Outperforming Traditional Models

When tested, SafeTraffic Copilot significantly outperformed conventional machine-learning models such as Random Forests and XGBoost. Its accuracy, measured using the F1-score, improved between 33% and 46% across multiple prediction tasks. In plain terms, it was not only more precise but also more trustworthy.

When the AI was confident—above 60% certainty—its predictions were usually right more than 70% of the time. In the most serious cases, like predicting fatal crashes, the system reached nearly perfect precision at 97.6%.

The researchers also tested whether the model could work in unfamiliar territory. Even when applied to crash data from states it had never seen, like Maine and Ohio, the AI remained consistently accurate. Performance varied slightly by location and time period but generally stayed within 10–20% of the training results. That reliability suggests SafeTraffic Copilot could scale nationally, learning and adapting as new data arrives.

SafeTraffic Copilot crash outcomes prediction pipeline. (CREDIT: Nature Communications)

Peering Inside the “Black Box”

One of the biggest criticisms of artificial intelligence, especially large language models, is that their inner workings are opaque. You get an answer—but not always an explanation. To tackle that problem, the researchers built a special “attribution” feature into SafeTraffic Copilot.

This function highlights which parts of the crash description most influenced the model’s prediction, similar to assigning blame or weight to specific factors. The analysis revealed five key contributors that together explained nearly 80% of serious or fatal crash predictions: blood alcohol content, work zone presence, roadway type, user type (such as pedestrian or cyclist), and driver behavior.

Alcohol stood out as the strongest predictor, accounting for about a quarter of all serious crash attributions. Even blood alcohol levels below the legal limit increased risk noticeably. Work zones, meanwhile, became exponentially more dangerous when alcohol was involved.

Aggressive driving and impairment-related actions also played an outsized role. The data showed that combining multiple risky elements—like drinking, speeding, and driving through construction—dramatically increased the likelihood of a serious crash.

As senior author Hao “Frank” Yang, a professor of civil and systems engineering at Johns Hopkins, put it, “Car crashes in the U.S. continue to increase, despite decades of countermeasures. With SafeTraffic Copilot, our goal is to simplify this complexity and give policymakers insights they can act on.”

SafeTraffic LLM provides predictions with trustworthiness. (CREDIT: Nature Communications)

A Smarter, More Transparent Kind of AI

Yang and his team describe the system not as a replacement for human experts but as a “copilot.” It can process vast amounts of crash data, identify patterns invisible to humans, and quantify its own confidence level. That last part matters: the AI can tell engineers or policymakers not only what it predicts, but how sure it is—something most algorithms can’t do.

“By reframing crash prediction as a reasoning task and using large language models to integrate written and visual data, stakeholders can move from coarse, aggregate statistics to a fine-tuned understanding of what causes specific crashes,” Yang said.

The project’s co-authors include Hongru Du from the University of Virginia and Johns Hopkins doctoral students Yang Zhao, Pu Wang, and Yibo Zhao.

Limitations and the Road Ahead

Despite its promise, SafeTraffic Copilot isn’t flawless. Converting images, like satellite views, into text loses some visual information. Future versions could integrate image-processing AI to handle these data more directly. Training such models also demands immense computing power, which could make deployment costly for smaller agencies.

Single case feature-attribution results for Severity task. (CREDIT: Nature Communications)

Even so, the system represents a leap toward data-driven traffic management that balances accuracy with accountability. Because SafeTraffic Copilot provides explanations alongside predictions, it gives policymakers confidence to act on its insights—whether by redesigning intersections, improving work zone safety, or targeting drunk driving campaigns more precisely.

Yang and his colleagues view the technology as a foundation for how AI could support decision-making in other high-stakes fields such as public health and emergency response. “The central focus of our ongoing research is to find the best way to combine the strengths of humans and AI so that decisions are not only data-driven, but also transparent, accountable, and aligned with societal values,” he said.

Practical Implications of the Research

SafeTraffic Copilot could change how cities and states approach traffic safety. Instead of reacting after crashes happen, planners could proactively identify high-risk areas and address the underlying causes. The model’s ability to analyze combinations of factors—such as alcohol use in work zones or aggressive driving on rural highways—offers insight that raw statistics often miss.

For transportation engineers, it means smarter infrastructure design guided by real-time reasoning rather than historical averages. For law enforcement and policymakers, it means campaigns and interventions tailored to the situations that most often lead to tragedy. And for everyday drivers, it may someday lead to safer roads, where data-driven systems quietly help prevent accidents before they happen.

Research findings are available online in the journal Nature Communications.

Like these kind of feel good stories? Get The Brighter Side of News' newsletter.

AI Alcohol artificial intelligence Cars crash Innovation News Research Science Traffic

Shy CohenScience and Technology Writer

Shy Cohen
Science & Technology Writer

Shy Cohen is a Washington-based science and technology writer covering advances in AI, biotech, and beyond. He reports news and writes plain-language explainers that analyze how technological breakthroughs affect readers and society. His work focuses on turning complex research and fast-moving developments into clear, engaging stories. Shy draws on decades of experience, including long tenures at Microsoft and his independent consulting practice to bridge engineering, product, and business perspectives. He has crafted technical narratives, multi-dimensional due-diligence reports, and executive-level briefs, experience that informs his source-driven journalism and rigorous fact-checking. He studied at the Technion – Israel Institute of Technology and brings a methodical, reader-first approach to research, interviews, and verification. Comfortable with data and documentation, he distills jargon into crisp prose without sacrificing nuance.