[Nov. 30, 2023: JD Shavit, The Brighter Side of News]
Semantic hearing applications. Users wearing binaural headsets can block out street chatter and focus on the sounds of birds chirping. (CREDIT: ACM)
In the realm of noise-canceling headphones, a technological breakthrough is on the horizon, one that promises to grant users unprecedented control over their auditory environment. Most users of noise-canceling headphones are familiar with the transformative power of silencing unwanted noises, but what if you could choose which sounds to let in?
A team of researchers from the University of Washington has embarked on a groundbreaking journey to redefine noise-canceling technology. They have developed deep-learning algorithms that enable users to handpick the sounds they want to hear in real-time, introducing a concept they've dubbed "semantic hearing."
The conventional noise-canceling experience typically involves muffling or eliminating ambient noises, a helpful feature in various scenarios. For instance, one may desire to eliminate the incessant honking of car horns when working indoors but wouldn't want to miss the bustling sounds of a busy street while walking.
Unfortunately, current noise-canceling headphones lack the ability to selectively cancel specific sounds according to the user's preferences. This limitation has sparked a quest for innovation, leading to the development of semantic hearing.
At the core of this revolutionary system is the integration of deep-learning algorithms that empower users to customize their auditory experience. Semantic hearing headphones function by streaming captured audio to a connected smartphone, which then orchestrates the cancellation of all environmental sounds. Users can interact with the system through either voice commands or a smartphone app, allowing them to select their preferred sounds from a predefined set of 20 sound classes.
These classes encompass a wide array of sounds, ranging from sirens and baby cries to human speech, vacuum cleaners, and bird chirps. Only the chosen sounds will be relayed through the headphones, offering users a tailored and immersive listening experience.
The University of Washington research team unveiled their groundbreaking findings at the UIST '23 conference held in San Francisco. As they continue to refine their technology, the researchers have ambitious plans to release a commercial version of the semantic hearing system in the future.
Shyam Gollakota, the senior author of the study and a professor at the Paul G. Allen School of Computer Science & Engineering at the University of Washington, emphasized the complexity of achieving true semantic hearing. He stated, "Understanding what a bird sounds like and extracting it from all other sounds in an environment requires real-time intelligence that today's noise-canceling headphones haven't achieved. The challenge is that the sounds headphone wearers hear need to sync with their visual senses. You can't be hearing someone's voice two seconds after they talk to you. This means the neural algorithms must process sounds in under a hundredth of a second."
The unique demands of semantic hearing necessitate rapid processing of sounds, a task that cannot solely rely on cloud servers due to time constraints. Consequently, the semantic hearing system operates primarily on a device such as a connected smartphone, ensuring real-time responsiveness. Furthermore, the system must retain and incorporate the delays and spatial cues associated with sounds arriving from various directions into users' ears, preserving the meaningful perception of their auditory environment.
Semantic hearing applications. Users wearing binaural headsets can attend to speech while blocking out only the vacuum cleaner noise. (CREDIT: ACM)
In extensive testing conducted in diverse environments, including offices, city streets, and tranquil parks, the semantic hearing system exhibited remarkable capabilities. It successfully isolated and extracted target sounds such as sirens, bird chirps, alarms, and more while effectively eliminating all other real-world noise. In a survey involving 22 participants who rated the system's audio output for the target sounds, the results were overwhelmingly positive. On average, participants reported an improvement in audio quality compared to the original recordings.
Despite these promising results, the system faced challenges when distinguishing between sounds that shared similar properties, such as vocal music and human speech. The researchers acknowledged this limitation and suggested that further refining the system through training on a more extensive dataset of real-world sounds could enhance its performance in such scenarios.
Semantic hearing applications. Users wearing binaural headsets can tune out construction noise yet hear car honks. (CREDIT: ACM)
As the development of semantic hearing technology progresses, it holds the potential to transform the way we engage with our auditory surroundings. Imagine walking through a bustling city street while retaining the ability to isolate the melodious chirping of birds or selectively tune in to a captivating conversation. Semantic hearing could revolutionize not only the field of noise-canceling headphones but also how we interact with our auditory world.
The impact of semantic hearing extends far beyond personal listening preferences. It has the potential to benefit individuals with specific auditory needs, such as those who rely on hearing aids or assistive listening devices. By offering a level of sound customization previously thought impossible, semantic hearing could enhance the quality of life for many individuals.
Semantic hearing applications. A meditating user could use headsets to block out trafc noise outside yet hear alarm clock sounds. (CREDIT: ACM)
As the technology continues to evolve and overcome its remaining challenges, the day when we can all choose the sounds we want to hear in our daily lives may be closer than we think. Semantic hearing promises a world where we are the conductors of our auditory symphony, allowing us to embrace the soundscape that resonates with us most.
Additional co-authors on the paper were Bandhav Veluri and Malek Itani, both UW doctoral students in the Allen School; Justin Chan, who completed this research as a doctoral student in the Allen School and is now at Carnegie Mellon University; and Takuya Yoshioka, director of research at AssemblyAI.
For more science and technology news stories check out our New Innovations section at The Brighter Side of News.
Note: Materials provided above by The Brighter Side of News. Content may be edited for style and length.
Like these kind of feel good stories? Get the Brighter Side of News' newsletter.