Researchers at the University of Washington have come up with a cool AI system for noise-canceling headphones.
This system can zero in on and boost a single person’s voice in a noisy setting, just by having the user look at that person. So, if you’re trying to listen to someone in a crowd, these headphones will help you focus on that specific person’s voice.
Target Speech Hearing System
University of Washington researchers have developed the Target Speech Hearing (TSH) system, an AI-powered device that adjusts headphone audio based on user preferences.
This groundbreaking technology was showcased at the ACM CHI Conference on Human Factors in Computing Systems, though it’s not yet available for purchase. The team has, however, made the code available on GitHub for others to explore and develop further.
- Building on earlier work in “semantic hearing,” the TSH system allows users to focus on specific sounds while blocking out others.
- Currently, the system can only enroll one speaker at a time and requires that speaker to be the loudest during the enrollment process. The researchers are now aiming to extend this technology to earbuds and hearing aids in the future.
How TSH Technology Works
The TSH system enhances regular headphones by adding microphones and an AI neural network. Here’s how it works:
- To lock onto a specific speaker’s voice, just look at the person for three to five seconds and press a button on the headphones. This starts the “enrollment” phase, where the headphones capture the speaker’s sound.
- The AI then analyzes these captured signals in real-time to recognize the speaker’s unique vocal characteristics. This information is passed to another neural network, which continuously isolates the speaker’s voice from background noise.
Once the system is set up, it can maintain focus on the speaker’s voice, letting you hear them clearly even if you move around or look away.
User Testing and Feedback
Researchers at the University of Washington put the TSH system to the test with 21 participants. On average, these users found that the clarity of the targeted speaker’s voice nearly doubled compared to unfiltered audio. While the current setup requires the target speaker to be the loudest in the room during enrollment, users can re-enroll to enhance sound quality if needed.
This technology could revolutionize communication in diverse environments like museums, city streets, and potentially in popular headphones and earbuds.
Future Enhancements and Applications
The TSH system could be a game-changer for people with partial hearing loss or anyone who often deals with noisy environments where having a conversation is tough.
Researchers are hopeful about future updates that could tackle current limitations, like allowing the system to enroll multiple speakers at once and isolate voices in more complicated audio settings.
They’re also planning to adapt this technology for earbuds and hearing aids, which would make it even more accessible and useful for a lot more people.