Whisper Hearing System: AI Noise Reduction
In today's episode of The HearingTracker Podcast, host Steve Taddei talks to Dr. Don Schum, the Head of Audiology at Whisper. Based in San Francisco, Whisper manufactures hearing aids that remove background noise through "denoising", a process that relies on Artificial Intelligence (AI) and Deep Neural Networks (DNNs). In the interview, Dr. Schum provides some history on Whisper's creation and mission, and explains—in layman's terms—how Whisper uses cutting-edge technologies to help Whisper owners hear better in background noise.
Podcast Transcript
Steve Taddei: Hi, I'm Dr. Steve Taddei and you are tuned into the Hearing Tracker Podcast.
If you listen to our last episode, you know we began scratching the surface of artificial intelligence.
Giles Tongue: So generally, anything an animal or a human does that you describe as intelligent is artificial intelligence if it's done by a machine.
Steve Taddei: But we were mainly focused on true wireless stereo earbuds. So today on the show, I wanted to see how machine learning and deep neural networks are being used in other hearing technologies, such as hearing aids.
Is it the next best thing or more of a buzz term? How exactly is it being used? And the big one, does this technology offer more than a whisper of hope to improve speech in noise? Okay, that's a bad joke but my way of introducing this episode's guest. To answer these questions, I spoke with Dr. Donald Schum and he's the Head of Audiology at a company called Whisper.
Donald Schum: Whisper was started by a gentleman who came from the AI fields in Silicon Valley. They were working for some of the big tech companies out there. And they wanted to use their expertise in the field of AI to do something new in a company. They had been paying attention to a lot of the potential of some of the advanced forms of artificial intelligence to approve speech understanding and to improve, you know, speech in noise applications and things like that.
So, they gathered a group of people that they knew from that Silicon Valley area who knew a lot about audio and consumer electronics and signal processing and definitely AI to start building hearing aids.
Steve Taddei: AI may seem new, but it's found its way into most of our electronics. You're probably not more than an arm's length away right now.
Siri: Hello, now playing the Hearing Tracker Podcast.
Steve Taddei: If you look a bit further, it starts feeling like we're living inside some of our favorite sci-fi franchises. Places like the University of Wisconsin have started using specialized delivery robots on campus. I've even seen shelf stocking droids at my local marketplace. Beyond all that, I personally can't wait to have a self-driving car.
All this is great, but how can this same technology be used with ear level devices? What benefits does AI really have to offer?
Donald Schum: If you have a lot of data that you are dealing with, but you're not sure what's in the data or you're trying to extract information from the data, that's exactly where machine learning comes in.
Machine learning is a very specialized part of the world of AI. And it could take a lot of information and sort through it very quickly and extract information or find information that happens to be in the data. And when you think about the speech in noise problem, that's exactly what we have. We have a lot of data.
There's so much intricacies into the details of the speech signal, into the details of other sounds in the environment, and that is a perfect sort of problem for AI to go after. Because you have a lot of data to work with, but you need to be able to get through that data very quickly and find very intricate patterns in that data that can tell you something.
Steve Taddei: Don went on describing how exactly these systems are created, and a process called offline training.
Donald Schum: The way machine learning works is, offline you feed the system thousands and thousands and thousands of recordings of speech in noise, and you know, over thousands of thousands of hours of training. And every time a signal passes through the system, it is held up against a criterion of what good looks like. And so, it learns what good looks like and it sees this coming in and it's like, "what do I have to do to manipulate this signal to make it look like it's good, to make it match what good looks like?"
The more you train it, you know, you add thousands of hours of training to it, it gets more and more refined about identifying what is meaningful and what is not meaningful. In a deep learning system or a deep neural network or machine learning system, the system can optimize on literally tens of thousands of parameters. It depends on how much processing power you have available. So that basically allows an ability to process basically on a phoneme-to-phoneme basis.
Because each phoneme has a distinct acoustic pattern. A "T" looks different than a "Th". An "E" looks different than an "Ah". A fricative, they all have all these different unique patterns to them. And that's the sort of thing that could be learned by a deep learning system.
Steve Taddei: And according to Don, this step is key. The accuracy of your system really depends on this offline training. So, imagine feeding a computer, lots of audio samples of good clean speech.
Clean Speech Voice 1: The country there is rich and pleasant.
Clean Speech Voice 2: Maybe it's easiest to get a new one?
Clean Speech Voice 3: I wonder what's happening outside?
Steve Taddei: This serves as a baseline of good. Then we start training the system by feeding it speech with broadband noise, narrowband noise, restaurant noise, and other potentially undesirable sounds like room ambience.
Eventually, you can get a system that can accurately recognize and analyze speech inputs in real world situations. Wow, right? And this is just one example of how AI is starting to be used in hearing aids.
Donald Schum: There's other hearing aid companies who use AI to do different things. For example, some of the companies will, will monitor the sound environment and send information about the sound environment up to the cloud.
And then the cloud will come down and suggest changes to the settings of the hearing aid. That's one use of AI. Another use is through motion detectors. So that if somebody has a beam forming system in their hearing aid, which are systems that work really well under certain circumstances, but if you're moving through space sometimes a beam former can be a very disorienting sort of affect the have on you're hearing aid. And so, they use a motion detector to turn off they're beam former. That's another use of AI.
Steve Taddei: But if you remember this discussion began with the mention of a Silicon Valley based company called Whisper. How are they using artificial intelligence?
Donald Schum: What we're doing with AI, the idea of using machine learning to in real time sort through a very complex speech in noise inputs to be able to identify patterns that look like speech differentiated from other sounds.
And to do that in real time, that's the way we're using AI. And one of the things that we take very seriously is the idea that if you're going to do that, you need a lot of signal processing power. So, when Whisper was started, one of the assumptions was that they were not going to be limited in terms of signal processing power to just what you can fit into an earpiece [hearing aid].
If you want to really throw the processing power that the problem really deserves, then you need to go to another device. And we have a third device in our system other than the two earpieces, and it is the Brain.
Steve Taddei: More on the Whisper Brain when we come back. You won't want to miss it, trust me.
Thank you for listening to the Hearing Tracker Podcast. On the show, we commonly talk about technologies and various ways to improve audibility when hearing issues exist. However, the importance of protecting your hearing cannot be overstated. In most cases, hearing loss can be reduced, delayed, or even avoided if we practice safer listening habits. Similar can be said for those who already have hearing loss because, let's face it, your hearing can almost always get worse.
To this end, and to say thank you for your continued support, we're giving away a pair of Minuendo Lossless Earplugs. Now these aren't your run-of-the-mill squishy foam plugs. They allow you to adjust the amount of sound attenuation by way of a sliding lever on the devices outer shell.
How do you enter? Simply leave us an honest review on Apple Podcast or share this episode online. Then snap a screenshot and send it to steve@hearingtracker.com. We'll announce the winner in next month's episode. And don't worry if it's not you, you can still get 20% off Minuendo products with the code STEVESB.
Welcome back. Before the break, Don kind of dropped a bomb regarding Whisper. He mentioned that their system relies on an external device, beyond the earpieces, to handle the extra AI processing. And this is called the Brain. Seems appropriate, right? This decision to include a third processor was necessary to exceed the capabilities of traditional hearing device form factors.
So, let's get back into it and hear more about the Whisper Brain.
Donald Schum: The Brain is this third piece. The signal processor that allows the system to apply a significant amount of Brain power basically to the problem of being able to, in real time sort through this very complex input, identify the patterns that it has learned look like the speech signal, to differentiate it from other sounds in the environment.
Steve Taddei: Don went on telling me how the Brain is a situational device. You don't always have to use it as the hearing aids function normally without it.
Donald Schum: The earpiece will be monitoring environmental conditions. And if the level, the overall level of the environment high enough, and the signal to noise ratio is not optimal, then it will divert the signal down to the Brain for the Brain to do the processing.
So as the signal is passing through the Brain, the Brain is analyzing this in real time, looking for those patterns that it has learned on offline training, to separate speech from noise. As it sees those patterns, then it'll protect those parts of the patterns that it believes are the speech signal and reduce gain, or reduce the signal level, for those parts that look more like noise. But if you're, for example, just home with one other family members having dinner, quite dinner, or just quietly watching TV, then you don't need the Brain. The Brain is when the rest of the family comes over for Sunday dinner, or the Brain is when you're going to go out to a restaurant, or go to a party, or some other function where you really need help.
Steve Taddei: Okay, so far Whisper has definitely piqued my interest. And there's another unique feature that I think will resonate with us all. Take a moment and think of a time when you purchased a new phone, television, camera, computer, piece of cool podcast gear and just after purchasing it the company rolls out a brand-new product line.
Great, now you're stuck with a previous generation of technology and all the buyer's remorse that comes with it. I've seen this exact situation in clinic with patients looking to invest in new technology. Well, with Whisper this might be a thing of the past.
Donald Schum: When a new product's going to be ready, you have a signal processor, a digital signal processor that you can have on there. And basically, you fill up that processor, doing everything you possibly could that makes sense. And so that product might be excellent on the day it's sold. But three or four years down the line it's still exactly the same product. And so, there's basically a performance gap that starts to build between what is state of the art today and what was state of the art three or four years ago when you bought the hearing aid.
And so when Whisper was founded, it was like, well that doesn't make sense. It doesn't have to be that way. If you reserve processing capability on the product, and very importantly, if you build your company's development organization to plan to do upgrades, then you can make that product better as often as you want to.
One of the key calling cards to our system is our ability to upgrade the system on a regular basis. We will do upgrades every, typically every three to four months, where we will push to the user's system, you know, new firmware. Almost every time we do an upgrade, we almost are always improving the machine learning model that we use.
Steve Taddei: These updates are great, but there's one slight downside. Due to the accelerated processing updates, it's challenging to conduct large clinical trials comparing the AI processing benefits to other more traditional devices. Don mentioned that they do have private internal data and it indicates somewhere around "A six times improvement in technically what we can do to the signal to noise ratio compared to a market leader."
I did some digging of my own and found a few other academic studies supporting technical and clinical improvements from these deep neural network systems. But this technology is still in its infancy, and we really do need more research.
While this may seem like a no-lose situation, Don caution that there can be undesirable side effects if things are taken too far.
Donald Schum: Anytime you do aggressive manipulation of audio, you could screw it up. No matter what you want to do to it. You could try to do something good, but you could also do something bad. And noise reduction systems have always had that dual sort of demands on them. You could do a lot of noise reduction, but if you do it too fast and too aggressively it could sound bad.
Steve Taddei: With traditional modulation-based noise reduction, hearing aids attempt to reduce frequency ranges dominated by noise. So, if there's a low frequency humming, like you're hearing right now, these ranges will be quieter. However, by reducing the low frequencies, I'm also changing the sound of my voice.
Other algorithms take a different approach and attempt to cancel out any noise with a process called spectral subtraction. So, if there is lots of noise like this, it can be subtracted based on the sonic footprint of that sound. If these processes are pushed too far, we may hear artifacts that do more damage than good.
Okay, so there's another very shocking and important element of Whisper that we've neglected to address so far. And that's, how can you get them?
We know there are so many direct-to-consumer self-fit devices emerging. Well, Whisper is not one of them. Whisper pairs with hearing care providers and they stand behind the importance of professionals in the hearing aid fitting process.
Donald Schum: We are definitely interested in understanding what the future looks like. And it's not so much what type of device you give a patient. It's more how do they want to get their care from their provider? And we know that they're going to want to be met in a lot of different places.
We believe that there's room in the marketplace for different paths to success with provision of hearing care. But at the end of it, it's always about hearing care. It's not just about technology. It's not just about getting technology to people and then having them have a go at it. You know, it is a matter of finding ways to bring care to the patient. Sometimes it's face to face and sometimes it might be from a distance because that's what the patient's looking for.
Steve Taddei: I'd like to thank Dr. Donald Schum for coming on the show and talking about both AI and Whisper. To learn more, you can visit Whisper.ai.
I'd also like to thank Soundbrenner and Minuendo for supporting the giveaway mentioned earlier. This episode was written, produced, and sound designed by me with help from Dr. Abram Bailey. If you'd like to hear my full unedited chat with Don, head to over to our Patreon. And as always thank you for listening.
Minuendo Lossless Earplugs Giveaway Rules: The giveaway will run from March 23rd to April 1st. Only one entrance is allowed per person and anyone is eligible to participate with no purchase necessary to enter. The winner will be chosen at random and announced during the April episode of the Hearing Tracker Podcast.
Media Credits: Music by Coma-Media from Pixabay Music by Penguinmusic from Pixabay Music by Dylan-Darby from Pixabay