Qualcomm is set to roll out the aptX Adaptive decoder for headsets, headphones, and speakers in September and it will be available for customers who use the company’s CSRA68100 and QCC5100 series Bluetooth Audio SoCs. Meanwhile, the aptX Adaptive encoder for smartphones and tablets will be released in December and it will be available to run on Android 9.0 Pie.
If Android phone -> hearing aid streaming audio can take advantage of aptX Adaptive, that opens up a lot of potential for sound processing on the phone… which means we might finally see AI/deep learning-driven noise reduction strategies.
Okay, you lost me? Seems to me that codec’s used for streaming wouldn’t be useful for noise reduction strategies coming in via the Mics where noise and speech are intermingled. Maybe I’m thinking about it wrong?
Remember that IEEE article I posted the other day?
The artificial intelligence running under the hood of the Livio AI system mostly relies on traditional machine learning algorithms rather than the potentially more powerful deep learning algorithms. That’s in large part because the hearing aid’s onboard computing power remains limited, and Starkey didn’t want critical AI-boosted listening or other functions to rely upon Internet access to additional cloud computing … for now, hearing aids still cannot fully leverage deep learning because the miniaturized devices lack the onboard computing power to run the algorithms. “I’ve yet to see the first hearing aid which really incorporates deep learning for noise removal,” Wang says.
Basically if we can get REALLY REALLY fast wireless audio transmissions, we can eventually get our phones or cloud computing to process the sounds from the hearing aid mics and then send it back processed to the aids. Now I have no idea whether AptX Adaptive gets us there in terms of speed, or whether our phones are even capable (at this point) of running adequate deep learning algorithms… but the point is, that is where I see all of this headed.
Okay, I follow your logic. But I’m not sure yet that this new whiz bang AI stuff can help us hear better.
Let’s say that AI can determine that you have walked into a restaurant and switches to speech-in-noise. Well that’s okay, but we already have the ability to automatically switch to speech-in-noise, or (for more user control) we can even set it to manually switch only.
Here’s how I could see AI being useful ;
You buy a new set of hearing aids (un-programmed)! It comes with an App that lets you do a bunch of things. Change volume, treble, bass, mid-range. AI could record every adjustment along with which environment you are in (calm, noisy, etc) and program your hearing aids for you without the need for an audiologist!
You think that might work? Replace the Audi?
Ah, you underestimate the power of what AI can do for you… Read this http://spectrum.ieee.org/consumer-electronics/audiovideo/deep-learning-reinvents-the-hearing-aid
You can still color me as skeptical
There are, of course, limits to the program’s abilities. For example, in our samples, the type of noise that obscured speech was still quite similar to the type of noise the program had been trained to classify.
Translation = We cheated by teaching AI what noise is.
Since we published those early results, we’ve purchased a database of sound effects designed for filmmakers and used its 10,000 noises to further train the program.
Translation = We cheated some more.
Now, with funding from the National Institute on Deafness and Other Communication Disorders, we are pushing the program to operate in more environments and test it with more listeners who have hearing loss.
Translation = Cool, now we can spend OP (Other People’s) money.
Maybe you trained it to walk into a restaurant. But what if you walk into a Chinese restaurant? I think it would be different.
Not knowing much, it’s easy for me to offer a relatively uninformed opinion. But I bet AI/deep learning for HA’s is in the same department as HoloLens processing, etc. I bet that either would need an auxiliary wearable somewhere else on the body to provide augmented processing power and battery for that processing power. Or, if 5G is the answer to everything, the auxiliary device could provide a very high-speed low latency connection to servers that would provide the requisite processing power (thinking of how 5G is supposed to allow cars and traffic lights, etc., at any intersection all communicate “instantaneously” and avoid intersection bang-ups). The problem with 5G coming to the rescue of anything is its very short transmission distances. You’d need a cell tower on every block. Ain’t gonna work too well out in the countryside at large but great in high-density urban areas.
I’m an ex-biologist, Ph.D., former college professor-type. Most of the modern advances in medical science, DNA sequencing, molecular biology, etc., have been perfected with “other people’s money.” I have always felt with the PETA folks that just like organ donation on driver’s licenses, there should be an option “If I am in an accident, please do not treat me with any medical advances that have come through Animal Research or Other People’s Money. I would rather die…” So people can really live or die on the strength (and wisdom) of their convictions.
P.S. This is not to say that a lot of animals have not been needlessly expended in poorly thought-out research or a lot of other people’s money wasted in the same way. Research is big business with states vying to get to the top of the federal dollars gravy train… Research and animal use are subject to the same foibles as are all other human endeavors.
I personally think it holds promise. It makes sense to me that hearing aids could use more computing power (and battery power) A smartphone could easily be that source and the advances in smartphone cpus have been pretty dramatic. Not holding my breath though.
I think they’ll come up with techniques for hair cell regeneration long before this becomes a reality. Assume the processing is done in something other than the hearing aids and that you’re using the hearing aids as microphone. You have to transmit from the aids to your processor, do your super-computer filtering, and transmit back to your aids without noticeable latency.
The aids will have to preserve and transmit enough detail so the processor can do its job. I think that current codecs won’t cut it. Low latency audio on the return journey is the least of the problems to solve.
Hmm. I’d bet on machine learinging before sorting out the biological processes successfully. Could be an interesting horse race though.
The 5G spec allows for one-way latency as low as 1 millisecond(ms) for “ultra-reliable low latency communication” setups although typical allowed latency is 4 ms.
According to Wikipedia, up to 200 ms latency in audio devices is considered tolerable:
Apparently we live with a fair amount of latency already built into various audio devices, according to Wikipedia. Interestingly, every 588 km of transmission distance to be covered has a built-in one-way latency of 3 ms (signal traveling at 2/3 speed of light). So having a powerful AI processor close by would keep round-trip latency via 5G down to a manageable size while still allowing 10’s of ms of processing time on the signal before delay became bothersome.
Maybe 5G chews up too much battery to work on a hearing aid but if one’s phone were not powerful enough to handle the AI processing, it’s conceivable the phone could add off to some other processor or servers not too far away via 5G and the whole process could still be in the tolerable latency range.
Tried to look up latency considerations in human perceptions of lips moving vs. sound to see how much out-of-sync tolerance there is there. Seems like from what little I read that if you can see a speaker’s lips moving, less latency would be tolerable there as a paper that I glanced at claimed that humans use visual cues to shorten audio recognition latency. So if you were only listening to words being spoken more audio latency might be tolerable (kinda like early satellite phone conversations where even with slight delays, you could still get by?).
Update: Found audio being out of sync with visual is defined as “lip sync error.” Although shorter times for audio lagging video are recommended in TV and film standards, an expert panel of reviewers found that the threshold detection for audio lagging visual is 125 ms.
So that gives a decent time cushion for an AI processor to massage speech and get the improved version to the HA before the listener perceives any disturbing lip sync error in face-to-face communications.
Concerns about the noise problem