My Whisper AI trial vs. Oticon More

x475aws · November 30, 2021, 3:38am

In your mind, how is the real time soundscape processed in order to use it as input to the DNN?

SpudGunner · November 30, 2021, 3:45am

happymach · November 30, 2021, 4:00am

Now that we’ve determined that no one is really speaking with any kind of authority about what is happening with Whisper’s design, can we just stop it?

SpudGunner · November 30, 2021, 4:20am

@happymach: I’m observing that discussion rather than contributing much to it. You need to it. I suggest you ask @x475aws to knock it off: all @Volusiano is doing is exercising his right to defend himself from the former’s taunting.

Volusiano · November 30, 2021, 4:21am

Hm, I didn’t know that forum members have to have some kind of authority about something in order to post things up on the forum. Speaking way off-topic consistently is one thing. Expressing opinions now requires having some kind of authority first before being allowed to speak?

I’m actually not that keen to comment on Whisper anymore because I’ve already said my piece a long time ago in the very first Whisper thread. But the door keeps getting opened time and time again, addressed to me, about various things to elicit (or challenge me?) to a response. I don’t usually come in uninvited.

happymach · November 30, 2021, 4:25am

It’s just that it’s pointless; the same relatively unsubstantiated assertions are being made over and over again. And take some responsibility: the door may be opened for you, but you keep choosing to walk through it.

Volusiano · November 30, 2021, 4:37am

It’s not pointless, and it’s not irresponsible. I make it a point to actually “take responsibility” to express my opposing view (if I have any) because if I don’t and decline to walk through that door, then it’s implied that I agree to the other view while I don’t.

Whether anyone’s assertion is substantiated or unsubstantiated is up to the reader to decide. If I recall, you made lots of assertions, too, about how the Whisper works, outside of your personal experience in using it. And I don’t recall how you substantiated your assertions, not did I require that you have to substantiate your assertions.

greg.smith · November 30, 2021, 4:59am

You ask what was rude so I will take the bait.

You invalidated another person’s hearing loss and suggested they don’t belong here, as if we are somehow in competition.

Tossing off such a comment and then blaming the target for not “having a sense of humour” (about a comment you actually prefaced with “not being funny, but…”) is dodging accountability for your actions at best, and double-down bullying at worst.

d_Wooluf · November 30, 2021, 5:39am

Or else they want to do a better job of it than the others?

There’s nothing there that’s “obvious”.

BeachBum · November 30, 2021, 10:54am

Analogies can sometimes serve a good purpose when one is at the end of their rope “casting pearls before swine”.

WhiteHat · November 30, 2021, 1:13pm

Yes, by training an eon ago. I’m into systems engineering for programs (far more than software — total solutions for problems). So spares, maint concept, testing, future tech refresh planning, cost, contracting strategies, user interface, training, depot level maint implementation, etc. etc. I spend lots of YOUR money.

There may be software and computer hw in your car or hearing aids. Or your tank. But the whole is far more than just the sw or hw. But either will sink your system pretty quickly if not implemented well.

So as I wander in my career from system to system I sometimes see implementations that solve problems most ppl never see. And then I couldn’t talk about them here, typically.

WH

WhiteHat · November 30, 2021, 2:04pm

I don’t know. I would suspect you’d want to process it so that it looks like the inputs given to the DNN in the training phase.

WH

jeffrey · November 30, 2021, 4:06pm

Sorry to see you get slimed, Greg, by the trolls. Oh well, the jackasses will bray and have nothing to say.

greg.smith · November 30, 2021, 6:55pm

Such is life on the internet.

x475aws · November 30, 2021, 8:57pm

That was my thinking also. So it seems to me that the more processing is applied to soundscapes during DNN training, the greater the processor speed requirement during hearing aid operation, since the live soundscape has to be readied for DNN input in real time.

WhiteHat · November 30, 2021, 9:57pm

I’d expect a DISP - A Digital Signal Processor (or something similar) will configure the input in the appropriate manner. Another one (or a bank of them) will also take the raw digital data from the mics and process it according to the output from the DNN. That would be baked into an ASIC or something like that. As I said earlier, I’ve been out of this environment too long to know what is current and “state of the art”. The general purpose processing would be limited to initializing and changing the configuration on the DSPs to respond to the desired “program”.

WH

BeachBum · November 30, 2021, 10:12pm

You are right on target. I too have been out of development of DSPs to have even a remote idea what is state of the art right now. If I knew it would probably blow mine mind.

BeachBum · November 30, 2021, 10:15pm

Edited. See below for corrections.

Volusiano · November 30, 2021, 10:47pm

All HAs use DSP modules of some sort to prepare the input data of the sound field to feed into the next module, be it a DNN, or a noise removal module of some sort, or some kind of special amplifier, etc to continue further processing, until the final processed form is achieved and output into D2A converters/amplifiers for the users to hear.

But as a system, these modules operate fairly independently as black boxes, and as long as the previous module in the flow supplies them with the input data in the proper format, that’s all that’s required.

I think what @x475aws is getting at here, through his line of questioning, in trying to get an answer out of you, and I quote his post again below for reference, is that he’s still trying to prove his earlier assertion that because the Whisper DNN is very resource intensive in itself, that the data processing to deliver the input to the DNN MUST THEREFORE be just as intensive in real time, or else it can’t keep up with the DNN. And because this data preparation phase of the DSP requires a lot of resources just the same (as the DNN), it gives the justification for the Whisper brain, to host both the pre-processing DSP going into the DNN, and the DNN itself. This was the argument he and I and @happymach had originally, which led to my tennis analogy.

Well, my contention is that a DNN should be able to accept normal data in real time, then process it however and whichever way it wants. But just because the DNN processing chooses to be resource intensive doesn’t automatically mean that the DSP data prep going into the DNN has to become just as resource intensive to match it.

I have yet another analogy. Let’s say all our ears are the same. We’re not hearing challenged, we have normal hearing. But let’s say I’m blind and you’re not, so the DNN in my brain allows me to process the same audio information coming into my ears (just as the same information coming into your ears) much better than the DNN in your brain, simply because I’m deprived of the visual cues, so my DNN gets trained to "over develop: and become better at processing the audio cues than your DNN. But the input to my DNN through my normal ears is JUST THE SAME as the input into your DNN through your normal ears. And both of our ears present the same audio information to both of our DNNs. The only difference is that my DNN process this audio information differently than yours. The fact that my DNN has been trained to process the audio better than your DNN (and arguably requires a larger portion of the brain than your DNN because my more complicated DNN is more resource intensive than yours) doesn’t automatically imply that I need a bionic ear to go with that DNN of mine or else it wouldn’t work.

ziploc · December 1, 2021, 2:13am

I tried an experiment that is more on my level of scientific understanding (probably a step or two above that of Beavis and Butthead). I listened to my “busy restaurant” YouTube video together with the same TED talk YouTube video like I did at the beginning of this thread. This time I tried using the “bass double vent” domes from the Oticon More on the Whispers, and vice-versa.

I could hear the increased treble with the Whisper domes on the Mores. And I heard less treble with the More domes on the Whisper. No surprise there. I tried listening to the two videos simultaneously with the volumes set so that I could barely understand the TED talk woman’s words. I tried various combinations of programs, volumes and domes with both HAs. The best speech recognition I achieved was with the Whispers with the Whisper domes on the dynamic program. But the Mores on the default program with the More domes was a very close second.

I know that the programming and the domes are dependent on each other so i couldn’t expect any improvement at all, much less any dramatic improvement, by switching domes without altering the programming to account for the different style of dome. Still I thought it was worth trying and reporting on it here.

It makes me wonder, though, if I had a different dome on the Whisper AND the HA was programmed to account for the different dome, if it would make any difference. It seems a bit curious that the Oticon software recommends a bass double dome and Whisper apparently recommends a less-occlusive dome with a tiny pinhole.