Funky new feature in Airpods Pro 2. Relevant to us?

“With iOS 16, you can use the TrueDepth camera on iPhone to create a personal profile for spatial audio because the way we all perceive sound is unique based on the size and shape of our head and ears,” said Senior Engineer of AirPods Firmware Mary-Ann Rau.

So you use new features in ios 16 to scan your head basically. The shape of your head and ears supposedly affects spacial audio. By ‘spatial audio’ I’m reading ‘sound stage’? Spatial audio will be personalised just for you!

Does this have any possible application to us, the hearing impaired? I’m thinking real-world sound localisation. All assuming that it’s not complete bs of course. Seriously you can hate Apple but at the same time shake your head at their sheer unstoppability.

2 Likes

Maybe we’ll know more when iOS 16 is officially released:

3 Likes

I guess someone here’s going to get the gear sooner or later and they can do a before and after comparison. It’s more likely that the spatial sound processing is done in the iphone than in the buds, therefore no effect on transparency mode.

It was meant as a wider question. In a year or two, will audiologists be measuring our heads and inputting that information into the fitting software to improve our spatial location of sound?

1 Like

Since streaming starts on the iPhone, “spatial” sound effects could at least be applied to streamed audio sent to the HA’s. Since the MFi transmission protocol to HA’s is undoubtedly not Dolby Atmos nor DTS, iOS might use its knowledge of the listener’s audio space to translate what’s streamed to the HA’s to imitate DTS. The other thing is since smartphone apps can adjust the “fit” of the HA’s, Apple’s knowledge of the HA user’s personal audio space (trying to imitate the lingo you referred to in the Earbuds Pro 2.0 article) could always be handed over to the smartphone HA app to apply that “learning” to how the HA’s reproduce sound heard directly through the HA microphones. My meager knowledge of spatial sound as reproduced by DTS is a lot of it depends in timing delays between ears. So maybe personal fit information could be integrated along with the average pinna effect already built into HA programming for BTE microphones to improve the spatial perception of audio.

Maybe the answer to this is known. But I wonder how we learn directionality if we all have differently shaped pinnas? Perhaps we hear a noise, and we see where it came from in the same instant or by turning around, see its source, starting out as infants and maybe continuing to learn on into life what sounds sound like coming from different directions and distances. Perhaps, the idea is just to restore more of our own unique pinna effect to make sound “directionality” more like what we learned when we were young (and had a lot better hearing!).

Edit_Update: Bonus Materials!

Wikipedia has an extensive article on how humans localize sound. Both time delays between ears and the relative level of a sound heard in each ear are important:

Sound localization - Wikipedia

I liked the discussion of “The Cone of Confusion…!”

Two interesting articles: One an extensive presentation on the development of sound localization in young animals. The last slide in the article concludes that exposure to sound is necessary for the development of sound localization but given exposure, the ability to localize sound itself is not driven by sound experience but is (apparently) innate:

image

Source: https://faculty.washington.edu/lawerner/sphsc462/dev_loc.pdf

Another paper studied the development of sound localization in early-blind people and found that their ability to localize sound was better than sighted people, implying that training by visual cues does not play a critical role in sound localization ability.

Early-blind human subjects localize sound sources better than sighted subjects

Source: https://www.nature.com/articles/26228

4 Likes

The Apple web page on AirPods Pro 2.0 says the following about Personalized Spatial Audio in a footnote:

  1. Compatible hardware and software required. Works with compatible content in supported apps. Not all content available in Dolby Atmos. iPhone with TrueDepth camera required to create a personal profile for Spatial Audio, which will sync across Apple devices running the latest operating system software, including iOS, iPadOS (coming later this fall), macOS (coming later this fall), and tvOS.

Source: AirPods Pro (2nd generation) - Apple

So, the firmware can take Dolby Atmos played on ~the whole gamut of Apple devices and send customized spatial sound to the AirPods Pro 2.0. Note that HA’s are not mentioned as a receiving device but I should think HA OEMs could add that capability to receive and interpret what any Apple device is broadcasting in something approaching Dolby Atmos or at least DTS.

Further up the web page (under Personalized listening. Sound
Tuned to you., there is also the following statement about “Dynamic Headtracking.”

Dynamic head tracking now brings three‑dimensional audio to Group FaceTime calls, so conversations feel like you’re in the same room with your friends and family.

Perhaps this could work both ways. If you were wearing a lavalier mic on your chest but turned away from your web cam, your voice wouldn’t necessarily change depending on whether you faced your device screen and web cam but your own headtracking through AirPods Pro 2.0 could adjust the sound sent out according to the way you faced the screen**. And if other folks in a Group FaceTime were similarly wearing AirPods Pro 2.0, the sound from them to your ears might also reflect their head positions whether they were wearing a direct input lavalier mic or not. The utility of this seems to be the same as for iMessage and FaceTime itself. It compels everyone in your group or family to be all-in with Apple or left out (as I was for years as an Android user in an Apple family! - now I’ve finally capitulated!).

** Same if the input were coming from mics on the AirPods Pro 2.0 themselves.

Cool cool cool. I expect that they are measuring the distance between the ears and applying frequency-specific time and level delays from that. They also appear to be scanning the pinna; can they interpolate semi-appropriate pinna filters from that?? That seems much harder to me, but video processing can do all sorts of crazy stuff these days so, sure? Based on some random user comments on reddit, I am guessing that they are not getting as accurate a head related transfer function as they would get from recording audio from directly into your ears. The old example I always think of is the barber shop one, which it looks like Starkey stole and repurpossed for marketting at some point: Virtual Barber Shop (Audio...use headphones, close ur eyes) - YouTube (use headphones)

In regards to relevance for individuals with hearing loss, pinna effect cues are pretty high frequency so depending on the hearing loss may not be accessible.

It’s flexible. If you make model pinnas and stick them on your head backwards while smoothing your own pinnas down, everything will sound backwards for a while and then it will flip and seem normal. You can temporarily shift both time and level difference assessments, and therefore localization judgements, by fatiguing one ear with multiple tone exposures. You’re right that the ears tend to follow the eyes. But there’s a combination of flexibility and . . . predetermined architecture. In owls for example, if you plug up one ear at birth you can track how they recover sound localization once that ear is unplugged. Recovery slows the longer you wait and after a certain point (~48 months?) you have passed the critical period and they will not recover sound localization when you unplug the deprived ear. Early experience is protective against later monaural deprivation. I don’t have a reference for how long that critical window or sensitive period is in humans; I’m not sure we have that information yet. (Critical developmental windows and sensitive periods exist for all sorts of things, and are worth reading about if you haven’t. Language is surely one of the most interesting areas.)

I’m tremendously biased, but I think auditory localization is much cooler than visual localizataion. :slight_smile:

3 Likes

Two interesting articles relative to Apple’s Personalized Spatial Audio and Dynamic Headtracking:

Apple’s personalized spatial audio trick is really a Sony idea

and

Would be very interesting if it comes to MFi HA’s - and one has the ability to turn it on or off as desired and fine-tune it.

1 Like

Both great links. Thanks Jim. Btw, I didn’t really follow what you were saying before about Dolby Atmos, 'cause I didn’t realise it’s a spacial thing. Just to show you who you’re dealing with here…

For some reason I thought you might be the one to appreciate it.

Ok, thinking aloud about things that I don’t understand… Most of us here will hear a voice partly through our hearing aids and partly naturally. I’m guessing that all of us- hearing impaired or not- are ‘tuned’ to recognise- partly through spatial location- a bunch of sounds as belonging to the speech of some individual. Also guessing that this bundling of sounds really helps our processing of speech.

If we’re getting conflicting cues about the speaker’s location what’s this doing to our speech understanding? Taking on board what you’re saying that the highest frequencies don’t provide a lot of spacial information to most of us, there will still be some crossover? I get that we’re probably talking audiology 101. I skipped my classes. So the gist of my thinking is… in the future, can this stuff improve the spatial cues that our hearing aids provide us and can this help us?

When I first saw this stuff on the web, I went through a few wtf moments. Like we can map our heads with our phones? Seriously?

And on a sad personal note. I saw the Airpod Pro 2 on pre-sale for about 25% off, but by the time I tried to order it, they’d pulled it. Filthy about that.

Pinna cues would be the first to go, but people with hearing loss are still accessing ILDs and ITDs to various degrees depending on their hearing loss.

Eh, applicability to hearing aids is weaker because: 1) You don’t need to reproduce ILD/ITD cues because they already exist in the world (as opposed to listening to e.g. music). You just need to maintain them appropriately, and 2) Hearing aids are already employing strategies to try to support this. I think in most cases they assume interaural time differences will be naturally maintained because hearing aid processing times will be symmetrical. Offhand, at least Oticon and Unitron try to be careful about maintaining ILD cues. Phonak applies ILDs with their Roger system to support a spatial effect through the streamed mic sound. Unitron is applying frequency specific filtering to try to replicate pinna cues. Most others have at least some basic directionality in place to support a pinna-like effect. Resound’s M&RIE mic is trying to maintain pinna cues physically rather than replicate them–seems to me that with that mic in the ear it is should be possible to take a direct measurement of the individual’s pinna filter and apply it to the sound coming in the BTE mics in noise situations, but they don’t. So there’s room for advancement.

But yes, the spatial location of sounds is a big component of how we are able to process speech in noise. It’s not unimportant, and it does break down with hearing loss.

2 Likes

When I saw this during the Apple presentation I yelped excitedly and started explaining pinna to my husband. :smiley: Apple has an entrenched user base and if they are working something as deeply related to hearing as pinna they are working on OTC hearing aids. It’s Apple. They won’t be first to market with any product, but when they get there, it’s usually worth the wait.

There is every chance that what Apple can bring to the market can include any of the following:

  • support a hearing aid test provided by a pro
  • support custom adjustments to high mid and low frequencies
  • support ‘conversation boost’ for the vocal ranges as a quick function
  • support an ear scan to help augment your device to more closely replicate your pinna. Huge ear people, rejoice.
  • much more flawlessly take calls and switch audio between your Apple devices
  • stream your TV to you with no secondary devices needed if you have an Apple TV
  • treat your hearing aids as a real, honest to god bluetooth device without all the limitations on pairing to multiple devices
  • a recharging case much like the airpods
  • a variety of sizes of closed in ear buds for a personalized fit
  • use of transparency mode to both eliminate the dreaded own voice echo effect, and to step around the issue having the pick up mic be behind your ear, further enhancing natural pinna. Those with low loss of hearing will get excellent bass and treble, great clarity, and streaming music will sound fantastic, vs the open hearing method if they want to try and go this route.
  • not just reducing the volume of ‘bad’ sounds like our hearing aids do, but active noise cancelation for ‘bad’ sounds, that doesn’t interfere with speech and other vital noises. IMAGINE IT.

I wanted to wait for OTC hearing aids specifically because I know Apple will be a player, eventually, but I just couldn’t wait any longer. I like my Widex Moments 440s well enough, but dang it Apple, GET IN THE MARKET ALREADY.

I don’t think Apple will make an fda-approved hearing aid.They’ll keep improving the ‘assistance’ features in their audio products. The adaptive transparency mode is interesting. I’ve seen some ‘reviews’ (mostly by people who haven’t tried them) say that it will bring prioritise speech against background noise. Yeah, maybe. Then there’s a mysterious port on the charging case that might or might not be a remote microphone. I won’t lie. I’d like the chance to try them out.

I wouldn’t expect them to work as well as hearing aids. My use case is listen to streamed music on them and not have to put on my hearing aids if someone tries to converse.