Yes, I understand your point. But you don’t lose the original note and you can fully hear the original note just the same as before, so the pitch of the note is never lost or altered and becomes a different note like you think.
Now how about the copy of that note being lowered? Does it have the same pitch or not? That’s not clear, but remember that the highs that get transposed that are being lowered are mostly timbres and high end harmonics and not pure notes, so if they are not just pure tones, and you still hear the pure tones loud and clear, then your musical perception is still intact and not greatly altered. You end up hearing the high end timbres and harmonics that you would miss anyway. The question is whether the high end timbres and harmonics blend well with the original musical content or not. To me, they seem to blend well.
I’m including below a screenshot of the Speech Rescue whitepaper that talks about how they use the ERB (width of the cochlear bandpass filters) to make frequency selection to follow the natural perceptual arrangement to minimize distortion. I know that’s a lot of mumbo-jumbo that I don’t understand myself. But intuitively, I interpret it to mean that they’re using some kind of knowledge about how the cochlear works to make the lowered sounds blend in well with the original sounds.