In the attached screenshot I’ve circled a variable that I’m asking about.
Any input appreciated.
A guess is fitting formulas.
Thank you. Now, what are fitting formulas?
Compression turns the gain down and up. This can’t happen instantly (that would be distortion). There are situations where a fast action (120mS) makes sense; others where a slow response (800mS) is more natural. And still others where a “Dual” response (fast on short spikes, slow for sustained sounds) is useful.
It may be difficult to hear these differences without a test situation. Stand next to a fan (steady sound) and make a loud sound (clank the dishes near your ear). The fan sound should “duck” and come back. Larger number is slower recovery. For “natural dynamics” we want slow. But for speech intelligibility we may want fast so we get the soft sibilant after the loud vowel.
Demonstrating “Dual” requires more complexity. Clank the dish TINK, compare to clanking all the dishes TINKTINKTINKTINK so it is a “long loud” sound. The quick tink should recover fast, a long loud sound should recover slower. Aside from the “makes sense” aspect, your brain recognizes a single time constant as “unnatural” (but maybe useful), dual time constants fool the brain better.
In recording, we developed single compression with times from 50mS to 2000mS, and dual constants, by 1938. These were mainstays of audio processing until the 1980s when computers allowed much more clever trickery. (I worked on that; not easy to do.) I assume the “DSE” are such a system. Apparently part of “Dynamic Speech Enhancement”, which sounds like a collection of proprietary algorithms.