ICoeuss OP t1_jdkr40n wrote on March 25, 2023 at 2:37 AM

Reply to comment by The_D0lph1n in I turned my X2HRs to Sundaras by ICoeuss

Thank you very much for sharing your knowledge. I can't say I am qualified to fully understand everything you said (and it's even harder to do so because English isn't my first language) so I have a few questions if you don't mind.

What exactly is "cumulative distortion from the lower registers"? That and FR are the only 2 things that determine a headphones' resolution/tonal contrast then, correct?

I always thought that angled/far drivers or different sized drivers have different soundstages because pinna changes the FR of the sound that is entering ear canals differently. Therefore spaciousness and directionality of the sound is mostly (maybe even entirely) determined by pinna and since in-ear mics roughly measure the sound arriving my ear canal entrance (so it takes the interaction between the headphones and my HRTF including my pinna into account), can we really not replicate soundstage using EQ? If we can fully match the 2 sounds entering my ear canal, my eardrum should hear identical sounds as ear canal will react identically to both sounds, should it not?

If I understand correctly overshoot in impulse response level is slam. So if an impulse of larger than intended amplitude is created it will have more slam and maybe will sound nicer but is it the intended sound? I'm not asking if it's intended by the music producer, I'm asking if it's closer and more accurate to the digital input.

I've heard from multiple people that most headphones are "minimum phase". I don't know what exaclty "minimum phase" is but what I understood from it is that in CSD plots, if there's a peak in FR there will be a peak in decay time and vice versa and they match well enough so that it is under audibility threshold so they don't matter. Is this correct?

I now strongly agree on the idea that "momentary SPL at the eardrum is nearly everything" as FRs don't show how the headphones react when more than one particular frequency is played.

So if a headphone can't keep up when multiple instruments intended to sound like they're coming from different directions are played, I think the sounds might bleed into each other and hurt the fine details of the sounds and the sense of directionality. Could this be what people refer to when they use the terms "separation, resolution, imaging"? And I have a feeling this is related to the attack and decay speed of the headphones, is this true?

The_D0lph1n t1_jdl4vif wrote on March 25, 2023 at 4:45 AM

When I mentioned cumulative distortion, I'm referring to how the energy present at a specific frequency in a multi-band signal is comprised not just the energy in the signal itself, but of it summed with all of the distortion products of lower tones. For example, the amplitude at 2 KHz is not just the 2 KHz component in the signal, but also includes energy from the 2nd harmonic distortion of the 1 KHz component, the 3rd harmonic of the 666.66 Hz component, the 4th harmonic of the 500 Hz component, etc. That's what I meant by cumulative: the level at each frequency depends not just on what's in the signal, but on the distortion components of lower frequencies that are played at the same time which lie at the same frequency.

In theory, if you could exactly match the waveform seen at the eardrum, then yes, you would hear exactly the same soundstage and imaging. However, I have never been able to properly do this in practice with over-ear headphones. Additionally, I've heard from an acoustic engineer that soundstage is partially influenced by physical factors; if the headphone is touching your ears, it hurts the illusion of soundstage because your brain knows that the sound is coming from right outside of your ear. In general, the brain prioritizes non-auditory inputs. That's why the McGurk Effect exists: when there's a conflict between what your eyes see and your ears hear, you literally hear what your eyes see, even if the actual auditory input doesn't match.

Regarding overshoot, in theory, less overshoot means it's more accurate to the input signal. That's what Dan Clark says to justify the macrodynamic performance of his headphones; he says that other headphones overshoot in their impulse responses, but his headphones do not. Many people think his headphones sound really dead and lifeless as a result, but that's where science meets art. If the music was produced on gear that has more overshoot, it probably has lower dynamic swings in the signal. Should the headphone reproduce the signal as is, or should it try to reproduce the dynamics that were in the original performance, but weren't mastered into the signal? There's no single right answer to that question, it's a matter of design philosophy and preference.

Regarding minimum-phase, it means that the phase response is exactly the amount needed to produce the frequency response. There's no excess group delay across the entire frequency range. In theory, the CSD plot shows nothing that isn't already in the FR, and generally weirdness in the CSD plot is reflected in peaks and troughs in the FR graph too. I used to hold strongly to that view, but now I'm not as certain that CSD plots have no value. All physical devices have resonances, and at higher amplitudes, those resonances are the first to exhibit serious non-linearities. In general, if I see a long trail in a CSD plot, then I take it as a sign that I should be very careful about boosting that region in EQ. If I cut, that's fine, because the trail disappears with the cut, but if I boost, then that trail will become more significant, maybe enough to become audible at high volumes. There is one case, the Koss KSC75, where I actually hear something like ringing in the lower treble (it sounds like a bit of microphone feedback when certain notes play), which is probably a resonance, but it goes away when I EQ down the 5 KHz region.

On the more practical side, I can only presume that phase is an issue at the design level. Many companies have tech in their headphones to control phase effects. Phase effects tend to result in weird, narrow peaks and dips in the FR. Those features become difficult to EQ out, and if you look at Oratory1990's EQ presets, he often doesn't fix those really narrow irregularities because they stem from phase effects that will sound really strange if filled in via EQ.

For your last question, speed isn't a real metric in headphones as far as we can tell. Some headphones certainly sound like they attack/decay faster or slower than others, but there's no concrete metric that can be exclusively tied to that phenomenon. Here's an article by Brent Butterworth at SoundstageSolo! that showed a perceptively slower headphone actually responding faster to the input than a fast-sounding headphone. Given my experience with dynamic EQ allowing me to add macro-dynamic punch to a headphone by overdriving large transients, I suspect that separation is part of "micro-dynamics" where small transients are being overdriven by a headphone. But there is no scientifically backed measurement that explains the perception of attack/decay speed in headphones.