Viewing a single comment thread. View all comments

The_D0lph1n t1_jdija1e wrote

Keep in mind that rapid A/B switching tends to erase differences. That's a familiar problem for people who go to big meets where lots of headphones are available for demo and they try out multiple headphones/IEMs in the span of a few minutes; everything starts to sound the same because our brains don't have time to get acclimated to any one sound. It's a common phenomenon that headphones that sound good during a short demo at a show don't sound as good in the long run, because the sound qualities that make it stand out against the 5 other headphones the listener just heard make it too sharp or too unusual in a normal listening environment.

I've gotten headphones very close to one another via EQ as well (though never quite exact), and resolution is something that to me is mostly linked to FR. I actually don't like the term "resolution", and I prefer the term "tonal contrast", which I think is a more descriptive term for what I hear. Contrast is what allows me to differentiate between different sounds (similar to how visual contrast is a key part of how our eyes perform object recognition and differentiation), and to me, "resolution" is how easily I can distinguish different instruments and sounds. It's a very fine-grained balance between different frequency ranges (plus lack of cumulative distortion from the lower registers that might interfere with the presentation of the high frequencies) that produces the correct contrast for good "resolution".

Soundstage is the main thing that cannot be easily replicated via EQ, and that's because the headphone's interaction with your HRTF matters a lot for that. Even Dr. Sean Olive, possibly the foremost expert on headphone FR measurements, has said in the recent interview with Resolve and Crinacle that FR measurements aren't everything, and don't measure the spatial qualities of a headphone. I could not EQ my Sundara to have the same soundstage size as my Shangri-La Jr, even though I could approach its resolution and overall sound. But the placement of sounds is something I could not reproduce via EQ, the SGL just sounded more spacious. The physical sizes of the drivers are different between the two, so the wavefront that hits my ears is different, so the interaction of my ears to that wavefront is different as well. The X2 and the Sundara have similar sizes and shapes I recall (I only briefly owned the X2 years ago), so soundstaging differences should be less pronounced between them.

With EQ, I've noticed regions (different for each headphone) where the magnitude of EQ applied doesn't match the magnitude of the perceived change in the sound. I've noticed places where 0.5 dB makes a noticeable difference in the sound. I've also seen cases where I boost a range by 10 dB and it does nothing to erase a dip in that range (that's usually with closed-back headphones with undamped earcups). I've found that even if I can EQ headphone A to sound like headphone B, that's no guarantee that I can do the reverse, make B sound like A.

There's also another aspect of sound that can be produced with EQ, but not via standard EQs (graphic or parametric). I've started using dynamic EQ, which boosts/cuts a frequency band only when dynamic swings occur in that band, and that allows me to add the "punch and slam" of macro-dynamics into a headphone. So in a way, dynamics are FR too, but not in the FR that you can easily see in a graph, it's sort of "instantaneous FR" if you will. I've heard of "attack measurements" at SBAF and also of impulse response overshoot as metrics for dynamic performance (more overshoot in the impulse response level means more slam), but either way it's not something that you can easily see in the standard FR graph yet has quite noticeable effects on the sound.

My overall view is that I don't agree with people who say "FR is everything" and mean that you can just look at your usual FR graph and immediately know how a headphone sounds. Even experts like Dr. Olive who specialize in those FR measurements don't hold that view. I take the view of "momentary SPL at the eardrum is nearly everything". I leave open the possibility that part of what we perceive is not eardrum-related, like maybe there's an effect perceived by the skin of the inner ear canal. There's also the fact that our brain doesn't work on SPL, but on loudness, and those two do not correlate exactly.

I also like seeing CSD plots, as I've read that at higher frequencies (>2KHz and with some effect down to 500 Hz), our brain doesn't maintain phase lock with the incoming sound wave, but instead the perception process is triggered by the waveform envelope. I've heard it explained that outside of the phase-locking region, the brain "batches" sound in time, and perceives the total amount of sound occurring in each batch as its loudness. Longer sound = louder. My understanding of this is that if a peak in the FR (above 500 Hz or so) has a long trail in the CSD plot, that peak will sound louder than the plain FR would imply. You've already seen the demonstration of how EQing down a peak also cuts the CSD trail, so the headphone "double-dips" from the EQ, not only is the peak gone, but the amplifying effect of the CSD trail is also gone. I suspect that may be why I notice unusual effects when EQing, I'm changing the FR at the same frequencies where there is a significant CSD trail, so the effect is either muted or amplified. When I've done my measurements, it's usually the case that regions with odd EQ interactions also have longer trails in the CSD plot. Psychoacoustics is a really interesting field that I wish I studied more in college (I studied electrical engineering with an emphasis on computer microarchitecture, so outside of a few audio engineering classes, I never went too deep into that subject).

13

ICoeuss OP t1_jdkr40n wrote

Thank you very much for sharing your knowledge. I can't say I am qualified to fully understand everything you said (and it's even harder to do so because English isn't my first language) so I have a few questions if you don't mind.

What exactly is "cumulative distortion from the lower registers"? That and FR are the only 2 things that determine a headphones' resolution/tonal contrast then, correct?

I always thought that angled/far drivers or different sized drivers have different soundstages because pinna changes the FR of the sound that is entering ear canals differently. Therefore spaciousness and directionality of the sound is mostly (maybe even entirely) determined by pinna and since in-ear mics roughly measure the sound arriving my ear canal entrance (so it takes the interaction between the headphones and my HRTF including my pinna into account), can we really not replicate soundstage using EQ? If we can fully match the 2 sounds entering my ear canal, my eardrum should hear identical sounds as ear canal will react identically to both sounds, should it not?

If I understand correctly overshoot in impulse response level is slam. So if an impulse of larger than intended amplitude is created it will have more slam and maybe will sound nicer but is it the intended sound? I'm not asking if it's intended by the music producer, I'm asking if it's closer and more accurate to the digital input.

I've heard from multiple people that most headphones are "minimum phase". I don't know what exaclty "minimum phase" is but what I understood from it is that in CSD plots, if there's a peak in FR there will be a peak in decay time and vice versa and they match well enough so that it is under audibility threshold so they don't matter. Is this correct?

I now strongly agree on the idea that "momentary SPL at the eardrum is nearly everything" as FRs don't show how the headphones react when more than one particular frequency is played.

So if a headphone can't keep up when multiple instruments intended to sound like they're coming from different directions are played, I think the sounds might bleed into each other and hurt the fine details of the sounds and the sense of directionality. Could this be what people refer to when they use the terms "separation, resolution, imaging"? And I have a feeling this is related to the attack and decay speed of the headphones, is this true?

1

The_D0lph1n t1_jdl4vif wrote

When I mentioned cumulative distortion, I'm referring to how the energy present at a specific frequency in a multi-band signal is comprised not just the energy in the signal itself, but of it summed with all of the distortion products of lower tones. For example, the amplitude at 2 KHz is not just the 2 KHz component in the signal, but also includes energy from the 2nd harmonic distortion of the 1 KHz component, the 3rd harmonic of the 666.66 Hz component, the 4th harmonic of the 500 Hz component, etc. That's what I meant by cumulative: the level at each frequency depends not just on what's in the signal, but on the distortion components of lower frequencies that are played at the same time which lie at the same frequency.

In theory, if you could exactly match the waveform seen at the eardrum, then yes, you would hear exactly the same soundstage and imaging. However, I have never been able to properly do this in practice with over-ear headphones. Additionally, I've heard from an acoustic engineer that soundstage is partially influenced by physical factors; if the headphone is touching your ears, it hurts the illusion of soundstage because your brain knows that the sound is coming from right outside of your ear. In general, the brain prioritizes non-auditory inputs. That's why the McGurk Effect exists: when there's a conflict between what your eyes see and your ears hear, you literally hear what your eyes see, even if the actual auditory input doesn't match.

Regarding overshoot, in theory, less overshoot means it's more accurate to the input signal. That's what Dan Clark says to justify the macrodynamic performance of his headphones; he says that other headphones overshoot in their impulse responses, but his headphones do not. Many people think his headphones sound really dead and lifeless as a result, but that's where science meets art. If the music was produced on gear that has more overshoot, it probably has lower dynamic swings in the signal. Should the headphone reproduce the signal as is, or should it try to reproduce the dynamics that were in the original performance, but weren't mastered into the signal? There's no single right answer to that question, it's a matter of design philosophy and preference.

Regarding minimum-phase, it means that the phase response is exactly the amount needed to produce the frequency response. There's no excess group delay across the entire frequency range. In theory, the CSD plot shows nothing that isn't already in the FR, and generally weirdness in the CSD plot is reflected in peaks and troughs in the FR graph too. I used to hold strongly to that view, but now I'm not as certain that CSD plots have no value. All physical devices have resonances, and at higher amplitudes, those resonances are the first to exhibit serious non-linearities. In general, if I see a long trail in a CSD plot, then I take it as a sign that I should be very careful about boosting that region in EQ. If I cut, that's fine, because the trail disappears with the cut, but if I boost, then that trail will become more significant, maybe enough to become audible at high volumes. There is one case, the Koss KSC75, where I actually hear something like ringing in the lower treble (it sounds like a bit of microphone feedback when certain notes play), which is probably a resonance, but it goes away when I EQ down the 5 KHz region.

On the more practical side, I can only presume that phase is an issue at the design level. Many companies have tech in their headphones to control phase effects. Phase effects tend to result in weird, narrow peaks and dips in the FR. Those features become difficult to EQ out, and if you look at Oratory1990's EQ presets, he often doesn't fix those really narrow irregularities because they stem from phase effects that will sound really strange if filled in via EQ.

For your last question, speed isn't a real metric in headphones as far as we can tell. Some headphones certainly sound like they attack/decay faster or slower than others, but there's no concrete metric that can be exclusively tied to that phenomenon. Here's an article by Brent Butterworth at SoundstageSolo! that showed a perceptively slower headphone actually responding faster to the input than a fast-sounding headphone. Given my experience with dynamic EQ allowing me to add macro-dynamic punch to a headphone by overdriving large transients, I suspect that separation is part of "micro-dynamics" where small transients are being overdriven by a headphone. But there is no scientifically backed measurement that explains the perception of attack/decay speed in headphones.

2