audioen

audioen t1_je0ioev wrote

Let me show you my squid web proxy. It runs all the content of the Internet through an AI that rewrites it so that everything agrees exactly to what I like. I appreciate your positive and encouraging words where you are enthusiastic, like so many of us, about the potential and possibilities afforded by new technologies, and are looking forwards to near-limitless access to machine labor and assistance in all things. As an optimist, like you, I am sure that it is certain to boost the intelligence of the average member of our species by something like 15 IQ points if not more.

In all seriousness, though, it is a new world now. Rules that used to apply to the old one are fading. You can't usually roll back technology, and this has promise of boosting worker productivity in intellectual stuff by factor around 10. The words of caution are: I will not call up that which I can not put down. However, this cat is out of the bag, well and truly. All we can do now is to adapt to it.

Iain M. Banks once wrote in his Culture series novel something to the effect that in a world where everyone can fake anything, the generally accepted standard for authenticity is a high-fidelity enough real-time recording that is performed by a machine which can ascertain that what it is seeing is real.

Your watermark solution won't work. Outlawing it won't work. Anything can be fake news now. Soon it will come with AI-written articles, AI-generated videos, and AI-supplied photographic evidence, and AI-chatbots pushing it all around on social media. If that is not a signal for your average person to just disconnect from the madhouse that is media in general, I don't know what is. Go outside and look at the Sun, and feel the breeze -- that is real. Let the machines worry about the future of the world -- it seems they are poised to do that anyway.

5

audioen t1_jdz3nxt wrote

This is basically a fluff piece inserted into the conversation that worries about the machine bias, the ability for it to do stuff like figure out race by proxy, and possibly use that knowledge to learn biases assumed to be present in its training data.

To be honest, the network can always be run in reverse. If it lights up a "black" label, or whatever, you can ask it to project back to the regions in image which contributed most to that label. That is the part it is looking, in some very real sense. I guess they did that and it lighted up big part of the input, so it is something like diffuse property that nevertheless is systematic enough for the AI to figure out.

Or maybe they didn't know they could do this and just randomly stabbed around in the dark. Who knows. As I said, this is fluff piece that doesn't tell you anything about what these researches were actually doing except doing some image oversaturation tricks, and when that didn't make a dent in machine's ability to identify race, they were apparently flummoxed.

3

audioen t1_jdz1ol1 wrote

LLM, wired like this, is not conscious, I would say. It has no ability to recall past experience. It has no ability to evolve, and it always predicts the same output probabilities from the same input. It must go from input straight to output, it can't reserve space to think or refine its answer depending on the complexity of the task. Much of its massive size goes into recalling vast quantities of training text verbatim, though this same ability helps it to do this one-shot input to output translation which already seems to convince so many. Yet, in some sense, it is ultimately just looking stuff up from something like generalized, internalized library that holds most of human knowledge.

I think the next step in LLM technology is to address these shortcomings. People are already trying to achieve that, using various methods. Add tools like calculators and web search so the AI can look up information rather than try to just memorize it. Give the AI a prompt structure where it first decomposes task to subtasks and then completes the main task based on results of subtasks. Add self-reflection capabilities where it reads its own answer and looks at it from point of view whether the answer turned out to be very good and maybe detects if it made a mistake in reasoning or hallucinated the response, and then goes back and edits those parts of the response to be correct.

Perhaps we will even add ability to learn from experience somewhere along the line, where the AI runs a training pass at end of each day from its own outputs and their self-assessed and externally observed quality, or something. Because we are working with LLMs for some time, I think we will create machine consciousness expressed partially or fully in language, where the input and output remain to be language. Perhaps later, we figure out how AI can drop even language and mostly use a language module to interface with humans and their library of written material.

2

audioen t1_jdw2frs wrote

The trivial counterargument is that I can write a python program that says it is conscious, while being nothing such, as it is literally just a program that always prints these words.

It is too much of a stretch to regard a language model as conscious. It is deterministic -- it always predicts same probabilities for next token (word) if it sees the same input. It has no memory except words already in its context buffer. It has no ability to process more or less as task needs different amount of effort, but rather data flows from input to output token probabilities with the exact same amount of work each time. (With the exception that as input grows, its processing does take longer because the context matrix which holds the input becomes bigger. Still, it is computation flowing through the same steps, accumulating to the same matrices, but it does get applied to progressively more words/tokens that sit in the input buffer.)

However, we can probably design machine consciousness from the building blocks we have. We can give language models a scratch buffer they can use to store data and to plan their replies in stages. We can give them access to external memory so they don't have to memorize contents of wikipedia, they can just learn language and use something like Google Search just like the rest of us.

Language models can be simpler, but systems built from them can display planning, learning from experience via self-reflection of prior performance, long-term memory and other properties like that which at least sound like there might be something approximating a consciousness involved.

I'm just going to go out and say this: something like GPT-4 is probably like 200 IQ human when it comes to understanding language. The way we test it shows that it struggles to perform tasks, but this is mostly because of the architecture of directly going prompt to answer in a single step. The research right now is adding the ability to plan, edit and refine the replies from the AI, sort of like how a human makes multiple passes over their emails, or realizes after writing for a bit that they said something stupid or wrong and go back and erase the mistake. These are properties we do not currently grant our language models. Once we do, their performance will go through the roof, most likely.

0

audioen t1_jdujtbl wrote

Yes. Directly predicting the answer in one step from a question is a difficult ask. Decomposing the problem to discrete steps, and writing out these steps and then using these sub-answers to compose the final result is evidently simpler and likely requires less outright memorization and depth in network. I think it is also how humans work out answers, we can't just go from question to answer unless the question is simple or we have already memorized the answer.

Right now, we are asking the model to basically memorize everything, and hoping it generalizes something like cognition or reasoning in the deep layers of the network, and to degree this happens. But I think it will be easier to engineer good practical Q&A system by being more intelligent about the way LLM is used, perhaps just by recursively querying itself or using the results of this kind of recursive querying to generate vast synthetic datasets that can be used to train new networks that are designed to perform some kind of LLM + scratchpad for temporary results = answer type behavior.

One way to do it today with something like GPT4 might be to just ask it to write its own prompt. When the model gets the human question, the first prompt actually executed by AI could be "decompose the user's prompt to a simpler, easier to evaluate subtasks if necessary, then perform these subtasks, then respond".

3

audioen t1_jduat9o wrote

https://rentry.org/llama-tard-v2 contains bunch of the requisite torrent links, though much of it is disorganized information and multipurpose, e.g. do this if you want that, use these if you are on Windows but these if on Linux, and so forth. It is a mess.

I have llama.cpp built on my Linux laptop and I got some of these quantized model files and have installed bunch of python libraries required to run the conversions from the various formats to what llama.cpp can eat (model files whose names start with ggml- and end with .bin). I think it takes some degree of technical expertise right now if you do it by hand, though there is probably prebuilt software packages available by now.

3

audioen t1_jc9mq5x wrote

I am not so negative. Sure, it is something like statistical plagiarism. On the other hand, I have seen it perform clever word-plays that I do not think exist in its training material. After it generalizes from many examples, it displays fluidity in association and capabilities that are quite remarkable for what it is.

Much of what we do today involves working on a computer, consuming digital media and producing digital output. I am going to just claim that all of that is amenable to AI. We were all completely wrong in predicting what programs could do -- it turns out that the most important thing is simply affordance. If it is data that computer can read, then it can do something with it.

Much of what we think that is intelligence appears to be barely better than that plagiarism that you decry. I mean, work we do is typically just about doing repetitive tasks every day which are similar to what you did before, and applying known formulas you have been taught or learnt by experience to new problems. I am afraid that human creativity will not turn out to be all that different form machine creativity.

1

audioen t1_j9qrwwm wrote

My point is that tonality-wise, which is the most important factor of the sound, there should be barely any difference between them. There are others, like time-domain accuracy which is about resonances and reflections between head and cup, and harmonic distortion, and some arguments that say none of them matter very much. Far less than the tonality, in any case.

By and large I have thought that all Hifiman magnetoplanars have largely similar sound. The technology is very similar, it's just a thin membrane on which conductive path snakes up and down between magnets laid out to create a tiny motor force, enough to move that membrane a little according to the input and make sound.

I have the Edition XS, which should be fairly similar to 400se. You see how nicely these curves hug each other on these other models as well: https://tpucdn.com/review/hifiman-edition-xs-planar-headphones/images/comparison-1.png

This is what I mean when I say these headsets should all have very similar sound.

I drive my Edition XS with their 18 ohm impedance using the $9 apple usb c 3.5 mm headset jack from PC. Lowish impedance of headset can be a little bit of a problem, not sure if this is a problem with Fiio K3. I can't see that stat on the product page. The Apple dongle is very cheap and pretty good, though.

1

audioen t1_j9px3ke wrote

Interestingly, the sound tonality should be almost the same: https://crinacle.com/graphs/headphones/graphtool/?share=IEF_Neutral_Target,Sundara_(2020),HE400se

I do not think 400se should be bad choice. There are more measurements taken of this headset and IIRC they only tend to have just a defect with a bit of narrow-band resonance around 700 Hz. Otherwise, the performance looked fine. So, I am quite at loss to explain what the problem is.

1

audioen t1_j9abd2n wrote

I think the short and boring answer is 8 electrons can arrange into 4 electron pairs, which gives you tetrahedral symmetry with the atom in the center and its bonds extending towards the corners of the tetrahedron. As an example, CH4 has this structure. For many atoms, 4 outer electron pairs seems to be optimal in sense that atoms still can get close enough to share electrons without bumping to each other, and the pairs can still arrange into structures called orbitals where they can get as far as away from each other as possible in a deliberate way that is described by quantum mechanics.

When atom is floating alone in space, the orbitals are all distinct and create these quantum-mechanically allowed non-overlapping structures such as s, p and d orbitals, and so forth. When other atoms enter the picture, the situation changes and the orbitals are said to hybridize, which is to say that they are no longer like that but tend to combine and the picture is now more complicated. As an example, tetrahedral symmetry is result of 2 distinct orbital shell types combining together to yield this new structure of 4 identical covalent bonds.

First group elements tend to only create one covalent bond as their outermost shell is single spherical structure that can only fit 2 electrons, and they already have one themselves. Most other elements seem to prefer 8 electrons, likely because of the sweet spot of maximizing electromagnetic attraction with electrons and protons, while also still minimizing the electromagnetic repulsion between the electrons. Then there are transitional metals which are larger in diameter and create more complicated covalent bond structures, apparently between 12 to 18 electrons, and there electrons also make use of the d orbitals which tend to be more pointy and narrow in their shape, which is a general trend with all the higher orbitals.

2

audioen t1_j6vqnn6 wrote

I think output impedance issues are the reason for more bass. Truthear x Crinacle Zero doesn't work correctly unless a low output impedance amplifier is provided. This IEM has two drivers with a crossover, and the bass region of the spectrum has the higher impedance, so it uses less current relative to the highs at that volume level. Devices that struggle to provide enough current will thus tend to recess mids and highs. You have probably never heard how this IEM is supposed to sound.

Cheapest thing that should drive them just fine is the $9 Apple dongle, as its output impedance is less than 1 ohm. One known issue is that Android phones can't adjust the hardware volume of this dongle, so maximum may be a little on the quiet side for some.

Edit: dug up the ASR measurement of the impedance: https://www.audiosciencereview.com/forum/index.php?attachments/truthear-x-crinacle-zero-iem-thd-impedance-measurement-png.230795/ which is smoothly variable but also rises towards the low end. 10 ohms is less than most planars, and planars usually also have constant impedance.

3

audioen t1_j5sux24 wrote

No ELI5 for this. Impulse response is the convolution that system applies to its input to produce its output. Convolution is a description of how prior data seen by the system alters the signal that is being produced right now. It is computed as integral of the input at past points multiplied with convolution function's value at corresponding time point. In sampled digital systems, both the impulse response and the signal are arrays of numbers, and you approximate the integral by placing the impulse response and input signal side by side, and you multiply input and impulse together at their corresponding positions, sum all the values together, and write the sum as the output sample. Then you move convolution function forwards along the input by one sample, and redo this massive calculation again to get the next output sample, and so on, and this is the convolution of the input with the impulse response.

As an example, guitarists may have amplifier cabinet simulators which are the sampled impulse responses of the speaker in the cabinet, which is usually open in the back, or possibly they have sampled room impulse responses and their effect processor actually convolves their playing with these impulses -- possibly multiple seconds long -- in real time to mimic the sound of these amplifiers and rooms. They actually perform the equivalent of the convolution computation in frequency domain because of the massive amount of multiplications that are needed for long impulses in time domain.

The ideal impulse response is infinitely narrow spike, because it means that the input at that one position is the only thing that affects the output and the convolution's output is thus the same as the input.

2

audioen t1_j5srj6c wrote

I answered earlier to OP directly at top-level comment, but I want to answer this one directly. The impulse response is the same thing as the frequency response, it is just time-domain characterization of the system, whereas frequency response is the frequency domain equivalent, though we are usually not shown the complex number nature of the frequency spectrum where phase information of the sound is encoded because we do not hear phase directly and the phase plot doesn't relate to anything we can intuitively understand.

If the speaker membrane moves slowly back to neutral position after an impulse has excited it, that would show up as a decaying plot, and in frequency response would look like a low-pass filter. One way to understand it is to think that system isn't fast enough to reproduce waveform that cycles in and out of phase within that decay region, so if the wavelength is short relative to the impulse's decay time, it cancels with prior versions of itself that are decaying in the impulse, resulting in little output.

In this case, the impulse drops gradually rather than instantly, suggesting that there is some low-pass filtering effect, but also overshoots and goes below zero, which suggest to me that it has could have high pass filtering characteristic, too. For low frequencies, whose wavelength is long relative to the impulse, the negative parts of the impulse subtract from the positive side, and reduce output for low frequency. The fact impulse also returns to the positive side suggests it also contains a resonating component, though. If wavelength is the same as the impulse's ringing around zero level, then it will be amplified by the impulse response.

Finally, if impulse response is ideal, and system's frequency response is perfectly flat, and phase is linear, the entire impulse response is just a single spike with perfect silence surrounding it forever. The ideal impulse indicates that whatever the signal wants to do, the system can reproduce without altering it.

6

audioen t1_j5sp8ch wrote

Hmm, okay. I couldn't find any other measurement of this headset, so I will leave it up to the air whether it is representative of the headset, then. Certainly the noisy/spiky character of it should be discarded as a measurement artifact, but the other curve parts going up and down in it might be real. The ideal group delay plot is just a flat line at 0 ms that gradually rises towards the bass due to inevitable high-pass filtering somewhere in the amplifier or such.

3

audioen t1_j5soc56 wrote

I think you are just wrong, and you do not seem to know what flat frequency response looks like in impulse response graph. It is one sample long singular spike, followed by perfect silence afterwards, forever. You could say that studio monitor speaker systems strive to reproduce just such an impulse, and the closer it is to a very narrow spike, the better the acoustic system. Even this headset looks like it is not that far from perfect impulse apart from some ringing afterwards which suggests it has some resonance peaks and probably highpass filtering because the impulse goes below zero, and that would indicate it cancels some of the sound it produced earlier after a time delay, which is how highpass filters generally work. I don't know, it is really hard to try to read the frequency response off impulse response.

It is completely obvious to me that time delays do not have impact on the magnitude spectrum. They have an effect on the complex spectrum because phase (and group delay) are different and are encoded in the ratio of the imaginary and real parts of the complex number which is usually not shown because phase angle is difficult to relate to anything we actually hear. In that case, group delay would show the added fixed delay just fine, though.

I can only assure you that from mathematical point of view, the impulse response and complex frequency response can be converted to each other without loss. Whether Fourier analysis is good model of human hearing is perhaps a thornier question, as this isn't quite how our ears work, but I think it is still plenty useful as a construct.

7

audioen t1_j5r8ty5 wrote

It is actually the same information as the (complex) frequency response. This is just the time-domain representation of it. More specifically, the Fourier transform of the impulse response is the (complex) frequency response, and the inverse Fourier transform of the (complex) frequency response is the impulse response.

I guess usual smoothed magnitude spectrum has elided phase information, while phase information is in some sense still visible in the impulse response. It is thus the more complete record of the system's behavior. That is why I put the word "complex" in parenthesis above, it means that the ratio of the imaginary and real part of the complex number gives the phase angle. In my opinion, the phase should be processed to group delay plot which shows how much the system delays sound across the response's frequency range. I agree in that I don't think the raw impulse response is easy to read at all.

Group delay is not often an interesting plot with headphones because they are not supposed to have much group delay to begin with. Group delay is more of a property of electronics and digital filters. However, sometimes group delay plots of headphones have big spikes that show that phase is wildly inconsistent at some frequency, and this is often something like a resonating structure in the headset cup. It would also be apparent in frequency response as a narrow spike at that location, but often frequency response plots are heavily smoothed which hides these defects.

As an example, Hifiman Ananda has something wrong in its group delay plot: https://www.audiosciencereview.com/forum/index.php?attachments/hifiman-ananda-group-delay-measurements-open-back-planar-headphone-png.122835/ -- the plot can be quite noisy which probably comes partially from how the headset sits on the fixture and how reflections go inside the headset cup. However, curve fragments going up and down all over, especially in low frequencies above 200 Hz just isn't normal. So this is example of headset with messed up phase that indicates a sound quality problem.

27

audioen t1_j23jkj4 wrote

My guess is that differences other than frequency response are related to harmonic distortion, and things like ringing/resonance in the headset cup, mostly.

Harmonic distortion makes it hard to tell instruments apart because pure tones already gain extra overtones which can audibly affect the character of the sound if they are above some -60 dB relative to the main tone, and multiple tones do not blend cleanly, either, but interact and create additional extra frequencies, and it is typically called intermodulation distortion. These extra frequencies could be perceived as extra noise, or timbre changes, or such, and may make it hard to tell instruments apart. It is one of the reasons why I look for harmonic distortion graphs, especially those that have separated 2nd, 3rd, 4th and so forth, as physiological measurements of human auditory system show that the masking of the harmonic distortion mostly covers the 2nd harmonic at some -40 dB level, but barely at all for the higher ones, though there is general tendency for harmonic distortion below -60 dB to be inaudible no matter where it is.

For over-ear headphones, ringing in the cup is probably visible as narrow peaks in frequency response at some specific frequencies, assuming the graph is not overly smoothed. Ringing is usually also visible as minor kinks in impedance graph as well, as the driver behaves somewhat differently at those particular frequencies, and is likely also seen as abrupt changes in the group delay and phase. So I like to see a nice flat group delay plot up to some 10 kHz, to know that there are no phase or ringing issues to be expected. Above some frequency depending on the cup's distance to headset fixture and earlobes, the measurement device itself will add all sorts of phase issues, and generally speaking the measurement above 10 kHz is not usable. For IEMs, I think the measurement reliability extends far higher, though there will be a peak for the ear canal resonance frequency where soundwave bounces between eardrum and the IEM and the exact frequency depends on insertion depth.

2

audioen t1_iujxe26 wrote

This guy sounds like he is totally wrong, but he is roughly correct in that you need low output impedance amplifier to drive variable impedance loads correctly. I am parsing this is what he is talking about, though he doesn't use the right technical terms. He talks about "voltage swings" and "amplifier power", and these are pretty awful ways to describe the problem. Low output impedance is unrelated to having lots of power, or "high" voltage in output side.

Audio is not actually demanding application for electronics. Circuits can switch states at gigahertz rates, and audio is very, very slow signal in comparison, so electronics can trivially follow and reproduce it without ever having to care about real high frequency stuff such as signal path lengths. Power requirements of headsets are also trivial, usually milliwatts or so before they get so loud that your hearing is at risk, and this is still less than would be used by a standby LED of a random home appliance. The voltages needed are similarly on the low side, for a typical lower impedance headset it is probably less than 0.2 V, and this translates to current demands that are a few milliamps, so again, barely anything. Therefore, good enough amplifiers do not need to be large or expensive, but more like finger-nail sized and cost a few bucks to put together. The existence of these external usb soundcards that are often called DAC dongles (but they actually can have dozens of milliwatts of output power) is the proof of what I am saying here.

1

audioen t1_iujr419 wrote

Output impedance is one possibility. It becomes a problem when it is either frequency dependent, or if the headset's impedance is frequency dependent, as it adds an extra resistance that lowers the voltage seen by the headset in the parts of the frequency response where its own impedance is also low. Good gear has roughly 1 ohm output impedance or less, where it is considered no longer to matter.

1

audioen t1_iu31rpn wrote

I would actually get this $10 dongle just in case anyway, it can serve as a reference to check whether things are okay. You don't have to use it if you don't think it improves anything, but it is a trivial investment to answer a question you might not be able to answer otherwise. Based on the fact that you have 560S with a pretty high impedance measurement of 133 to 224 ohms (with nominal given as 120 ohms) I think any device with minimal attention to quality ought to be able to drive this headset correctly, but perhaps you want to buy other headsets one beautiful day.

The Apple dongle has been measured and is known to be an accurate DAC+amp combo, with 0.5 (EU) or 1 V (US) maximum output (so it can't drive high impedance and low sensitivity devices that typically want higher voltages than that), but it has very low 0.9 ohm output impedance so it can correct for frequency response errors that occur when high output impedance drives impedance load that varies by sound frequency.

The error comes from voltage drop that occurs somewhere in the output plug's electronics where voltage drops below expected because the headset wants more current than the output stage is capable of producing due to its internal resistance. There is a rule of thumb that says that output impedance should be at most 1/8th of the lowest impedance of the headset, e.g. if headset is 120 ohms, then you should have at most 15 ohm output impedance (which I daresay is an easy requirement, though some mobos have been measured to have more than that). This should limit errors related to output impedance to level that is probably not audible, to a small fraction of a dB.

There are barely any headsets with impedance less than 10 ohm. I happen to have Crinacle x Truthear Zero which actually has 10-20 ohm impedance in such a way that it has a crossover internally where 20 ohm impedance is with the bass and 10 ohms otherwise, and it benefits from low output impedance amplifier or the bass increases too much. So I use the Apple USB-C dongle for those because otherwise it becomes pretty bass-heavy in one laptop, though it is actually fine in another.

1

audioen t1_iu2zy5j wrote

My suspicion is that all of these changes -- while welcome -- actually amount to close to imperceptible marginal improvements. It is not like existing technology, even down to dynamic drivers from decades ago, couldn't to almost all of the same things, and almost as well. Hell, I bought my first pair of HD-600s 20 years ago and people still use them and think they sound fine.

Magnetoplanar technology is probably the point where I personally stop caring about sound quality improvements as far as the drivers go, because measurements of such a system indicate that they are already practically perfect, whereas dynamic drivers tend to struggle with distortion at low end if they were to play at flat level, let alone at the boosted Harman target level.

To me, it is a bit like in the 80s/90s when PC soundcards reached 44.1 kHz and 16 bits, and in practice reached a level which largely holds today, and to which only very slight improvement is possible in practice.

I think it is okay for technology to peak. It all reaches level of "good enough" sooner or later.

8

audioen t1_itkhotf wrote

Typically, you need to know at least two things to evaluate power. The impedance controls the overall voltage level needed to put in enough milliwatts in to the headset as function of voltage, P=U²/R, and you are probably best off looking up the actual impedance measurements rather than using some single nominal value given by manufacturer. It is also of interest whether the impedance varies by frequency, and if so, by how much. The other is, what sound level is achieved per milliwatt of power, or per 100 mV of driving voltage from peak to peak, or however they prefer to express it.

Power requirements of headsets are generally low in context of the systems that are tasked at driving them. From what I have seen, the range is somewhere from less than 1 mW to about 30 mW for something like 96 dB SPL which should be loud enough to cause hearing damage if prolonged. I am not sure if there is use case for having much more loudness than this, but watts multiply very quickly when target SPL goes up given that this scale is logarithmic, but on the other hand, so does driver excursion and thus distortion. Many headsets are operating close to their maximum performance at 96 dB and usually only magnetoplanars and electrostatics can handle more SPL without distortion going out of control.

If the impedance of the headset is low, it should work perfectly with the $10 Apple USB-C DAC dongle, as this particular one provides low-impedance output at mere 0.9 ohm, which is exceptionally good, and has been measured to be capable of supplying 33 mW of power to a 32-ohm load before distortion. (The issue with high output impedance is that it adds a frequency-varying component which distorts frequency response if the headset's impedance also varies as function of frequency.)

The other issue is whether it gets loud enough for you. This is question of providing high enough voltage to drive the headset. This can be a problem with high impedance headsets, as they tend to require much larger driving voltage because their large internal resistance forces the power to be very low until something like full volts are hit. For instance, if your headset had 600 ohms impedance, and it takes a milliwatt of power to drive the headset loudly, you need to provide something closer to a 0.8 V (P = U²/R = 0.8 V * 0.8 V / 600 ohm = about 1 mW), whereas most low-impedance headsets are already very loud at mere 0.1-0.2 V output voltages.

I personally go by audiosciencereview's measurements as they typically add each headset to their list they put on every review and show the millivolts needed to reach 96 dB benchmark level. Some low-impedance headsets like Hifiman HE6SE v2 with 64 ohm impedance actually require over 1V before they get loud, so it can be said that some planars are very inefficient, but even that is relative to other headsets, as it is still a few dozen milliwatts, similar to the power requirement of a single standby led you see all over the place. Most are not that bad. So the summary is probably that you can simply try driving with the Apple USB-C DAC dongle first, and if it is not getting loud enough, then get something else with more output voltage.

5

audioen t1_it7d0z6 wrote

I would add a minor quibble about the impedance part. It is only the other half of the equation. After all, impedance is the R in the total power, which is the familiar P = UI, U = RI part of the equation. One consequence is P=U²/R, square of voltage divided by impedance. The higher the impedance, the larger voltage is needed to produce certain level of power, but voltage grows in square so it will not be all that much in the end.

Depending on design, that power can then be translated to acoustic energy at some efficiency or other, and manufacturers either relate that in terms of voltage or milliwatts to achieve certain SPL. From what I can see, most headsets should be deafeningly loud with very low power figures, to the tune that even 1 mW is more than your ears can take without suffering damage. Milliwatts of power are such an astonishingly low figure, and I think it is a crime that not literally everything has enough power behind it to drive headsets well enough. You don't need a separate amplifier to make milliwatts of power, that much is achieved by pretty much anything. I confess I also do not quite understand how headset can have multiple hundreds of ohms of impedance. What is it doing? Do they put big resistors there? A voice coil really shouldn't present that much of resistance.

In any case, I would personally steer away from headsets that require an amp. The Apple USB-C DAC is a great example of USB soundcard that can deliver the few milliwatts more or less perfectly, and it costs all of $10. Hopefully in the future, all DACs on all devices are decent enough, and headsets can be simply driven by any random thing even if it isn't a real amplifier, because there is clearly solution space where the problem is tiny -- just keep impedance anywhere reasonable, say somewhere in 15-50 range, and produce enough SPL per milliwatt, easily achieved by many designs already and there really is no sound quality compromise here as far as I can tell.

2

audioen t1_it79rjp wrote

Well, it may be worth highlighting the fact that impedance varies by sound frequency when there is significant inductance and capacitance involved. Most measurements of headphones come with the impedance graph as function of frequency, so I kind of struggle to understand your point.

1

audioen t1_it6axyb wrote

Appreciate headsets for what they are: they have edge in producing much less distortion, their low end can hit close to 20 Hz, and there is no reverberation of a room to produce peaks and valleys in the frequency response due to booms and cancellations. Headsets simply are inherently very precise, but they are also unnatural compared to an actual sound that comes from a point in room.

Ultimately, it raises the question how to get best of both worlds. You want to control the reverb, but that is actually pretty costly, as it can involve placing multiple subwoofers, adding bass traps, placing acoustic foam panels and diffusers, doing room equalization with microphone, and so forth. Depending on your living setup, it can be unacceptable to design everything from acoustics first perspective. In fact, it would be best to have a dedicated room for music, whose geometry can be set up for great acoustics and which doesn't have to double as living space. So add cost of such a thing to the purchase price of great speaker system, I guess.

The barber shop DSP listening experience might be a decent middle ground here, where you'd have fake artificial treated room with directional sound constructed by computer simulation of a great listening room. Perhaps someone can come up with a transducer you wear under your clothes that then thumps your chest in tune of the bass to get the tactile feel.

2