Ivor wrote: "blind testing is the last refuge of the agenda-driven scoundrel."
and my skepticism on blind testing is well aired here but I came across this yesterday. It's takes a few paragraphs to get going....
And here's John Atkinson, Editor of Stereophile, from his highly recommended article Blind Listening:
But when you have taken part in a number of these blind tests and experienced how two amplifiers you know from personal experience to sound extremely different can still fail to be identified under blind conditions, then perhaps an alternative hypothesis is called for: that the very procedure of a blind listening test can conceal small but real subjective differences. Having taken part in quite a number of such blind tests, I have become convinced of the truth in this hypothesis. Over 10 years ago, for example, I failed to distinguish a Quad 405 from a Naim NAP250 or a TVA tube amplifier in such a blind test organized by Martin Colloms. Convinced by these results of the validity in the Consumer Reports philosophy, I consequently sold my exotic and expensive Lecson power amplifier with which I had been very happy and bought a much cheaper Quad 405—the biggest mistake of my audiophile career!
Some amplifiers which cannot be distinguished reliably under formal blind conditions do not sound similar over lengthy listening in more familiar and relaxed circumstances.
Thanks for the article link, Ivor although I've read most of it before.
I agree with you & I suspect that there are many things wrong with audio blind testing.
I've repeated this request in many places - let's test blind tests themselves - let's use something that we know sounds different & insert it invisibly into the blind test & see if people can differentiate it or does the blind test procedure itself kill the ability to differentiate known differences. All "objectivists" I suggest this to argue against it or just ignore it - proving to me that they are agenda-driven & have no interest in finding out how specific their gold-standard test actually is.
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 11:19 am
by DaveF
Dont think I'd be completely against blind testing though, rather just another tool to use alongside sighted testing. I'd agree with the test above where something known to be audibly different gets inserted unknowningly into the chain. Thats a good way to see how good the person's hearing is.
(of course, what is actually audibly different is another argument)
But I'd turn the above on its head with the following example:
Two software players, MQN and Jplay. The person believes that MQN is the better player in a system that they know really well and are comfortable with. They trust their ears and are assured that this is the only way to properly judge a system. Lets say that the same piece is played several times through both players and their conclusion each time is a 100% certainty that MQN is the better player.
Now lets put that faith in the ears to the test just a little bit. The same test is repeated in the same order but on the sly the person in control switches to Foobar and plays it constantly while making the audience think that they are going through the same MQN/Jplay pattern as before. No need for blankets or leaving the room, level matching has already being done at the start.
Surely nothing too controversial there? Would it not make people reevaluate if they got it horribly wrong by picking the same winner again? I've no idea if the same winner would come out. I've never done anything like the above before but its a simple test.
One person with the right software skills could do something such as write a batch or python script to randomly select which player gets chosen to do the playback after you chosen your selected tracks. Thats something you could do on your own. The script would keep a record of what player was chosen and you compare afterwards. I'm assuming of course that both players could be run in command line mode.
No need to be plugging out cables or switching on and off valve amps or waiting for them warm back up etc. I think the above is a simple test but the key to it is that its done in a system that the person knows REALLY well for a long time. Too often, formal blind tests are done in a room or system that a person isnt so familiar with or not at all.
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 12:03 pm
by Diapason
Dave, you know I agree with you but for the purposes of this discussion:
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 12:17 pm
by jkeny
DaveF wrote:Dont think I'd be completely against blind testing though, rather just another tool to use alongside sighted testing. I'd agree with the test above where something known to be audibly different gets inserted unknowningly into the chain. Thats a good way to see how good the person's hearing is.
(of course, what is actually audibly different is another argument)
Well, I actually think it's a good way to test the whole test & not just someone's hearing. I would like to see this include with every blind test as it would give us a handle on how reliable the results are. For instance, let's say we alter the test track by 1dB (0.5dB is said to be the noticeable level) & use it in the way I described - insert it randomly in place of the correct test track. If this isn't noticed as different, then what does it say about the results from the rest of the test? In other words when a null result is returned (as they mostly are) are we surprised, is it not to be expected because we have just shown by this internal control that the test/tester is not sensitive enough to reveal 1dB differences so how could it be sensitive enough to reveal other small differences.
BTW, in blind tests, these are called false negatives - where the subject/listener doesn't hear a known difference. I would be interested in seeing the percentage of false negatives from such blind tests & would bet that it is high. This would put a nail in the coffin of the types of blind tests normally run & the use by many of null results from these tests as "evidence" that there is no difference between X audio devices (for X substitute whatever audio device you like DAC, amp, etc).
Of course just using a volume difference is a bit restrictive & some other known differences should be used as well as volume for the test control.
Dave why not try this test? You could record your test tracks onto CD with some randomly increased in volume. Set your CD to random play & hide the track display screen. It would give you a handle on how useful the test is as a tool to differentiate differences. Of course just using a volume difference could possibly skew the test as knowing this means that you might be looking out for a volume difference (although a small 1dB vol increase is usually perceived as better quality rather than louder)
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 12:25 pm
by jkeny
Diapason wrote:Dave, you know I agree with you but for the purposes of this discussion:
I don't wish to upset anybody or get personal so, guys don't take this the wrong way - it's just my observation but I find it interesting that you two guys are also the one's most disappointed with the sound of your systems. Dave, you expressed such a while ago & Diapson I've seen your posts on WhatsbestForum stating this. I also thought both of you veered more towards analytical analysis of audio systems?
I know this might be considered to be drawing wrong conclusions from just two examples but it makes me wonder - does it not make you guys wonder?
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 12:33 pm
by DaveF
jkeny wrote:
I know this might be considered to be drawing wrong conclusions from just two examples but it makes me wonder - does it not make you guys wonder?
Well, my reasons are quite straightforward: One of my Quads has a problem in the left channel as documented elsewhere. The Jadis also seemed to have far less resolution than the Devialet. The Quads and Dev were stunning but I wanted the Jadis and problems arose soon afterwards. A decision I kinda regret since.
Previous system was a horrible mismatch between an amp (ATM2) and a compromised pair of Kharmas. Unlucky there.
Not sure how this relates to the logic behind the blind tests we've put forward above.
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 12:57 pm
by Diapason
I had decided I wasn't going to comment on any of this, but John's asked a question so I'll try to answer honestly.
I know I was a lot less "happy" with this hobby when I held the view that cables make no difference, for example. This was partly because there was nothing to tinker with that wasn't crazily expensive, partly because it's a difficult position to maintain among other audiophiles (or at least it was) and partly because I started to wonder whether I was cutting off my nose to spite my face. So eventually I gave in, bought and swapped and upgraded pretty expensive cables, and felt like I was properly part of the club. Then one day much later I put in other, much cheaper cables and while I convinced myself I could hear a difference, the difference was so slight that I couldn't even admit it to myself. Do I think cables make a difference? Absolutely. Are they worth it? That's not for me to say, it's your money. Would I have spent several thousand Euro on them if I couldn't pick them out in a blind test at home? Probably not. Do I have the courage of my convictions now and will I sell them? Still, probably not. However, it should be stressed that all of this was about the hobbiest aspect, not about the music, barely even about the sound. It was the urge to tinker with the system that was the driver.
Simply put, I now believe that I've spent my decidedly finite budget badly, because I believed all the hype about cables, and system supports, and digital front-ends, and spent a lot of money accordingly. The problem is that in order of things that make the most difference and can yield the most improvements, I've got it arseways. I'm not saying these things make no difference, in fact for the most part I think they probably do, but they're not the things I needed to spend money on, and now I've none left to really fix what's necessary. I'm not content with the sound of my hifi, that's correct, but that's not from lack of effort to find the right gear. I've tested a LOT of stuff at home, I'm not one of those hard-headed assholes who won't listen to anything but will dismiss it anyway, in fact I'm the exact opposite. The problems I have at home are because I know that the equipment isn't giving its best, and I know this because I've heard the same equipment elsewhere.
I'm pro blind-testing (or some semblance of it, not the scientifically rigorous stuff that gets you published in journals, just a personal thing) because I KNOW how much we can persuade ourselves something is better, and I KNOW how massive the effect of knowing what's playing is. It's happened in hifi, it's happened in wine tasting, it's well-reported everywhere as a known phenomenon, and I've experienced it so many times myself I can't ignore it. The effect is real, and occasionally I'd like to see that acknowledged. Is blind testing the answer? I don't know, let's talk about that, but let's also acknowledge that just because we "heard something with our own ears" it's not the final word.
As I said to Dave elsewhere I'd genuinely love to have a proper discussion about blind testing, and I'm open to the idea that maybe it doesn't reveal everything it should, that the test itself is flawed for some reason, I'd find that an interesting question. I just wish we could come at it from a point of intellectual honesty rather than religious fervour, and historically online (not necessarily here) that's not possible. Ultimately, positions on both the objectivist and subjectivist side are too entrenched for me to be bothered with the discussion any more, it's like watching Dawkins arguing with the Pope.
Edit: Sweet merciful crap I've written another epistle. And this is me "not engaging"...
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 2:16 pm
by tony
DaveF wrote:But I'd turn the above on its head with the following example:
Two software players, MQN and Jplay. The person believes that MQN is the better player in a system that they know really well and are comfortable with. They trust their ears and are assured that this is the only way to properly judge a system. Lets say that the same piece is played several times through both players and their conclusion each time is a 100% certainty that MQN is the better player.
Now lets put that faith in the ears to the test just a little bit. The same test is repeated in the same order but on the sly the person in control switches to Foobar and plays it constantly while making the audience think that they are going through the same MQN/Jplay pattern as before. No need for blankets or leaving the room, level matching has already being done at the start.
Surely nothing too controversial there? Would it not make people reevaluate if they got it horribly wrong by picking the same winner again? I've no idea if the same winner would come out. I've never done anything like the above before but its a simple test.
One person with the right software skills could do something such as write a batch or python script to randomly select which player gets chosen to do the playback after you chosen your selected tracks. Thats something you could do on your own. The script would keep a record of what player was chosen and you compare afterwards. I'm assuming of course that both players could be run in command line mode.
No need to be plugging out cables or switching on and off valve amps or waiting for them warm back up etc. I think the above is a simple test but the key to it is that its done in a system that the person knows REALLY well for a long time. Too often, formal blind tests are done in a room or system that a person isnt so familiar with or not at all.
The answer is off you go Dave and set up the test. I have seen threads not really like this in spirit but very heated and never once have I come across somebody on the blind test camp go off and set it up. There was a bake off thing done on PFM and to be fair that guy was trying fairly to set up a rigorous test but it aint that easy.
BTW should this not be farmed out into a new thread?
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 2:59 pm
by Ivor
As I said a few posts back the sessions where we meet up and swap out hardware/software have served us quite well. Yes some findings are inconclusive but that's the world we actually live in! Even in the strictest blind tests the main variable is going to be the listener. Preconceptions, tastes and even one's mood play a part. If one person expresses an opinion then everybody's is "tainted". We could gag and blindfold the listeners, I'm pretty sure some would enjoy that. But again their mind might wander!
Re: Quad-Lampizator7 evening
Posted: Thu Mar 12, 2015 3:03 pm
by jkeny
Thanks for the replies, guys - it helps me better understand where you are coming from.
Here's my take on audio blind testing - it's so difficult to do it right because we are dealing with auditory perception that it should be only done in a scientifically rigorous way by those trained in the cognitive sciences.
The usual blind tests are so flawed that I believe it's only good for gross differences which are easily indentifiable sighted. For most of us it is used as a personal sanity check - did I really hear that difference but I honestly feel that if you have to ask this then the difference probably isn't worth being bothered about.
Long term listening seems to me to be the way to evaluate audio for many reasons not least because it averages out the many variables that can affect our auditory perception & we are listening to the system naturally, not in some enforced, analytical way.
I think I've given my thoughts on auditory perception before?
But what I see as the common mistakes made by many are twofold:
- to treat hearing as a kind of instrument that can measure & compare
- to treat it as a linear system that converts vibrations at the eardrum into an image of what we hear
Auditory perception is far more complicated & far more interesting than that & only becoming somewhat teased out in the last 10 years. Essentially what we are faced with are vibrations arriving at the ears from all directions from many different objects, all intermingled. The job of our auditory perception is to efficiently (with a fair amount of accuracy) generate a best guess analysis of the auditory signals - group & match them to real-world objects & keep them grouped together through movement & changes. This is a phenomenal feat as there is not enough information in the vibrations at the ear drum to actually do this & the ear shape & ear mechanism introduce their own distortions of this signal. So this is a pattern matching, prediction machine that is continually guessing the best fit model of these signals. As I said, there usually isn't sufficient information in the signals to unequivocally model the auditory environment so it uses as many & as much extra information as it can to come with the best analysis - visual, internally stored auditory models, knowledge of how audio works in the real world.
Edit: I saw a number of posts were made while I was composing this & the suggestion by Tony to do this in a separate thread is not a bad one if there was interest in continuing further?
PPS: I see that this thread was already split off - good one!!