# Thoughts on Double Blind Testing



## Otto (May 18, 2006)

Hi there,

A few friends and I have been talking about doing a double blind test of amps. I have never done anything like this before, and I would like opinions on good methods to ensure meaningful results.

FWIW, I think we are going to use the Behringer EP2500 ($300) against a BAT 5-channel amp ($5k).

Here are some of the things that I have in mind:



Level Matching using RS SPL meter. Is using a target of 75 dB acceptable? Is using the gain knobs on the EP2500 acceptable? That way we don't have to mess with the volume knob on the preamp at all.

Randomized order of tests.

Double blind -- neither proctor nor subjects knows what is being tested. Sounds like one of the wives (a "disinterested party" if there ever was one) will have to be doing the switching. 

Using banana plugs to make switching easy. The BAT will take bananas, but I might have to make a little adapter for the Behringer. Any complaints?

A/B/X? Where the X is predetermined. The subjects can then listen to A (knowing that it's A), and then B (knowing that it's B). They will then be presented with X. The question, then, is X=A or is X=B? I think I would allow the subjects to listen to A and B back and forth a few times if they wish. 

We could also just use "different/same" testing. We will play one clip two times. The choice to be made here is "are we using the same amp or not?"

Using whatever music they wish, but it should be predetermined before hand. Maybe a total of fewer than 10 tracks. We can listen at length with knowledge of which amp is which prior to the test starting.

So, there are some thoughts just off the top of my head. I'm pretty sure it'll be inevitable that, whatever the results, they will be called into question one way or another. Please help me make a solid protocol to try to minimize this.

Thanks!


----------



## drf (Oct 22, 2006)

You will need to be able to switch fairly quickly, I am told the average human brain has approx. a 7 sec memory for sound quality/tone. Even if this isn't true the testee needs to be able to compare without too much time in between. 

I would set it up so that the only adjustments made when switching are on the units under test. Pre-amps, crossovers and all other gear must remain exactly the same for both units.

People who argue with the results of dbt's usually have just spent mega$$$ on the brand that was found to be inferior. :demon:


----------



## terry j (Jul 31, 2006)

Otto looking very much forward to hearing your results. 

Agree with drf, only really likely to be 'picked' when the results don't agree with someones pre-concieved idea. Of EITHER persuasion!

Anyway, my take on your questions

1) yes, that is the way I'd do it. Not sure if any particular db level is more important than another, only I guess that no amp is driven to clipping which of course will be audible.

2) can't see why that is important, but number three ideally should be that way. 

Four and five, yes and yes. Any connections to the Beh. to get bananas should be solid, apart from that I can't se e any problem. Bananas will give a quicker swap over time, essential I would have thought.

How will you hook it all up?? What I mean is, can you simply switch an output on the preamp and leave the different amps on different outputs, or will you manuallt unplug one amp from the pre and then plug in the next?? The only real prob with that is the extra time involved between listens.

As drf said, the sonic memory effect could come into play, ie too long and then an accurate comparison can't be made. For what it's worth, my take on that is that, if there is 'so little actual difference' between the amps that it is masked by psychoacoustic properties such as that, is the extra expense worth it?? Ie, no one would argue that differences in the way two different speakers sound are so minor that they couldn;t be picke after 30 seconds??? 

Anyway, that is not really answering your questions so I'll leave it with that, other than to say I look forward to hearing your results, whatever they may be!!

lots of love

terry

(ps drf, will be down in your neck of the woods soon! Hope your HT is coming along nicely and that the weather gods will be on my side too and Melb will be a pleasant place to visit)


----------



## Otto (May 18, 2006)

drf said:


> You will need to be able to switch fairly quickly
> 
> I would set it up so that the only adjustments made when switching are on the units under test.


Yep, the only thing that will change is the amplifiers under test. Since both amps take balanced inputs, and banana plugs (one way or another), we should just be able to swap those cables and be ready to go. I'm willing to hot plug the amps if the other amp owner doesn't mind (mine is the Behringer, his is the BAT). If we're lucky we'll be able to swap in 5 or 10 seconds.




> People who argue with the results of dbt's usually have just spent mega$$$ on the brand that was found to be inferior. :demon:


Yeah, I asked him if he'd sell his BAT and buy a cheapy if he can't tell the difference, and he said "no". However, if we find that we really can't tell significant differences between the two, it might change our thinking for future amp purposes. FWIW, I use the Behringer to run my IB, and I use a Sunfire Cinema Grand to run my 5-ch setup.



terry j said:


> Anyway, my take on your questions
> 
> 1) yes, that is the way I'd do it. Not sure if any particular db level is more important than another, only I guess that no amp is driven to clipping which of course will be audible.


Yeah, I was thinking about that this evening. I think we'll just have to find a reasonable listening level while listening to music, and then calibrate each amp to the same level. And you're right; no amps will be clipping.



> 2) can't see why that is important


Oh, I was just thinking that whatever order the "X" is presented should be randomized. We dont' want to do anything that would give someone a pattern to key off of. 



> How will you hook it all up?? What I mean is, can you simply switch an output on the preamp and leave the different amps on different outputs, or will you manuallt unplug one amp from the pre and then plug in the next?? The only real prob with that is the extra time involved between listens.


Not trying to be silly here, but it will be Source->Preamp->Amp->Speakers. The only thing that will change is the amp. We will swap inputs to the amp and speaker connections. I'm thinking that one of us involved in the testing (i.e., _not_ one of the wives, unfortunately) will have to do the swapping. That will ensure the quickest and most accurate swaps. Also, I just had an idea that we snap a quick photo of each set of connections so that there will be no question that things might have been improperly connected.

Also, depending on location of the test the associated equipment will be:


Source = PC
Preamp (DACs) = Sunfire Theater Grand III
Speakers = Thiel CS2.3
or


Source = PC or Denon DVD-1910
Preamp (DACs) = Outlaw 990
Speakers = Vandersteen 3A Signature

I don't think that we will be using any subs.

Thanks for the comments guys! I'm not sure when this is going to come together, but the group that's interested will all be having some time off during the holidays, so perhaps then.


----------



## daniel (Dec 31, 2006)

Don't lose your time with this!

Double blind test have all concluded to the same thing: listener are unable to differenciate anything. 
That conclusion may lead you to some disapointment as i need many years ago.
With some freind we compare ( same db level) two tape deck.

One was a nakamichi, the other was a candle ( like the hifi component you can buy as a hign end store ( lol) like walmart.

since we couldn't find any difference using prerecorded tape and our own recording, i sold my nakamichi. big mistake. I noticed that there was so many thing missing compare to my nak, that i sold ( more trow in the garbage) the candle and bought another nakamichi. I lose money from that experience.

you would be better doing blind testing. You listen as long as you want not knowing the component you're listening to ( you can put it in a big box). After each listening session, you take notes on the sound quality beeing as precise as you can.
(your component must be warm before starting listening).

Repeat the same thing with the other component, again taking notes. Be careful not changing anything else than that component.

Here's a link where you can look at how to do blind testing.

http://www.uhfmag.com

If you prefer to beleived in double blind testing. Sell all your component and star over buying the cheapest component you can find. Double blind testing will assure you you made the right decision. Your ears will says otherwise but who cares.


----------



## Otto (May 18, 2006)

daniel said:


> Don't lose your time with this!
> 
> Double blind test have all concluded to the same thing: listener are unable to differenciate anything.
> That conclusion may lead you to some disapointment as i need many years ago.
> With some freind we compare ( same db level) two tape deck.


I'm not sure what you are saying. Is it that we won't be able to discern any difference during the test, but that we will be able to tell a difference later? Don't worry, none of us will be selling off our nice amps to have them be replaced with Behringer pro amps if we can't tell a difference. However, we may have a different view of some aspects of audio equipment.



> I noticed that there was so many thing missing compare to my nak...


So you heard no difference during your initial testing, but later you regretted the decision to go with the cheaper tape player? What was it about the Nak that you missed compared to the Candle? And why didn't you hear it during the initial tests?



> you would be better doing blind testing.


Well, what's the difference between blind and double-blind in your book? To me, the double blind just means that neither the subject of the test, nor the administrator of the test, know which is being used. Whatever we call it, the subject will definitely not know which amp is being used, will have ample time to listen, will probably be able to go back and forth between amps, etc. They will then be presented with one or the other of the amps and asked to assign it as either the "A" or "B" amp. The administrator of the test will probably actually know which is being used, so they will either have to stay in the other room, or remain silent and poker-faced. We definitely don't want the administrator of the test to "give away" which is being used ("oh, that sounds better, doesn't it" would be a totally irresponsible and stupid comment from the admin, right?).



> If you prefer to beleived in double blind testing.


Well, I guess that, in general, I "believe" in double blind testing. It's how science, medicine and various other aspects of our lives move forward. 



> Sell all your component and star over buying the cheapest component you can find.


Well, that's not the point of the exercise. However, it might inform future purchases.



> Double blind testing will assure you you made the right decision. Your ears will says otherwise but who cares.


I don't get it. When will my ears say otherwise? If I can't hear a difference during the double blind test, I will be able to later?


----------



## terry j (Jul 31, 2006)

Otto

looking forward to your results and conclusions. Usually, the results of these type of tests only stir up trouble after they've been published, seems this time that we should all save time and get straight to the debate!

No, all mucking about aside, hope that the present holiday period allows you to get on with your testing and we see the results soon.


----------



## Otto (May 18, 2006)

Thanks terry. I'm not exactly sure when we're going to get to do it. The location of the test is in the process of getting new speakers. I'm pretty sure we'll do it at my friend's house, and that means we will likely be using the Sonus Faber Amati speakers. I've heard the Cremona, and they were niiiiiice! The Amati is about 2x the cost of the Cremona, so we should have a revealing speaker if that's how it goes. Those would be in place of the Thiels I referenced above. Maybe in the next month or so. Sorry for the delay. I want to do it as well, but we haven't even agreed upon a procedure.


----------



## daniel (Dec 31, 2006)

HI Otto, first: nice speaker you're going to use!

what i was saying is that in double blind testing they always ask " is there difference between a and b. They never ask to describe the sound qualities reproduces by a and by b ?
If you take your time to listen to one component ( with blind testing instead of double blind) , describing all you can about his qualities, when you will switch to the other component you will probably find difference between component.

When you do fast switching between component, everything will sound the same. ( history of double blind testing have shown this many many times.)

For the bad experience with nak and candle. In the double blind testing that i did with some friends, both could reproduce bass, both could reproduce mid and both could reproduce high. I heard the music with both...well it was so similar that i could not do anything other than thinking that high end rip me off.

So over few weeks, i noticed that there was no more depth, the high were screenching my ears. The bass was just boum, boum. No more feeling of the skin's drum...get the picture?

i did one experiment not with but to some of my friend less than 5 years ago.
They were comparing my high end turntable to my low end cd player.
Bd matched to less than 1 db, using my decibel meter. I used only one record/cd: Brother's in arm from Dire strait. With the remote i could change between both component without them knowing which one was playing. They only thing they had to do was: same or different. they only knew if i was going to use my remote for switching. I could switch or not, but no one except me knew what was happening.

As you can guess, they could not make any differnce between the two.

We repeat the experiment 1 or 2 weeks after. This time i ask them to listen to less than 15 minutes. I didn't want them to identified the turntable when the lp side ended. 
they had to write about soundstaging left/right; front/back; height; the different musical instruments; the place they occupied within the soundstage and to give a numerical rating related to there liking.
I did 3 times 15 minutes each. With a pause between ( so i could put the tonearm at the beginning) each session.
Most were surprised of the result: They all prefer by a wide margin the turntable.

p.s. as you probably have notice, English is not my first language!

p.p.s. as long as you have fun doing what you're doing what am i to say anything?


----------



## Otto (May 18, 2006)

Hi daniel,

Let me address your last two comments first:

1) Your English is fine. Much better than my French!
2) We hope to have fun, but I welcome all comments!

OK. To the rest of it. I do see what you are saying. And yes, the subject will be able to listen as long as they like. I know I'll be listening for depth, soundstage, imaging, dynamics, etc. I'm assuming that the other guys will as well. I'd consider myself a mid-level audiophile: I've been playing with this stuff for years; I enjoy it, and I know more about it than 99% of people on the street, but I'm not hard core. It's a good hobby for me, but not a way of life. :laugh:

It's interesting that you, in two cases, found no difference between _analog source_ components! Tape vs. tape and vinyl vs. tape. I would have thought that the differences would have been immediate! Thanks for the descriptions, and we will be on the lookout for those types of differences.

Again, I see you've made a distinction between "blind" testing and "double-blind" testing. To me, they are very similar. Can you elaborate on what you would consider to be the differences?

Finally, I spent a little time over at the AudioAsylum last night. Man, those guys are crazy! :coocoo: (Hey, it's their name, not mine! :bigsmile: ). Seriously, though, they do have some good discussion on DBT in their PropellerHead forum, and I'm going to continue to read up over there.

Thanks!


----------



## daniel (Dec 31, 2006)

It was not vinyl vs tape, but vinyl vs cd player.


----------



## Otto (May 18, 2006)

Yep, I misread. That's really amazing. I would think the difference would be so obvious!


----------



## ISLAND1000 (May 2, 2007)

IMHO without each of the subjects having equal hearing ability, I think the Dbl blind tests will be of little use. All participating subjects should have equal hearing ability going into the tests.
If . . . all the subjects have equal hearing, I would think you could ask for a few more sound quality differences to be added to the list to differentiate A from B from X. 
1) bass
2) treble
3) stereo effect
4) spatial effect
If the testing is just for you and a couple friends, the friend who works in the drop forge will probably hear no difference. The friend who works in the public library will hear all sorts of differences and probably ask you to be quiet.


----------



## Otto (May 18, 2006)

Hi Phil,

Well, I'd think it's pretty much impossible for any group of people to have the same hearing ability. While I still haven't gotten around to doing this test (I still do have it tentatively planned, but there's just too much going on), the purpose of it is just to determine if we can hear differences. If we put my $300 Behringer up against a $5k BAT, I would hope that the guy that bought the BAT would be able to hear a difference. If not, he should rethink his investment and choices. And the test will be at his place, in his room, with no other changes in the system. So he should theoretically have the best chance of getting it right. 

We've all put a lot of time and money into this hobby, and I'm just trying to see where the points of good return are. I know that I can hear differences between speakers, and I know that I can hear differences between DAC, but a well-designed amp should be pretty transparent. 

What's your take, Phil? Do you think an audiophile-type person will be able to hear the differences between those amps? Personally, I'm not sure we will be able, but we will see...

Thanks for reminding me about this test. I think we'll get back to it when the weather isn't so nice.


----------



## SteveCallas (Apr 29, 2006)

Otto, I have done a few amp and processor blind listening tests and I have a few tips I'd like to share. 

- Using the attenuation knobs on the amps is definitely a good idea, as the processors channel output levels can then stay fixed, ensuring no more or less upstream distortion

- Banana plugs are a must

- Muting the processor between switches if leaving the amps on at all times is a precaution I would take

- Y splitter cables from each channel would prevent you from having to unplug and replug the line level cable during switching

- Have each attendant pick a song in advance, then compile the songs onto a CD-R and send it to each a few weeks in advance so everyone is pre-familiarized with each track before the test

- Playing the same clip twice in a row, asking if a switch took place, and then taking a brief break is what I found to be the best method. Playing clips back to back to back to back, etc. and having the listeners record everytime they thought a switch occured is too demanding

- Level matching with wide band pink noise and an spl meter at the seats is the preferred method in my opinion, as it is one step further downstream than voltage matching at the amp outputs. Make sure the spl meter isn't fluttering at the seat, you want a solid number - preferably a digital meter

- Level match one speaker at a time on each amp, then both together. Make sure there is no variation. 

- Keep all electronics in a seperate room from the listening room

- Rotate seats between "rounds" of testing, say 5-10 tests per listener in each position

- Ensure the switcher takes exactly the same amount of time to "switch" whether they have switched or not - in fact, the switches should be timed to make sure the same increment is used each switch

- Ensure the listeners can not hear whether a switch has been made from their position in the listening room - in other words, make sure the physical act of unplugging and plugging cables doesn't make any noise from their vantage point

- No talking at all amongst listeners between clips, and then during breaks, no talk regarding what they thought they heard, not until all testing is complete

- If the listeners feel they heard a switch between clips, very brief notes on what they thought the difference was can be very useful if there are legitimate differences

- Ensure only two speakers are being used and run full range 2 channel stereo - no subs, no crossovers, no signal alteration, no surround sound

- Ensure the music is being played at a "spirited" level - I believe we calibrated to ~82db at the seat for our tests (not 100% certain, but it was on the loud side) - if the music is too low, detail is harder to pick up on

- Allow the listeners to listen to music prior to beginning any testing, just to ease their nerves and acclimate them to the sonic characteristics of the speakers you will be using

- Explain the "rules" to the listeners and switchers clearly and more than once - make sure everybody knows exactly what is expected of them

- Have fun! Minor mistakes will probably be made, I know we made some, but you gotta push on and enjoy it, cause the effort it definitely worth it at the end.

Look forward to your results, this kinda stuff isn't done enough. Of course I'm biased from my own testing now that ALL capable electronics sound identical :T


----------



## terry j (Jul 31, 2006)

I too am still looking forward to this.

I would LOVE to be part of a test such as this, I have not done the testing myself so I can't categorically state that it is true for me, but I do admit I fall into the camp that most amps would sound the same.

Or, if they don't, are the differences worth the extra money which could, after all, go towards new music or movies, whichever takes your fancy.

That is, at the end of the day, what this hobby is all about??


----------



## Danny (May 3, 2006)

My thinking is that you should run both amps at full power (0 Db attenuation) otherwise the noise level from one amp may be different even though the may both produce the same amount of noise


----------



## ISLAND1000 (May 2, 2007)

Otto, I have been involved in Dbl blind listening tests but it was at a time when the industry was changing from tubes to solid state. It was easy at that time for me to discern diffrences between amps. Today I don't have the auditory accuity to hardly tell the difference between tympany and tympanic membrane. 
Firing weapons, diving too deep, close proximity to jet engines, and yelling spouses, sadly, have dimmed my senses. I would no longer be a suitable listening subject . . . . . unless of course the subject material was . . . . .


----------



## Otto (May 18, 2006)

Steve,

Thanks a lot for the comments. The business of getting "the rules" together is one of the things that's contributed to the stall of this experiment. 

terry, you're right -- it's what the hobby's all about. We'll get around to it one of these days.


----------



## Otto (May 18, 2006)

Danny said:


> My thinking is that you should run both amps at full power (0 Db attenuation) otherwise the noise level from one amp may be different even though the may both produce the same amount of noise


Danny,

Not a bad idea; I'll keep that in mind. The thing with the level matching ahead of time, of course, is that we won't have to adjust volume between takes. We'll see how it goes. Thanks.


----------



## Otto (May 18, 2006)

ISLAND1000 said:


> Otto, I have been involved in Dbl blind listening tests but it was at a time when the industry was changing from tubes to solid state. It was easy at that time for me to discern diffrences between amps. Today I don't have the auditory accuity to hardly tell the difference between tympany and tympanic membrane.
> Firing weapons, diving too deep, close proximity to jet engines, and yelling spouses, sadly, have dimmed my senses. I would no longer be a suitable listening subject . . . . . unless of course the subject material was . . . . .


Yelling spouses!?!?! Oh no!!!! I know what you mean about all the loud stuff, though. I've been pretty cautious throughout my life with machinery and other "loud things" by wearing earplugs. I have listened to a lot of loud music over the years though. I had a hearing check last year, and my frequency response was actually pretty good. I was somewhat surprised. I do suffer from tinnitus to a small degree, unfortunately. Music, though, even at a relatively low volume will mask my tinnitus. I defintely don't think I'm a golden ear, but I can do OK. Even with whatever hearing damage you may have, I wonder if you're selling yourself short.


----------



## ISLAND1000 (May 2, 2007)

I too had a hearing test last year. Bad news . . . . . have two big holes in my hearing frequency response curve. One centered at 2000Hz and a second at 8000Hz. So far nothing can be done. Still hopeful.
And by the way . . . . I am NOT short . . . . . 6ft. plus! :T


----------



## phaseshift (May 29, 2007)

Just a couple of things to add here- Since I do a lot of this stuff or something very close when comparing different speaker tune-ups (while voicing speakers)

A lot has been said about the switching. I can not stress how important it is to be done instantly and quietly, as in _no pops_. My experience is that if you can not switch in a very short time- under a couple of seconds, everything will be sounding the same and that is when comparing speakers which may in fact sound remarkably different. I do not believe that you will get good results having someone switch by unplugging and re-plugging the bananas. Also, there is a possibility of accidental polarity changes with a manual swap- unlikely, but possible. 

The level match should be done with a scope looking at a voltage level. Well, that is my opinion and how I do it when I have to have the levels close. Look at the level at 1K first and set your voltage to whatever gives you about 85 dB on your SPL meter. Once you are set and have both amps where you think they are right on the money, take a look at other frequencies. If one amp has a bit of bass boost built in (you would be surprised...), then you will see that as a different voltage compared to the math and what you would get on the other amp. I usually look at 12 different frequencies; one being 1K and the rest being as follows- 20, 50, 100, 200 400, 800, 1K, 2K, 4K, 8K 16K and 20K. Note that the frequency is seldom _exactly_ 50 Hz or 400 Hz; 397 is close enough to 400 for me. Use a good scope, preferably a DSO with some math functions. That way, you can see RMS value, frequency, P to P and other stuff if you like. I always look at the RMS and frequency in a case like this as a way of keeping track. Tektronics TDS 1002 is a descent little basic scope that is easy to find and very useful as long as you are not looking at high frequency stuff (digital power supplies etc...) 

If you do it right, you will likely not hear a difference in the electronics or cables. Please let us know what you observe and hear.


----------



## SteveCallas (Apr 29, 2006)

> I can not stress how important it is to be done instantly and quietly, as in _no pops_. My experience is that if you can not switch in a very short time- under a couple of seconds, everything will be sounding the same and that is when comparing speakers which may in fact sound remarkably different


Instantaneous or <3 second switching is what some people used in an attempt to discredit my testing, but I feel that is a cheap copout. If audible differences in equipment existed, and they couldn't be percieved with a ~10 second switch, then I feel one should really be questioning their own hearing capacity, not the experiment setup. Acoustic memory is short, but not that short. The whole purpose for the testing, at least from our point of view, was to see if there were audible differences between two pieces of equipment which would then lead to subjective blind testing (if audible differences could first be percieved confidently), and ultimately, keeping the better piece of gear in our own systems so that we would be enjoying the better sounding unit. If there were any differences, but they couldn't be picked up with a ~10 second switch, do you stand to gain anything by using the more expensive unit in your system? No, because you wouldn't be able to appreciate the difference anyway.

Additionally, with even longer switching periods, I have only heard two pairs of speakers that sounded very similar, all others have pretty noticable differences. In fact, using one pair of speakers, but slightly tweaking the crossover to boost or flatten the top end was easily discernable with what I would have to guess was 40 second switches. The material used for that testing was a trumpet solo.


----------



## lcaillo (May 2, 2006)

Has anyone looked at the research on JND in the behavioral sciences to determine what kinds of intervals are suggested? I see a lot of speculation but not much substance in these discussions.

My big concern with double blind or any other kind of testing is the problem of individual differences. Trying to identify differences is very dependent on the motivation and listening skill of the subjects. Then assuming that that difference if found is meaningful for everyone is a big leap. Assuming that because no differences are found by a particular subject group means that no difference exists in the equipment is equally faulty. If one is trying to determine IF SOME difference exists, then only the most skilled listeners should be used and a variety of listening conditions must be tested. If one is trying to determine if differences meaningful to an average sample of the popluation then a random sample group is appropriate.

As with most questions, the correct answer (especially with experimental design) is...it depends.


----------



## phaseshift (May 29, 2007)

> Instantaneous or <3 second switching is what some people used in an attempt to discredit my testing, but I feel that is a cheap copout. If audible differences in equipment existed, and they couldn't be percieved with a ~10 second switch, then I feel one should really be questioning their own hearing capacity, not the experiment setup. Acoustic memory is short, but not that short.


 I do this (the audio biz as a development engineer) for a living and probably do a lot more equipment testing than most folks on this forum. My post is about fundamental DOE practice specific to comparing 2 pieces of equipment; removing variables that will influence the comparison, not to question your hearing capacities. 
If the subject or one of the other participants has a suspicion that the hearing capacities are not up to par, it is easy to validate or disprove that through intentional level skew, EQing, filtering or some other intentional alteration to one of the DUT's. You can also review the results of the test vs the record taken by the administrator of the test. Maybe have the administrator do nothing but turn the same device on and off over the entire test and see how many people talk about the amazing things they heard and the clear differences between test cycles. My experience is that you will always find one or two in a group, even if they are giving them nothing but placebo all the way....


----------



## lcaillo (May 2, 2006)

By placebo, I presume you mean control? So you are using a random sample of observers? Are you evaluatinog equipment that will be used by the general public or by professionals?


----------



## phaseshift (May 29, 2007)

> By placebo, I presume you mean control? So you are using a random sample of observers?



What I mean is that in a group of people doing blind or double blind testing, you will very likely find a few (or many) who hear significant differences in the same piece of equipment if they believe that they are listening to 2 or more different pieces of equipment. I am assuming that the listeners are not random or even re-seated between tests if given the time. 



> Are you evaluatinog equipment that will be used by the general public or by professionals?


To me, they are one and the same; specs may be a little different, but that is it for my line of work.


----------



## REK (Jun 10, 2007)

Try http://www.pcabx.com/ for a long standing reference site on ABX double blind testing. Arny has a PC software version for free download, and give extensive background in what it takes to do a fair non-biased test. Level matching is a MUST. I recommend 4 1/2 digit DVM type level matching since only a couple of tenths of a dB loudness can throw you off! PCABX is definitely a recommended site!


----------



## terry j (Jul 31, 2006)

sounds like you've had some experience??? If so, what was it?


----------



## REK (Jun 10, 2007)

Terry: There are two groups in Detroit that have taken ABX testing of audio equipment to a fine level of professional detail. I am a member of both the Audio Engineering Society (AES) since 1970, and the Southeastern Michigan Woofer and Tweeter Marching Society (SMWTMS) since 1976. Several members of the AES and SMWTMS joined together to form the ABX Company to manufacture and promote ABX testing 27 years ago. www.pcabx.com has all the references to AES articles and other links to provide education in ABX audio equipment testing. The original hardware ABX comparator testing was designed to be able to determine audible differences in power amplifiers and preamps that had thousands of dollars of cost difference between them. I was part of a very serious lengthy ABX test of low cost and expensive speaker cables back in the early eighties that was sponsored by a well known high end speaker cable company. The test started with "sighted" comparisons of the cables and we of course were absolutely certain the we all could tell the difference between the cables 100% of the time. Then we engaged the ABX technique. A series of sixteen trials of comparisons were run over the evening. We again were quite certain the we could hear the difference between the cables. The results of the ABX test proved with substantial level of confidence that the two different speaker cables sounded identical. Many of us got 8 of 16 trial answers correct! In other words, we could have flipped a coin, and got the same results!! Many people saved hundreds of dollars in cable purchases! As long as nothing is "broke" and there is care taken in level matching, you will end up with a valid result using this ABX technique. Be careful to take precision measurements of frequency response! Even a couple of tenths of a dB in the shape of a frequency response curve over several octaves in the midrange of the audio spectrum is readily audible! 
A very broad low Q rise of just 0.2dB in the lower midrange can accentuate male voices in a mixed choral group recording, while a 0.2dB rise in a higher part of the midrange accentuates the female voices! It is no wonder when listening to speakers that even moving about a room makes them sound different! Please give www.pcabx.com a try and let people know what you think.


----------



## terry j (Jul 31, 2006)

thanks for that. I have visited the site before, will have to re-acquaint myself.

Reminds of some story where people were listening at some high end system, then the assistants go behind the speakers, hold up some cables (as thick as a thigh ha ha) and then bend down.

Evidently all heard a massive improvement.

Of course the kicker is that the cables were not in fact changed!

The only thing that had changed was the audiences perception.

Have you done any abx testing of amps??


----------



## JCD (Apr 20, 2006)

I think it's great you're gonna do this.

I may have skimmed some of the messages a little too quickly, but I have a few comments:

I wouldn't worry about the "abilities" of the testers. I think the test would be able to say if your average enthusiast can hear a difference between the amps. I would assume that if you get someone trained in this kind of thing, you might get a different result than what you come up with in your group. But, since I'm not trained in that manner, this is a test I could extrapolate to myself.
Everything I've seen indicates level matching is CRITICAL. It has to be SPOT on or the test is gonna be invalid. Generally, if one source is slightly higher than the other, then the general reaction is to say the louder one is "better".
I would try this test at different volume levels. Soft, medium and loud.
I think the random method is best. The person making the changes should be out of sight. 
And I think the faster the switch the better.
No talking while you do the tests. Any comments are likely to sway the opinions of some of the other testers.
I'm REALLY looking forward to your results!

JCD


----------



## lcaillo (May 2, 2006)

JCD said:


> I wouldn't worry about the "abilities" of the testers. I think the test would be able to say if your average enthusiast can hear a difference between the amps. I would assume that if you get someone trained in this kind of thing, you might get a different result than what you come up with in your group. But, since I'm not trained in that manner, this is a test I could extrapolate to myself.


This is like saying that you can compare the results of two room measurements with different mics and not worry about the calibration differnces. Certainly, if you had a large enough group of listeners that were a random sample of the population the between-observer variance would be less significant as the group would approach the general population. With small observer groups, however, this must be accounted for if the results are to have much meaning.

It is a common mistake in research or clinical settings to ignore the individual differences in observers that produce data. The fact is that it is just one more area which produces variance. Unless you use a multivariate model that accounts for it or select observers based on criteria that are relevant to that which you are trying to apply the data, you just might be PITW.

If the purpose of the testing is to determine IF there are ANY audible differences that MIGHT exist. Only highly trained or skilled observers should be used. If one is trying to determine if the average listener might notice a difference, the observer group should match the skill of the population.


----------



## Otto (May 18, 2006)

Good comments from all. Of course, we have to use our own ears, as the test is for our own purposes -- to see where we should draw the line in spending and upgrading, etc. If I go on the word of the "experts" or "golden ears", then I'll be upgrading forever. As far as I know, most reviews, comments, and so on are based on fully sighted tests where the price tag is visible. I wonder if those same reviewers would fare so well when things were truly double blind.

I did a quick sighted test using the myrtle wood blocks under some part of the signal chain (not sure if it was interconnects or speaker cables), and a friend was with me. Of course, we could see what was going on. I heard no difference, but my friend said it was "night and day." Did he really hear a difference or not? I don't know, but I imagine results would have been different if we couldn't see the cables. The placebo effect is very real.

Of course, I can only come out of such tests and comment that there "are" or "are not" differences to my ears.


----------



## ISLAND1000 (May 2, 2007)

LOL, You've heard from the experts. Now it's YOUR turn. Have fun, let us know how it turned out. And for the subjects who vote against your equipment . . . . . kick em in the knee.


----------



## JCD (Apr 20, 2006)

lcaillo said:


> This is like saying that you can compare the results of two room measurements with different mics and not worry about the calibration differnces. Certainly, if you had a large enough group of listeners that were a random sample of the population the between-observer variance would be less significant as the group would approach the general population. With small observer groups, however, this must be accounted for if the results are to have much meaning.
> 
> It is a common mistake in research or clinical settings to ignore the individual differences in observers that produce data. The fact is that it is just one more area which produces variance. Unless you use a multivariate model that accounts for it or select observers based on criteria that are relevant to that which you are trying to apply the data, you just might be PITW.
> 
> If the purpose of the testing is to determine IF there are ANY audible differences that MIGHT exist. Only highly trained or skilled observers should be used. If one is trying to determine if the average listener might notice a difference, the observer group should match the skill of the population.


Make no mistake, I COMPLETELY agree with everything you've said here. To do a valid test is far more complicated and would/should involve people that are trained to hear and understand the potential audible differences.

However, I don't think a test like the one being proposed is without merit as I think something can be learned from this test. I might qualify my answer a little and say it might be beneficial to know the backround of the test subjects, e.g., their actual interest in audio.

So, in the end, do I think this test would be statistically and scientifically valid? Not really. Do I think I'd still take something away from the end results? Yes, I do.

Anyway, just my opinion.

JCD


----------



## daniel (Dec 31, 2006)

Have you done your double blind test?


----------



## Otto (May 18, 2006)

Sadly, no. Too many things got in the way, and my friend ended up moving out of the country for the time being. I've still considered doing it with my Sunfire amp against my Behringer, but I haven't gotten to it.


----------



## daniel (Dec 31, 2006)

something you could try is to listen for as long as you need it to with one amp in your system making sure you don't know which amp you're listening to. You will need help to setting this up. You could use a box or something else to hide the amps.

When your setting is done. You take notes, many notes. 
when you're satisfied about your finding, ask a friend to substitute one amp for the other. The idea is not to substitute amp "A" for amp "A", but to find the qualities of each amp in different area. Maybe you could use a sort of template to help you take your note. You think about some question and you try to find the answers. How is the soundstage. Can I pinpoint the instrument ? If I close my eyes, how wide is the soundstage. Using similar instrument (example: drums) can I hear different instrument or they are more alike? How many guitars are playing? what give you goosebumps? 
You might have to listen at different moment in a day or in a week. I don't know where you live, but people in commercial or worst in industrial area find that there sound system sound better at night.
Have fun!
Dan


----------

