Monday, November 12, 2007

Who is misinformed?

I just finished a pleasant little book on the power of statistical analysis in modern business called "Super Crunchers" by Ian Ayres. Of course, he tackles some very politically incorrect topics like gender variation (i.e. Summers) and educational philosophies and the role of numerical analysis in determining practice. But he makes some major errors. Here is one. On page 203 he tackles political polls and the journalists who cover them. At issue is a hypothetical contest pitting Laverne against Shirley (is the Pollster Squiggy?). Polls show that Laverne leads Shirley 51 to 49 percent with a margin of error of 2 percent. Ayres attacks the journalist who declares the contest a "statistical dead heat".
"Balderdash! Laverne is a full Standard Deviation ahead (the margin of error is 2 standard deviations). Crunching these numbers in Excell tells us in a few seconds that there is an 84% chance that Laverne currently leads in the polls. If something does not change, she is your likely winner."
This analysis is horribly confused. First, you need to be Bayesian here to even talk about the "chance that Laverne leads". Then you need a reasonable prior. I am sure one exists, that would make this probability 84% but I dont know it. I do know that Ayres goofed. What he is actually doing is a one sided test of the hypothesis that Laverne=Shirley, by calculating the probability that Laverne will poll at least 51% assuming that her true support is equal to 50%. This is 84%. This is still still not right. What he should be testing is the two sided alternative by calculating the chance that either candidate will poll at least 51% assuming equal actual support. This chance is 68%. Ayres is making the error of ignoring multiplicity. The whole example is wrong.