Wednesday, October 1, 2008

Polls Are Simultaneously Crap...and Dangerous

A new article has been published today, based on poll data.

Here's the headline of the article that appears on the Time website:

Poll: Obama Hits a New High, Making Big Gains Among Women

And here are some of the sweeping statements made in that article:

----McCain is losing female voters faster than Sarah Palin attracted them after the Republican National Convention. Obama leads McCain by 17 points with women, 55%-38%. Before the conventions, women preferred Obama by a margin of 10 points, 49%-39%. After McCain picked Palin as his running mate, the gap narrowed to a virtual tie, with Obama holding a 1-point margin, 48%-47%

----white women now favor Obama by three points, 48%-45%; in 2004, George W. Bush won the same demographic by 11 points against John Kerry. Where Bush carried married women by 15 points in that election, 57%-42%, Obama now leads by 6 points, 50%-44%, a 21-point shift.

----Non-college-educated white women split virtually evenly, 46%-45% for McCain. By contrast, Obama remains weak among white men. That group supports McCain 57%-36% overall, and non-college-educated white men back the Republican ticket by an even greater margin, 63%-27%.

Ohay, now if you read those above paragraphs, which compare for example Bush's numbers (DURING THE ELECTION - a sample size of several million people, obviously) with the numbers for McCain now (based on a slightly smaller sample size, as you will see ina minute), you might think that McCain is in a heckuva lot of trouble.

But read down to the verry last paragraph of this article, and what do you see?

The poll, which surveyed 1,133 likely voters nationwide between Sept. 26-29, has a margin of error of +/- 3 percentage points.

This poll is comparing apples and oranges when it comes to Bush's numbers with women during his election, and McCain's numbers with women before the election.

And can anyone answer this question, based on the paragraph above? The only bit of info we're given.

1) Of the 1,133 people surveyed nationwide, how many were men? How many were women? (Because the numbers given in the article conclude things about both male and female voters - so they must have talked to both men and women, right?)

Logic would dictate that about 550 were men and 550 were women, right, but how do we know? Maybe it was 800 women and 300 men.

But let's say it was 550 men and 550 women. There are 50 states in the Union. (Not 57, as a certain presidential candidate seems to think.) That means that this poll talked to 10 people in each state. Mebbe. Considering the cost of phone calls, perhaps they didn't even bother wiht Alaska and Hawaii.

So, based on what 10 people in each state say, the author of this article has the audacity to say that McCAin is "behind in the polls." He may well be, but big whoop.

And what's that "likely voters" phrase mean? Where did they find the people who responded to this poll? Is anyone registered to vote a "likely" voter?

Did the pollsters only talk to Democrats? Did they talk to Republicans? What percentage of that "1,133 people surveyed nationwide" were Democrats, how many Republicans, how many Independents?

It is very easy to skew statistical data to get exactly the result you want.

Isaac Asimov once wrote an essay - albeit on radioactive half-life of various atoms, in which he stated: Once you have a large group, you can use statistics to predict the future. The larger the group, the more accurate (percentage wise) the predictions. The smaller the sample size, the more useless are the statistics.

So, in my view, any poll, regarding politics, that does not have a sample size of at least 10,000 people, is useless - and even 10,000 people is pretty low.

Why are Polls dangerous, then?

Because the media reporting on the poll skews the data to come to the conclusions that they want.

1) The statements made in the body of the article are not clarified...they make it seem as if a majority of the population feels the way that the members of the poll do... and when the members of the poll are merely 1,000 people, that's crap.

But, it is my belief that most people don't read to the end of these articles to see what ridiculous sample sizes are used. 90% of them see the headlines - and believe it. 80% of the rest read the first two or three paragraphs - and believe what they read. And 10% of people check the bottom paragraph first, see that the sample size is ridiiculously small, and dismiss the poll as ridiculous.

The statements, therefore, should be along the lines of, "80% of the people who participated in this poll believe that" such and such is true.

2) The sample size for each and every poll should be put in the first paragraph, as well as information on whom that sample size consists of.

So that's my contention. Any poll on political matters - considering that we're talking about 700 million people in the US - that does not have at least 10,000 respondents, is a worthless poll and doesn't even deserve to be mentioned.

