Examples of How to Lie with Statistics
Many governments, political activists, and others with a hidden
agenda know how to lie, and do so regularly. Lying with statistics
is one of the most common tactics. What I mean by this is that
they can use numbers that are perfectly accurate, yet use them
in a way that is meant to deceive or to purposely slant people's
perspective in ways that are not noticeable to most. That is
often the same as a lie in my opinion. Now let's look at some
examples.
Non-Random Sampling
For statistics to be meaningful, the sample selected has to
be random. When polls were done during one of Franklin D. Roosevelt's
presidential campaigns, they showed that he would lose the race
by a significant margin, but he won. The problem was that the
polls were conducted by telephone, and only relatively wealthy
people had phones at that time. In other words, there was a selection
of one group of people rather than a true random sampling. That
group made up a small percentage of the population, but the people
in it were more likely to vote for the other candidate, and they
were the ones polled. It was probably an honest mistake, but
the principle has been learned and put to use more consciously
since then.
In more recent times the best examples of non-random sampling
which is meant to produce a perspective or at least known to
do so and used anyhow, is found on television polls that require
viewers to call in and express their opinion. On a given issue,
80% of the population may feel a certain way, yet not as strongly
as the 20% who hold the opposite viewpoint. Of course the latter
are more likely to participate in such polls. The producers of
these shows certainly know this in many cases, but go ahead with
their plans with only the slightest acknowledgement of the non-scientific
nature of the process.
In addition to this there is the composition of the audience
to consider. More conservatives watch some networks while more
liberals watch others, so the same poll questions would get different
responses depending on which network asks them. Granted, the
networks do mention in passing that these are not scientific
polls, but then if they have no validity, what's the point? It's
a reasonable conjecture that the purpose is often to promote
a particular agenda by showing support for it.
Up To...
I recently read that, "Those who smoke have up to 10
times more indoor air pollution." In other words, the average
might be 1.2 times more, but one person in the study had 10 times
more. This is how to lie with statistics by playing loose with
the particular examples. It gives no meaningful information about
the average level of additional pollution that comes from smoking.
Watch out for "up to" in any statistics.
Correlation and Causation
Perhaps the most common way to misrepresent an issue is by
presenting correlation as though it equals causation. For example,
some point out that citizens in the United States have more guns
than this or that country or group of countries, and more gun-related
crime. The implication is that guns cause more crime, but the
statistics alone don't show that. A correlation does not prove
causation.
In fact, if we were to compare Switzerland, which has far
less gun-crime than the U.S. and yet essentially issues a gun
to every adult, we find the opposite correlation. That doesn't
prove that guns reduce crime, although it does provide some evidence
which argues against the proposition that guns cause crime.
Data Selection
In addition to biased sampling, there is also the problem
of biased selection of data. To be truly scientific, when research
is done all trials should be accounted for in the results, but
because there is no research trial registry in the United States,
pharmaceutical companies have been caught discarding trials that
show a drug had no effect, in order to make an effect "appear"
in the trials they select for the purposes of gaining approval
of a drug. If half your trials show a beneficial effect and half
show no effect or even a negative one, just throw out the latter
trials and you have a seemingly successful product based on the
statistical evidence - the carefully selected evidence anyhow.
Or how about counting only those who are getting unemployment
benefits as unemployed? This is the routine way government present
statistics that supposedly show how many people are out of work.
Does it make any sense to exclude those who are no longer receiving
unemployment benefits and still don't have a job from the numbers?
It is hard to imagine any purpose other than deception.
These are just a few of the many ways to lie with numbers
and statistics. Most of them are missed by the average consumer
of news or government reports.
|