Tuesday, November 8, 2016

Lying With Statistics


Written by a non-statistician in hokey language and illustrate by humorous line drawings, How To Lie With Statistics is as relevant and enjoyable as when it first appeared in 1954.Indeed the book is a best seller even though some examples are out of date, like the salary of Yale graduates and the price of bananas. Likewise the tricks described by Darrell Huff, from misleading charts to misuse of averages, are still used today. "Many a statistic is false on its face. It gets by only because the magic of numbers brings about a suspension of common sense," Huff says. The purpose of this book is about explaining how to look a phony statistic in the eye and face it down by asking some of these questions:

1. Who says so?
2. How does he know?
3. What's missing?
4. Did somebody change the subject?
5. Does it make sense?

Remember:statistics don't lie, people do. Here are a few more things we can take from the book:

"Proper treatment will cure a cold in seven days, but left to itself a cold will hang on for a week."

When numbers appear, the reader believes some truth is about to be imparted. Even a nonsensical statement such as this carries the air of authority until the meaning sinks in. Yes, using statistics to lie is easy and, yes, statistics can be used to manipulate, obfuscate, sensationalize, and confuse.

Samples are, by definition, incomplete pictures of the whole. How much of the whole, this is the question. When a sample is large enough and selected properly, it tells us something. The basic sample is called 'random.' As its name suggests, it is formed by chance from the 'universe,' that is the whole from which the sample is part. Everyone in the universe must have an equal chance of landing in that pool. It is expensive to do and difficult to obtain.



Samples are based on responses, which reveal either the truth or the airbrushed version of who we wish we were. When samples rely on people to tell the truth about themselves, we learn more about what they want to be than who they really are. The study that showed some extraordinarily high number of Americans reported washing their hands after using the bathroom. Reporters staked out public restrooms far and wide and came away with a far lower percentage of actual post-washroom washing. Why? From the days of yore, people tend to respond with what will please the one asking the question (who wants to say they don't wash?), will offend the poll taker least (studies show the gender or race of the one asking the questions greatly affects the answers given), or will make them look the best (self-reported income tends to be far higher than actual).

Also implicit in all statistics based on sampling are the probable error and the standard error, both of which state the measure of reliability--without it, the number is meaningless. This means it is a range, though some either ignore this fact or try to use it to say something that isn't there.
Ignoring? Let's say 10 companies are all found to use too much packaging material. A list is presented in which all of them are shown to use what environmentalists consider to be too much. Yet, the company at the bottom might still step up and herald themselves as the Green Company of choice.

A difference is a difference only if it makes a difference. When the sample is too small to speak to anything, it allows you to say what you want to say without pesky facts getting in the way Flip a coin four times. Will you get the mythical 50%? Probably not, but maybe. This may suffice when tossing a coin. When you make a medical decision or assess the validity of a scientific study, you should demand more proof. If we don't know the degree of significance of a given number (how representative--or not--a sample is) we don't know how likely it is that the test or sample figure represents a rea lresult rather than one produced by chance.

An average is a single value meant to typify a list of values: there are three types - mean, median, or mode - and they 'typify' in very different ways. The mean average is the one you most commonly think of when you hear the word average. Advertisers and others sometimes rely on this. You arrive at the mean average by adding a group of numbers, then divide by the number of items you've just added together. For example, a real estate agent wants to be able to say a neighborhood has a high average income. The neighborhood in question is mostly farmers and hourly-wage workers. There are three families, though, who are millionaire weekenders. The mean average will assist the broker's wish for a higher number because the wealthy few will bring the mean average considerably higher. Of course, it will not paint a particularly accurate portrait, but it gets the job done.

These are are but a few of the examples that the book covers. As leaders, it's important that we look more at the facts behind the statistics.

Thought for the week: 

Facts are stubborn things, but statistics are pliable.― Mark Twain


 

No comments:

Post a Comment