Read more on Amazon.
Get next week’s and all 200+ book notes.
The book is a fantastic primer on how we’re tricked, daily, by the sneaky use of statistics. It’s a must-read for anyone to more intelligently interpret news, media, and even medical research. Once you learn to see these tricks, you cannot unsee them.
Don’t be fooled by missing data: Just because taking a drug clears up a cold in one week doesn’t mean that it wouldn’t have cleared up in a week on its own.
Many conclusions you see come from samples that are too small, biased, or both.
When you hear a statistic, say, that the average American brushes their teeth 1.02 times a day, ask yourself: “How could they have figured it out?” Does it make sense that it could have been researched effectively? In this case, they would have had to ask, and don’t you think it’s a safe assumption people lied?
“To be worth much, a report based on sampling must use a representative sample, which is one from which every source of bias has been removed.”
If a psychiatrist says that “practically everyone is neurotic,” do you suppose that their impression has been biased by their line of work?
There are three kinds of average:
- The mean: add up all the values and divide by the quantity of values
- The mode: the most common value
- The median: the value in the middle of the sample
These can be very different numbers, and reporters and others will pick the one that best supports their argument.
In normal distributions, the three will be near each other, but in irregular distributions (e.g. annual household income) you’ll get vastly different numbers for each.
Companies will keep running experiments until they get the results they want, discarding the experiments that “failed to produce significant findings.”
With smaller samples you have larger variance. With 10 coin flips you might get 8 heads, but you’re much less likely to get 80 heads in 100 coin flips.
These three are all the same graph (data wise), but very different impression wise:
You have to look at the range of data being used on both axes. Another example:
The Semiattached Figure
“If you can’t prove what you want to prove, demonstrate something else and pretend that they are the same thing.”
For example: “You can’t prove that your nostrum cures colds, but you can publish (in large type) a sworn laboratory report that half an ounce of the stuff killed 31,108 germs in a test tube in eleven seconds.”
More: ““27 percent of a large sample of eminent physicians smoke Throaties— more than any other brand.” The figure itself may be phony, of course, in any of several ways, but that really doesn’t make any difference. The only answer to a figure so irrelevant is “So what?” With all proper respect toward the medical profession, do doctors know any more about tobacco brands than you do? Do they have any inside information that permits them to choose the least harmful among cigarettes? Of course they don’t, and your doctor would be the first to say so. Yet that “27 percent” somehow manages to sound as if it meant something.”
Or: “By the same kind of nonsense that the article writer used you can show that clear weather is more dangerous than foggy weather. More accidents occur in clear weather, because there is more clear weather than foggy weather. All the same, fog may be much more dangerous to drive in.”
You can also represent the same data in many different ways: “There are often many ways of expressing any figure. You can, for instance, express exactly the same fact by calling it a one percent return on sales, a fifteen percent return on investment, a ten-million-dollar profit, an increase in profits of forty percent (compared with 1935– 39 average), or a decrease of sixty percent from last year.”
Correlation vs. Causation
“It is the one that says that if B follows A, then A has caused B. An unwarranted assumption is being made that since smoking and low grades go together, smoking causes low grades. Couldn’t it just as well be the other way around?”
“This is the post hoc fallacy at its best. It says that these figures show that if you (your son, your daughter) attend college you will probably earn more money than if you decide to spend the next four years in some other manner. This unwarranted conclusion has for its basis the equally unwarranted assumption that since college-trained folks make more money, they make it because they went to college. Actually we don’t know but that these are the people who would have made more money even if they had not gone to college.”
How to Talk Back to a Statistic
These five questions help you avoid getting tricked by statistics.
Who Says So?
“About the first thing to look for is bias— the laboratory with something to prove for the sake of a theory, a reputation, or a fee; the newspaper whose aim is a good story; labor or management with a wage level at stake.”
How Does He Know?
“Watch out for evidence of a biased sample, one that has been selected improperly or— as with this one— has selected itself. Ask the question we dealt with in an early chapter: Is the sample large enough to permit any reliable conclusion?”
“Watch out for an average, variety unspecified, in any matter where mean and median might be expected to differ substantially.”
“Sometimes it is percentages that are given and raw figures that are missing, and this can be deceptive too. Long ago, when Johns Hopkins University had just begun to admit women students, someone not particularly enamored of coeducation reported a real shocker: Thirty-three and one-third percent of the women at Hopkins had married faculty members! The raw figures gave a clearer picture. There were three women enrolled at the time, and one of them had married a faculty man.”
“A report of a great increase in deaths from cancer in the last quarter-century is misleading unless you know how much of it is a product of such extraneous factors as these: Cancer is often listed now where “causes unknown” was formerly used; autopsies are more frequent, giving surer diagnoses; reporting and compiling of medical statistics are more complete; and people more frequently reach the most susceptible ages now. And if you are looking at total deaths rather than the death rate, don’t neglect the fact that there are more people now than there used to be.”
Did Somebody Change the Subject?
Is that the real statistic, or what someone reported? (e.g. how often they bathed).
“The “population” of a large area in China was 28 million. Five years later it was 105 million. Very little of that increase was real; the great difference could be explained only by taking into account the purposes of the two enumerations and the way people would be inclined to feel about being counted in each instance. The first census was for tax and military purposes, the second for famine relief.”
Does it Make Sense?
“Hearings on amendments to the Social Security Act have been haunted by various forms of a statement that makes sense only when not looked at closely. It is an argument that goes like this: Since life expectancy is only about sixty-three years, it is a sham and a fraud to set up a social-security plan with a retirement age of sixty-five, because virtually everybody dies before that.”
Show MoreHow to Lie with Statistics Book Summary
The book How to Lie with Statistics written by Darrell Huff shows you how statistics are used to mislead; sometimes unintentionally, other times on purpose. It gives the readers the knowledge necessary to intelligently question and understand the story behind the numbers. In other words, it shows the tricks the crooks use, so that honest men can use this knowledge for self defense.
I think it’s particularly useful for a manager or an executive to read and understand this book, because they are usually presented with a lot of numbers, graphs and charts and are expected to make decisions based on these numbers. People collecting and presenting the numbers to management could employ some of the…show more content…
Even though they can be similar when describing the heights of men, they are usually far off when describing salaries. One should always be careful when interpreting results with “average” values, and should understand which average is employed. It’s also important to understand the ranges or standard deviations along with means. For example, if you pack your clothes according to the average temperature for your vacation to Oklahoma City ignoring the ranges, you could end up in a hospital. Similarly, you should also pay attention to the errors especially when comparing figures with small differences. Ignoring those errors, which are implicit in all sampling studies, you could end up in incorrect results. Percentages could also be another misleading way to present the results as how they are calculated and what the real figures are generally not clear to the reader.
The power of graphs and the tricks on them are well illustrated in the book. It is very easy to tell two different stories by just changing the ranges on graphs or trimming below an artificial baseline– such as a very profitable year vs. a steady one. Sometimes, we prefer to use pictures instead of graphs to better visualize the results. However, they can be exploited even more than graphs to deceive the readers. Honest comparisons should be done in one dimension and pictures should have areas proportional to