"Even properly done statistics can’t be trusted. The plethora of available statistical techniques and analyses grants researchers an enormous amount of freedom when analyzing their data, and it is trivially easy to ‘torture the data until it confesses’. Just try several different analyses offered by your statistical software until one of them turns up an interesting result, and then pretend this is the analysis you intended to do all along. Without psychic powers, it’s almost impossible to tell when a published result was obtained through data torture." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"In science, it is important to limit two kinds of errors: false positives, where you conclude there is an effect when there isn’t, and false negatives, where you fail to notice a real effect. In some sense, false positives and false negatives are flip sides of the same coin. If we’re too ready to jump to conclusions about effects, we’re prone to get false positives; if we’re too conservative, we’ll err on the side of false negatives." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"In exploratory data analysis, you don’t choose a hypothesis to test in advance. You collect data and poke it to see what interesting details might pop out, ideally leading to new hypotheses and new experiments. This process involves making numerous plots, trying a few statistical analyses, and following any promising leads. But aimlessly exploring data means a lot of opportunities for false positives and truth inflation." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"In short, statistical significance does not mean your result has any practical significance. As for statistical insignificance, it doesn’t tell you much. A statistically insignificant difference could be nothing but noise, or it could represent a real effect that can be pinned down only with more data." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"More useful than a statement that an experiment’s results were statistically insignificant is a confidence interval giving plausible sizes for the effect. Even if the confidence interval includes zero, its width tells you a lot: a narrow interval covering zero tells you that the effect is most likely small (which may be all you need to know, if a small effect is not practically useful), while a wide interval clearly shows that the measurement was not precise enough to draw conclusions." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"Much of experimental science comes down to measuring differences. [...] We use statistics to make judgments about these kinds of differences. We will always observe some difference due to luck and random variation, so statisticians talk about statistically significant differences when the difference is larger than could easily be produced by luck. So first we must learn how to make that decision." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"Overlapping confidence intervals do not mean two values are not significantly different. Checking confidence intervals or standard errors will mislead. It’s always best to use the appropriate hypothesis test instead. Your eyeball is not a well-defined statistical procedure." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"The p value is the probability, under the assumption that there is no true effect or no true difference, of collecting data that shows a difference equal to or more extreme than what you actually observed. [...] Remember, a p value is not a measure of how right you are or how important a difference is. Instead, think of it as a measure of surprise." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"There is exactly one situation when visually checking confidence intervals works, and it is when comparing the confidence interval against a fixed value, rather than another confidence interval. If you want to know whether a number is plausibly zero, you may check to see whether its confidence interval overlaps with zero. There are, of course, formal statistical procedures that generate confidence intervals that can be compared by eye and that even correct for multiple comparisons automatically. Unfortunately, these procedures work only in certain circumstances;" (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
"When statisticians are asked for an interesting paradoxical result in statistics, they often turn to Simpson’s paradox. Simpson’s paradox arises whenever an apparent trend in data, caused by a confounding variable, can be eliminated or reversed by splitting the data into natural groups." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)
No comments:
Post a Comment