"A well-designed graph clearly shows you the relevant end points of a continuum. This is especially important if you’re documenting some actual or projected change in a quantity, and you want your readers to draw the right conclusions. […]"
"Collecting data through sampling therefore becomes a never-ending battle to avoid sources of bias. [...] While trying to obtain a random sample, researchers sometimes make errors in judgment about whether every person or thing is equally likely to be sampled."
"GIGO is a famous saying coined by early computer scientists: garbage in, garbage out. At the time, people would blindly put their trust into anything a computer output indicated because the output had the illusion of precision and certainty. If a statistic is composed of a series of poorly defined measures, guesses, misunderstandings, oversimplifications, mismeasurements, or flawed estimates, the resulting conclusion will be flawed."
"How do you know when a correlation indicates causation? One way is to conduct a controlled experiment. Another is to apply logic. But be careful - it’s easy to get bogged down in semantics."
"In statistics, the word 'significant' means that the results passed mathematical tests such as t-tests, chi-square tests, regression, and principal components analysis (there are hundreds). Statistical significance tests quantify how easily pure chance can explain the results. With a very large number of observations, even small differences that are trivial in magnitude can be beyond what our models of change and randomness can explain. These tests don’t know what’s noteworthy and what’s not - that’s a human judgment."
"Infographics are often used by lying weasels to shape public opinion, and they rely on the fact that most people won’t study what they’ve done too carefully."
"Just because there’s a number on it, it doesn’t mean that the number was arrived at properly. […] There are a host of errors and biases that can enter into the collection process, and these can lead millions of people to draw the wrong conclusions. Although most of us won’t ever participate in the collection process, thinking about it, critically, is easy to learn and within the reach of all of us."
"Many of us feel intimidated by numbers and so we blindly accept the numbers we’re handed. This can lead to bad decisions and faulty conclusions. We also have a tendency to apply critical thinking only to things we disagree with. In the current information age, pseudo-facts masquerade as facts, misinformation can be indistinguishable from true information, and numbers are often at the heart of any important claim or decision. Bad statistics are everywhere."
"Measurements must be standardized. There must be clear, replicable, and precise procedures for collecting data so that each person who collects it does it in the same way."
"Most of us have difficulty figuring probabilities and statistics in our heads and detecting subtle patterns in complex tables of numbers. We prefer vivid pictures, images, and stories. When making decisions, we tend to overweight such images and stories, compared to statistical information. We also tend to misunderstand or misinterpret graphics."
"One kind of probability - classic probability - is based on the idea of symmetry and equal likelihood […] In the classic case, we know the parameters of the system and thus can calculate the probabilities for the events each system will generate. […] A second kind of probability arises because in daily life we often want to know something about the likelihood of other events occurring […]. In this second case, we need to estimate the parameters of the system because we don’t know what those parameters are. […] A third kind of probability differs from these first two because it’s not obtained from an experiment or a replicable event - rather, it expresses an opinion or degree of belief about how likely a particular event is to occur. This is called subjective probability […]." (Daniel J Levitin, "Weaponized Lies", 2017)
"One way to lie with statistics is to compare things - datasets, populations, types of products - that are different from one another, and pretend that they’re not. As the old idiom says, you can’t compare apples with oranges."
"Probabilities allow us to quantify future events and are an important aid to rational decision making. Without them, we can become seduced by anecdotes and stories."
"Samples give us estimates of something, and they will almost always deviate from the true number by some amount, large or small, and that is the margin of error. […] The margin of error does not address underlying flaws in the research, only the degree of error in the sampling procedure. But ignoring those deeper possible flaws for the moment, there is another measurement or statistic that accompanies any rigorously defined sample: the confidence interval."
"Statistics, because they are numbers, appear to us to be cold, hard facts. It seems that they represent facts given to us by nature and it’s just a matter of finding them. But it’s important to remember that people gather statistics. People choose what to count, how to go about counting, which of the resulting numbers they will share with us, and which words they will use to describe and interpret those numbers. Statistics are not facts. They are interpretations. And your interpretation may be just as good as, or better than, that of the person reporting them to you." (Daniel J Levitin, "Weaponized Lies", 2017)
"The margin of error is how accurate the results are, and the confidence interval is how confident you are that your estimate falls within the margin of error."
"The most accurate but least interpretable form of data
presentation is to make a table, showing every single value. But it is
difficult or impossible for most people to detect patterns and trends in such
data, and so we rely on graphs and charts. Graphs come in two broad types:
Either they represent every data point visually (as in a scatter plot) or they
implement a form of data reduction in which we summarize the data, looking, for
example, only at means or medians."
"To be any good, a sample has to be representative. A sample
is representative if every person or thing in the group you’re studying has an
equally likely chance of being chosen. If not, your sample is biased. […] The
job of the statistician is to formulate an inventory of all those things that
matter in order to obtain a representative sample. Researchers have to avoid the
tendency to capture variables that are easy to identify or collect data on - sometimes
the things that matter are not obvious or are difficult to measure."
"We are a storytelling species, and a social species, easily swayed by the opinions of others. We have three ways to acquire information: We can discover it ourselves, we can absorb it implicitly, or we can be told it explicitly. Much of what we know about the world falls in this last category - somewhere along the line, someone told us a fact or we read about it, and so we know it only second-hand. We rely on people with expertise to tell us." (Daniel J Levitin, "Weaponized Lies", 2017)
"We use the word probability in different ways to mean
different things. It’s easy to get swept away thinking that a person means one
thing when they mean another, and that confusion can cause us to draw the wrong
conclusion."
No comments:
Post a Comment