"A study is any data collection exercise. The purpose of any study is to answer a question. [...] Once the question has been clearly articulated, it’s time to design a study to answer it. At one end of the spectrum, a study can be a controlled experiment, deliberate and structured, where researchers act like the ultimate control freaks, manipulating everything from the gender of their test subjects to the humidity in the room. Scientific studies, the kind run by men in white lab coats and safety goggles, are often controlled experiments. At the other end of the spectrum, an observational study is simply the process of watching something unfold without trying to impact the outcome in any way." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"According to the central limit theorem, it doesn’t matter what the raw data look like, the sample variance should be proportional to the number of observations and if I have enough of them, the sample mean should be normal." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Although it’s a little more complicated than [replication and random sampling], blocking is a powerful way to eliminate confounding factors. Blocking is the process of dividing a sample into one or more similar groups, or blocks, so that samples in each block have certain factors in common. This technique is a great way to gain a little control over an experiment with lots of uncontrollable factors." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Any factor you don’t account for can become a confounding factor. A confounding factor is any variable that confuses the conclusions of your study, or makes them ambiguous. [...] Confounding factors can really screw up an otherwise perfectly good statistical analysis." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Any time you collect data, you have uncertainty to deal with. This uncertainty comes from two places: (1) inherent variation in the values a random variable can take on and (2) the fact that for most studies, you can’t capture the entire population and so you must rely on a sample to make your conclusions." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Choosing and organizing a sample is a crucial part of the experimental design process. Statistically speaking, the best type of sample is called a random sample. A random sample is a subset of the entire population, chosen so each member is equally likely to be picked. [...] Random sampling is the best way to guarantee you’ve chosen objectively, without personal preference or bias." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Probability, the mathematical language of uncertainty, describes what are called random experiments, bets, campaigns, trials, games, brawls, and anything other situation where the outcome isn’t known beforehand. A probability is a fraction, a value between zero and one that measures the likelihood a given outcome will occur. A probability of zero means the outcome is virtually impossible. A probability of one means it will almost certainly happen. A probability of one-half means the outcome is just as likely to occur as not." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"Replication is the process of taking more than one observation or measurement. [...] Replication helps eliminate negative effects of uncontrollable factors, because it keeps us from getting fooled by a single, unusual outcome." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"The random experiment, or trial, is the situation whose outcome is uncertain, the one you’re watching. A coin toss is a random experiment, because you don’t know beforehand whether it will turn up heads or tails. The sample space is the list of all possible separate and distinct outcomes in your random experiment. The sample space in a coin toss contains the two outcomes heads and tails. The outcome you're interested in calculating a probability for is the event. On a coin toss, that might be the case where the coin lands on heads." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"The scientific method is the foundation of modern research. It’s how we prove a theory. It’s how we demonstrate cause and effect. It’s how we discover, innovate, and invent. There are five basic steps to the scientific method: (1) Ask a question. (2) Conduct background research. (3) Come up with a hypothesis. (4) Test the hypothesis with data. (5) Revise and retest the hypothesis until a conclusion can be made." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
"There are three important requirements for the probability distribution. First, it should be defined for every possible value the random variable can take on. In other words, it should completely describe the sample space of a random experiment. Second, the probability distribution values should always be nonnegative. They’re meant to measure probabilities, after all, and probabilities are never less than zero. Finally, when all the probability distribution values are summed together, they must add to one." (Kristin H Jarman, "The Art of Data Analysis: How to answer almost any question using basic statistics", 2013)
No comments:
Post a Comment