04 December 2018

Data Science: Hypothesis Testing (Just the Quotes)

"A discoverer is a tester of scientific ideas; he must not only be able to imagine likely hypotheses, and to select suitable ones for investigation, but, as hypotheses may be true or untrue, he must also be competent to invent appropriate experiments for testing them, and to devise the requisite apparatus and arrangements." (George Gore, "The Art of Scientific Discovery", 1878)

"Statistics is the fundamental and most important part of inductive logic. It is both an art and a science, and it deals with the collection, the tabulation, the analysis and interpretation of quantitative and qualitative measurements. It is concerned with the classifying and determining of actual attributes as well as the making of estimates and the testing of various hypotheses by which probable, or expected, values are obtained. It is one of the means of carrying on scientific research in order to ascertain the laws of behavior of things - be they animate or inanimate. Statistics is the technique of the Scientific Method." (Bruce D Greenschields & Frank M Weida, "Statistics with Applications to Highway Traffic Analyses", 1952)

"All testing, all confirmation and disconfirmation of a hypothesis takes place already within a system. And this system is not a more or less arbitrary and doubtful point of departure for all our arguments; no it belongs to the essence of what we call an argument. The system is not so much the point of departure, as the element in which our arguments have their life." (Ludwig Wittgenstein, "On Certainty", 1969)

"Science consists simply of the formulation and testing of hypotheses based on observational evidence; experiments are important where applicable, but their function is merely to simplify observation by imposing controlled conditions." (Henry L Batten, "Evolution of the Earth", 1971)

"Decision-making problems (hypothesis testing) involve situations where it is desired to make a choice among various alternative decisions (hypotheses). Such problems can be viewed as generalized state estimation problems where the definition of state has simply been expanded." (Fred C Scweppe, "Uncertain dynamic systems", 1973)

"Hypothesis testing can introduce the need for multiple models for the multiple hypotheses and,' if appropriate, a priori probabilities. The one modeling aspect of hypothesis testing that has no estimation counterpart is the problem of specifying the hypotheses to be considered. Often this is a critical step which influences both performance arid the difficulty of implementation." (Fred C Scweppe, "Uncertain dynamic systems", 1973)

"Pattern recognition can be viewed as a special case of hypothesis testing. In pattern recognition, an observation z is to be used to decide what pattern caused it. Each possible pattern can be viewed as one hypothesis. The main problem in pattern recognition is the development of models for the z corresponding to each pattern (hypothesis)." (Fred C Scweppe, "Uncertain dynamic systems", 1973)

"The term hypothesis testing arises because the choice as to which process is observed is based on hypothesized models. Thus hypothesis testing could also be called model testing. Hypothesis testing is sometimes called decision theory. The detection theory of communication theory is a special case." (Fred C Scweppe, "Uncertain dynamic systems", 1973)

"Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion." (Stephen M. Stigler, "Neutral Models in Biology", 1987)

"A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that’s the only way you can take it in formal hypothesis testing), is always false in the real world. [...] If it is false, even to a tiny degree, it must be the case that a large enough sample will produce a significant result and lead to its rejection. So if the null hypothesis is always false, what’s the big deal about rejecting it?" (Jacob Cohen, "Things I Have Learned (So Far)", American Psychologist, 1990)

"There is a tendency to use hypothesis testing methods even when they are not appropriate. Often, estimation and confidence intervals are better tools. Use hypothesis testing only when you want to test a well-defined hypothesis." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"A type of error used in hypothesis testing that arises when incorrectly rejecting the null hypothesis, although it is actually true. Thus, based on the test statistic, the final conclusion rejects the Null hypothesis, but in truth it should be accepted. Type I error equates to the alpha (α) or significance level, whereby the generally accepted default is 5%." (Lynne Hambleton, "Treasure Chest of Six Sigma Growth Methods, Tools, and Best Practices", 2007)

"The way we explore data today, we often aren't constrained by rigid hypothesis testing or statistical rigor that can slow down the process to a crawl. But we need to be careful with this rapid pace of exploration, too. Modern business intelligence and analytics tools allow us to do so much with data so quickly that it can be easy to fall into a pitfall by creating a chart that misleads us in the early stages of the process." (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020) 

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.