"A discoverer is a tester of scientific ideas; he must not only be able to imagine likely hypotheses, and to select suitable ones for investigation, but, as hypotheses may be true or untrue, he must also be competent to invent appropriate experiments for testing them, and to devise the requisite apparatus and arrangements." (George Gore, "The Art of Scientific Discovery", 1878)
"Statistics is the fundamental and most important part of inductive logic. It is both an art and a science, and it deals with the collection, the tabulation, the analysis and interpretation of quantitative and qualitative measurements. It is concerned with the classifying and determining of actual attributes as well as the making of estimates and the testing of various hypotheses by which probable, or expected, values are obtained. It is one of the means of carrying on scientific research in order to ascertain the laws of behavior of things - be they animate or inanimate. Statistics is the technique of the Scientific Method." (Bruce D Greenschields & Frank M Weida, "Statistics with Applications to Highway Traffic Analyses", 1952)
"The peculiarity of [...] statistical hypotheses is that they are not conclusively refutable by any experience." (Richard B Braithwaite, "Scientific Explanation: A Study of the Function of Theory, Probability and Law in Science", 1953)
"Tests of the null hypothesis that there is no difference between certain treatments are often made in the analysis of agricultural or industrial experiments in which alternative methods or processes are compared. Such tests are [...] totally irrelevant. What are needed are estimates of magnitudes of effects, with standard errors." (Francis J Anscombe, "Discussion on Dr. David’s and Dr. Johnson’s Paper", Journal of the Royal Statistical Society B 18, 1956)
"[...] the tests of null hypotheses of zero differences, of no relationships, are frequently weak, perhaps trivial statements of the researcher’s aims [...] in many cases, instead of the tests of significance it would be more to the point to measure the magnitudes of the relationships, attaching proper statements of their sampling variation. The magnitudes of relationships cannot be measured in terms of levels of significance." (Leslie Kish, "Some statistical problems in research design", American Sociological Review 24, 1959)
"In view of our long-term strategy of improving our theories, our statistical tactics can be greatly improved by shifting emphasis away from over-all hypothesis testing in the direction of statistical estimation. This always holds true when we are concerned with the actual size of one or more differences rather than simply in the existence of differences." (David A Grant, "Testing the null hypothesis and the strategy and tactics of investigating theoretical models", Psychological Review 69, 1962)
"[...] we need to get on with the business of generating [...] hypotheses and proceed to do investigations and make inferences which bear on them, instead of [...] testing the statistical null hypothesis in any number of contexts in which we have every reason to suppose that it is false in the first place." (David Bakan, "The test of significance in psychological research", Psychological Bulletin 66, 1966)
"All testing, all confirmation and disconfirmation of a hypothesis takes place already within a system. And this system is not a more or less arbitrary and doubtful point of departure for all our arguments; no it belongs to the essence of what we call an argument. The system is not so much the point of departure, as the element in which our arguments have their life." (Ludwig Wittgenstein, "On Certainty", 1969)
"Science consists simply of the formulation and testing of hypotheses based on observational evidence; experiments are important where applicable, but their function is merely to simplify observation by imposing controlled conditions." (Henry L Batten, "Evolution of the Earth", 1971)
"[...] the statistical power of many psychological studies is ridiculously low. This is a self-defeating practice: it makes for frustrated scientists and inefficient research. The investigator who tests a valid hypothesis but fails to obtain significant results cannot help but regard nature as untrustworthy or even hostile." (Amos Tversky & Daniel Kahneman, "Belief in the law of small numbers", Psychological Bulletin 76(2), 1971)
"Decision-making problems (hypothesis testing) involve situations where it is desired to make a choice among various alternative decisions (hypotheses). Such problems can be viewed as generalized state estimation problems where the definition of state has simply been expanded." (Fred C Scweppe, "Uncertain dynamic systems", 1973)
"Hypothesis testing can introduce the need for multiple models for the multiple hypotheses and,' if appropriate, a priori probabilities. The one modeling aspect of hypothesis testing that has no estimation counterpart is the problem of specifying the hypotheses to be considered. Often this is a critical step which influences both performance arid the difficulty of implementation." (Fred C Scweppe, "Uncertain dynamic systems", 1973)
"Pattern recognition can be viewed as a special case of hypothesis testing. In pattern recognition, an observation z is to be used to decide what pattern caused it. Each possible pattern can be viewed as one hypothesis. The main problem in pattern recognition is the development of models for the z corresponding to each pattern (hypothesis)." (Fred C Scweppe, "Uncertain dynamic systems", 1973)
"The term hypothesis testing arises because the choice as to which process is observed is based on hypothesized models. Thus hypothesis testing could also be called model testing. Hypothesis testing is sometimes called decision theory. The detection theory of communication theory is a special case." (Fred C Scweppe, "Uncertain dynamic systems", 1973)
"Small wonder that students have trouble [with statistical hypothesis testing]. They may be trying to think." (W Edwards Deming, "On probability as a basis for action", American Statistician 29, 1975)
"Tests appear to many users to be a simple way to discharge the obligation to provide some statistical treatment of the data." (H V Roberts, "For what use are tests of hypotheses and tests of significance", Communications in Statistics [Series A], 1976)
"In practice, of course, tests of significance are not taken seriously." (Louis Guttman, "The illogic of statistical inference for cumulative science", Applied Stochastic Models and Data Analysis, 1985)
"Most readers of The American Statistician will recognize the limited value of hypothesis testing in the science of statistics. I am not sure that they all realize the extent to which it has become the primary tool in the religion of Statistics." (David Salsburg, The Religion of Statistics as Practiced in Medical Journals, "The American Statistician" 39, 1985)
"Since a point hypothesis is not to be expected in practice to be exactly true, but only approximate, a proper test of significance should almost always show significance for large enough samples. So the whole game of testing point hypotheses, power analysis notwithstanding, is but a mathematical game without empirical importance." (Louis Guttman, "The illogic of statistical inference for cumulative science", Applied Stochastic Models and Data Analysis, 1985
"We shall marshal arguments against [significance] testing, leading to the conclusion that it be abandoned by all substantive science and not just by educational research and other social sciences which have begun to raise voices against the virtual tyranny of this branch of inference in the academic world." (Louis Guttman, "The illogic of statistical inference for cumulative science", Applied Stochastic Models and Data Analysis, 1985)
"Analysis of variance [...] stems from a hypothesis-testing formulation that is difficult to take seriously and would be of limited value for making final conclusions." (Herman Chernoff, Comment, The American Statistician 40(1), 1986)
"We are better off abandoning the use of hypothesis tests entirely and concentrating on developing continuous measures of toxicity which can be used for estimation." (David Salsburg, "Statistics for Toxicologists", 1986)
"Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion." (Stephen M Stigler, "Neutral Models in Biology", 1987)
"A little thought reveals a fact widely understood among statisticians: The null hypothesis, taken literally (and that’s the only way you can take it in formal hypothesis testing), is always false in the real world. [...] If it is false, even to a tiny degree, it must be the case that a large enough sample will produce a significant result and lead to its rejection. So if the null hypothesis is always false, what’s the big deal about rejecting it?" (Jacob Cohen, "Things I Have Learned (So Far)", American Psychologist, 1990)
"I believe [...] that hypothesis testing has been greatly overemphasized in psychology and in the other disciplines that use it. It has diverted our attention from crucial issues. Mesmerized by a single all-purpose, mechanized, ‘objective’ ritual in which we convert numbers into other numbers and get a yes-no answer, we have come to neglect close scrutiny of where the numbers come from." (Jacob Cohen, "Things I have learned (so far)", American Psychologist 45, 1990)
"Despite the stranglehold that hypothesis testing has on experimental psychology, I find it difficult to imagine a less insightful means of transitting from data to conclusions." (Geoffrey R Loftus, "On the tyranny of hypothesis testing in the social sciences", Contemporary Psychology 36, 1991)
"How has the virtually barren technique of hypothesis testing come to assume such importance in the process by which we arrive at our conclusions from our data?" (Geoffrey R Loftus, "On the tyranny of hypothesis testing in the social sciences", Contemporary Psychology 36, 1991)
"This remarkable state of affairs [overuse of significance testing] is analogous to engineers’ teaching (and believing) that light consists only of waves while ignoring its particle characteristics—and losing in the process, of course, any motivation to pursue the most interesting puzzles and paradoxes in the field." (Geoffrey R Loftus, "On the tyranny of hypothesis testing in the social sciences", Contemporary Psychology 36, 1991)
"Whereas hypothesis testing emphasizes a very narrow question (‘Do the population means fail to conform to a specific pattern?’), the use of confidence intervals emphasizes a much broader question (‘What are the population means?’). Knowing what the means are, of course, implies knowing whether they fail to conform to a specific pattern, although the reverse is not true. In this sense, use of confidence intervals subsumes the process of hypothesis testing." (Geoffrey R Loftus, "On the tyranny of hypothesis testing in the social sciences", Contemporary Psychology 36, 1991)
"After four decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred .05 criterion—still persist. This article reviews the problems with this practice [...]” [...] “What’s wrong with [null hypothesis significance testing]? Well, among many other things, it does not tell us what we want to know, and we so much want to know what we want to know that, out of desperation, we nevertheless believe that it does!" (Jacob Cohen, "The earth is round (p<.05)", American Psychologist 49, 1994)
"I argued that hypothesis testing is fundamentally inappropriate for ecological risk assessment, that its use has undesirable consequences for environmental protection, and that preferable alternatives exist for statistical analysis of data in ecological risk assessment. The conclusion of this paper is that ecological risk assessors should estimate risks rather than test hypothesis" (Glenn W Suter, "Abuse of hypothesis testing statistics in ecological risk assessment", Human and Ecological Risk Assessment 2, 1996)
"I contend that the general acceptance of statistical hypothesis testing is one of the most unfortunate aspects of 20th century applied science. Tests for the identity of population distributions, for equality of treatment means, for presence of interactions, for the nullity of a correlation coefficient, and so on, have been responsible for much bad science, much lazy science, and much silly science. A good scientist can manage with, and will not be misled by, parameter estimates and their associated standard errors or confidence limits." (Marks Nester, "A Myopic View and History of Hypothesis Testing", 1996)
"Statistical hypothesis testing is commonly used inappropriately to analyze data, determine causality, and make decisions about significance in ecological risk assessment,[...] It discourages good toxicity testing and field studies, it provides less protection to ecosystems or their components that are difficult to sample or replicate, and it provides less protection when more treatments or responses are used. It provides a poor basis for decision-making because it does not generate a conclusion of no effect, it does not indicate the nature or magnitude of effects, it does address effects at untested exposure levels, and it confounds effects and uncertainty[...]. Risk assessors should focus on analyzing the relationship between exposure and effects[...]." (Glenn W Suter, "Abuse of hypothesis testing statistics in ecological risk assessment", Human and Ecological Risk Assessment 2, 1996)
"We should push for de-emphasizing some topics, such as statistical significance tests - an unfortunate carry-over from the traditional elementary statistics course. We would suggest a greater focus on confidence intervals - these achieve the aim of formal hypothesis testing, often provide additional useful information, and are not as easily misinterpreted." (Gerry Hahn et al, "The Impact of Six Sigma Improvement: A Glimpse Into the Future of Statistics", The American Statistician, 1999)
"There is a tendency to use hypothesis testing methods even when they are not appropriate. Often, estimation and confidence intervals are better tools. Use hypothesis testing only when you want to test a well-defined hypothesis."
"A type of error used in hypothesis testing that arises when incorrectly rejecting the null hypothesis, although it is actually true. Thus, based on the test statistic, the final conclusion rejects the Null hypothesis, but in truth it should be accepted. Type I error equates to the alpha (α) or significance level, whereby the generally accepted default is 5%." (Lynne Hambleton, "Treasure Chest of Six Sigma Growth Methods, Tools, and Best Practices", 2007)
"The way we explore data today, we often aren't constrained by rigid hypothesis testing or statistical rigor that can slow down the process to a crawl. But we need to be careful with this rapid pace of exploration, too. Modern business intelligence and analytics tools allow us to do so much with data so quickly that it can be easy to fall into a pitfall by creating a chart that misleads us in the early stages of the process." (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020)
No comments:
Post a Comment