02 December 2018

🔭Data Science: Hypothesis (Just the Quotes)

"[…] it is not necessary that these hypotheses should be true, or even probably; but it is enough if they provide a calculus which fits the observations […]" (Andrew Osiander, "On the Revolutions of the Heavenly Spheres", 1543)

"The art of discovering the causes of phenomena, or true hypothesis, is like the art of decyphering, in which an ingenious conjecture greatly shortens the road." (Gottfried W Leibniz, "New Essays Concerning Human Understanding", 1704) [published 1765]

"In order to shake a hypothesis, it is sometimes not necessary to do anything more than push it as far as it will go." (Denis Diderot, "On the Interpretation of Nature", 1753)

"No hypothesis can lay claim to any value unless it assembles many phenomena under one concept." (Johann Wolfgang von Goethe, [letter to Sommering] 1795)

"Induction, analogy, hypotheses founded upon facts and rectified continually by new observations, a happy tact given by nature and strengthened by numerous comparisons of its indications with experience, such are the principal means for arriving at truth." (Pierre-Simon Laplace, "A Philosophical Essay on Probabilities", 1814)

"The hypothesis is like the captain, and the observations like the soldiers of an army: while he appears to command them, and in this way to work his own will, he does in fact derive all his power of conquest from their obedience, and becomes helpless and useless if they mutiny." (William Whewell, "Philosophy of the Inductive Sciences", 1840)

"The process of scientific discovery is cautious and rigorous, not by abstaining from hypothesis, but by rigorously comparing hypotheses with facts, and by resolutely rejecting all which the comparison does not confirm." (William Whewell, "The Philosophy of the Inductive Sciences Founded Upon Their History" Vol. 2, 1840)

"When the hypothesis, of itself and without adjustment for the purpose, gives us the rule and reason of a class of facts not contemplated in its construction, we have a criterion of its reality, which has never yet been produced in favour of falsehood." (William Whewell, "The Philosophy of the Inductive Sciences", 1840) 

"An hypothesis being a mere supposition, there are no other limits to hypotheses than those of the human imagination; we may, if we please, imagine, by way of accounting for an effect, some cause of a kind utterly unknown, and acting according to a law altogether fictitious." (John S Mill, "A System of Logic, Ratiocinative and Inductive", 1843)

"It appears, then, to be a condition of a genuinely scientific hypothesis, that it be not destined always to remain an hypothesis, but be certain to be either proved or disproved by [...] comparison with observed facts." (John S Mill, "A System of Logic, Ratiocinative and Inductive", 1843)

"The hypothesis, by suggesting observations and experiments, puts us upon the road to that independent evidence if it be really attainable; and till it be attained, the hypothesis ought not to count for more than a suspicion." (John S Mill, "A System of Logic, Ratiocinative and Inductive", 1843)

"The rules of scientific investigation always require us, when we enter the domains of conjecture, to adopt that hypothesis by which the greatest number of known facts and phenomena may be reconciled." (Matthew F Maury, "The Physical Geography of the Sea", 1855) 

"An anticipative idea or an hypothesis is, then, the necessary starting point for all experimental reasoning. Without it, we could not make any investigation at all nor learn anything; we could only pile up sterile observations. If we experiment without a preconceived idea, we should move at random […]" (Claude Bernard, "An Introduction to the Study of Experimental Medicine", 1865)

"In scientific investigations, it is permitted to invent any hypothesis and, if it explains various large and independent classes of facts, it rises to the ranks of a well-grounded theory." (Charles Darwin, "The Variations of Animals and Plants Under Domestication" Vol. 1, 1868)

"The great tragedy of Science - the slaying of a beautiful hypothesis by an ugly fact." (Thomas H Huxley, "Biogenesis and abiogenesis", [address] 1870)

"[…] wrong hypotheses, rightly worked from, have produced more useful results than unguided observation." (Augustus de Morgan, "A Budget of Paradoxes", 1872)

"An hypothesis is only a habit - a habit of looking through a glass of one peculiar colour, which imparts its hue to all around it." (Frederick Marryat, "The King's Own", 1873) 

"A discoverer is a tester of scientific ideas; he must not only be able to imagine likely hypotheses, and to select suitable ones for investigation, but, as hypotheses may be true or untrue, he must also be competent to invent appropriate experiments for testing them, and to devise the requisite apparatus and arrangements." (George Gore, "The Art of Scientific Discovery", 1878)

"The scientific discovery appears first as the hypothesis of an analogy; and science tends to become independent of the hypothesis." (William K Clifford, "Lectures and Essays", 1879)

"Every hypothesis must derive indubitable results from mechanically well-defined assumptions by mathematically correct methods." (Ludwig Boltzmann, "Certain Questions of the Theory of Gasses", Nature Vol. 51 (1322), 1895) 

"For the truly scientific man, the hypothesis is destined solely to enable him to get the facts of nature in some definite order, an order which shall make apparent their connection with the great order and harmony which is believed to be present in the universe." (James M Baldwin, "The Processes of Life Revealed by the Microscope: A Plea for Physiological Histology", Science N.S. Vol. 2 (34), 1895)

"If the working hypothesis fails in any essential particular he [the scientist] is ready to modify or discard it. For the truly inspired investigator, one undoubted fact weighs more in the balance than a thousand theories." (James M Baldwin, "The Processes of Life Revealed by the Microscope: A Plea for Physiological Histology", Science N.S. Vol. 2 (34), 1895)

"In scientific investigations, it is permitted to invent any hypothesis and, if it explains various large and independent classes of facts, it rises to the ranks of a well-grounded theory." (Charles Darwin, "The Variations of Animals and Plants Under Domestication" Vol. 1, 1896)

"Entia non sunt multiplicanda praeter necessitatem. That is to say; before you try a complicated hypothesis, you should make quite sure that no simplification of it will explain the facts equally well." (Charles S Peirce," Pragmatism and Pragmaticism", [lecture] 1903)

"A false hypothesis, if it serve as a guide for further enquiry, may, at the right stage of science, be as useful as, or more useful than, a truer one for which acceptable evidence is not yet at hand." (William C Dampier, "Science and the Human Mind, Science in the Ancient World", 1912) 

"Without hypothesis there can be no progress in knowledge." (Max Verworn, "Irritability", 1913) 

"The great difference between induction and hypothesis is that the former infers the existence of phenomena such as we have observed in cases which are similar, while hypothesis supposes something of a different kind from what we have directly observed, and frequently something which it would be impossible for us to observe directly." (Charles S Peirce, "Chance, Love and Logic: Philosophical Essays, Deduction, Induction, Hypothesis", 1914)

"Theory is the best guide for experiment - that were it not for theory and the problems and hypotheses that come out of it, we would not know the points we wanted to verify, and hence would experiment aimlessly" (Henry Hazlitt,  "Thinking as a Science", 1916)

"A good hypothesis in science must have other properties than those of the phenomenon it is immediately invoked to explain, otherwise it is not prolific enough." (William James, "Selected Papers on Philosophy", 1918) 

"An indispensable hypothesis, even though still far from being a guarantee of success, is however the pursuit of a specific aim, whose lighted beacon, even by initial failures, is not betrayed." (Max Planck, [Nobel lecture] 1918) 

"A hypothesis or theory is clear, decisive, and positive, but it is believed by no one but the man who created it. Experimental findings, on the other hand, are messy, inexact things, which are believed by everyone except the man who did the work." (Harlow Shapley, "Review of Scientific Instruments" Vol. 6, 1922) 

"However successful a theory or law may have been in the past, directly it fails to interpret new discoveries its work is finished, and it must be discarded or modified. However plausible the hypothesis, it must be ever ready for sacrifice on the altar of observation." (Joseph W Mellor, "A Comprehensive Treatise on Inorganic and Theoretical Chemistry", 1922) 

"Hypothesis, however, is an inference based on knowledge which is insufficient to prove its high probability." (Frederick L Barry, "The Scientific Habit of Thought", 1927) 

"Abstraction is the detection of a common quality in the characteristics of a number of diverse observations […] A hypothesis serves the same purpose, but in a different way. It relates apparently diverse experiences, not by directly detecting a common quality in the experiences themselves, but by inventing a fictitious substance or process or idea, in terms of which the experience can be expressed. A hypothesis, in brief, correlates observations by adding something to them, while abstraction achieves the same end by subtracting something." (Herbert Dingle, Science and Human Experience, 1931)

"Science does not aim, primarily, at high probabilities. It aims at a high informative content, well backed by experience. But a hypothesis may be very probable simply because it tells us nothing, or very little." (Karl Popper, "The Logic of Scientific Discovery", 1934) 

"All the theories and hypotheses of empirical science share this provisional character of being established and accepted ‘until further notice’, whereas a mathematical theorem, once proved, is established once and for all; it holds with that particular certainty which no subsequent empirical discoveries, however unexpected and extraordinary, can ever affect to the slightest extent." (Carl G Hempel, "Geometry and Empirical Science", 1935)

"In relation to any experiment we may speak of this hypothesis as the null hypothesis, and it should be noted that the null hypothesis is never proved or established, but is possibly disproved, in the course of experimentation. Every experiment may be said to exist only in order to give the facts a chance of disproving the null hypothesis." (Ronald Fisher, "The Design of Experiments", 1935)

"The laws of science are the permanent contributions to knowledge - the individual pieces that are fitted together in an attempt to form a picture of the physical universe in action. As the pieces fall into place, we often catch glimpses of emerging patterns, called theories; they set us searching for the missing pieces that will fill in the gaps and complete the patterns. These theories, these provisional interpretations of the data in hand, are mere working hypotheses, and they are treated with scant respect until they can be tested by new pieces of the puzzle." (Edwin P Whipple, "Experiment and Experience", [Commencement Address, California Institute of Technology] 1938)

"When two hypotheses are possible, we provisionally choose that which our minds adjudge to the simpler on the supposition that this Is the more likely to lead in the direction of the truth." (James H Jeans, "Physics and Philosophy" 3rd Ed., 1943)

"We see what we want to see, and observation conforms to hypothesis." (Bergen Evans, "The Natural History of Nonsense", 1946)

"A successful hypothesis is not necessarily a permanent hypothesis, but it is one which stimulates additional research, opens up new fields, or explains and coordinates previously unrelated facts." (Farrington Daniels, "Outlines of Physical Chemistry", 1948)

"There would be cases where we would not want to accept an hypothesis even though the evidence gives a high d. c. [degree of confirmation] score, because we are fearful of the consequences of a wrong decision." (C West Churchman, "Theory of Experimental Inference", 1948) 

"Hypothesis is a tool which can cause trouble if not used properly. We must be ready to abandon out hypothesis as soon as it is shown to be inconsistent with the facts." (William I B Beveridge, "The Art of Scientific Investigation", 1950) 

"A collection of observable concepts in a purely formal hypothesis suggesting no analogy with anything would consequently not suggest either any directions for its own development." (Mary B Hesse, "Operational Definition and Analogy in Physical Theories", British Journal for the Philosophy of Science 2 (8), 1952)

"Whenever we attempt to test a hypothesis we naturally try to avoid errors in judging it. This seems to indicate the right way of proceeding: when choosing a test we should try to minimize the frequency of errors that may be committed in applying it." (Jerzy Neyman, "Lectures and Conferences on Mathematical Statistics", 1952) 

"The only relevant test of the validity of a hypothesis is comparison of prediction with experience." (Milton Friedman, "Essays in Positive Economics", 1953)

"[…] the grand aim of all science […] is to cover the greatest possible number of empirical facts by logical deductions from the smallest possible number of hypotheses or axioms." (Albert Einstein, 1954)

"One must credit an hypothesis with all that has had to be discovered in order to demolish it." (Jean Rostand, "The substance of man", 1962)

"The formulation of a hypothesis carries with it an obligation to test it as rigorously as we can command skills to do so." (Peter Medawar, "Hypothesis and Imagination", 1963)

"Truth in science can be defined as the working hypothesis best suited to open the way to the next better one." (Konrad Lorenz, "On Aggression", 1963) 

"The validation of a model is not that it is 'true' but that it generates good testable hypotheses relevant to important problems." (Richard Levins, "The Strategy of Model Building in Population Biology", 1966)

"All testing, all confirmation and disconfirmation of a hypothesis takes place already within a system. And this system is not a more or less arbitrary and doubtful point of departure for all our arguments; no it belongs to the essence of what we call an argument. The system is not so much the point of departure, as the element in which our arguments have their life." (Ludwig Wittgenstein, "On Certainty", 1969) 

"Science consists simply of the formulation and testing of hypotheses based on observational evidence; experiments are important where applicable, but their function is merely to simplify observation by imposing controlled conditions." (Henry L Batten, "Evolution of the Earth", 1971)

"An experiment is a failure only when it also fails adequately to test the hypothesis in question, when the data it produces don't prove anything one way or the other." (Robert M Pirsig, "Zen and the Art of Motorcycle Maintenance", 1974)

"A hypothesis is empirical or scientific only if it can be tested by experience. […] A hypothesis or theory which cannot be, at least in principle, falsified by empirical observations and experiments does not belong to the realm of science." (Francisco J Ayala, "Biological Evolution: Natural Selection or Random Walk", American Scientist, 1974)

"A hypothesis will in the end become a truth when all phenomena let themselves be derived from it in a natural and in an obvious manner, when all these consequences are connected with one another and with the general reasons, in short, when that hypothesis is consistent in all its parts with itself." (Johann H Lambert, 1976)

"The essential function of a hypothesis consists in the guidance it affords to new observations and experiments, by which our conjecture is either confirmed or refuted." (Ernst Mach, "Knowledge and Error: Sketches on the Psychology of Enquiry", 1976)

"Be suspicious of a theory if more and more hypotheses are needed to support it as new facts become available, or as new considerations are brought to bear." (Sir Fred Hoyle & Nalin C Wickramasinghe, "Evolution from Space", 1981)

"All interpretations made by a scientist are hypotheses, and all hypotheses are tentative. They must forever be tested and they must be revised if found to be unsatisfactory. Hence, a change of mind in a scientist, and particularly in a great scientist, is not only not a sign of weakness but rather evidence for continuing attention to the respective problem and an ability to test the hypothesis again and again." (Ernst Mayr, "The Growth of Biological Thought: Diversity, Evolution and Inheritance", 1982)

"Don't just read it; fight it! Ask your own question, look for your own examples, dicover your own proofs. Is the hypothesis necessary? Is the converse true? What happens in the classical special case? What about the degenerate cases? Where does the proof use the hypothesis?" (Paul R Halmos, "I Want to be a Mathematician", 1985)

"Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion." (Stephen M Stigler, "Testing Hypotheses or fitting Models? Another Look at Mass Extinctions" [in "Neutral Models in Biology"], 1987)

"All science is based on models, and every scientific model comprises three distinct stages: statement of well-defined hypotheses; deduction of all the consequences of these hypotheses, and nothing but these consequences; confrontation of these consequences with observed data." (Maurice Allais, "An Outline of My Main Contributions to Economic Science", [Noble lecture] 1988)

"Any physical theory is always provisional, in the sense that it is only a hypothesis: you can never prove it. No matter how many times the results of experiments agree with some theory, you can never be sure that the next time the result will not contradict the theory." (Stephen Hawking,  "A Brief History of Time", 1988)

"The heart of the scientific method is the problem-hypothesis-test process. And, necessarily, the scientific method involves predictions. And predictions, to be useful in scientific methodology, must be subject to test empirically." (Paul Davies, "The Cosmic Blueprint: New Discoveries in Nature's Creative Ability to, Order the Universe", 1988)

"The model and the theory it represents must be accepted, at least temporarily, or rejected, depending on the agreement or disagreement between observed data and the hypotheses and implications of the model. When neither the hypotheses nor the implications of a theory can be confronted with the real world, that theory is devoid of any scientific interest. Mere logical, even mathematical, deduction remains worthless in terms of the understanding of reality if it is not closely linked to that reality." (Maurice Allais, "An Outline of My Main Contributions to Economic Science", [Noble lecture] 1988)

"A fact is a simple statement that everyone believes. It is innocent, unless found guilty. A hypothesis is a novel suggestion that no one wants to believe. It is guilty, until found effective." (Edward Teller, "Conversations on the Dark Secrets of Physics", 1991)

"Visualizations can be used to explore data, to confirm a hypothesis, or to manipulate a viewer. [...] In exploratory visualization the user does not necessarily know what he is looking for. This creates a dynamic scenario in which interaction is critical. [...] In a confirmatory visualization, the user has a hypothesis that needs to be tested. This scenario is more stable and predictable. System parameters are often predetermined." (Usama Fayyad et al, "Information Visualization in Data Mining and Knowledge Discovery", 2002) 

"[…] a conceptual model is a diagram connecting variables and constructs based on theory and logic that displays the hypotheses to be tested." (Mary W Celsi et al, "Essentials of Business Research Methods", 2011)

"Data science is an iterative process. It starts with a hypothesis (or several hypotheses) about the system we’re studying, and then we analyze the information. The results allow us to reject our initial hypotheses and refine our understanding of the data. When working with thousands of fields and millions of rows, it’s important to develop intuitive ways to reject bad hypotheses quickly." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"Observation and experiment, without a rational hypothesis, is like a man groping at objects at random with his eyes shut." (Henry P Tappan, "Elements of Logic", 2015)

"A hypothesis is a starting point for an investigation. When you hypothesize, you make a claim about why something might be the case, based on limited data, to offer an explanation or a path forward. You wouldn’t make a proposition about something you are certain of. You may not have enough evidence yet to even convince you that it’s true. But making such a claim puts a stake in the ground that suggests a path for focused analysis." (Eben Hewitt, "Technology Strategy Patterns: Architecture as strategy" 2nd Ed., 2019)

"Data science is, in reality, something that has been around for a very long time. The desire to utilize data to test, understand, experiment, and prove out hypotheses has been around for ages. To put it simply: the use of data to figure things out has been around since a human tried to utilize the information about herds moving about and finding ways to satisfy hunger. The topic of data science came into popular culture more and more as the advent of ‘big data’ came to the forefront of the business world." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Pure data science is the use of data to test, hypothesize, utilize statistics and more, to predict, model, build algorithms, and so forth. This is the technical part of the puzzle. We need this within each organization. By having it, we can utilize the power that these technical aspects bring to data and analytics. Then, with the power to communicate effectively, the analysis can flow throughout the needed parts of an organization." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

01 December 2018

🔭Data Science: Data Visualization (Just the Quotes)

"No matter how clever the choice of the information, and no matter how technologically impressive the encoding, a visualization fails if the decoding fails. Some display methods lead to efficient, accurate decoding, and others lead to inefficient, inaccurate decoding. It is only through scientific study of visual perception that informed judgments can be made about display methods." (William S Cleveland, "The Elements of Graphing Data", 1985)

"The greatest possibilities of visual display lie in vividness and inescapability of the intended message. A visual display can stop your mental flow in its tracks and make you think. A visual display can force you to notice what you never expected to see. One should see the intended at once; one should not even have to wait for it to appear." (John W Tukey, "Data-based graphics: Visual display in the decades to come", Statistical Science 5, 1990)

"Data that are skewed toward large values occur commonly. Any set of positive measurements is a candidate. Nature just works like that. In fact, if data consisting of positive numbers range over several powers of ten, it is almost a guarantee that they will be skewed. Skewness creates many problems. There are visualization problems. A large fraction of the data are squashed into small regions of graphs, and visual assessment of the data degrades. There are characterization problems. Skewed distributions tend to be more complicated than symmetric ones; for example, there is no unique notion of location and the median and mean measure different aspects of the distribution. There are problems in carrying out probabilistic methods. The distribution of skewed data is not well approximated by the normal, so the many probabilistic methods based on an assumption of a normal distribution cannot be applied." (William S Cleveland, "Visualizing Data", 1993)

"Many of the applications of visualization in this book give the impression that data analysis consists of an orderly progression of exploratory graphs, fitting, and visualization of fits and residuals. Coherence of discussion and limited space necessitate a presentation that appears to imply this. Real life is usually quite different. There are blind alleys. There are mistaken actions. There are effects missed until the very end when some visualization saves the day. And worse, there is the possibility of the nearly unmentionable: missed effects." (William S Cleveland, "Visualizing Data", 1993)

"One important aspect of reality is improvisation; as a result of special structure in a set of data, or the finding of a visualization method, we stray from the standard methods for the data type to exploit the structure or the finding." (William S Cleveland, "Visualizing Data", 1993)

"There are two components to visualizing the structure of statistical data - graphing and fitting. Graphs are needed, of course, because visualization implies a process in which information is encoded on visual displays. Fitting mathematical functions to data is needed too. Just graphing raw data, without fitting them and without graphing the fits and residuals, often leaves important aspects of data undiscovered." (William S Cleveland, "Visualizing Data", 1993)

"Visualization is an approach to data analysis that stresses a penetrating look at the structure of data. No other approach conveys as much information. […] Conclusions spring from data when this information is combined with the prior knowledge of the subject under investigation." (William S Cleveland, "Visualizing Data", 1993)

"Visualization is an effective framework for drawing inferences from data because its revelation of the structure of data can be readily combined with prior knowledge to draw conclusions. By contrast, because of the formalism of probablistic methods, it is typically impossible to incorporate into them the full body of prior information." (William S Cleveland, "Visualizing Data", 1993)

"When visualization tools act as a catalyst to early visual thinking about a relatively unexplored problem, neither the semantics nor the pragmatics of map signs is a dominant factor. On the other hand, syntactics (or how the sign-vehicles, through variation in the visual variables used to construct them, relate logically to one another) are of critical importance." (Alan M MacEachren, "How Maps Work: Representation, Visualization, and Design", 1995)

"The nature of maps and of their use in science and society is in the midst of remarkable change - change that is stimulated by a combination of new scientific and societal needs for geo-referenced information and rapidly evolving technologies that can provide that information in innovative ways. A key issue at the heart of this change is the concept of ‘visualization’." (Alan M MacEachren, "Exploratory cartographic visualization: advancing the agenda", 1997)

"Visualization for large data is an oxymoron - the art is to reduce size before one visualizes. The contradiction (and challenge) is that we may need to visualize first in order to find out how to reduce size." (Peter Huber, "Massive datasets workshop: Four years after", Journal of Computational and Graphical Statistics Vol 8, 1999)

"Functional visualizations are more than innovative statistical analyses and computational algorithms. They must make sense to the user and require a visual language system that uses color, shape, line, hierarchy and composition to communicate clearly and appropriately, much like the alphabetic and character-based languages used worldwide between humans." (Matt Woolman, "Digital Information Graphics", 2002)

"Visualizations can be used to explore data, to confirm a hypothesis, or to manipulate a viewer. [...] In exploratory visualization the user does not necessarily know what he is looking for. This creates a dynamic scenario in which interaction is critical. [...] In a confirmatory visualization, the user has a hypothesis that needs to be tested. This scenario is more stable and predictable. System parameters are often predetermined." (Usama Fayyad et al, "Information Visualization in Data Mining and Knowledge Discovery", 2002) 

"Dashboards and visualization are cognitive tools that improve your 'span of control' over a lot of business data. These tools help people visually identify trends, patterns and anomalies, reason about what they see and help guide them toward effective decisions. As such, these tools need to leverage people's visual capabilities. With the prevalence of scorecards, dashboards and other visualization tools now widely available for business users to review their data, the issue of visual information design is more important than ever." (Richard Brath & Michael Peters, "Dashboard Design: Why Design is Important," DM Direct, 2004)

"Merely drawing a plot does not constitute visualization. Visualization is about conveying important information to the reader accurately. It should reveal information that is in the data and should not impose structure on the data." (Robert Gentleman, "Bioinformatics and Computational Biology Solutions using R and Bioconductor", 2005)

"Exploratory Data Analysis is more than just a collection of data-analysis techniques; it provides a philosophy of how to dissect a data set. It stresses the power of visualisation and aspects such as what to look for, how to look for it and how to interpret the information it contains. Most EDA techniques are graphical in nature, because the main aim of EDA is to explore data in an open-minded way. Using graphics, rather than calculations, keeps open possibilities of spotting interesting patterns or anomalies that would not be apparent with a calculation (where assumptions and decisions about the nature of the data tend to be made in advance)." (Alan Graham, "Developing Thinking in Statistics", 2006) 

"Data visualization [...] expresses the idea that it involves more than just representing data in a graphical form (instead of using a table). The information behind the data should also be revealed in a good display; the graphic should aid readers or viewers in seeing the structure in the data. The term data visualization is related to the new field of information visualization. This includes visualization of all kinds of information, not just of data, and is closely associated with research by computer scientists." (Antony Unwin et al, "Introduction" [in "Handbook of Data Visualization"], 2008) 

"The main goal of data visualization is its ability to visualize data, communicating information clearly and effectively. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex dataset by communicating its key aspects in a more intuitive way. Yet designers often tend to discard the balance between design and function, creating gorgeous data visualizations which fail to serve its main purpose - communicate information." (Vitaly Friedman, "Data Visualization and Infographics", Smashing Magazine, 2008)

"The purpose of visualization is insight, not pictures." (Ben Shneiderman, "Extreme visualization: squeezing a billion records into a million pixels",  SIGMOD ’08: Proceedings of the 2008 ACM SIGMOD, 2008)

"With the ever increasing amount of empirical information that scientists from all disciplines are dealing with, there exists a great need for robust, scalable and easy to use clustering techniques for data abstraction, dimensionality reduction or visualization to cope with and manage this avalanche of data."  (Jörg Reichardt, "Structure in Complex Networks", 2009)

"So what is the difference between a chart or graph and a visualization? […] a chart or graph is a clean and simple atomic piece; bar charts contain a short story about the data being presented. A visualization, on the other hand, seems to contain much more ʻchart junkʼ, with many sometimes complex graphics or several layers of charts and graphs. A visualization seems to be the super-set for all sorts of data-driven design." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"The goal of visualization is to aid our understanding of data by leveraging the human visual system’s highly tuned ability to see patterns, spot trends, and identify outliers." (J Heer et al, "A tour through the visualization zoo", Queue 8, 2010) 

"All graphics present data and allow a certain degree of exploration of those same data. Some graphics are almost all presentation, so they allow just a limited amount of exploration; hence we can say they are more infographics than visualization, whereas others are mostly about letting readers play with what is being shown, tilting more to the visualization side of our linear scale. But every infographic and every visualization has a presentation and an exploration component: they present, but they also facilitate the analysis of what they show, to different degrees." (Alberto Cairo, "The Functional Art", 2011)

"Exploratory data visualizations are appropriate when you have a whole bunch of data and you’re not sure what’s in it. […] By contrast, explanatory data visualization is appropriate when you already know what the data has to say, and you are trying to tell that story to somebody else." (Noah Iliinsky & Julie Steele, "Designing Data Visualizations", 2011)

"In data visualization, the number one rule of thumb to bear is mind is: Function first, suave second." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"The first and main goal of any graphic and visualization is to be a tool for your eyes and brain to perceive what lies beyond their natural reach." (Alberto Cairo, "The Functional Art", 2011)

"Thinking of graphics as art leads many to put bells and whistles over substance and to confound infographics with mere illustrations." (Alberto Cairo, "The Functional Art", 2011)

"[...] the terms data visualization and information visualization (casually, data viz and info viz) are useful for referring to any visual representation of data that is: (•) algorithmically drawn (may have custom touches but is largely rendered with the help of computerized methods); (•) easy to regenerate with different data (the same form may be repurposed to represent different datasets with similar dimensions or characteristics); (•) often aesthetically barren (data is not decorated); and (•) relatively data-rich (large volumes of data are welcome and viable, in contrast to infographics)." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Visualizations act as a campfire around which we gather to tell stories." (Al Shalloway, 2011)

"Good infographic design is about storytelling by combining data visualization design and graphic design." (Randy Krum, "Good Infographics: Effective Communication with Data Visualization and Design", 2013)

"Good visualization is a winding process that requires statistics and design knowledge. Without the former, the visualization becomes an exercise only in illustration and aesthetics, and without the latter, one of only analyses. On their own, these are fine skills, but they make for incomplete data graphics. Having skills in both provides you with the luxury - which is growing into a necessity - to jump back and forth between data exploration and storytelling." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"The biggest thing to know is that data visualization is hard. Really difficult to pull off well. It requires harmonization of several skills sets and ways of thinking: conceptual, analytic, statistical, graphic design, programmatic, interface-design, story-telling, journalism - plus a bit of 'gut feel'. The end result is often simple and beautiful, but the process itself is usually challenging and messy." (David McCandless, 2013)

"Visualization can be appreciated purely from an aesthetic point of view, but it’s most interesting when it’s about data that’s worth looking at. That’s why you start with data, explore it, and then show results rather than start with a visual and try to squeeze a dataset into it. It’s like trying to use a hammer to bang in a bunch of screws. […] Aesthetics isn’t just a shiny veneer that you slap on at the last minute. It represents the thought you put into a visualization, which is tightly coupled with clarity and affects interpretation." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Visualization is what happens when you make the jump from raw data to bar graphs, line charts, and dot plots. […] In its most basic form, visualization is simply mapping data to geometry and color. It works because your brain is wired to find patterns, and you can switch back and forth between the visual and the numbers it represents. This is the important bit. You must make sure that the essence of the data isn’t lost in that back and forth between visual and the value it represents because if you can’t map back to the data, the visualization is just a bunch of shapes." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"What is good visualization? It is a representation of data that helps you see what you otherwise would have been blind to if you looked only at the naked source. It enables you to see trends, patterns, and outliers that tell you about yourself and what surrounds you. The best visualization evokes that moment of bliss when seeing something for the first time, knowing that what you see has been right in front of you, just slightly hidden. Sometimes it is a simple bar graph, and other times the visualization is complex because the data requires it." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Just because data is visualized doesn’t necessarily mean that it is accurate, complete, or indicative of the right course of action. Exhibiting a healthy skepticism is almost always a good thing." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"To be sure, data doesn’t always need to be visualized, and many data visualizations just plain suck. Look around you. It’s not hard to find truly awful representations of information. Some work in concept but fail because they are too busy; they confuse people more than they convey information [...]. Visualization for the sake of visualization is unlikely to produce desired results - and this goes double in an era of Big Data. Bad is still bad, even and especially at a larger scale." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"We are all becoming more comfortable with data. Data visualization is no longer just something we have to do at work. Increasingly, we want to do it as consumers and as citizens. Put simply, visualizing helps us understand what’s going on in our lives - and how to solve problems." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"Data visualization is marketed today as the miracle cure that will open the doors to success, whatever its shape. We have enough experience to realize that in reality it’s not always easy to distinguish between real usefulness and zealous marketing. After the initial excitement over the prospects of data visualization comes disillusionment, and after that the possibility of a balanced assessment. The key is to get to this point quickly, without disappointments and at a lower cost." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"[...] data visualization [is] a tool that, by applying perceptual mechanisms to the visual representation of abstract quantitative data, facilitates the search for relevant shapes, order, or exceptions. [...]  We must think of data visualization as a generic field where several (combinations of) perspectives, processes, technologies, and objectives (not forgetting the subjective component of personal style) can coexist. In this sense, data art, infographics, and business visualization are branches of data visualization." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"Data visualization is not a science; it is a crossroads at which certain scientific knowledge is used to justify and frame subjective choices. Ÿis doesn’t mean that rules don’t count. Rules exist and are effective when applied within the context for which they were designed." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"Creating effective visualizations is hard. Not because a dataset requires an exotic and bespoke visual representation - for many problems, standard statistical charts will suffice. And not because creating a visualization requires coding expertise in an unfamiliar programming language [...]. Rather, creating effective visualizations is difficult because the problems that are best addressed by visualization are often complex and ill-formed. The task of figuring out what attributes of a dataset are important is often conflated with figuring out what type of visualization to use. Picking a chart type to represent specific attributes in a dataset is comparatively easy. Deciding on which data attributes will help answer a question, however, is a complex, poorly defined, and user-driven process that can require several rounds of visualization and exploration to resolve." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"[…] no single visualization is ever quite able to show all of the important aspects of our data at once - there just are not enough visual encoding channels. […] designing effective visualizations to make sense of data is not an art - it is a systematic and repeatable process." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

 "[…] the data itself can lead to new questions too. In exploratory data analysis (EDA), for example, the data analyst discovers new questions based on the data. The process of looking at the data to address some of these questions generates incidental visualizations - odd patterns, outliers, or surprising correlations that are worth looking into further." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The field of [data] visualization takes on that goal more broadly: rather than attempting to identify a single metric, the analyst instead tries to look more holistically across the data to get a usable, actionable answer. Arriving at that answer might involve exploring multiple attributes, and using a number of views that allow the ideas to come together. Thus, operationalization in the context of visualization is the process of identifying tasks to be performed over the dataset that are a reasonable approximation of the high-level question of interest." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Apart from the technical challenge of working with the data itself, visualization in big data is different because showing the individual observations is just not an option. But visualization is essential here: for analysis to work well, we have to be assured that patterns and errors in the data have been spotted and understood. That is only possible by visualization with big data, because nobody can look over the data in a table or spreadsheet." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"As a first principle, any visualization should convey its information quickly and easily, and with minimal scope for misunderstanding. Unnecessary visual clutter makes more work for the reader’s brain to do, slows down the understanding (at which point they may give up) and may even allow some incorrect interpretations to creep in." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Data storytelling can be defined as a structured approach for communicating data insights using narrative elements and explanatory visuals." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Data storytelling involves the skillful combination of three key elements: data, narrative, and visuals. Data is the primary building block of every data story. It may sound simple, but a data story should always find its origin in data, and data should serve as the foundation for the narrative and visual elements of your story." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"(1) Good data visualization is trustworthy: Is it reliable? Is the portrayal of the data and the subject faithful? Do the representation and presentation design have integrity? (2) Good data visualization is accessible: Is it usable? Is the portrayal of the data and the subject relevant? Is the representation and presentation design suitably understandable? (3) Good data visualization is elegant: Is it aesthetic? Is the representation and presentation design appealing?" (Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019)

"In addition to managing how the data is visualized to reduce noise, you can also decrease the visual interference by minimizing the extraneous cognitive load. In these cases, the nonrelevant information and design elements surrounding the data can cause extraneous noise. Poor design or display decisions by the data storyteller can inadvertently interfere with the communication of the intended signal. This form of noise can occur at both a macro and micro level." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"One very common problem in data visualization is that encoding numerical variables to area is incredibly popular, but readers can’t translate it back very well." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"There is often no one 'best' visualization, because it depends on context, what your audience already knows, how numerate or scientifically trained they are, what formats and conventions are regarded as standard in the particular field you’re working in, the medium you can use, and so on. It’s also partly scientific and partly artistic, so you get to express your own design style in it, which is what makes it so fascinating." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019) 

"When visuals are applied to data, they can enlighten the audience to insights that they wouldn’t see without charts or graphs. Many interesting patterns and outliers in the data would remain hidden in the rows and columns of data tables without the help of data visualizations. They connect with our visual nature as human beings and impart knowledge that couldn’t be obtained as easily using other approaches that involve just words or numbers." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"While visuals are an essential part of data storytelling, data visualizations can serve a variety of purposes from analysis to communication to even art. Most data charts are designed to disseminate information in a visual manner. Only a subset of data compositions is focused on presenting specific insights as opposed to just general information. When most data compositions combine both visualizations and text, it can be difficult to discern whether a particular scenario falls into the realm of data storytelling or not." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Another problem is that while data visualizations may appear to be objective, the designer has a great deal of control over the message a graphic conveys. Even using accurate data, a designer can manipulate how those data make us feel. She can create the illusion of a correlation where none exists, or make a small difference between groups look big." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"As presenters of data visualizations, often we just want our audience to understand something about their environment – a trend, a pattern, a breakdown, a way in which things have been progressing. If we ask ourselves what we want our audience to do with that information, we might have a hard time coming up with a clear answer sometimes. We might just want them to know something." (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020)

"Data visualizations are either used (1) to help people complete a task, or (2) to give them a general awareness of the way things are, or (3) to enable them to explore the topic for themselves."  (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020)

"Much of the data visualization that bombards us today is decoration at best, and distraction or even disinformation at worst. The decorative function is surprisingly common, perhaps because the data visualization teams of many media organizations are part of the art departments. They are led by people whose skills and experience are not in statistics but in illustration or graphic design. The emphasis is on the visualization, not on the data. It is, above all, a picture." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"A data visualization, or dashboard, is great for summarizing or describing what has gone on in the past, but if people don’t know how to progress beyond looking just backwards on what has happened, then they cannot diagnose and find the ‘why’ behind it." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Data literacy is for the masses, and data visualization is powerful to simplify what could be very complicated." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Understanding the entire data ecosystem, from the production of a data point to its consumption in a dashboard or a visualization, provides the ability to invoke action, which is more valuable than the mere sum of its parts." (Jesús Barrasa et al, "Knowledge Graphs: Data in Context for Responsive Businesses", 2021)

"Data visualization is a simplified approach to studying data." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Data visualization is a mix of science and art. Sometimes we want to be closer to the science side of the spectrum - in other words, use visualizations that allow readers to more accurately perceive the absolute values of data and make comparisons. Other times we may want to be closer to the art side of the spectrum and create visuals that engage and excite the reader, even if they do not permit the most accurate comparisons." (Jonathan Schwabish, "Better Data Visualizations: A guide for scholars, researchers, and wonks", 2021)

"I agree that data visualizations should be visually appealing, driving and utilizing the appeal and power for individuals to utilize it effectively, but sometimes this can take too much time, taking it away from more valuable uses in data. Plus, if the data visualization is not moving the needle of a business goal or objective, how effective is that visualization?" (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Data becomes more useful once it’s transformed into a data visualization or used in a data story. Data storytelling is the ability to effectively communicate insights from a dataset using narratives and visualizations. It can be used to put data insights into context and inspire action from your audience. Color can be very helpful when you are trying to make information stand out within your data visualizations." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

"Data visualization is the practice of taking insights found in data analysis and turning them into numbers, graphs, charts, and other visual concepts to make them easier to grasp, understand, learn from, and utilize.[...] The visualization of data can be thought of as both a science and an art in that the way it is displayed is often as important to its understanding as the actual information that is being displayed." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

"Visualizations can remove the background noise from enormous sets of data so that only the most important points stand out to the intended audience. This is particularly important in the era of big data. The more data there is, the more chance for noise and outliers to interfere with the core concepts of the data set." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

"The best approach is to build visualizations in the most digestible form, fitted to how that executive thinks. You will have to interact with executives, show them different visualizations, and see how they react in order to learn which forms work best for them. Be ready to fail often and learn fast, particularly with visualizations." (John Lucker)

"Visualisation is fundamentally limited by the number of pixels you can pump to a screen. If you have big data, you have way more data than pixels, so you have to summarise your data. Statistics gives you lots of really good tools for this." (Hadley Wickham)

"We often think of visualization as a design and programming task, but the process starts further back with the data. You have to understand the data - its trends and patterns, along with its flaws and imperfections - and the rest follows." (Nathan Yau)

🔭Data Science: The Science in Data Science (Just the Quotes)

"The aim of every science is foresight. For the laws of established observation of phenomena are generally employed to foresee their succession. All men, however little advanced make true predictions, which are always based on the same principle, the knowledge of the future from the past." (Auguste Compte, "Plan des travaux scientifiques nécessaires pour réorganiser la société", 1822)

"Science is nothing but the finding of analogy, identity, in the most remote parts." (Ralph W Emerson, 1837)

"Therefore science always goes abreast with the just elevation of the man, keeping step with religion and metaphysics; or, the state of science is an index of our self-knowledge." (Ralph W Emerson, "The Poet", 1844)

"It may sound quite strange, but for me, as for other scientists on whom these kinds of imaginative images have a greater effect than other poems do, no science is at its very heart more closely related to poetry, perhaps, than is chemistry." (Just Liebig, 1854)

"Science is the systematic classification of experience." (George H Lewes, "The Physical Basis of Mind", 1877)

"Science is the observation of things possible, whether present or past; prescience is the knowledge of things which may come to pass, though but slowly." (Leonardo da Vinci, "The Notebooks of Leonardo da Vinci", 1883)

"While science is pursuing a steady onward movement, it is convenient from time to time to cast a glance back on the route already traversed, and especially to consider the new conceptions which aim at discovering the general meaning of the stock of facts accumulated from day to day in our laboratories." (Dmitry Mendeleyev, "The Periodic Law of the Chemical Elements", Journal of the Chemical Society Vol. 55, 1889)

"The aim of science is always to reduce complexity to simplicity." (William James, "The Principles of Psychology", 1890)

"Science is not the monopoly of the naturalist or the scholar, nor is it anything mysterious or esoteric. Science is the search for truth, and truth is the adequacy of a description of facts." (Paul Carus, "Philosophy as a Science", 1909)

"Science is reduction. Mathematics is its ideal, its form par excellence, for it is in mathematics that assimilation, identification, is most perfectly realized. The universe, scientifically explained, would be a certain formula, one and eternal, regarded as the equivalent of the entire diversity and movement of things." (Émile Boutroux, "Natural law in Science and Philosophy", 1914)

"Abstract as it is, science is but an outgrowth of life. That is what the teacher must continually keep in mind. […] Let him explain […] science is not a dead system - the excretion of a monstrous pedantism - but really one of the most vigorous and exuberant phases of human life." (George A L Sarton, "The Teaching of the History of Science", The Scientific Monthly, 1918)

"The aim of science is to seek the simplest explanations of complex facts. We are apt to fall into the error of thinking that the facts are simple because simplicity is the goal of our quest. The guiding motto in the life of every natural philosopher should be, ‘Seek simplicity and distrust it’." (Alfred N Whitehead, "The Concept of Nature", 1919)

"Science is simply setting out on a fishing expedition to see whether it cannot find some procedure which it can call measurement of space and some procedure which it can call the measurement of time, and something which it can call a system of forces, and something which it can call masses." (Alfred N Whitehead, "The Concept of Nature", 1920)

"Science is a magnificent force, but it is not a teacher of morals. It can perfect machinery, but it adds no moral restraints to protect society from the misuse of the machine. It can also build gigantic intellectual ships, but it constructs no moral rudders for the control of storm tossed human vessel. It not only fails to supply the spiritual element needed but some of its unproven hypotheses rob the ship of its compass and thus endangers its cargo." (William J Bryan, "Undelivered Trial Summation Scopes Trial", 1925)

"Science is but a method. Whatever its material, an observation accurately made and free of compromise to bias and desire, and undeterred by consequence, is science." (Hans Zinsser, "Untheological Reflections", The Atlantic Monthly, 1929)

"Although this may seem a paradox, all exact science is dominated by the idea of approximation. When a man tells you that he knows the exact truth about anything, you are safe in inferring that he is an inexact man." (Bertrand Russell, "The Scientific Outlook", 1931)

"The common view of science is that it is a sort of machine for increasing the race’s store of dependable facts. It is that only in part; in even larger part it is a machine for upsetting undependable facts." (Will Durant, 1931)

"One has to recognize that science is not metaphysics, and certainly not mysticism; it can never bring us the illumination and the satisfaction experienced by one enraptured in ecstasy. Science is sobriety and clarity of conception, not intoxicated vision."(Ludwig Von Mises, "Epistemological Problems of Economics", 1933)

"Modern positivists are apt to see more clearly that science is not a system of concepts but rather a system of statements." (Karl R Popper, "The Logic of Scientific Discovery", 1934)

"Science is a system of statements based on direct experience, and controlled by experimental verification. Verification in science is not, however, of single statements but of the entire system or a sub-system of such statements." (Rudolf Carnap, "The Unity of Science", 1934)

"Science is the attempt to discover, by means of observation, and reasoning based upon it, first, particular facts about the world, and then laws connecting facts with one another and (in fortunate cases) making it possible to predict future occurrences." (Bertrand Russell, "Religion and Science, Grounds of Conflict", 1935)

"[…] that all science is merely a game can be easily discarded as a piece of wisdom too easily come by. But it is legitimate to enquire whether science is not liable to indulge in play within the closed precincts of its own method. Thus, for instance, the scientist’s continuous penchant for systems tends in the direction of play." (Johan Huizinga, "Homo Ludens", 1938)

"Science makes no pretension to eternal truth or absolute truth; some of its rivals do. That science is in some respects inhuman may be the secret of its success in alleviating human misery and mitigating human stupidity." (Eric T Bell, "Mathematics: Queen and Servant of Science", 1938)

"Science is the attempt to make the chaotic diversity of our sense experience correspond to a logically uniform system of thought." (Albert Einstein, "Considerations Concerning the Fundaments of Theoretical Physics", Science Vol. 91 (2369), 1940)

"Science is the organised attempt of mankind to discover how things work as causal systems. The scientific attitude of mind is an interest in such questions. It can be contrasted with other attitudes, which have different interests; for instance the magical, which attempts to make things work not as material systems but as immaterial forces which can be controlled by spells; or the religious, which is interested in the world as revealing the nature of God." (Conrad H Waddington, "The Scientific Attitude", 1941)

"Science, in the broadest sense, is the entire body of the most accurately tested, critically established, systematized knowledge available about that part of the universe which has come under human observation. For the most part this knowledge concerns the forces impinging upon human beings in the serious business of living and thus affecting man’s adjustment to and of the physical and the social world. […] Pure science is more interested in understanding, and applied science is more interested in control […]" (Austin L Porterfield, "Creative Factors in Scientific Research", 1941)

"Science is an interconnected series of concepts and schemes that have developed as a result of experimentation and observation and are fruitful of further experimentation and observation."(James B Conant, "Science and Common Sense", 1951)

"[…] theoretical science is essentially disciplined exploitation of metaphor." (Anatol Rapoport, "Operational Philosophy", 1953)

"Prediction is all very well; but we must make sense of what we predict. The mainspring of science is the conviction that by honest, imaginative enquiry we can build up a system of ideas about Nature which has some legitimate claim to ‘reality’." (Stephen Toulmin, "The Philosophy of Science: An Introduction", 1953)

"An engineering science aims to organize the design principles used in engineering practice into a discipline and thus to exhibit the similarities between different areas of engineering practice and to emphasize the power of fundamental concepts. In short, an engineering science is predominated by theoretical analysis and very often uses the tool of advanced mathematics." (Qian Xuesen, "Engineering cybernetics", 1954))

"The true aim of science is to discover a simple theory which is necessary and sufficient to cover the facts, when they have been purified of traditional prejudices." (Lancelot L Whyte, "Accent on Form", 1954)

"Science is the creation of concepts and their exploration in the facts. It has no other test of the concept than its empirical truth to fact." (Jacob Bronowski, "Science and Human Values", 1956)

"The progress of science is the discovery at each step of a new order which gives unity to what had seemed unlike." (Jacob Bronowski, "Science and Human Values", 1956)

"[…] any serious examination of the basic concepts of any science is far more difficult than the elaboration of their ultimate consequences." (George F J Temple, "Turning Points in Physics", 1959)

"Science is usually understood to depict a universe of strict order and lawfulness, of rigorous economy - one whose currency is energy, convertible against a service charge into a growing common pool called entropy." (Paul A Weiss,"Organic Form: Scientific and Aesthetic Aspects", 1960)

"[…] the progress of science is a little like making a jig-saw puzzle. One makes collections of pieces which certainly fit together, though at first it is not clear where each group should come in the picture as a whole, and if at first one makes a mistake in placing it, this can be corrected later without dismantling the whole group." (Sir George Thomson, "The Inspiration of Science", 1961)

"Science is the reduction of the bewildering diversity of unique events to manageable uniformity within one of a number of symbol systems, and technology is the art of using these symbol systems so as to control and organize unique events. Scientific observation is always a viewing of things through the refracting medium of a symbol system, and technological praxis is always handling of things in ways that some symbol system has dictated. Education in science and technology is essentially education on the symbol level." (Aldous L Huxley, "Essay", Daedalus, 1962)

"The important distinction between science and those other systematizations [i.e., art, philosophy, and theology] is that science is self-testing and self-correcting. Here the essential point of science is respect for objective fact. What is correctly observed must be believed [...] the competent scientist does quite the opposite of the popular stereotype of setting out to prove a theory; he seeks to disprove it." (George G Simpson, "Notes on the Nature of Science", 1962)

"What, then, is science according to common opinion? Science is what scientists do. Science is knowledge, a body of information about the external world. Science is the ability to predict. Science is power, it is engineering. Science explains, or gives causes and reasons." (John Bremer "What Is Science?" [in "Notes on the Nature of Science"], 1962)

"Science is a matter of disinterested observation, patient ratiocination within some system of logically correlated concepts. In real-life conflicts between reason and passion the issue is uncertain. Passion and prejudice are always able to mobilize their forces more rapidly and press the attack with greater fury; but in the long run (and often, of course, too late) enlightened self-interest may rouse itself, launch a counterattack and win the day for reason." (Aldous L Huxley, "Literature and Science", 1963)

"Science is a way to teach how something gets to be known, what is not known, to what extent things are known (for nothing is known absolutely), how to handle doubt and uncertainty, what the rules of evidence are, how to think about things so that judgments can be made, how to distinguish truth from fraud, and from show." (Richard P Feynman, "The Problem of Teaching Physics in Latin America", Engineering and Science, 1963)

"The aim of science is to apprehend this purely intelligible world as a thing in itself, an object which is what it is independently of all thinking, and thus antithetical to the sensible world. [...] The world of thought is the universal, the timeless and spaceless, the absolutely necessary, whereas the world of sense is the contingent, the changing and moving appearance which somehow indicates or symbolizes it." (Robin G Collingwood, "Essays in the Philosophy of Art", 1964)

"The central task of a natural science is to make the wonderful commonplace: to show that complexity, correctly viewed, is only a mask for simplicity; to find pattern hidden in apparent chaos." (Herbert A Simon, "The Sciences of the Artificial", 1969)

"The central task of a natural science is to make the wonderful commonplace: to show that complexity, correctly viewed, is only a mask for simplicity; to find pattern hidden in apparent chaos." (Herbert A Simon, "The Sciences of the Artificial", 1969)

"Science is a product of man, of his mind; and science creates the real world in its own image." (Frank E Egler, "The Way of Science", 1970)

"To do science is to search for repeated patterns, not simply to accumulate facts [...]" (Robert H. MacArthur, "Geographical Ecology", 1972)

"Science is systematic organisation of knowledge about the universe on the basis of explanatory hypotheses which are genuinely testable. Science advances by developing gradually more comprehensive theories; that is, by formulating theories of greater generality which can account for observational statements and hypotheses which appear as prima facie unrelated." (Francisco J Ayala, "Studies in the Philosophy of Biology: Reduction and Related Problems", 1974)

"A mature science, with respect to the matter of errors in variables, is not one that measures its variables without error, for this is impossible. It is, rather, a science which properly manages its errors, controlling their magnitudes and correctly calculating their implications for substantive conclusions." (Otis D Duncan, "Introduction to Structural Equation Models", 1975)

"The very nature of science is such that scientists need the metaphor as a bridge between old and new theories." (Earl R MacCormac, "Metaphor and Myth in Science and Religion", 1976)

"Facts do not ‘speak for themselves’; they are read in the light of theory. Creative thought, in science as much as in the arts, is the motor of changing opinion. Science is a quintessentially human activity, not a mechanized, robot-like accumulation of objective information, leading by laws of logic to inescapable interpretation." (Stephen J Gould, "Ever Since Darwin", 1977)

"Science is not a heartless pursuit of objective information. It is a creative human activity, its geniuses acting more as artists than information processors. Changes in theory are not simply the derivative results of the new discoveries but the work of creative imagination influenced by contemporary social and political forces." (Stephen J Gould, "Ever Since Darwin: Reflections in Natural History", 1977)

"Engineering or Technology is the making of things that did not previously exist, whereas science is the discovering of things that have long existed." (David Billington, "The Tower and the Bridge: The New Art of Structural Engineering", 1983)

"Science is a process. It is a way of thinking, a manner of approaching and of possibly resolving problems, a route by which one can produce order and sense out of disorganized and chaotic observations. Through it we achieve useful conclusions and results that are compelling and upon which there is a tendency to agree." (Isaac Asimov, "‘X’ Stands for Unknown", 1984)

"If doing mathematics or science is looked upon as a game, then one might say that in mathematics you compete against yourself or other mathematicians; in physics your adversary is nature and the stakes are higher." (Mark Kac, "Enigmas Of Chance", 1985)

"Science is defined as a set of observations and theories about observations." (F Albert Matsen, "The Role of Theory in Chemistry", Journal of Chemical Education Vol. 62 (5), 1985)

"We expect to learn new tricks because one of our science based abilities is being able to predict. That after all is what science is about. Learning enough about how a thing works so you'll know what comes next. Because as we all know everything obeys the universal laws, all you need is to understand the laws." (James Burke, "The Day the Universe Changed", 1985)

"Science is human experience systematically extended (by intent, methodology and instrumentation) for the purpose of learning more about the natural world and for the critical empirical testing and possible falsification of all ideas about the natural world. Scientific hypotheses may incorporate only elements of the natural empirical world, and thus may contain no element of the supernatural." (Robert E Kofahl, Correctly Redefining Distorted Science: A Most Essential Task", Creation Research Society Quarterly Vol. 23, 1986)

"Science is not a given set of answers but a system for obtaining answers. The method by which the search is conducted is more important than the nature of the solution. Questions need not be answered at all, or answers may be provided and then changed. It does not matter how often or how profoundly our view of the universe alters, as long as these changes take place in a way appropriate to science. For the practice of science, like the game of baseball, is covered by definite rules." (Robert Shapiro, "Origins: A Skeptic’s Guide to the Creation of Life on Earth", 1986)

"Science doesn't purvey absolute truth. Science is a mechanism. It's a way of trying to improve your knowledge of nature. It's a system for testing your thoughts against the universe and seeing whether they match. And this works, not just for the ordinary aspects of science, but for all of life. I should think people would want to know that what they know is truly what the universe is like, or at least as close as they can get to it." (Isaac Asimov, [Interview by Bill Moyers] 1988)

"Science doesn’t purvey absolute truth. Science is a mechanism, a way of trying to improve your knowledge of nature. It’s a system for testing your thoughts against the universe, and seeing whether they match." (Isaac Asimov, [interview with Bill Moyers in The Humanist] 1989)

"The view of science is that all processes ultimately run down, but entropy is maximized only in some far, far away future. The idea of entropy makes an assumption that the laws of the space-time continuum are infinitely and linearly extendable into the future. In the spiral time scheme of the timewave this assumption is not made. Rather, final time means passing out of one set of laws that are conditioning existence and into another radically different set of laws. The universe is seen as a series of compartmentalized eras or epochs whose laws are quite different from one another, with transitions from one epoch to another occurring with unexpected suddenness." (Terence McKenna, "True Hallucinations", 1989)

"Science is (or should be) a precise art. Precise, because data may be taken or theories formulated with a certain amount of accuracy; an art, because putting the information into the most useful form for investigation or for presentation requires a certain amount of creativity and insight." (Patricia H Reiff, "The Use and Misuse of Statistics in Space Physics", Journal of Geomagnetism and Geoelectricity 42, 1990)

"In science if you know what you are doing you should not be doing it. In engineering if you do not know what you are doing you should not be doing it. Of course, you seldom, if ever, see either pure state." (Richard W Hamming, "The Art of Probability for Scientists and Engineers", 1991)

"On this view, we recognize science to be the search for algorithmic compressions. We list sequences of observed data. We try to formulate algorithms that compactly represent the information content of those sequences. Then we test the correctness of our hypothetical abbreviations by using them to predict the next terms in the string. These predictions can then be compared with the future direction of the data sequence. Without the development of algorithmic compressions of data all science would be replaced by mindless stamp collecting - the indiscriminate accumulation of every available fact. Science is predicated upon the belief that the Universe is algorithmically compressible and the modern search for a Theory of Everything is the ultimate expression of that belief, a belief that there is an abbreviated representation of the logic behind the Universe's properties that can be written down in finite form by human beings." (John D Barrow, "New Theories of Everything", 1991)

"The goal of science is to make sense of the diversity of Nature." (John D Barrow, "Theories of Everything: The Quest for Ultimate Explanation", 1991)

"Science is not about control. It is about cultivating a perpetual condition of wonder in the face of something that forever grows one step richer and subtler than our latest theory about it. It is about  reverence, not mastery." (Richard Power, "Gold Bug Variations", 1993)

"Statistics as a science is to quantify uncertainty, not unknown." (Chamont Wang, "Sense and Nonsense of Statistical Inference: Controversy, Misuse, and Subtlety", 1993)

"Clearly, science is not simply a matter of observing facts. Every scientific theory also expresses a worldview. Philosophical preconceptions determine where facts are sought, how experiments are designed, and which conclusions are drawn from them." (Nancy R Pearcey & Charles B. Thaxton, "The Soul of Science: Christian Faith and Natural Philosophy", 1994)

"Science is distinguished not for asserting that nature is rational, but for constantly testing claims to that or any other affect by observation and experiment." (Timothy Ferris, "The Whole Shebang: A State-of-the Universe’s Report", 1996)

"Science is more than a mere attempt to describe nature as accurately as possible. Frequently the real message is well hidden, and a law that gives a poor approximation to nature has more significance than one which works fairly well but is poisoned at the root." (Robert H March, "Physics for Poets", 1996)

"The art of science is knowing which observations to ignore and which are the key to the puzzle." (Edward W Kolb, "Blind Watchers of the Sky", 1996)

"Mathematics is the study of analogies between analogies. All science is. Scientists want to show that things that don’t look alike are really the same. That is one of their innermost Freudian motivations. In fact, that is what we mean by understanding." (Gian-Carlo Rota, "Indiscrete Thoughts", 1997)

"Religion is the antithesis of science; science is competent to illuminate all the deep questions of existence, and does so in a manner that makes full use of, and respects the human intellect. I see neither need nor sign of any future reconciliation." (Peter W Atkins, "Religion - The Antithesis to Science", 1997)

"[…] the pursuit of science is more than the pursuit of understanding. It is driven by the creative urge, the urge to construct a vision, a map, a picture of the world that gives the world a little more beauty and coherence than it had before." (John A Wheeler, "Geons, Black Holes, and Quantum Foam: A Life in Physics", 1998)

"The rate of the development of science is not the rate at which you make observations alone but, much more important, the rate at which you create new things to test." (Richard Feynman, "The Meaning of It All", 1998)

"The passion and beauty and joy of science is that we humans have invented a process to understand the universe in a way that is true for everyone. We are finding universal truths." (Bill Nye, 2000)

"The poetry of science is in some sense embodied in its great equations, and these equations can also be peeled. But their layers represent their attributes and consequences, not their meanings." (Graham Farmelo, 2002)

"Science is the art of the appropriate approximation. While the flat earth model is usually spoken of with derision it is still widely used. Flat maps, either in atlases or road maps, use the flat earth model as an approximation to the more complicated shape." (Byron K. Jennings, "On the Nature of Science", Physics in Canada Vol. 63 (1), 2007)

"It is ironic but true: the one reality science cannot reduce is the only reality we will ever know. This is why we need art. By expressing our actual experience, the artist reminds us that our science is incomplete, that no map of matter will ever explain the immateriality of our consciousness." (Jonah Lehrer, "Proust Was a Neuroscientist", 2011)

"Science isn’t about being right. It is about convincing others of the correctness of an idea through a methodology all will accept using data everyone can trust. New ideas take time to be accepted because they compete with others that have already passed the test." (Tom Koch, "Commentary: Nobody loves a critic: Edmund A Parkes and John Snow’s cholera", International Journal of Epidemiology Vol. 42 (6), 2013)

"Science, at its core, is simply a method of practical logic that tests hypotheses against experience. Scientism, by contrast, is the worldview and value system that insists that the questions the scientific method can answer are the most important questions human beings can ask, and that the picture of the world yielded by science is a better approximation to reality than any other." (John M Greer, "After Progress: Reason and Religion at the End of the Industrial Age", 2015)

More quotes on "Science" at quotablemath.blogspot.com.

🔭Data Science: Parameters (Just the Quotes)

"The essential feature is that we express ignorance of whether the new parameter is needed by taking half the prior probability for it as concentrated in the value indicated by the null hypothesis and distributing the other half over the range possible." (Harold Jeffreys, "Theory of Probablitity", 1939)

"The general method involved may be very simply stated. In cases where the equilibrium values of our variables can be regarded as the solutions of an extremum (maximum or minimum) problem, it is often possible regardless of the number of variables involved to determine unambiguously the qualitative behavior of our solution values in respect to changes of parameters." (Paul Samuelson, "Foundations of Economic Analysis", 1947)

"A primary goal of any learning model is to predict correctly the learning curve - proportions of correct responses versus trials. Almost any sensible model with two or three free parameters, however, can closely fit the curve, and so other criteria must be invoked when one is comparing several models." (Robert R Bush & Frederick Mosteller, "A Comparison of Eight Models?", Studies in Mathematical Learning Theory, 1959)

"A satisfactory prediction of the sequential properties of learning data from a single experiment is by no means a final test of a model. Numerous other criteria - and some more demanding - can be specified. For example, a model with specific numerical parameter values should be invariant to changes in independent variables that explicitly enter in the model." (Robert R Bush & Frederick Mosteller,"A Comparison of Eight Models?", Studies in Mathematical Learning Theory, 1959)

"The usefulness of the models in constructing a testable theory of the process is severely limited by the quickly increasing number of parameters which must be estimated in order to compare the predictions of the models with empirical results" (Anatol Rapoport, "Prisoner's Dilemma: A study in conflict and cooperation", 1965)

"Since all models are wrong the scientist cannot obtain a ‘correct’ one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity." (George Box, "Science and Statistics", Journal of the American Statistical Association 71, 1976)

"Mathematical model making is an art. If the model is too small, a great deal of analysis and numerical solution can be done, but the results, in general, can be meaningless. If the model is too large, neither analysis nor numerical solution can be carried out, the interpretation of the results is in any case very difficult, and there is great difficulty in obtaining the numerical values of the parameters needed for numerical results." (Richard E Bellman, "Eye of the Hurricane: An Autobiography", 1984)

"A mechanistic model has the following advantages: 1. It contributes to our scientific understanding of the phenomenon under study. 2. It usually provides a better basis for extrapolation (at least to conditions worthy of further experimental investigation if not through the entire range of all input variables). 3. It tends to be parsimonious (i.e, frugal) in the use of parameters and to provide better estimates of the response." (George E P Box, "Empirical Model-Building and Response Surfaces", 1987)

"Whenever parameters can be quantified, it is usually desirable to do so." (Norman R Augustine, "Augustine's Laws", 1987)

"In addition to dimensionality requirements, chaos can occur only in nonlinear situations. In multidimensional settings, this means that at least one term in one equation must be nonlinear while also involving several of the variables. With all linear models, solutions can be expressed as combinations of regular and linear periodic processes, but nonlinearities in a model allow for instabilities in such periodic solutions within certain value ranges for some of the parameters." (Courtney Brown, "Chaos and Catastrophe Theories", 1995)

"Bayesian inference is appealing when prior information is available since Bayes’ theorem is a natural way to combine prior information with data. Some people find Bayesian inference psychologically appealing because it allows us to make probability statements about parameters. […] In parametric models, with large samples, Bayesian and frequentist methods give approximately the same inferences. In general, they need not agree." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The Bayesian approach is based on the following postulates: (B1) Probability describes degree of belief, not limiting frequency. As such, we can make probability statements about lots of things, not just data which are subject to random variation. […] (B2) We can make probability statements about parameters, even though they are fixed constants. (B3) We make inferences about a parameter θ by producing a probability distribution for θ. Inferences, such as point estimates and interval estimates, may then be extracted from this distribution." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The important thing is to understand that frequentist and Bayesian methods are answering different questions. To combine prior beliefs with data in a principled way, use Bayesian inference. To construct procedures with guaranteed long run performance, such as confidence intervals, use frequentist methods. Generally, Bayesian methods run into problems when the parameter space is high dimensional." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"Each fuzzy set is uniquely defined by a membership function. […] There are two approaches to determining a membership function. The first approach is to use the knowledge of human experts. Because fuzzy sets are often used to formulate human knowledge, membership functions represent a part of human knowledge. Usually, this approach can only give a rough formula of the membership function and fine-tuning is required. The second approach is to use data collected from various sensors to determine the membership function. Specifically, we first specify the structure of membership function and then fine-tune the parameters of membership function based on the data." (Huaguang Zhang & Derong Liu, "Fuzzy Modeling and Fuzzy Control", 2006)

"It is also inevitable for any model or theory to have an uncertainty (a difference between model and reality). Such uncertainties apply both to the numerical parameters of the model and to the inadequacy of the model as well. Because it is much harder to get a grip on these types of uncertainties, they are disregarded, usually." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"Traditional statistics is strong in devising ways of describing data and inferring distributional parameters from sample. Causal inference requires two additional ingredients: a science-friendly language for articulating causal knowledge, and a mathematical machinery for processing that knowledge, combining it with data and drawing new causal conclusions about a phenomenon." (Judea Pearl, "Causal inference in statistics: An overview", Statistics Surveys 3, 2009)

"In negative feedback regulation the organism has set points to which different parameters (temperature, volume, pressure, etc.) have to be adapted to maintain the normal state and stability of the body. The momentary value refers to the values at the time the parameters have been measured. When a parameter changes it has to be turned back to its set point. Oscillations are characteristic to negative feedback regulation […]" (Gaspar Banfalvi, "Homeostasis - Tumor – Metastasis", 2014)

"Today we routinely learn models with millions of parameters, enough to give each elephant in the world his own distinctive wiggle. It’s even been said that data mining means 'torturing the data until it confesses'." (Pedro Domingos, "The Master Algorithm", 2015)

"An estimate (the mathematical definition) is a number derived from observed values that is as close as we can get to the true parameter value. Useful estimators are those that are 'better' in some sense than any others." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"Estimators are functions of the observed values that can be used to estimate specific parameters. Good estimators are those that are consistent and have minimum variance. These properties are guaranteed if the estimator maximizes the likelihood of the observations." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"What properties should a good statistical estimator have? Since we are dealing with probability, we start with the probability that our estimate will be very close to the true value of the parameter. We want that probability to become greater and greater as we get more and more data. This property is called consistency. This is a statement about probability. It does not say that we are sure to get the right answer. It says that it is highly probable that we will be close to the right answer." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

More quotes on "Parameters" the-web-of-knowledge.blogspot.com.

🔭Data Science: Iterations (Just the Quotes)

"Data analysis must be iterative to be effective. [...] The iterative and interactive interplay of summarizing by fit and exposing by residuals is vital to effective data analysis. Summarizing and exposing are complementary and pervasive." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"Statistical methods are tools of scientific investigation. Scientific investigation is a controlled learning process in which various aspects of a problem are illuminated as the study proceeds. It can be thought of as a major iteration within which secondary iterations occur. The major iteration is that in which a tentative conjecture suggests an experiment, appropriate analysis of the data so generated leads to a modified conjecture, and this in turn leads to a new experiment, and so on." (George E P Box & George C Tjao, "Bayesian Inference in Statistical Analysis", 1973)

"Iteration and experimentation are important for all of data analysis, including graphical data display. In many cases when we make a graph it is immediately clear that some aspect is inadequate and we regraph the data. In many other cases we make a graph, and all is well, but we get an idea for studying the data in a different way with a different graph; one successful graph often suggests another." (William S Cleveland, "The Elements of Graphing Data", 1985)

"Apart from power laws, iteration is one of the prime sources of self-similarity. Iteration here means the repeated application of some rule or operation - doing the same thing over and over again. […] A concept closely related to iteration is recursion. In an age of increasing automation and computation, many processes and calculations are recursive, and if a recursive algorithm is in fact repetitious, self-similarity is waiting in the wings."(Manfred Schroeder, "Fractals, Chaos, Power Laws Minutes from an Infinite Paradise", 1990)

"Fitting is essential to visualizing hypervariate data. The structure of data in many dimensions can be exceedingly complex. The visualization of a fit to hypervariate data, by reducing the amount of noise, can often lead to more insight. The fit is a hypervariate surface, a function of three or more variables. As with bivariate and trivariate data, our fitting tools are loess and parametric fitting by least-squares. And each tool can employ bisquare iterations to produce robust estimates when outliers or other forms of leptokurtosis are present." (William S Cleveland, "Visualizing Data", 1993)

"Data scientists combine entrepreneurship with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate over a solution. They are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: 'there’s a lot of data, what can you make from it?'" (Mike Loukides, "What Is Data Science?", 2011)

"Overfitting occurs when a formula describes a set of data very closely, but does not lead to any sensible explanation for the behavior of the data and does not predict the behavior of comparable data sets. In the case of overfitting, the formula is said to describe the noise of the system rather than the characteristic behavior of the system. Overfitting occurs frequently with models that perform iterative approximations on training data, coming closer and closer to the training data set with each iteration. Neural networks are an example of a data modeling strategy that is prone to overfitting." (Jules H Berman, "Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information", 2013)

"Geometric pattern repeated at progressively smaller scales, where each iteration is about a reproduction of the image to produce completely irregular shapes and surfaces that can not be represented by classical geometry. Fractals are generally self-similar (each section looks at all) and are not subordinated to a specific scale. They are used especially in the digital modeling of irregular patterns and structures in nature." (Mauro Chiarella, "Folds and Refolds: Space Generation, Shapes, and Complex Components", 2016)

"Cluster analysis refers to the grouping of observations so that the objects within each cluster share similar properties, and properties of all clusters are independent of each other. Cluster algorithms usually optimize by maximizing the distance among clusters and minimizing the distance between objects in a cluster. Cluster analysis does not complete in a single iteration but goes through several iterations until the model converges. Model convergence means that the cluster memberships of all objects converge and don’t change with every new iteration." (Danish Haroon, "Python Machine Learning Case Studies", 2017)

30 November 2018

🔭Data Science: p-value (Just the Quotes)

"What the use of a p-value implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred." (Harold Jeffreys, "Theory of Probability", 1939)

"A quotation of a p-value is part of the ritual of science, a sprinkling of the holy waters in an effort to sanctify the data analysis and turn consumers of the results into true believers." (William Cleveland, "Visualizing Data", 1993)

"A common misconception is that an effect exists only if it is statistically significant and that it does not exist if it is not [statistically significant]." (Jonas Ranstam, "A common misconception about p-value and its consequences", Acta Orthopaedica Scandinavica 67, 1996)

"It’s a commonplace among statisticians that a chi-squared test (and, really, any p-value) can be viewed as a crude measure of sample size: When sample size is small, it’s very difficult to get a rejection (that is, a p-value below 0.05), whereas when sample size is huge, just about anything will bag you a rejection. With large n, a smaller signal can be found amid the noise. In general: small n, unlikely to get small p-values. Large n, likely to find something. Huge n, almost certain to find lots of small p-values." (Andrew Gelman, "The sample size is huge, so a p-value of 0.007 is not that impressive", 2009)

"The p-value is a concept so misaligned with intuition that no civilian can hold it firmly in mind. Nor can many statisticians." (Matt Briggs, "Why do statisticians answer silly questions that no one ever asks?", Significance Vol. 9(1), 2012)

"Statistical significance refers to the probability that something is true. It’s a measure of how probable it is that the effect we’re seeing is real (rather than due to chance occurrence), which is why it’s typically measured with a p-value. P, in this case, stands for probability. If you accept p-values as a measure of statistical significance, then the lower your p-value is, the less likely it is that the results you’re seeing are due to chance alone." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"When statistical inferences, such as p-values, follow extensive looks at the data, they no longer have their usual interpretation. Ignoring this reality is dishonest: it is like painting a bull’s eye around the landing spot of your arrow. This is known in some circles as p-hacking, and much has been written about its perils and pitfalls." (Robert E Kass et all, "Ten Simple Rules for Effective Statistical Practice", PLoS Comput Biol 12(6), 2016)

"Remember that a p-value merely indicates the probability of a particular set of data being generated by the null model–it has little to say about the size of a deviation from that model (especially in the tails of the distribution, where large changes in effect size cause only small changes in p-values)." (Clay Helberg)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.