02 December 2018

🔭Data Science: All Molels Are Wrong (Just the Quotes)

“[…] no models are [true] = not even the Newtonian laws. When you construct a model you leave out all the details which you, with the knowledge at your disposal, consider inessential. […] Models should not be true, but it is important that they are applicable, and whether they are applicable for any given purpose must of course be investigated. This also means that a model is never accepted finally, only on trial.” (Georg Rasch, “Probabilistic Models for Some Intelligence and Attainment Tests”, 1960)

“Celestial navigation is based on the premise that the Earth is the center of the universe. The premise is wrong, but the navigation works. An incorrect model can be a useful tool.” (R A J Phillips, “A Day in the Life of Kelvin Throop”, Analog Science Fiction and Science Fact, Vol. 73 No. 5, 1964)

“Since all models are wrong the scientist cannot obtain a ‘correct’ one by excessive elaboration. On the contrary following William of Occam he should seek an economical description of natural phenomena. Just as the ability to devise simple but evocative models is the signature of the great scientist so overelaboration and overparameterization is often the mark of mediocrity.” (George Box, “Science and Statistics", Journal of the American Statistical Association 71, 1976)

“A model of the universe does not require faith, but a telescope. If it is wrong, it is wrong.” (Paul C W Davies, “Space and Time in the Modern Universe”, 1977)

"Competent scientists do not believe their own models or theories, but rather treat them as convenient fictions. […] The issue to a scientist is not whether a model is true, but rather whether there is another whose predictive power is enough better to justify movement from today's fiction to a new one." (Steve Vardeman," Comment", Journal of the American Statistical Association 82, 1987)

“The fact that [the model] is an approximation does not necessarily detract from its usefulness because models are approximations. All models are wrong, but some are useful.” (George Box, 1987)

"Statistical models for data are never true. The question whether a model is true is irrelevant. A more appropriate question is whether we obtain the correct scientific conclusion if we pretend that the process under study behaves according to a particular statistical model." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

“[…] it does not seem helpful just to say that all models are wrong. The very word model implies simplification and idealization. The idea that complex physical, biological or sociological systems can be exactly described by a few formulae is patently absurd. The construction of idealized representations that capture important stable aspects of such systems is, however, a vital part of general scientific analysis and statistical models, especially substantive ones, do not seem essentially different from other kinds of model.” (Sir David Cox, "Comment on ‘Model uncertainty, data mining and statistical inference’", Journal of the Royal Statistical Society, Series A 158, 1995)

“I do not know that my view is more correct; I do not even think that ‘right’ and ‘wrong’ are good categories for assessing complex mental models of external reality - for models in science are judged [as] useful or detrimental, not as true or false.” (Stephen Jay Gould, “Dinosaur in a Haystack: Reflections in Natural History”, 1995)

“No matter how beautiful the whole model may be, no matter how naturally it all seems to hang together now, if it disagrees with experiment, then it is wrong.” (John Gribbin, “Almost Everyone’s Guide to Science”, 1999)

“A model is a simplification or approximation of reality and hence will not reflect all of reality. […] Box noted that ‘all models are wrong, but some are useful’. While a model can never be ‘truth’, a model might be ranked from very useful, to useful, to somewhat useful to, finally, essentially useless.” (Kenneth P Burnham & David R Anderson, “Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach” 2nd Ed., 2005)

"You might say that there’s no reason to bother with model checking since all models are false anyway. I do believe that all models are false, but for me the purpose of model checking is not to accept or reject a model, but to reveal aspects of the data that are not captured by the fitted model." (Andrew Gelman, "Some thoughts on the sociology of statistics", 2007)

"First, we affirm that all models are wrong, some of them are useful. Since a model is an abstraction of reality, and that too only from a particular perspective, they are fundamentally wrong because they are not reality. That gives no license to models that are wrongly built - after all, two wrongs don’t make a right. So usefulness, or purpose, is what determines a model’s role, given that it is correctly formed. Models therefore have teleological value even though they are ontologically erroneous." (John Boardman & Brian Sauser, "Systems Thinking: Coping with 21st Century Problems", 2008)

“In general, when building statistical models, we must not forget that the aim is to understand something about the real world. Or predict, choose an action, make a decision, summarize evidence, and so on, but always about the real world, not an abstract mathematical world: our models are not the reality - a point well made by George Box in his oft-cited remark that “all models are wrong, but some are useful”. (David Hand, "Wonderful examples, but let's not close our eyes", Statistical Science 29, 2014)

"A model is a metaphor, a description of a system that helps us to reason more clearly. Like all metaphors, models are approximations, and will never account for every last detail. A useful mantra here is: all models are wrong, but some models are useful." (James G Scott, "Statistical Modeling: A Gentle Introduction", 2017)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.