25 November 2018

🔭Data Science: The Truth in Models (Just the Quotes)

"A model, like a novel, may resonate with nature, but it is not a ‘real’ thing. Like a novel, a model may be convincing - it may ‘ring true’ if it is consistent with our experience of the natural world. But just as we may wonder how much the characters in a novel are drawn from real life and how much is artifice, we might ask the same of a model: How much is based on observation and measurement of accessible phenomena, how much is convenience? Fundamentally, the reason for modeling is a lack of full access, either in time or space, to the phenomena of interest." (Kenneth Belitz, Science, Vol. 263, 1944)

"Exact truth of a null hypothesis is very unlikely except in a genuine uniformity trial." (David R Cox, "Some problems connected with statistical inference", Annals of Mathematical Statistics 29, 1958)

"[…] no models are [true] = not even the Newtonian laws. When you construct a model you leave out all the details which you, with the knowledge at your disposal, consider inessential. […] Models should not be true, but it is important that they are applicable, and whether they are applicable for any given purpose must of course be investigated. This also means that a model is never accepted finally, only on trial." (Georg Rasch, "Probabilistic Models for Some Intelligence and Attainment Tests", 1960)

"The validation of a model is not that it is 'true' but that it generates good testable hypotheses relevant to important problems." (Richard Levins, "The Strategy of Model Building in Population Biology", 1966)

"A theory has only the alternative of being right or wrong. A model has a third possibility: it may be right, but irrelevant." (Manfred Eigen, 1973)

"Models, of course, are never true, but fortunately it is only necessary that they be useful. For this it is usually needful only that they not be grossly wrong. I think rather simple modifications of our present models will prove adequate to take account of most realities of the outside world. The difficulties of computation which would have been a barrier in the past need not deter us now." (George E P Box, "Some Problems of Statistics and Everyday Life", Journal of the American Statistical Association, Vol. 74 (365), 1979)

"The purpose of an experiment is to answer questions. The truth of this seems so obvious, that it would not be worth emphasizing were it not for the fact that the results of many experiments are interpreted and presented with little or no reference to the questions that were asked in the first place."  (Thomas M Little, "Interpretation and presentation of results", Hortscience 16, 1981)

"The fact that [the model] is an approximation does not necessarily detract from its usefulness because models are approximations. All models are wrong, but some are useful." (George Box, 1987)

"A null hypothesis that yields under two different treatments have identical expectations is scarcely very plausible, and its rejection by a significance test is more dependent upon the size of an experiment than upon its untruth." (David J Finney, "Was this in your statistics textbook?", Experimental Agriculture 24, 1988)

"Statistical models for data are never true. The question whether a model is true is irrelevant. A more appropriate question is whether we obtain the correct scientific conclusion if we pretend that the process under study behaves according to a particular statistical model." (Scott Zeger, "Statistical reasoning in epidemiology", American Journal of Epidemiology, 1991)

"The motivation for any action on outliers must be to improve interpretation of data without ignoring unwelcome truth. To remove bad and untrustworthy data is a laudable ambition, but naive and untested rules may bring harm rather than benefit." (David Finney, "Calibration Guidelines Challenge Outlier Practices", The American Statistician Vol 60 (4), 2006) 

"You might say that there’s no reason to bother with model checking since all models are false anyway. I do believe that all models are false, but for me the purpose of model checking is not to accept or reject a model, but to reveal aspects of the data that are not captured by the fitted model." (Andrew Gelman, "Some thoughts on the sociology of statistics", 2007)

"If students have students have no experience with hands-on [telescope] observing, they may take all data as ‘truth’ without having an understanding of how the data are obtained and what could potentially go wrong in that process, so I think it becomes crucially important to give a glimpse of what’s happening behind the scenes at telescopes, so they can be appropriately skeptical users of data in the future." (Colette Salyk, Sky & Telescope, 2022)

"On a final note, we would like to stress the importance of design, which often does not receive the attention it deserves. Sometimes, the large number of modeling options for spatial analysis may raise the false impression that design does not matter, and that a sophisticated analysis takes care of everything. Nothing could be further from the truth." (Hans-Peter Piepho et al, "Two-dimensional P-spline smoothing for spatial analysis of plant breeding trials", “Biometrical Journal”, 2022)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.