"In working through graphics one has, however, to be exceedingly cautious in certain particulars, for instance, when a set of figures, dynamical or financial, are available they are, so long as they are tabulated, instinctively taken merely at their face value. When plotted, however, there is a temptation to extrapolation which is well nigh irresistible to the untrained mind. Sometimes the process can be safely employed, but it requires a rather comprehensive knowledge of the facts that lie back of the data to tell when to go ahead and when to stop." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)
"A piece of self-deception - often dear to the heart of apprentice scientists - is the drawing of a 'smooth curve' (how attractive it sounds!) through a set of points which have about as much trend as the currants in plum duff. Once this is done, the mind, looking for order amidst chaos, follows the Jack-o'-lantern line with scant attention to the protesting shouts of the actual points. Nor, let it be whispered, is it unknown for people who should know better to rub off the offending points and publish the trend line which their foolish imagination has introduced on the flimsiest of evidence. Allied to this sin is that of overconfident extrapolation, i.e. extending the graph by guesswork beyond the range of factual information. Whenever extrapolation is attempted it should be carefully distinguished from the rest of the graph, e.g. by showing the extrapolation as a dotted line in contrast to the full line of the rest of the graph. [...] Extrapolation always calls for justification, sooner or later. Until this justification is forthcoming, it remains a provisional estimate, based on guesswork." (Michael J Moroney, "Facts from Figures", 1951)
"Extrapolations are useful, particularly in the form of soothsaying called forecasting trends. But in looking at the figures or the charts made from them, it is necessary to remember one thing constantly: The trend to now may be a fact, but the future trend represents no more than an educated guess. Implicit in it is 'everything else being equal' and 'present trends continuing'. And somehow everything else refuses to remain equal." (Darell Huff, "How to Lie with Statistics", 1954)
"Almost all efforts at data analysis seek, at some point, to generalize the results and extend the reach of the conclusions beyond a particular set of data. The inferential leap may be from past experiences to future ones, from a sample of a population to the whole population, or from a narrow range of a variable to a wider range. The real difficulty is in deciding when the extrapolation beyond the range of the variables is warranted and when it is merely naive. As usual, it is largely a matter of substantive judgment - or, as it is sometimes more delicately put, a matter of 'a priori nonstatistical considerations'."
"Each part of a graphic generates visual expectations about its other parts and, in the economy of graphical perception, these expectations often determine what the eye sees. Deception results from the incorrect extrapolation of visual expectations generated at one place on the graphic to other places." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Time-series forecasting is essentially a form of extrapolation in that it involves fitting a model to a set of data and then using that model outside the range of data to which it has been fitted. Extrapolation is rightly regarded with disfavour in other statistical areas, such as regression analysis. However, when forecasting the future of a time series, extrapolation is unavoidable." (Chris Chatfield, "Time-Series Forecasting" 2nd Ed, 2000)
"The first myth is that prediction is always based on time-series extrapolation into the future (also known as forecasting). This is not the case: predictive analytics can be applied to generate any type of unknown data, including past and present. In addition, prediction can be applied to non-temporal (time-based) use cases such as disease progression modeling, human relationship modeling, and sentiment analysis for medication adherence, etc. The second myth is that predictive analytics is a guarantor of what will happen in the future. This also is not the case: predictive analytics, due to the nature of the insights they create, are probabilistic and not deterministic. As a result, predictive analytics will not be able to ensure certainty of outcomes." (Prashant Natarajan et al, "Demystifying Big Data and Machine Learning for Healthcare", 2017)
"If you study one group and assume that your results apply to other groups, this is extrapolation. If you think you are studying one group, but do not manage to obtain a representative sample of that group, this is a different problem. It is a problem so important in statistics that it has a special name: selection bias. Selection bias arises when the individuals that you sample for your study differ systematically from the population of individuals eligible for your study."
No comments:
Post a Comment