22 November 2011

Graphical Representation: Smoothing (Just the Quotes)

 "The chief problems in the technique of historigram [aka histogram] plotting are those of base line scales, types of lines to use for the graphs and methods of and purposes of smoothing these curves. The size of page, ability of grasp by the eye, subsequent treatment of the illustration, etc., are determining factors. The variable factor is usually plotted from a base line along the ordinate axis. Spacing and rules for scales apply as in frequency diagrams." (William C Marshall, "Graphical methods for schools, colleges, statisticians, engineers and executives", 1921)

"A connected graph is appropriate when the time series is smooth, so that perceiving individual values is not important. A vertical line graph is appropriate when it is important to see individual values, when we need to see short-term fluctuations, and when the time series has a large number of values; the use of vertical lines allows us to pack the series tightly along the horizontal axis. The vertical line graph, however, usually works best when the vertical lines emanate from a horizontal line through the center of the data and when there are no long-term trends in the data." (William S Cleveland, "The Elements of Graphing Data", 1985)

"If the underlying pattern of the data has gentle curvature with no local maxima and minima, then locally linear fitting is usually sufficient. But if there are local maxima or minima, then locally quadratic fitting typically does a better job of following the pattern of the data and maintaining local smoothness." (William S Cleveland, "Visualizing Data", 1993)

"As a general rule, the fewer the time intervals used in the averaging process, the more closely the moving average curve resembles the curve of the actual data. Conversely, the greater the number of intervals, the smoother the moving average curve. […] Moving average curves tend to have a delayed reaction to changes." (Robert L Harris, "Information Graphics: A Comprehensive Illustrated Reference", 1996)

"The plot tells us the data are granular in the data source, something we could not ascertain with the histogram. There is an important lesson here. Statistics texts and statistical packages that recommend the histogram as the graphical starting point for a data analysis are giving bad advice. The same goes for kernel density estimates. These are appropriate second stages for graphical data analysis. The best starting point for getting a sense of the distribution of a variable is a tally, stem-and-leaf, or a dot plot. A dot plot is a special case of a tally (perhaps best thought of as a delta-neighborhood tally). Once we see that the data are not granular, we may move on to a histogram or kernel density, which smooths the data more than a dot plot." (Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"Another method used to simplify the appearance of a graphic is smoothing. A regression line overlaid on a scatterplot is a smooth representation of the relationship between the two graph variables. For time series data, a moving average of the data over time is often used to smooth out the variation over small time steps in order to illustrate the overall trend." (Daniel B Carr & Linda W Pickle, "Visualizing Data Patterns with Micromaps", 2010)

"Scatterplots are the preferred medium for adding smooth curves to show a causal functional relationship or an association […] However, despite the advantage of the scatterplot for seeing some types of patterns, the linked micromap design adds geographic location to the information displayed and so enables searches for geographic patterns that the scatterplot omits." (Daniel B Carr & Linda W Pickle, "Visualizing Data Patterns with Micromaps", 2010)

"Smoothing is a technique that can be used to remove some of the variation in short-term data in favor of emphasizing long-term trends." (Andy Kriebel & Eva Murray, "#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time", 2018) 

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.