"A graphical form that involves elementary perceptual tasks that lead to more accurate judgments than another graphical form (with the same quantitative in formation) will result in better organization and increase the chances of a correct perception of patterns and behavior." (William S Cleveland & Robert McGill, "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods", Journal of the American Statistical Association Vol. 79(387), 1984)
"Dot charts are suggested as replacements for bar charts. The replacements allow more effective visual decoding of the quantitative information and can be used for a wider variety of data sets." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"[...] error bars are more effectively portrayed on dot charts than on bar charts. […] On the bar chart the upper values of the intervals stand out well, but the lower values are visually deemphasized and are not as well perceived as a result of being embedded in the bars. This deemphasis does not occur on the dot chart." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"Experimentation with graphical methods for data presentation is important for improving graphical communication in science." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"For certain types of data structures, one cannot always use the most accurate elementary task, judging position along a common scale. But this is not true of the data represented in divided bar charts and pie charts; one can always represent such data along a common scale. A pie chart can always be replaced by a bar chart, thus replacing angle judgments by position judgments. […] A divided bar chart can always be replaced by a grouped bar chart; […]." (William S Cleveland & Robert McGill, "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods", Journal of the American Statistical Association Vol. 79(387), 1984)
"Of course increased bias does not necessarily imply less overall accuracy. The reasoning, however, is that the mechanism leading to bias might well lead to other types of inaccuracy as well." (William S Cleveland & Robert McGill, "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods", Journal of the American Statistical Association Vol. 79(387), 1984)
"One must be careful not to fall into a conceptual trap by adopting accuracy as a criterion. We are not saying that the primary purpose of a graph is to convey numbers with as many decimal places as possible. […] The power of a graph is its ability to enable one to take in the quantitative information, organize it, and see patterns and structure not readily revealed by other means of studying the data." (William S Cleveland & Robert McGill, "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Methods", Journal of the American Statistical Association Vol. 79(387), 1984)
"The bar of a bar chart has two aspects that can be used to visually decode quantitative information-size (length and area) and the relative position of the end of the bar along the common scale. The changing sizes of the bars is an important and imposing visual factor; thus it is important that size encode something meaningful. The sizes of bars encode the magnitudes of deviations from the baseline. If the deviations have no important interpretation, the changing sizes are wasted energy and even have the potential to mislead." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"The full break results in a graph with two juxtaposed panels. This use of juxtaposition to provide a full scale break, with each panel having a fill frame and its own scales, shows the scale break about as forcefully as possible and discourages mental visual connections by viewers and actual connections by authors." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"The logarithm is an extremely powerful and useful tool for graphical data presentation. One reason is that logarithms turn ratios into differences, and for many sets of data, it is natural to think in terms of ratios. […] Another reason for the power of logarithms is resolution. Data that are amounts or counts are often very skewed to the right; on graphs of such data, there are a few large values that take up most of the scale and the majority of the points are squashed into a small region of the scale with no resolution." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"[…] the partial scale break is a weak indicator that the reader can fail to appreciate fully; visually the graph is still a single panel that invites the viewer to see, inappropriately, patterns between the two scales. […] The partial scale break also invites authors to connect points across the break, a poor practice indeed; […]" (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)
"A connected graph is appropriate when the time series is smooth, so that perceiving individual values is not important. A vertical line graph is appropriate when it is important to see individual values, when we need to see short-term fluctuations, and when the time series has a large number of values; the use of vertical lines allows us to pack the series tightly along the horizontal axis. The vertical line graph, however, usually works best when the vertical lines emanate from a horizontal line through the center of the data and when there are no long-term trends in the data." (William S Cleveland, "The Elements of Graphing Data", 1985)
"A time series is a special case of the broader dependent-independent variable category. Time is the independent variable. One important property of most time series is that for each time point of the data there is only a single value of the dependent variable; there are no repeat measurements. Furthermore, most time series are measured at equally-spaced or nearly equally-spaced points in time." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Another way to obscure data is to graph too much. It is always tempting to show everything that comes to mind on a single graph, but graphing too much can result in less being seen and understood." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Do not allow data labels in the data region to interfere with the quantitative data or to clutter the graph. […] Avoid putting notes, keys, and markers in the data region. Put keys and markers just outside the data region and put notes in the legend or in the text." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Clear vision is a vital aspect of graphs. The viewer must
be able to visually disentangle the many different items that appear on a
graph." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Graphs that communicate data to others often must undergo reduction and reproduction; these processes, if not done with care, can interfere with visual clarity." (William S Cleveland, "The Elements of Graphing Data", 1985)
"In part, graphing data needs to be iterative because we often do not know what to expect of the data; a graph can help discover unknown aspects of the data, and once the unknown is known, we frequently find ourselves formulating a new question about the data. Even when we understand the data and are graphing them for presentation, a graph will look different from what we had expected; our mind's eye frequently does not do a good job of predicting what our actual eyes will see." (William S Cleveland, "The Elements of Graphing Data", 1985)
"It is common for positive data to be skewed to the right: some values bunch together at the low end of the scale and others trail off to the high end with increasing gaps between the values as they get higher. Such data can cause severe resolution problems on graphs, and the common remedy is to take logarithms. Indeed, it is the frequent success of this remedy that partly accounts for the large use of logarithms in graphical data display." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Iteration and experimentation are important for all of
data analysis, including graphical data display. In many cases when we make a
graph it is immediately clear that some aspect is inadequate and we regraph the
data. In many other cases we make a graph, and all is well, but we get an idea
for studying the data in a different way with a different graph; one successful
graph often suggests another." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Make the data stand out and avoid superfluity are two
broad strategies that serve as an overall guide to the specific principles
[…] The data - the quantitative and qualitative information in the data region - are the reason for the existence of the graph. The data should stand out. […]
We should eliminate superfluity in graphs. Unnecessary parts of a graph add to
the clutter and increase the difficulty of making the necessary elements - the
data - stand out." (William S Cleveland, "The Elements of Graphing Data", 1985)
"No matter how clever the choice of the information, and no matter how technologically impressive the encoding, a visualization fails if the decoding fails. Some display methods lead to efficient, accurate decoding, and others lead to inefficient, inaccurate decoding. It is only through scientific study of visual perception that informed judgments can be made about display methods." (William S Cleveland, "The Elements of Graphing Data", 1985)
"There are some who argue that a graph is a success only
if the important information in the data can be seen within a few seconds. While
there is a place for rapidly-understood graphs, it is too limiting to make speed a requirement in science and technology, where
the use of graphs ranges from, detailed, in-depth data analysis to quick presentation." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Use a reference line when there is an important value
that must be seen across the entire graph, but do not let the line interfere
with the data." (William S Cleveland, "The Elements of Graphing Data", 1985)
"When a graph is constructed, quantitative and categorical information is encoded, chiefly through position, size, symbols, and color. When a person looks at a graph, the information is visually decoded by the person's visual system. A graphical method is successful only if the decoding process is effective. No matter how clever and how technologically impressive the encoding, it is a failure if the decoding process is a failure. Informed decisions about how to encode data can be achieved only through an understanding of the visual decoding process, which is called graphical perception." (William S Cleveland, "The Elements of Graphing Data", 1985)
"When magnitudes are graphed on a logarithmic scale, percents and factors are easier to judge since equal multiplicative factors and percents result in equal distances throughout the entire scale." (William S Cleveland, "The Elements of Graphing Data", 1985)
"When the data are magnitudes, it is helpful to have zero
included in the scale so we can see its value relative to the value of the
data. But the need for zero is not so compelling that we should allow its
inclusion to ruin the resolution of the data on the graph." (William S Cleveland, "The Elements of Graphing Data", 1985)
"Data that are skewed toward large values occur commonly.
Any set of positive measurements is a candidate. Nature just works like that.
In fact, if data consisting of positive numbers range over several powers of ten,
it is almost a guarantee that they will be skewed. Skewness creates many
problems. There are visualization problems. A large fraction of the data are
squashed into small regions of graphs, and visual assessment of the data
degrades. There are characterization problems. Skewed distributions tend to be
more complicated than symmetric ones; for example, there is no unique notion of
location and the median and mean measure different aspects of the distribution.
There are problems in carrying out probabilistic methods. The distribution of
skewed data is not well approximated by the normal, so the many probabilistic
methods based on an assumption of a normal distribution cannot be applied." (William S Cleveland, "Visualizing Data", 1993)
"Fitting data means finding mathematical descriptions of structure in the data. An additive shift is a structural property of univariate data in which distributions differ only in location and not in spread or shape. […] The process of identifying a structure in data and then fitting the structure to produce residuals that have the same distribution lies at the heart of statistical analysis. Such homogeneous residuals can be pooled, which increases the power of the description of the variation in the data." (William S Cleveland, "Visualizing Data", 1993)
"Fitting is essential to visualizing hypervariate data. The structure of data in many dimensions can be exceedingly complex. The visualization of a fit to hypervariate data, by reducing the amount of noise, can often lead to more insight. The fit is a hypervariate surface, a function of three or more variables. As with bivariate and trivariate data, our fitting tools are loess and parametric fitting by least-squares. And each tool can employ bisquare iterations to produce robust estimates when outliers or other forms of leptokurtosis are present." (William S Cleveland, "Visualizing Data", 1993)
"If the underlying pattern of the data has gentle
curvature with no local maxima and minima, then locally linear fitting is
usually sufficient. But if there are local maxima or minima, then locally
quadratic fitting typically does a better job of following the pattern of the
data and maintaining local smoothness." (William S Cleveland, "Visualizing Data", 1993)
"Many good things happen when data distributions are well approximated by the normal. First, the question of whether the shifts among the distributions are additive becomes the question of whether the distributions have the same standard deviation; if so, the shifts are additive. […] A second good happening is that methods of fitting and methods of probabilistic inference, to be taken up shortly, are typically simple and on well understood ground. […] A third good thing is that the description of the data distribution is more parsimonious." (William S Cleveland, "Visualizing Data", 1993)
"Many of the applications of visualization in this book
give the impression that data analysis consists of an orderly progression of exploratory
graphs, fitting, and visualization of fits and residuals. Coherence of
discussion and limited space necessitate a presentation that appears to imply
this. Real life is usually quite different. There are blind alleys. There are
mistaken actions. There are effects missed until the very end when some
visualization saves the day. And worse, there is the possibility of the nearly unmentionable:
missed effects." (William S Cleveland, "Visualizing Data", 1993)
"One important aspect of reality is improvisation; as a
result of special structure in a set of data, or the finding of a visualization
method, we stray from the standard methods for the data type to exploit the structure
or the finding." (William S Cleveland, "Visualizing Data", 1993)
"Probabilistic inference is the classical paradigm for data analysis in science and technology. It rests on a foundation of randomness; variation in data is ascribed to a random process in which nature generates data according to a probability distribution. This leads to a codification of uncertainly by confidence intervals and hypothesis tests." (William S Cleveland, "Visualizing Data", 1993)
"Sometimes, when visualization thoroughly reveals the structure of a set of data, there is a tendency to underrate the power of the method for the application. Little effort is expended in seeing the structure once the right visualization method is used, so we are mislead into thinking nothing exciting has occurred." (William S Cleveland, "Visualizing Data", 1993)
"The logarithm is one of many transformations that we can apply to univariate measurements. The square root is another. Transformation is a critical tool for visualization or for any other mode of data analysis because it can substantially simplify the structure of a set of data. For example, transformation can remove skewness toward large values, and it can remove monotone increasing spread. And often, it is the logarithm that achieves this removal." (William S Cleveland, "Visualizing Data", 1993)
"The scatterplot is a useful exploratory method for providing a first look at bivariate data to see how they are distributed throughout the plane, for example, to see clusters of points, outliers, and so forth." (William S Cleveland, "Visualizing Data", 1993)
"There are two components to visualizing the structure of statistical data - graphing and fitting. Graphs are needed, of course, because visualization implies a process in which information is encoded on visual displays. Fitting mathematical functions to data is needed too. Just graphing raw data, without fitting them and without graphing the fits and residuals, often leaves important aspects of data undiscovered." (William S Cleveland, "Visualizing Data", 1993)
"Using area to encode quantitative information is a poor graphical method. Effects that can be readily perceived in other visualizations are often lost in an encoding by area." (William S Cleveland, "Visualizing Data", 1993)
"Visualization is an approach to data analysis that stresses a penetrating look at the structure of data. No other approach conveys as much information. […] Conclusions spring from data when this information is combined with the prior knowledge of the subject under investigation." (William S Cleveland, "Visualizing Data", 1993)
"Visualization is an effective framework for drawing inferences from data because its revelation of the structure of data can be readily combined with prior knowledge to draw conclusions. By contrast, because of the formalism of probabilistic methods, it is typically impossible to incorporate into them the full body of prior information." (William S Cleveland, "Visualizing Data", 1993)
"When distributions are compared, the goal is to understand how the distributions shift in going from one data set to the next. […] The most effective way to investigate the shifts of distributions is to compare corresponding quantiles." (William S Cleveland, "Visualizing Data", 1993)
"When the distributions of two or more groups of univariate data are skewed, it is common to have the spread increase monotonically with location. This behavior is monotone spread. Strictly speaking, monotone spread includes the case where the spread decreases monotonically with location, but such a decrease is much less common for raw data. Monotone spread, as with skewness, adds to the difficulty of data analysis. For example, it means that we cannot fit just location estimates to produce homogeneous residuals; we must fit spread estimates as well. Furthermore, the distributions cannot be compared by a number of standard methods of probabilistic inference that are based on an assumption of equal spreads; the standard t-test is one example. Fortunately, remedies for skewness can cure monotone spread as well." (William S Cleveland, "Visualizing Data", 1993)
"Pie charts have severe perceptual problems. Experiments in graphical perception have shown that compared with dot charts, they convey information far less reliably. But if you want to display some data, and perceiving the information is not so important, then a pie chart is fine." (Richard Becker & William S Cleveland," S-Plus Trellis Graphics User's Manual", 1996)