"A good rule of thumb for deciding how long the analysis of the data actually will take is (1) to add up all the time for everything you can think of - editing the data, checking for errors, calculating various statistics, thinking about the results, going back to the data to try out a new idea, and (2) then multiply the estimate obtained in this first step by five." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Almost all efforts at data analysis seek, at some point, to
generalize the results and extend the reach of the conclusions beyond a
particular set of data. The inferential leap may be from past experiences to future
ones, from a sample of a population to the whole population, or from a narrow
range of a variable to a wider range. The real difficulty is in deciding when
the extrapolation beyond the range of the variables is warranted and when it is
merely naive. As usual, it is largely a matter of substantive judgment - or, as
it is sometimes more delicately put, a matter of 'a priori nonstatistical
considerations'." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"[…] fitting lines to relationships between variables is often a useful and powerful method of summarizing a set of data. Regression analysis fits naturally with the development of causal explanations, simply because the research worker must, at a minimum, know what he or she is seeking to explain." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Fitting lines to relationships between variables is the major tool of data analysis. Fitted lines often effectively summarize the data and, by doing so, help communicate the analytic results to others. Estimating a fitted line is also the first step in squeezing further information from the data." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"If two or more describing variables in an analysis are
highly intercorrelated, it will be difficult and perhaps impossible to assess accurately
their independent impacts on the response variable. As the association between
two or more describing variables grows stronger, it becomes more and more
difficult to tell one variable from the other. This problem, called 'multicollinearity' in the statistical jargon, sometimes causes
difficulties in the analysis of nonexperimental data. […] No statistical
technique can go very far to remedy the problem because the fault lies
basically with the data rather than the method of analysis. Multicollinearity
weakens inferences based on any statistical method - regression, path analysis,
causal modeling, or cross-tabulations (where the difficulty shows up as a lack
of deviant cases and as near-empty cells)." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"[…] it is not enough to say: 'There's error in the data and
therefore the study must be terribly dubious'. A good critic and data analyst
must do more: he or she must also show how the error in the measurement or the
analysis affects the inferences made on the basis of that data and analysis." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Logging size transforms the original skewed distribution
into a more symmetrical one by pulling in the long right tail of the
distribution toward the mean. The short left tail is, in addition, stretched.
The shift toward symmetrical distribution produced by the log transform is not,
of course, merely for convenience. Symmetrical distributions, especially those
that resemble the normal distribution, fulfill statistical assumptions that
form the basis of statistical significance testing in the regression model." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Logging skewed variables also helps to reveal the patterns
in the data. […] the rescaling of the variables by taking logarithms reduces
the nonlinearity in the relationship and removes much of the clutter resulting
from the skewed distributions on both variables; in short, the transformation
helps clarify the relationship between the two variables. It also […] leads to
a theoretically meaningful regression coefficient." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Our inability to measure important factors does not mean
either that we should sweep those factors under the rug or that we should give
them all the weight in a decision. Some important factors in some problems can
be assessed quantitatively. And even though thoughtful and imaginative efforts
have sometimes turned the 'unmeasurable' into a useful number, some
important factors are simply not measurable. As always, every bit of the
investigator's ingenuity and good judgment must be brought into play. And,
whatever un- knowns may remain, the analysis of quantitative data nonetheless can
help us learn something about the world - even if it is not the whole story." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Random data contain no substantive effects; thus if the
analysis of the random data results in some sort of effect, then we know that
the analysis is producing that spurious effect, and we must be on the lookout
for such artifacts when the genuine data are analyzed." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Sometimes clusters of variables tend to vary together in the
normal course of events, thereby rendering it difficult to discover the magnitude
of the independent effects of the different variables in the cluster. And yet
it may be most desirable, from a practical as well as scientific point of view,
to disentangle correlated describing variables in order to discover more
effective policies to improve conditions. Many economic indicators tend to move
together in response to underlying economic and political events." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"The problem of multicollinearity involves a lack of data, a
lack of information. […] Recognition of multicollinearity as a lack of
information has two important consequences: (1) In order to alleviate the problem,
it is necessary to collect more data - especially on the rarer combinations of
the describing variables. (2) No statistical technique can go very far to
remedy the problem because the fault lies basically with the data rather than
the method of analysis. Multicollinearity weakens inferences based on any
statistical method - regression, path analysis, causal modeling, or
cross-tabulations (where the difficulty shows up as a lack of deviant cases and
as near-empty cells)." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Statistical techniques do not solve any of the common-sense difficulties about making causal inferences. Such techniques may help organize or arrange the data so that the numbers speak more clearly to the question of causality - but that is all statistical techniques can do. All the logical, theoretical, and empirical difficulties attendant to establishing a causal relationship persist no matter what type of statistical analysis is applied." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"The language of association and prediction is probably most often used because the evidence seems insufficient to justify a direct causal statement. A better practice is to state the causal hypothesis and then to present the evidence along with an assessment with respect to the causal hypothesis - instead of letting the quality of the data determine the language of the explanation." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"The logarithmic transformation serves several purposes: (1) The resulting regression coefficients sometimes have a more useful theoretical interpretation compared to a regression based on unlogged variables. (2) Badly skewed distributions - in which many of the observations are clustered together combined with a few outlying values on the scale of measurement - are transformed by taking the logarithm of the measurements so that the clustered values are spread out and the large values pulled in more toward the middle of the distribution. (3) Some of the assumptions underlying the regression model and the associated significance tests are better met when the logarithm of the measured variables is taken." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"The matching procedure often helps inform the reader what is going on in the data […] Matching has some defects, chiefly that it is difficult to do a very good job of matching in complex situations without a large number of cases. […] One limitation of matching, then, is that quite often the match is not very accurate. A second limitation is that if we want to control for more than one variable using matching procedures, the tables begin to have combinations of categories without any cases at all in them, and they become somewhat more difficult for the reader to understand." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"The use of statistical methods to analyze data does not make
a study any more 'scientific', 'rigorous', or 'objective'. The purpose of quantitative analysis is not to sanctify a set
of findings. Unfortunately, some studies, in the words of one critic, 'use
statistics as a drunk uses a street lamp, for support rather than
illumination'. Quantitative techniques will be more likely to illuminate
if the data analyst is guided in methodological choices by a substantive understanding
of the problem he or she is trying to learn about. Good procedures in data
analysis involve techniques that help to (a) answer the substantive questions
at hand, (b) squeeze all the relevant information out of the data, and (c)
learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"Typically, data analysis is messy, and little details clutter it. Not only confounding factors, but also deviant cases, minor problems in measurement, and ambiguous results lead to frustration and discouragement, so that more data are collected than analyzed. Neglecting or hiding the messy details of the data reduces the researcher's chances of discovering something new." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)
"An especially effective device for enhancing the explanatory
power of time-series displays is to add spatial dimensions to the design of the
graphic, so that the data are moving over space (in two or three dimensions) as
well as over time. […] Occasionally graphics are belligerently multivariate,
advertising the technique rather than the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Each part of a graphic generates visual expectations about
its other parts and, in the economy of graphical perception, these expectations
often determine what the eye sees. Deception results from the incorrect
extrapolation of visual expectations generated at one place on the graphic to
other places." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"For many people the first word that comes to mind when they
think about statistical charts is 'lie'. No doubt some graphics do
distort the underlying data, making it hard for the viewer to learn the truth.
But data graphics are no different from words in this regard, for any means of
communication can be used to deceive. There is no reason to believe that
graphics are especially vulnerable to exploitation by liars; in fact, most of
us have pretty good graphical lie detectors that help us see right through
frauds." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Graphical excellence is the well-designed presentation of
interesting data - a matter of substance, of statistics, and of design. Graphical
excellence consists of complex ideas communicated with clarity, precision, and
efficiency. Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. Graphical excellence is nearly always multivariate. And
graphical excellence requires telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Graphical competence demands three quite different skills:
the substantive, statistical, and artistic. Yet now most graphical work, particularly
at news publications, is under the direction of but a single expertise - the
artistic. Allowing artist-illustrators to control the design and content of
statistical graphics is almost like allowing typographers to control the
content, style, and editing of prose. Substantive and quantitative expertise
must also participate in the design of data graphics, at least if statistical
integrity and graphical sophistication are to be achieved." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
" In time-series displays of money, deflated and standardized units of monetary measurement are nearly always better than nominal units." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Inept graphics also flourish because many graphic artists believe that statistics are boring and tedious. It then follows that decorated graphics must pep up, animate, and all too often exaggerate what evidence there is in the data. […] If the statistics are boring, then you've got the wrong numbers." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers even a very large set - is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Nearly all those who produce graphics for mass publication
are trained exclusively in the fine arts and have had little experience with
the analysis of data. Such experiences are essential for achieving precision
and grace in the presence of statistics. [...] Those who get ahead are those who
beautified data, never mind statistical integrity." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Of course, false graphics are still with us. Deception must always be confronted and demolished, even if lie detection is no longer at the forefront of research. Graphical excellence begins with telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Of course, statistical graphics, just like statistical calculations, are only as good as what goes into them. An ill-specified or preposterous model or a puny data set cannot be rescued by a graphic (or by calculation), no matter how clever or fancy. A silly theory means a silly graphic." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Relational graphics are essential to competent statistical
analysis since they confront statements about cause and effect with evidence,
showing how one variable affects another." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The conditions under which many data graphics are produced -
the lack of substantive and quantitative skills of the illustrators, dislike of
quantitative evidence, and contempt for the intelligence of the
audience-guarantee graphic mediocrity. These conditions engender graphics that
(1) lie; (2) employ only the simplest designs, often unstandardized time-series
based on a small handful of data points; and (3) miss the real news actually in
the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The interior decoration of graphics generates a lot of ink
that does not tell the viewer anything new. The purpose of decoration varies - to
make the graphic appear more scientific and precise, to enliven the display, to
give the designer an opportunity to exercise artistic skills. Regardless of its
cause, it is all non-data-ink or redundant data-ink, and it is often chartjunk." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"[…] the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between pies. […] Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The problem with time-series is that the simple passage of
time is not a good explanatory variable: descriptive chronology is not causal
explanation. There are occasional exceptions, especially when there is a clear
mechanism that drives the Y-variable." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The representation of numbers, as physically measured on the
surface of the graphic itself, should be directly proportional to the numerical
quantities represented." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. Most principles of design should be greeted with some skepticism, for word authority can dominate our vision, and we may come to see only though the lenses of word authority rather than with our own eyes." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"The time-series plot is the most frequently used form of
graphic design. With one dimension marching along to the regular rhythm of
seconds, minutes, hours, days, weeks, months, years, centuries, or millennia,
the natural ordering of the time scale gives this design a strength and
efficiency of interpretation found in no other graphic arrangement." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that heavoid all detail and treat his subjects only in outline, but that every word tell." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)
"At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution." (Edward R Tufte, "Envisioning Information", 1990)
"Confusion and clutter are failures of design, not attributes of information.
And so the point is to find design strategies that reveal detail and complexity - rather than to fault the data for an excess of complication. Or, worse, to
fault viewers for a lack of understanding. Among the most powerful devices for
reducing noise and enriching the content of displays is the technique of
layering and separation, visually stratifying various aspects of the data." (Edward R Tufte, "Envisioning Information", 1990)
"Consider this unsavory exhibit at right – chockablock with
cliché and stereotype, coarse humor, and a content-empty third dimension. [...] Credibility vanishes in clouds of chartjunk; who would trust a chart that looks
like a video game?" (Edward R Tufte, "Envisioning Information", 1990) [on diamond charts]
"Gray grids almost always work well and, with a delicate line, may promote
more accurate data reading and reconstruction than a heavy grid. Dark grid
lines are chartjunk. When a graphic serves as a look-up table (rare indeed),
then a grid may help with reading and interpolation. But even then the grid
should be muted relative to the data." (Edward R Tufte, "Envisioning Information", 1990)
"Information consists of differences that make a difference." (Edward R Tufte, "Envisioning Information", 1990)
"Lurking behind chartjunk is contempt both for information and for the audience. Chartjunk promoters imagine that numbers and details are boring, dull, and tedious, requiring ornament to enliven. Cosmetic decoration, which frequently distorts the data, will never salvage an underlying lack of content. If the numbers are boring, then you've got the wrong numbers." (Edward R Tufte, "Envisioning Information", 1990)
"The ducks of information design are false escapes from flatland, adding pretend dimensions to impoverished data sets, merely fooling around with information." (Edward R Tufte, "Envisioning Information", 1990)
"Visual displays rich with data are not only an appropriate and proper complement to human capabilities, but also such designs are frequently optimal. If the visual task is contrast, comparison, and choice - as so often it is - then the more relevant information within eyespan, the better. Vacant, low-density displays, the dreaded posterization of data spread over pages and pages, require viewers to rely on visual memory - a weak skill - to make a contrast, a comparison, a choice." (Edward R Tufte, "Envisioning Information", 1990)
"We envision information in order to reason about, communicate, document, and preserve that knowledge - activities nearly always carried out on two-dimensional paper and computer screen. Escaping this flatland and enriching the density of data displays are the essential tasks of information design." (Edward R Tufte, "Envisioning Information", 1990)
"What about confusing clutter? Information overload? Doesn't data have to be ‘boiled down’ and ‘simplified’? These common questions miss the point, for the quantity of detail is an issue completely separate from the difficulty of reading. Clutter and confusion are failures of design, not attributes of information. Often the less complex and less subtle the line, the more ambiguous and less interesting is the reading. Stripping the detail out of data is a style based on personal preference and fashion, considerations utterly indifferent to substantive content." (Edward R Tufte, "Envisioning Information", 1990)
"Good information design is clear thinking made visible, while bad design is stupidity in action." (Edward Tufte, "Visual Explanations" , 1997)
"Audience boredom is usually a content failure, not a decoration failure." (Edward R Tufte, "The cognitive style of PowerPoint", 2003)
"If your words or images are not on point, making them dance in color won't make them relevant." (Edward R Tufte, "The cognitive style of PowerPoint", 2003)
"A sparkline is a small, intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic." (Edward R Tufte, "Beautiful Evidence", 2006)
"Areas surrounding data-lines may generate unintentional optical clutter. Strong frames produce melodramatic but content-diminishing visual effects. [...] A good way to assess a display for unintentional optical clutter is to ask 'Do the prominent visual effects convey relevant content?'" (Edward R Tufte, "Beautiful Evidence", 2006)
"By segregating evidence by mode (word, number, image, graph) , the current-day computer approach contradicts the spirit of sparklines, a spirit that makes no distinction among words, numbers, graphics, images. It is all evidence, after all. A good system for evidence display should be centered on evidence, not on a collection of application programs each devoted to a single mode of information." (Edward R Tufte, "Beautiful Evidence", 2006)
"By showing recent change in relation to many past changes, sparklines provide a context for nuanced analysis - and, one hopes, better decisions. [...] Sparklines efficiently display and narrate binary data (presence/absence, occurrence/non-occurrence, win/loss). [...] Sparklines can simultaneously accommodate several variables. [...] Sparklines can narrate on-going results detail for any process producing sequential binary outcomes." (Edward R Tufte, "Beautiful Evidence", 2006)
"Closely spaced lines produce moiré vibration, usually at its worst when data-lines (the figure) and spaces (the ground) between data-lines are approximately equal in size, and also when figure and ground contrast strongly in color value." (Edward R Tufte, "Beautiful Evidence", 2006)
"Conflicting with the idea of integrating evidence regardless of its these guidelines provoke several issues: First, labels are data. even intriguing data. [...] Second, when labels abandon the data points, then a code is often needed to relink names to numbers. Such codes, keys, and legends are Impediments to learning, causing the reader's brow to furrow. Third, segregating nouns from data-dots breaks up evidence on the basis of mode (verbal vs. nonverbal), a distinction lacking substantive relevance. Such separation is uncartographic; contradicting the methods of map design often causes trouble for any type of graphical display. Fourth, design strategies that reduce data-resolution take evidence displays in the wrong direction. Fifth, what clutter? Even this supposedly cluttered graph clearly shows the main ideas: brain and body mass are roughly linear in logarithms, and as both variables increase, this linearity becomes less tight." (Edward R Tufte, "Beautiful Evidence", 2006) [argumentation against Cleveland's recommendation of not using words on data plots]
"Documentation allows more effective watching, and we have the Fifth Principle for the analysis and presentation of data: 'Thoroughly describe the evidence. Provide a detailed title, indicate the authors and sponsors, document the data sources, show complete measurement scales, point out relevant issues.'" (Edward R Tufte, "Beautiful Evidence", 2006)
"Explanatory, journalistic, and scientific images should nearly always be mapped, contextualized, and placed on the universal grid. Mapped pictures combine representational images with scales, diagrams, overlays, numbers, words, images." (Edward R Tufte, "Beautiful Evidence", 2006)
"Evidence is evidence, whether words, numbers, images, din grams- still or moving. It is all information after all. For readers and viewers, the intellectual task remains constant regardless of the particular mode Of evidence: to understand and to reason about the materials at hand, and to appraise their quality, relevance. and integrity." (Edward R Tufte, "Beautiful Evidence", 2006)
"Excellent graphics exemplify the deep fundamental principles of analytical design in action. If this were not the case, then something might well be wrong with the principles." (Edward R Tufte, "Beautiful Evidence", 2006)
"Good design, however, can dispose of clutter and show all the data points and their names. [...] Clutter calls for a design solution, not a content reduction." (Edward R Tufte, "Beautiful Evidence", 2006)
"In general. statistical graphics should be moderately greater in length than in height. And, as William Cleveland discovered, for judging slopes and velocities up and down the hills in time-series, best is an aspect ratio that yields hill - slopes averaging 45°, over every cycle in the time-series. Variations in slopes are best detected when the slopes are around 45°, uphill or downhill." (Edward R Tufte, "Beautiful Evidence", 2006)
"Making a presentation is a moral act as well as an intellectual activity. The use of corrupt manipulations and blatant rhetorical ploys in a report or presentation - outright lying, flagwaving, personal attacks, setting up phony alternatives, misdirection, jargon-mongering, evading key issues, feigning disinterested objectivity, willful misunderstanding of other points of view - suggests that the presenter lacks both credibility and evidence. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)
"Making an evidence presentation is a moral act as well as an intellectual activity. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)
"Most techniques for displaying evidence are inherently multimodal, bringing verbal, visual. and quantitative elements together. Statistical graphics and maps arc visual-numerical fields labeled with words and framed by numbers. Even an austere image may evoke other images, new or remembered narrative, and perhaps a sense of scale and quantity. Words can simultaneously convey semantic and visual content, as the nouns on a map both name places and locate them in the two - space of latitude and longitude." (Edward R Tufte, "Beautiful Evidence", 2006)
"Principles of design should attend to the fundamental intellectual tasks in the analysis of evidence; thus we have the Second Principle for the analysis And presentation of data: Show causality, mechanism, explanation, systematic structure." (Edward R Tufte, "Beautiful Evidence", 2006)
"Sparklines are wordlike graphics, With an intensity of visual distinctions comparable to words and letters. [...] Words visually present both an overall shape and letter-by-letter detail; since most readers have seen the word previously, the visual task is usually one of quick recognition. Sparklines present an overall shape and aggregate pattern along with plenty of local detail. Sparklines are read the same way as words, although much more carefully and slowly." (Edward R Tufte, "Beautiful Evidence", 2006)
"Sparklines vastly increase the amount of data within our eyespan and intensify statistical graphics up to the everyday routine capabilities of the human eye-brain system for reasoning about visual evidence, seeing distinctions, and making comparisons. [...] Providing a straightforward and contextual look at intense evidence, sparkline graphics give us some chance to be approximately right rather than exactly wrong." (Edward R Tufte, "Beautiful Evidence", 2006)
"Sparklines work at intense resolutions, at the level of good typography and cartography. [...] Just as sparklines are like words, so then distributions of sparklines on a page are like sentences and paragraphs. The graphical idea here is make it wordlike and typographic - an idea that leads to reasonable answers for most questions about sparkline arrangements." (Edward R Tufte, "Beautiful Evidence", 2006)
"[...] the First Principle for the analysis and presentation data: 'Show comparisons, contrasts, differences'. The fundamental analytical act in statistical reasoning is to answer the question "Compared with what?". Whether we are evaluating changes over space or time, searching big data bases, adjusting and controlling for variables, designing experiments , specifying multiple regressions, or doing just about any kind of evidence-based reasoning, the essential point is to make intelligent and appropriate comparisons. Thus visual displays, if they are to assist thinking, should show comparisons." (Edward R Tufte, "Beautiful Evidence", 2006)
"The only thing that is 2-dimensional about evidence is the physical flatland of paper and computer screen. Flatlandy technologies of display encourage flatlandy thinking. Reasoning about evidence should not be stuck in 2 dimensions, for the world seek to understand is profoundly multivariate. Strategies of design should make multivariateness routine, nothing out of the ordinary. To think multivariate. show multivariate; the Third Principle for the analysis and presentation of data:
'Show multivariate data; that is, show more than 1 or 2 variables.'" (Edward R Tufte, "Beautiful Evidence", 2006)
"The principles of analytical design are universal - like mathematics, the laws of Nature, the deep structure of language - and are not tied to any particular language, culture, style, century, gender, or technology of information display." (Edward R Tufte, "Beautiful Evidence", 2006)
"The purpose of an evidence presentation is to assist thinking. Thus presentations should be constructed so as to assist with the fundamental intellectual tasks in reasoning about evidence: describing the data, making multivariate comparisons, understanding causality, integrating a diversity Of evidence, and documenting the analysis. Thus the Grand Principle of analytical design: 'The principles of analytical design are derived from the principles of analytical thinking.' Cognitive tasks are turned into principles of evidence presentation and design." (Edward R Tufte, "Beautiful Evidence", 2006)
"The Sixth Principle for the analysis and display of data: 'Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content.' This suggests that the most effective way to improve a presentation is to get better content. It also suggests that design devices and gimmicks cannot salvage failed content." (Edward R Tufte, "Beautiful Evidence", 2006)
"These little data lines, because of their active quality over time, are named sparklines - small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are datawords: data-intense, design-simple, word-sized graphics." (Edward R Tufte, "Beautiful Evidence", 2006)
"Words. numbers. pictures, diagrams, graphics, charts, tables belong together. Excellent maps, which are the heart and soul of good practices in analytical graphics, routinely integrate words, numbers, line-art, grids, measurement scales. Rarely is a distinction among the different modes of evidence useful for making sound inferences. It is all information after all. Thus the Fourth Principle for the analysis and presentation of data: 'Completely integrate words, numbers, images, diagrams.'" (Edward R Tufte, "Beautiful Evidence", 2006)