31 December 2006

✏️Danyel Fisher - Collected Quotes

"A dimension is an attribute that groups, separates, or filters data items. A measure is an attribute that addresses the question of interest and that the analyst expects to vary across the dimensions. Both the measures and the dimensions might be attributes directly found in the dataset or derived attributes calculated from the existing data." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"A well-operationalized task, relative to the underlying data, fulfills the following criteria: (1) Can be computed based on the data; (2) Makes specific reference to the attributes of the data; (3) Has a traceable path from the high-level abstract questions to a set of concrete, actionable tasks." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"An actionable task means that it is possible to act on its result. That action might be to present a useful result to a decision maker or to proceed to a next step in a different result. An answer is actionable when it no longer needs further work to make sense of it." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Every dataset has subtleties; it can be far too easy to slip down rabbit holes of complications. Being systematic about the operationalization can help focus our conversations with experts, only introducing complications when needed." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Color is difficult to use effectively. A small number of well-chosen colors can be highly distinguishable, particularly for categorical data, but it can be difficult for users to distinguish between more than a handful of colors in a visualization. Nonetheless, color is an invaluable tool in the visualization toolbox because it is a channel that can carry a great deal of meaning and be overlaid on other dimensions. […] There are a variety of perceptual effects, such as simultaneous contrast and color deficiencies, that make precise numerical judgments about a color scale difficult, if not impossible." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Creating effective visualizations is hard. Not because a dataset requires an exotic and bespoke visual representation - for many problems, standard statistical charts will suffice. And not because creating a visualization requires coding expertise in an unfamiliar programming language [...]. Rather, creating effective visualizations is difficult because the problems that are best addressed by visualization are often complex and ill-formed. The task of figuring out what attributes of a dataset are important is often conflated with figuring out what type of visualization to use. Picking a chart type to represent specific attributes in a dataset is comparatively easy. Deciding on which data attributes will help answer a question, however, is a complex, poorly defined, and user-driven process that can require several rounds of visualization and exploration to resolve." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Dashboards are a type of multiform visualization used to summarize and monitor data. These are most useful when proxies have been well validated and the task is well understood. This design pattern brings a number of carefully selected attributes together for fast, and often continuous, monitoring - dashboards are often linked to updating data streams. While many allow interactivity for further investigation, they typically do not depend on it. Dashboards are often used for presenting and monitoring data and are typically designed for at-a-glance analysis rather than deep exploration and analysis." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Designing effective visualizations presents a paradox. On the one hand, visualizations are intended to help users learn about parts of their data that they don’t know about. On the other hand, the more we know about the users’ needs and the context of their data, the better we can design a visualization to serve them." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Dimensionality reduction is a way of reducing a large number of different measures into a smaller set of metrics. The intent is that the reduced metrics are a simpler description of the complex space that retains most of the meaning. […] Clustering techniques are similarly useful for reducing a large number of items into a smaller set of groups. A clustering technique finds groups of items that are logically near each other and gathers them together." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Maps also have the disadvantage that they consume the most powerful encoding channels in the visualization toolbox - position and size - on an aspect that is held constant. This leaves less effective encoding channels like color for showing the dimension of interest." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"[…] no single visualization is ever quite able to show all of the important aspects of our data at once - there just are not enough visual encoding channels. […] designing effective visualizations to make sense of data is not an art - it is a systematic and repeatable process." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"[…] the data itself can lead to new questions too. In exploratory data analysis (EDA), for example, the data analyst discovers new questions based on the data. The process of looking at the data to address some of these questions generates incidental visualizations - odd patterns, outliers, or surprising correlations that are worth looking into further." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The field of [data] visualization takes on that goal more broadly: rather than attempting to identify a single metric, the analyst instead tries to look more holistically across the data to get a usable, actionable answer. Arriving at that answer might involve exploring multiple attributes, and using a number of views that allow the ideas to come together. Thus, operationalization in the context of visualization is the process of identifying tasks to be performed over the dataset that are a reasonable approximation of the high-level question of interest." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The general concept of refining questions into tasks appears across all of the sciences. In many fields, the process is called operationalization, and refers to the process of reducing a complex set of factors to a single metric. The field of visualization takes on that goal more broadly: rather than attempting to identify a single metric, the analyst instead tries to look more holistically across the data to get a usable, actionable answer. Arriving at that answer might involve exploring multiple attributes, and using a number of views that allow the ideas to come together. Thus, operationalization in the context of visualization is the process of identifying tasks to be performed over the dataset that are a reasonable approximation of the high-level question of interest." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The goal of operationalization is to refine and clarify the question until the analyst can forge an explicit link between the data that they can find and the questions they would like to answer. […] To achieve this, the analyst searches for proxies. Proxies are partial and imperfect representations of the abstract thing that the analyst is really interested in. […] Selecting and interpreting proxies requires judgment and expertise to assess how well, and with what sorts of limitations, they represent the abstract concept." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The operationalization process is an iterative one and the end point is not precisely defined. The answer to the question of how far to go is, simply, far enough. The process is done when the task is directly actionable, using the data at hand. The analyst knows how to describe the objects, measures, and groupings in terms of the data - where to find it, how to compute, and how to aggregate it. At this point, they know what the question will look like and they know what they can do to get the answer." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"The intention behind prototypes is to explore the visualization design space, as opposed to the data space. A typical project usually entails a series of prototypes; each is a tool to gather feedback from stakeholders and help explore different ways to most effectively support the higher-level questions that they have. The repeated feedback also helps validate the operationalization along the way." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Rapid prototyping is a process of trying out many visualization ideas as quickly as possible and getting feedback from stakeholders on their efficacy. […] The design concept of 'failing fast' informs this: by exploring many different possible visual representations, it quickly becomes clear which tasks are supported by which techniques." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Too many simultaneous encodings will be overwhelming to the reader; colors must be easily distinguishable, and of a small enough number that the reader can interpret them."  (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Visualizations provide a direct and tangible representation of data. They allow people to confirm hypotheses and gain insights. When incorporated into the data analysis process early and often, visualizations can even fundamentally alter the questions that someone is asking." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

✏️Dina Gray - Collected Quotes

"Although performance measurement is often linked to tools such as scorecards, dashboards, performance targets, indicators and information systems, it would be naïve to consider the measurement of performance as just a technical issue. Indeed, measurement is often used as a way of attempting to bring clarity to complex and confusing situations." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"'Big Data" is certainly changing the way organizations operate, and our capacity to do planning, budgeting and forecasting, as well as the management of our processes and supply chains, has radically improved. However, greater availability of data is also being accompanied by two major challenges: firstly, many managers are now required to develop data-oriented management systems to make sense of the phenomenal amount of data their organizations and their main partners are producing. Secondly, whilst the volume of data that we now have access to is certainly seductive and potentially very useful, it can also be overwhelming." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"[...] introducing an excessive number of measures is only the start of the problem. The other is that measures tend to stick, unless questioned and revised. As the world changes, so does the environment in which an organization operates. Priorities change, new drivers of performance emerge, and different operating models are employed. It would therefore make sense that the performance measurement system is also revised to reflect these changes." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"Measurement is often associated with the objectivity and neatness of numbers, and performance measurement efforts are typically accompanied by hope, great expectations and promises of change; however, these are then often followed by disbelief, frustration and what appears to be sheer madness." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"Measurement is often seen as a tool that helps reduce the complexity of the world. Organizations, with their uncertainty and confusion, are full of people, patterns and trends; and measurement seems to offer a promise of bringing order, rationality and control into this chaos." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"One of the most puzzling things about performance measurement is that, regardless of the countless negative experiences, as well as a constant stream of similar failures reported in the media, organizations continue to apply the same methods and constantly fall into the same traps. This is because commonly held beliefs about the measurement and management of performance are rarely challenged." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"Performance measures by themselves are simply tools that may or may not be used by managers and staff. However, if your organization has an addiction to measurement, sooner or later people will start relying on measures excessively, and common sense will gradually begin to be replaced by the measures themselves leading the organization into the eye of the measurement madness hurricane." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"Regularly, and unfortunately more often than might be expected, organizations can become so fixated on the narrow task of measuring and reporting performance that measures lose their meaning, and no one relies on them for real decision-making. [...] More worryingly, sometimes performance measures are introduced without any intention of providing meaningful data for making decisions in the first place. In this case, such indicators are often treated with contempt." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"Since perfect measures of performance do not exist, organizations use proxies - indicators that approximate or represent performance in the absence of perfect measures. [...] Over time, proxies are perceived to rep￾resent true performance." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

"When all you see and believe is numbers, it becomes increasingly difficult to decide when to react and intervene. [...] The most obvious course of action is to set aside the numbers and try to understand the underlying causes of these changes. However, the over-reliance on measurement instead drives many managers to design 'thresholds' or 'colour codes' for numbers, thus adding another layer of abstraction to measurement and keeping these managers firmly desensitized to the meaning of per￾formance information." (Dina Gray et al, "Measurement Madness: Recognizing and avoiding the pitfalls of performance measurement", 2015)

✏️Edward R Tufte - Collected Quotes

"A good rule of thumb for deciding how long the analysis of the data actually will take is (1) to add up all the time for everything you can think of - editing the data, checking for errors, calculating various statistics, thinking about the results, going back to the data to try out a new idea, and (2) then multiply the estimate obtained in this first step by five." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Almost all efforts at data analysis seek, at some point, to generalize the results and extend the reach of the conclusions beyond a particular set of data. The inferential leap may be from past experiences to future ones, from a sample of a population to the whole population, or from a narrow range of a variable to a wider range. The real difficulty is in deciding when the extrapolation beyond the range of the variables is warranted and when it is merely naive. As usual, it is largely a matter of substantive judgment - or, as it is sometimes more delicately put, a matter of 'a priori nonstatistical considerations'." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"[…] fitting lines to relationships between variables is often a useful and powerful method of summarizing a set of data. Regression analysis fits naturally with the development of causal explanations, simply because the research worker must, at a minimum, know what he or she is seeking to explain." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Fitting lines to relationships between variables is the major tool of data analysis. Fitted lines often effectively summarize the data and, by doing so, help communicate the analytic results to others. Estimating a fitted line is also the first step in squeezing further information from the data." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"If two or more describing variables in an analysis are highly intercorrelated, it will be difficult and perhaps impossible to assess accurately their independent impacts on the response variable. As the association between two or more describing variables grows stronger, it becomes more and more difficult to tell one variable from the other. This problem, called 'multicollinearity' in the statistical jargon, sometimes causes difficulties in the analysis of nonexperimental data. […] No statistical technique can go very far to remedy the problem because the fault lies basically with the data rather than the method of analysis. Multicollinearity weakens inferences based on any statistical method - regression, path analysis, causal modeling, or cross-tabulations (where the difficulty shows up as a lack of deviant cases and as near-empty cells)."  (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"[…] it is not enough to say: 'There's error in the data and therefore the study must be terribly dubious'. A good critic and data analyst must do more: he or she must also show how the error in the measurement or the analysis affects the inferences made on the basis of that data and analysis." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Logging size transforms the original skewed distribution into a more symmetrical one by pulling in the long right tail of the distribution toward the mean. The short left tail is, in addition, stretched. The shift toward symmetrical distribution produced by the log transform is not, of course, merely for convenience. Symmetrical distributions, especially those that resemble the normal distribution, fulfill statistical assumptions that form the basis of statistical significance testing in the regression model." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Logging skewed variables also helps to reveal the patterns in the data. […] the rescaling of the variables by taking logarithms reduces the nonlinearity in the relationship and removes much of the clutter resulting from the skewed distributions on both variables; in short, the transformation helps clarify the relationship between the two variables. It also […] leads to a theoretically meaningful regression coefficient." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Our inability to measure important factors does not mean either that we should sweep those factors under the rug or that we should give them all the weight in a decision. Some important factors in some problems can be assessed quantitatively. And even though thoughtful and imaginative efforts have sometimes turned the 'unmeasurable' into a useful number, some important factors are simply not measurable. As always, every bit of the investigator's ingenuity and good judgment must be brought into play. And, whatever un- knowns may remain, the analysis of quantitative data nonetheless can help us learn something about the world - even if it is not the whole story." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Random data contain no substantive effects; thus if the analysis of the random data results in some sort of effect, then we know that the analysis is producing that spurious effect, and we must be on the lookout for such artifacts when the genuine data are analyzed." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Sometimes clusters of variables tend to vary together in the normal course of events, thereby rendering it difficult to discover the magnitude of the independent effects of the different variables in the cluster. And yet it may be most desirable, from a practical as well as scientific point of view, to disentangle correlated describing variables in order to discover more effective policies to improve conditions. Many economic indicators tend to move together in response to underlying economic and political events." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The problem of multicollinearity involves a lack of data, a lack of information. […] Recognition of multicollinearity as a lack of information has two important consequences: (1) In order to alleviate the problem, it is necessary to collect more data - especially on the rarer combinations of the describing variables. (2) No statistical technique can go very far to remedy the problem because the fault lies basically with the data rather than the method of analysis. Multicollinearity weakens inferences based on any statistical method - regression, path analysis, causal modeling, or cross-tabulations (where the difficulty shows up as a lack of deviant cases and as near-empty cells)." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Statistical techniques do not solve any of the common-sense difficulties about making causal inferences. Such techniques may help organize or arrange the data so that the numbers speak more clearly to the question of causality - but that is all statistical techniques can do. All the logical, theoretical, and empirical difficulties attendant to establishing a causal relationship persist no matter what type of statistical analysis is applied." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The language of association and prediction is probably most often used because the evidence seems insufficient to justify a direct causal statement. A better practice is to state the causal hypothesis and then to present the evidence along with an assessment with respect to the causal hypothesis - instead of letting the quality of the data determine the language of the explanation." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The logarithmic transformation serves several purposes: (1) The resulting regression coefficients sometimes have a more useful theoretical interpretation compared to a regression based on unlogged variables. (2) Badly skewed distributions - in which many of the observations are clustered together combined with a few outlying values on the scale of measurement - are transformed by taking the logarithm of the measurements so that the clustered values are spread out and the large values pulled in more toward the middle of the distribution. (3) Some of the assumptions underlying the regression model and the associated significance tests are better met when the logarithm of the measured variables is taken." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The matching procedure often helps inform the reader what is going on in the data […] Matching has some defects, chiefly that it is difficult to do a very good job of matching in complex situations without a large number of cases. […] One limitation of matching, then, is that quite often the match is not very accurate. A second limitation is that if we want to control for more than one variable using matching procedures, the tables begin to have combinations of categories without any cases at all in them, and they become somewhat more difficult for the reader to understand." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The use of statistical methods to analyze data does not make a study any more 'scientific', 'rigorous', or 'objective'. The purpose of quantitative analysis is not to sanctify a set of findings. Unfortunately, some studies, in the words of one critic, 'use statistics as a drunk uses a street lamp, for support rather than illumination'. Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Typically, data analysis is messy, and little details clutter it. Not only confounding factors, but also deviant cases, minor problems in measurement, and ambiguous results lead to frustration and discouragement, so that more data are collected than analyzed. Neglecting or hiding the messy details of the data reduces the researcher's chances of discovering something new." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"An especially effective device for enhancing the explanatory power of time-series displays is to add spatial dimensions to the design of the graphic, so that the data are moving over space (in two or three dimensions) as well as over time. […] Occasionally graphics are belligerently multivariate, advertising the technique rather than the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Each part of a graphic generates visual expectations about its other parts and, in the economy of graphical perception, these expectations often determine what the eye sees. Deception results from the incorrect extrapolation of visual expectations generated at one place on the graphic to other places." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"For many people the first word that comes to mind when they think about statistical charts is 'lie'. No doubt some graphics do distort the underlying data, making it hard for the viewer to learn the truth. But data graphics are no different from words in this regard, for any means of communication can be used to deceive. There is no reason to believe that graphics are especially vulnerable to exploitation by liars; in fact, most of us have pretty good graphical lie detectors that help us see right through frauds." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Graphical excellence is the well-designed presentation of interesting data - a matter of substance, of statistics, and of design. Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. Graphical excellence is nearly always multivariate. And graphical excellence requires telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Graphical competence demands three quite different skills: the substantive, statistical, and artistic. Yet now most graphical work, particularly at news publications, is under the direction of but a single expertise - the artistic. Allowing artist-illustrators to control the design and content of statistical graphics is almost like allowing typographers to control the content, style, and editing of prose. Substantive and quantitative expertise must also participate in the design of data graphics, at least if statistical integrity and graphical sophistication are to be achieved." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

" In time-series displays of money, deflated and standardized units of monetary measurement are nearly always better than nominal units." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Inept graphics also flourish because many graphic artists believe that statistics are boring and tedious. It then follows that decorated graphics must pep up, animate, and all too often exaggerate what evidence there is in the data. […] If the statistics are boring, then you've got the wrong numbers." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers even a very large set - is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Nearly all those who produce graphics for mass publication are trained exclusively in the fine arts and have had little experience with the analysis of data. Such experiences are essential for achieving precision and grace in the presence of statistics. [...] Those who get ahead are those who beautified data, never mind statistical integrity." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Of course, false graphics are still with us. Deception must always be confronted and demolished, even if lie detection is no longer at the forefront of research. Graphical excellence begins with telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Of course, statistical graphics, just like statistical calculations, are only as good as what goes into them. An ill-specified or preposterous model or a puny data set cannot be rescued by a graphic (or by calculation), no matter how clever or fancy. A silly theory means a silly graphic." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Relational graphics are essential to competent statistical analysis since they confront statements about cause and effect with evidence, showing how one variable affects another." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The conditions under which many data graphics are produced - the lack of substantive and quantitative skills of the illustrators, dislike of quantitative evidence, and contempt for the intelligence of the audience-guarantee graphic mediocrity. These conditions engender graphics that (1) lie; (2) employ only the simplest designs, often unstandardized time-series based on a small handful of data points; and (3) miss the real news actually in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The interior decoration of graphics generates a lot of ink that does not tell the viewer anything new. The purpose of decoration varies - to make the graphic appear more scientific and precise, to enliven the display, to give the designer an opportunity to exercise artistic skills. Regardless of its cause, it is all non-data-ink or redundant data-ink, and it is often chartjunk."  (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The number of information-carrying (variable) dimensions depicted should not exceed the number of dimensions in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"[…] the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between pies. […] Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The problem with time-series is that the simple passage of time is not a good explanatory variable: descriptive chronology is not causal explanation. There are occasional exceptions, especially when there is a clear mechanism that drives the Y-variable." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. Most principles of design should be greeted with some skepticism, for word authority can dominate our vision, and we may come to see only though the lenses of word authority rather than with our own eyes." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The time-series plot is the most frequently used form of graphic design. With one dimension marching along to the regular rhythm of seconds, minutes, hours, days, weeks, months, years, centuries, or millennia, the natural ordering of the time scale gives this design a strength and efficiency of interpretation found in no other graphic arrangement." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Vigorous writing is concise. A sentence should contain no unnecessary words, a paragraph no unnecessary sentences, for the same reason that a drawing should have no unnecessary lines and a machine no unnecessary parts. This requires not that the writer make all his sentences short, or that heavoid all detail and treat his subjects only in outline, but that every word tell." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"A range-frame does not require any viewing or decoding instructions; it is not a graphical puzzle and most viewers can easily tell what is going on. Since it is more informative about the data in a clear and precise manner, the range-frame should replace the non-data bearing frame inmany graphical applications." (Edward R Tufte, "Data-Ink Maximization and Graphical Design", Oikos Vol. 58 (2), 1990)

"At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution." (Edward R Tufte, "Envisioning Information", 1990) 

"Confusion and clutter are failures of design, not attributes of information. And so the point is to find design strategies that reveal detail and complexity - rather than to fault the data for an excess of complication. Or, worse, to fault viewers for a lack of understanding. Among the most powerful devices for reducing noise and enriching the content of displays is the technique of layering and separation, visually stratifying various aspects of the data." (Edward R Tufte, "Envisioning Information", 1990)

"Consider this unsavory exhibit at right – chockablock with cliché and stereotype, coarse humor, and a content-empty third dimension. [...] Credibility vanishes in clouds of chartjunk; who would trust a chart that looks like a video game?" (Edward R Tufte, "Envisioning Information", 1990) [on diamond charts]

"Graphics are almost always going to improve as they go through editing, revision, and testing against different design options. The principles of maximizing data-ink and erasing generate graphical alternatives and also suggest a direction in which revisions should move." (Edward R Tufte, "Data-Ink Maximization and Graphical Design", Oikos Vol. 58 (2), 1990)

"Gray grids almost always work well and, with a delicate line, may promote more accurate data reading and reconstruction than a heavy grid. Dark grid lines are chartjunk. When a graphic serves as a look-up table (rare indeed), then a grid may help with reading and interpolation. But even then the grid should be muted relative to the data." (Edward R Tufte, "Envisioning Information", 1990)

"Information consists of differences that make a difference." (Edward R Tufte, "Envisioning Information", 1990)

"Lurking behind chartjunk is contempt both for information and for the audience. Chartjunk promoters imagine that numbers and details are boring, dull, and tedious, requiring ornament to enliven. Cosmetic decoration, which frequently distorts the data, will never salvage an underlying lack of content. If the numbers are boring, then you've got the wrong numbers." (Edward R Tufte, "Envisioning Information", 1990)

"Maximizing data ink (within reason) is but a single dimension of a complex and multivariate design task. The principle helps conduct experiments in graphical design. Some of those experiments will succeed. There remain, however, many other considerations in the design of statistical graphics - not only of efficiency, but also of complexity, structure, density, and even beauty." (Edward R Tufte, "Data-Ink Maximization and Graphical Design", Oikos Vol. 58 (2), 1990)

"The ducks of information design are false escapes from flatland, adding pretend dimensions to impoverished data sets, merely fooling around with information." (Edward R Tufte, "Envisioning Information", 1990)

"Then there is the audience: will those looking at the new designs be confused? Some of the designs are selfexplanatory, as in the case of the range-frame. The dot-dash-plot is more difficult, although it still shows all the standard information found in the scatterplot. Nothing is lost to those puzzled by the frame of dashes, and something is gained by those who do understand. Moreover, it is a frequent mistake in thinking about statistical graphics to underestimate the audience. Instead, why not assume that if you understand it, most other readers will, too? Graphics should be as intelligent and sophisticated as the accompanying text." (Edward R Tufte, "Data-Ink Maximization and Graphical Design", Oikos Vol. 58 (2), 1990)

"Visual displays rich with data are not only an appropriate and proper complement to human capabilities, but also such designs are frequently optimal. If the visual task is contrast, comparison, and choice - as so often it is - then the more relevant information within eyespan, the better. Vacant, low-density displays, the dreaded posterization of data spread over pages and pages, require viewers to rely on visual memory - a weak skill - to make a contrast, a comparison, a choice." (Edward R Tufte, "Envisioning Information", 1990)

"We envision information in order to reason about, communicate, document, and preserve that knowledge - activities nearly always carried out on two-dimensional paper and computer screen. Escaping this flatland and enriching the density of data displays are the essential tasks of information design." (Edward R Tufte, "Envisioning Information", 1990)

"What about confusing clutter? Information overload? Doesn't data have to be ‘boiled down’ and  ‘simplified’? These common questions miss the point, for the quantity of detail is an issue completely separate from the difficulty of reading. Clutter and confusion are failures of design, not attributes of information. Often the less complex and less subtle the line, the more ambiguous and less interesting is the reading. Stripping the detail out of data is a style based on personal preference and fashion, considerations utterly indifferent to substantive content." (Edward R Tufte, "Envisioning Information", 1990)

"Good information design is clear thinking made visible, while bad design is stupidity in action." (Edward Tufte, "Visual Explanations" , 1997)

"Audience boredom is usually a content failure, not a decoration failure." (Edward R Tufte, "The cognitive style of PowerPoint", 2003)

"If your words or images are not on point, making them dance in color won't make them relevant." (Edward R Tufte, "The cognitive style of PowerPoint", 2003)

"A sparkline is a small, intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic." (Edward R Tufte, "Beautiful Evidence", 2006)

"Areas surrounding data-lines may generate unintentional optical clutter. Strong frames produce melodramatic but content-diminishing visual effects. [...] A good way to assess a display for unintentional optical clutter is to ask 'Do the prominent visual effects convey relevant content?'" (Edward R Tufte, "Beautiful Evidence", 2006)

"By segregating evidence by mode (word, number, image, graph) , the current-day computer approach contradicts the spirit of sparklines, a spirit that makes no distinction among words, numbers, graphics, images. It is all evidence, after all. A good system for evidence display should be centered on evidence, not on a collection of application programs each devoted to a single mode of information." (Edward R Tufte, "Beautiful Evidence", 2006)

"By showing recent change in relation to many past changes, sparklines provide a context for nuanced analysis - and, one hopes, better decisions. [...] Sparklines efficiently display and narrate binary data (presence/absence, occurrence/non-occurrence, win/loss). [...] Sparklines can simultaneously accommodate several variables. [...] Sparklines can narrate on-going results detail for any process producing sequential binary outcomes." (Edward R Tufte, "Beautiful Evidence", 2006)

"Closely spaced lines produce moiré vibration, usually at its worst when data-lines (the figure) and spaces (the ground) between data-lines are approximately equal in size, and also when figure and ground contrast strongly in color value." (Edward R Tufte, "Beautiful Evidence", 2006)

"Conflicting with the idea of integrating evidence regardless of its these guidelines provoke several issues: First, labels are data. even intriguing data. [...] Second, when labels abandon the data points, then a code is often needed to relink names to numbers. Such codes, keys, and legends are Impediments to learning, causing the reader's brow to furrow. Third, segregating nouns from data-dots breaks up evidence on the basis of mode (verbal vs. nonverbal), a distinction lacking substantive relevance. Such separation is uncartographic; contradicting the methods of map design often causes trouble for any type of graphical display. Fourth, design strategies that reduce data-resolution take evidence displays in the wrong direction. Fifth, what clutter? Even this supposedly cluttered graph clearly shows the main ideas: brain and body mass are roughly linear in logarithms, and as both variables increase, this linearity becomes less tight." (Edward R Tufte, "Beautiful Evidence", 2006) [argumentation against Cleveland's recommendation of not using words on data plots]

"Documentation allows more effective watching, and we have the Fifth Principle for the analysis and presentation of data: 'Thoroughly describe the evidence. Provide a detailed title, indicate the authors and sponsors, document the data sources, show complete measurement scales, point out relevant issues.'" (Edward R Tufte, "Beautiful Evidence", 2006)

"Explanatory, journalistic, and scientific images should nearly always be mapped, contextualized, and placed on the universal grid. Mapped pictures combine representational images with scales, diagrams, overlays, numbers, words, images." (Edward R Tufte, "Beautiful Evidence", 2006)

"Evidence is evidence, whether words, numbers, images, din grams- still or moving. It is all information after all. For readers and viewers, the intellectual task remains constant regardless of the particular mode Of evidence: to understand and to reason about the materials at hand, and to appraise their quality, relevance. and integrity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Excellent graphics exemplify the deep fundamental principles of analytical design in action. If this were not the case, then something might well be wrong with the principles." (Edward R Tufte, "Beautiful Evidence", 2006)

"Good design, however, can dispose of clutter and show all the data points and their names. [...] Clutter calls for a design solution, not a content reduction." (Edward R Tufte, "Beautiful Evidence", 2006)

"In general. statistical graphics should be moderately greater in length than in height. And, as William Cleveland discovered, for judging slopes and velocities up and down the hills in time-series, best is an aspect ratio that yields hill - slopes averaging 45°, over every cycle in the time-series. Variations in slopes are best detected when the slopes are around 45°, uphill or downhill." (Edward R Tufte, "Beautiful Evidence", 2006)

"Making a presentation is a moral act as well as an intellectual activity. The use of corrupt manipulations and blatant rhetorical ploys in a report or presentation - outright lying, flagwaving, personal attacks, setting up phony alternatives, misdirection, jargon-mongering, evading key issues, feigning disinterested objectivity, willful misunderstanding of other points of view - suggests that the presenter lacks both credibility and evidence. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Making an evidence presentation is a moral act as well as an intellectual activity. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Most techniques for displaying evidence are inherently multimodal, bringing verbal, visual. and quantitative elements together. Statistical graphics and maps arc visual-numerical fields labeled with words and framed by numbers. Even an austere image may evoke other images, new or remembered narrative, and perhaps a sense of scale and quantity. Words can simultaneously convey semantic and visual content, as the nouns on a map both name places and locate them in the two - space of latitude and longitude." (Edward R Tufte, "Beautiful Evidence", 2006)

"Principles of design should attend to the fundamental intellectual tasks in the analysis of evidence; thus we have the Second Principle for the analysis And presentation of data: Show causality, mechanism, explanation, systematic structure." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines are wordlike graphics, With an intensity of visual distinctions comparable to words and letters. [...] Words visually present both an overall shape and letter-by-letter detail; since most readers have seen the word previously, the visual task is usually one of quick recognition. Sparklines present an overall shape and aggregate pattern along with plenty of local detail. Sparklines are read the same way as words, although much more carefully and slowly." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines vastly increase the amount of data within our eyespan and intensify statistical graphics up to the everyday routine capabilities of the human eye-brain system for reasoning about visual evidence, seeing distinctions, and making comparisons. [...] Providing a straightforward and contextual look at intense evidence, sparkline graphics give us some chance to be approximately right rather than exactly wrong. (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines work at intense resolutions, at the level of good typography and cartography. [...] Just as sparklines are like words, so then distributions of sparklines on a page are like sentences and paragraphs. The graphical idea here is make it wordlike and typographic - an idea that leads to reasonable answers for most questions about sparkline arrangements." (Edward R Tufte, "Beautiful Evidence", 2006)

"[...] the First Principle for the analysis and presentation data: 'Show comparisons, contrasts, differences'. The fundamental analytical act in statistical reasoning is to answer the question "Compared with what?". Whether we are evaluating changes over space or time, searching big data bases, adjusting and controlling for variables, designing experiments , specifying multiple regressions, or doing just about any kind of evidence-based reasoning, the essential point is to make intelligent and appropriate comparisons. Thus visual displays, if they are to assist thinking, should show comparisons." (Edward R Tufte, "Beautiful Evidence", 2006)

"The only thing that is 2-dimensional about evidence is the physical flatland of paper and computer screen. Flatlandy technologies of display encourage flatlandy thinking. Reasoning about evidence should not be stuck in 2 dimensions, for the world seek to understand is profoundly multivariate. Strategies of design should make multivariateness routine, nothing out of the ordinary. To think multivariate, show multivariate; the Third Principle for the analysis and presentation of data: 'Show multivariate data; that is, show more than 1 or 2 variables.'" (Edward R Tufte, "Beautiful Evidence", 2006)

"The principles of analytical design are universal - like mathematics, the laws of Nature, the deep structure of language - and are not tied to any particular language, culture, style, century, gender, or technology of information display." (Edward R Tufte, "Beautiful Evidence", 2006)

"The purpose of an evidence presentation is to assist thinking. Thus presentations should be constructed so as to assist with the fundamental intellectual tasks in reasoning about evidence: describing the data, making multivariate comparisons, understanding causality, integrating a diversity Of evidence, and documenting the analysis. Thus the Grand Principle of analytical design: 'The principles of analytical design are derived from the principles of analytical thinking.' Cognitive tasks are turned into principles of evidence presentation and design." (Edward R Tufte, "Beautiful Evidence", 2006)

"The Sixth Principle for the analysis and display of data: 'Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content.' This suggests that the most effective way to improve a presentation is to get better content. It also suggests that design devices and gimmicks cannot salvage failed content." (Edward R Tufte, "Beautiful Evidence", 2006)

"These little data lines, because of their active quality over time, are named sparklines - small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are datawords: data-intense, design-simple, word-sized graphics." (Edward R Tufte, "Beautiful Evidence", 2006)

"Words. numbers. pictures, diagrams, graphics, charts, tables belong together. Excellent maps, which are the heart and soul of good practices in analytical graphics, routinely integrate words, numbers, line-art, grids, measurement scales. Rarely is a distinction among the different modes of evidence useful for making sound inferences. It is all information after all. Thus the Fourth Principle for the analysis and presentation of data: 'Completely integrate words, numbers, images, diagrams.'" (Edward R Tufte, "Beautiful Evidence", 2006)

30 December 2006

✏️Robert D Carlsen - Collected Quotes

"A systems analysis project is usually thought of as occurring in two separate phases [...]. The first phase involves both the study of the existing system and phase involves implementing the new or improved system. The second phase involves implementing the new or improved system. This means writing the detailed procedures and data processing programs, conducting various types of tests. and installing the new system." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"A system is an operation or combination of operations performed by men and, possibly, machines to carry out a specific business activity. This might be a total system that considers all the factors in the entire operation of an enterprise, or it might be a subsystem of that total." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"A systems analysis is a study of one of these systems or subsystems. The purpose is to evaluate the system in terms of one or more of the following factors ... efficiency, accuracy, timeliness, economy, and productivity ... and to design a new or improved system. The design should eliminate or minimize deficiencies and improve the overall operations. Basically, the systems analyst who performs the study is concerned with three things. First, he must consider what is currently being done. Second, he must develop a method for what should be done. Finally, he must plan for the new design's application and for implementation of the system. Systems analysis is the first step in the development of a successful automated computer system, but the results of a systems analysis do not necessarily have to result in an automated system." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"Objectives recorded on the System Specification work sheet, even though preliminary in nature, should be specific. It is never sufficient to state an objective in terms of simply improving an existing system or of implementing a computerized system. The idea that a system or an 'automated' system is a better system has been a popular concept too long. An improved system, per se, is of no benefit to a business client; implementing a better system in order to increase profits or reduce costs is of great benefit." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"Probably the most neglected area in systems analysis involves the planning and control of the project, especially those projects requiring automation. More than one disastrous project has been launched by 'computer people' who communicated their aims to the vexed manager using technical data processing jargon in lieu of specific lists of easily understood tasks, schedules, and costs. This problem applies equally to in-house projects or those requiring the services of outside consultants. Each project must first be planned in detail. Control is involved with comparing actual progress with the plan and taking corrective action when the two do not correspond. Without the plan, true control is not possible; the need for corrective action, its nature, extent, and urgency cannot be accurately determined." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"Project management is the process by which it is assured that the objective is achieved and resources are not wasted. Planning is one of the two parts of project management. Control is the other." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"The most important ingredient in any system analysis is the tody of fact on which it is based. This body of fact must be complete; it must fully descrite the system which is already in existence and the environment in which it operates. Although an essential part of it compries the forms and documents being used, these alone are not sufficient. The ultimate source of the critical facts is the people who are part of the system, the operators, the users, those who input the information, and the system mmagers. The only efficient way to obtain the required information is to ask these people; that is, to conduct a series of interviews." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"There are basically two types of flowcharts. One is the program flowchart and the other the systems flowchart. The program flowchart. sometimes called 'logic diagram', graphically portrays the data precessing program logic. [...] Systems flowcharts display the flow of information throughout all parts of a system, including the manual portions. Systems flowcharts can be of two types. One type is task-oriented, describing the flow of data in terms of the work being performed. The other is forms-oriented, following the forms through the functional structure of the system." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"There are several classes of flowcharts used in recording study data in the Workbook. The purpose of any chart; of course, is to clarify and to make the information more understandable. One of these types of charts is a Process Flow Chart. It concerns itself with the flow of physical materials, including documents, through a system, especially in terms of distance and time. It is most useful in analyzing some of the cost and benefit factors for existing and proposed systems. System flowcharts [...] have been called the analyst's 'shorthand'. They can be forms-oriented or task-oriented. These flowcharts are not only the primary way of recording data pertinent to the current system, but are used for developing and displaying the new system as well. Later, in the implementation phase, program flowcharts, a fundamental tool of programming, would be developed." (Robert D Carlsen & James A Lewis, "The Systems Analysis Workbook: A complete guide to project implementation and control", 1973)

"The types of graphics used in operating a business fall into three main categories: diagrams, maps, and charts. Diagrams, such as organization diagrams, flow diagrams, and networks, are usually intended to graphically portray how an activity should be, or is being, accomplished, and who is responsible for that accomplishment. Maps such as route maps, location maps, and density maps, illustrate where an activity is, or should be, taking place, and what exists there. [...] Charts such as line charts, column charts, and surface charts, are normally constructed to show the businessman how much and when. Charts have the ability to graphically display the past, present, and anticipated future of an activity. They can be plotted so as to indicate the current direction that is being followed in relationship to what should be followed. They can indicate problems and potential problems, hopefully in time for constructive corrective action to be taken." (Robert D Carlsen & Donald L Vest, "Encyclopedia of Business Charts", 1977)

✏️Colin Ware - Collected Quotes

"Why should we be interested in visualization? Because the human visual system is a pattern seeker of enormous power and subtlety. The eye and the visual cortex of the brain form a massively parallel processor that provides the highest-bandwidth channel into human cognitive centers. At higher levels of processing, perception and cognition are closely interrelated, which is the reason why the words 'understanding' and 'seeing' are synonymous." (Colin Ware, 2000)

"A good visualization is not just a static picture or a 3D virtual environment that we can walk through and inspect like a museum full of statues. A good visualization is something that allows us to drill down and find more data about anything that seems important." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"Chernoff faces have not generally been adopted in practical visualization applications. The main reason for this may be the idiosyncratic nature of faces. When data is mapped to faces, many kinds of perceptual interactions can occur. Sometimes the combination of variables will result in a particular stereotypical face, perhaps a happy face or a sad face, and this will be identified more readily. In addition, there are undoubtedly great differences in our sensitivity to the different features. We may be more sensitive to the curvature of the mouth than to the height of the eyebrows, for example. This means that the perceptual space of Chernoff faces is likely to be extremely nonlinear. In addition, there are almost certainly many uncharted interactions between facial features, and these are likely to vary from one viewer to another." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"Diagrams are always hybrids of the conventional and the perceptual. Diagrams contain conventional elements, such as abstract labeling codes, that are difficult to learn but formally powerful. They also contain information that is coded according to perceptual rules, such as Gestalt principles. Arbitrary mappings may be useful, as in the case of mathematical notation, but a good diagram takes advantage of basic perceptual mechanisms that have evolved to perceive structure in the environment." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"It is useful to think of color as an attribute of an object rather than as its primary characteristic. It is excellent for labeling and categorization, but poor for displaying shape, detail, or space." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"Interactive visualization is a process made up of a number of interlocking feedback loops that fall into three broad classes. At the lowest level is the data manipulation loop, through which objects are selected and moved using the basic skills of eye–hand coordination. Delays of even a fraction of a second in this interaction cycle can seriously disrupt the performance of higher-level tasks. At an intermediate level is an exploration and navigation loop, through which an analyst finds his or her way in a large visual data space." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"The great advantage of the treemap over conventional tree views is that the amount of information on each branch of the tree can be easily visualized. Because the method is space-filling, it can show quite large trees containing thousands of branches. The disadvantage is that the hierarchical structure is not as clear as it is in a more conventional tree drawing, which is a specialized form of node–link diagram." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"The problem with the view that metadata and primary data are somehow essentially different is that all data is interpreted to some extent - there is no such thing as raw data. Every data gathering instrument embodies some particular interpretation in the way it is built. Also, from the practical viewpoint of the visualization designer, the problems of representation are the same for metadata as for primary data. In both cases, there are entities, relationships, and their attributes to be represented, although some are more abstract than others." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"[...] when data is presented in certain ways, the patterns can be readily perceived. If we can understand how perception works, our knowledge can be translated into rules for displaying information. Following perception‐based rules, we can present our data in such a way that the important and informative patterns stand out. If we disobey the rules, our data will be incomprehensible or misleading." (Colin Ware, "Information Visualization: Perception for Design" 2nd Ed., 2004)

"One reason design is difficult is that the designer already has the knowledge expressed in the design, has seen it develop from inception, and therefore cannot see it with fresh eyes. The solution is to be analytic and this is where this book is intended to add value. Effective design should start with a visual task analysis, determine the set of visual queries to be supported by a design, and then use color, form, and space to efficiently serve those queries." (Colin Ware, "Visual Thinking for Design", 2008)

"Design graphic representations of data by taking into account human sensory capabilities in such a way that important data elements and data patterns can be quickly perceived." (Colin Ware, "Information Visualization: Perception for Design" 4th Ed., 2021)

"Important data should be represented by graphical elements that are more visually distinct than those representing less important information." (Colin Ware, "Information Visualization: Perception for Design" 4th Ed., 2021)

✏️ Leandro N de Castro - Collected Quotes

"A bar chart is similar to a line chart, except that each data point is replaced by a rectangle with a height proportional to the value. The rectangle is usually centered on the spatial attribute of the data, and its width is often uniform. When values are categorical or discrete and cannot be shown in a series, a bar chart may be a suitable alternative for the line chart. Similarly to the case of a line chart, it is possible to create multivariate bar charts by stack‑ing the bars on top of each other in a form of superimposition easy to interpret." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"A scatterplot is a data visualization graph that uses dots to represent the relationship between two quantitative variables. One variable, called the explanatory variable, is plotted on the x‑axis, and the other variable, called the response variable, is plotted on the y‑axis. It is also possible to include a third categorical variable, represented by different dot colors. Each dot represents an individual data point, and the colors, when used, represent the categories of the dots. Therefore, the data point is organized into two or three columns, one for each variable, and each data point is plotted on the graph using two coordinates, one for each variable, with various colors representing each category.,." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Closure is a feature related to our capability of completing (closing) an object or a shape that is incomplete, that is, one that has some parts missing. The preattentive processing of closure is also automatic, not requiring conscious effort. For example, when looking at any shape, e.g., a circle or a square, with a small part missing, our brain automatically and preattentively perceives whether the shape is incomplete and fills these gaps. Preattentive processing of closure can be used in visual communication to create recognizable symbols and logos." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Color is a powerful visual tool to encode data and convey different meanings, such as  categories, magnitude, visual hierarchy, and even emotions. Using different hues, saturations, and brightness levels can help differentiate between categories or show patterns in the data." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Curvature is another preattentive feature that leads to a fast detection of changes in the degree of curvature, bending, or angularity of a shape or line, such as the presence of a more or less curved line in a group of otherwise similar lines. The degree of curvature in a line or shape can be used to represent different quantities or values, for instance, a smaller or larger number of peaks in a function." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Data visualization, by contrast, focuses on the visual representation of data in such a way that its values, structure, nature, type, and variability are accurately expressed by means of graphs. It aims to support the exploration and understanding of data, the identi‑fication of patterns, trends, distributions, correlations, and anomalies, the communicationof insights, and aid in decision‑making." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Differences in orientation can help us differentiate between items (e.g., data points, lines, objects, etc.) or extract information about the data. For example, using vertical bars in a bar chart can help differentiate between categories, while using horizontal bars can emphasize the magnitude of the data. Angles and direction can be used to convey information, such as trends, movement, sense of depth, or changes in values." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"In data visualization, texture is the visual quality of an object related to its roughness, pattern, or smoothness. It can be created using a variety of techniques, for example, using different line styles, brushes, patterns, and even special effects. Differences in texture can help distinguish between data points or objects, create visual hierarchies, or convey infor‑mation about the data. For example, using different textures for different categories can help viewers quickly identify and differentiate patterns. Like the other features described here, the texture is usually processed preattentively, without the need for focused attention." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Length is another preattentive visual property that can be used to create visual contrast, differences, importance, and proportions. The perception of differences in length normally occurs automatically and rapidly, without conscious effort or attention. It can be used in visual communication to quickly draw attention to important information or to create a visual hierarchy. For example, in a graph, longer bars may indicate larger values or quanti‑ties; in a map, longer lines may indicate longer distances; in a drawing, longer items may convey a sense of flow, etc." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Line charts are useful for identifying patterns and trends in a one‑dimensional sequence of univariate data, that is, continuous data over time with a single value per data item. They map the sequence data (e.g., time) to one dimension, typically the x‑axis, and the data value to another dimension, typically the y‑axis, forming a line; or to the color of a mark or region along the spatial axis, forming a bar. The data is adjusted in size to be within the limits of the display attribute." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive features, such as color, shape, orientation, and size, are those basic visual properties that are processed automatically, without conscious effort or attention. By understanding preattentive features, data analysts can create effective data visualization designs that make use of them to convey information more efficiently and accurately to the audience." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Size is a preattentive feature that exerts a similar effect in vision as that exerted by the line width, that is, to detect differences quickly and automatically in items (e.g., objects, data points, font sizes, etc.). Differences in size can draw attention to specific data points, indicate hierarchy, emphasize specific items, or convey information about the magnitude of the data. Variation in size can be used to represent different quantities or values, where larger sizes may indicate higher values or importance, while smaller sizes may indicate lower values or importance." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive processing of 3D (three‑dimensional) properties allows us to detect the depth and spatial relationships between objects, such as the presence of an object that appears to be closer or farther away than the others, without the need for focused attention. Perspective, lighting, size, or shading can be used to create the illusion of depth and convey information, such as relationships between variables." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The histogram is a useful visualization technique to explore the pattern of a single variable distribution, where the x‑axis represents the range of values, and the y‑axis represents the absoluteor relative frequency of data points within each bin. Histograms allow the exploration of cen‑tral tendency measures, such as the mean and median; dispersion measures, such as the stan‑dard deviation; and range, and shape, such as skewness and kurtosis. It also helps to identify outliers or unusual values and to reveal potential biases or errors in the data collection process." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of density occurs automatically and rapidly, without conscious effort or attention, and can be used in visual communication to create contrast and emphasize importance or relevance. This feature can be swiftly detected by the presence of varying numbers of objects (e.g., data points or shapes) in a given region of the space, rep‑resenting different quantities or values. For instance, in a chart or graph, a higher density of data points can be used to represent a larger quantity, a more significant trend, or a more exciting or energetic area. By making use of the preattentive processing of density, design‑ers can create effective visual designs that convey information quickly and efficiently to the viewer." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of markings (e.g., stripes, dots, crosses, stars, hatchings, etc.) includes various visual properties, such as texture, shading, and patterns. These properties allow us to swiftly detect differences and similarities between objects or regions, such as the presence of a repeating pattern in a group of otherwise random shapes. The presence or absence of certain markings, such as dots or squares, can be used to represent different categories or values." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of closure states that incomplete objects are perceived as complete because our brain tends to fill the gaps to create the complete image. Note that closure is also a pre‑attentive feature and thus plays a key role not only in the quick filling of gaps or completion of shapes, but also in the organization of the information to be conveyed."(Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of common fate proposes that objects that move together or change similarly tend to be perceived as a group or a pattern. In this case, graphs that allow visualizing data obeying this principle will have to embody a type or a sense of motion. To illustrate this principle, let us consider a motion chart, a streamgraph, and a force‑directed graph. The motion chart is a visualization method that shows how data changes over time; the streamgraph is a stacked area graph that shows the changes in a set of data over time; and the force‑directed graph is a network visualization that shows the relationships of nodes in a graph. In all cases, there is a sense of common fate in the data." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of continuity states that objects that are arranged in a smooth, continuous way are more likely to be perceived as a single object, even if their pattern is interrupted. The line chart, the Sankey diagram, and the scatterplot are good examples of the principle of continuity in the use of Gestalt theory in data visualization." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of figure‑ground, also called figure‑field, states that objects are perceived as either being in the foreground or the background. One way of forcing this principle is by using contrasting colors in the background and foreground of an image, for instance, black and white, blue and orange, green and purple, red and green, yellow and purple, pink and green, and others. However, many of these pairs are not suitable for technical and scientific works, and thus, the recommendation is to use colors with parsimony." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of proximity proposes that objects that are close to one another tend to be perceived as a group or a pattern. In data visualization, the heatmap, the scatterplot, and the bar chart are good examples of methods that account for the principle of proximity. The heatmap is a graph in which the values of a matrix are represented by colors, which are a preattentive feature, and neighboring cells in the matrix convey a sense of organization and relationship. The scatterplot places similar data values close to one another, grouping them in the plot. In a bar chart, related data values are placed close together in the bars, allowing a visual association among them." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of similarity proposes that objects that share similar characteristics, such as color or form, tend to be perceived as a group or a pattern. Examples of data visualization techniques that account for the similarity principle in Gestalt theory include a line chart in which lines representing different categories have the same style, a bar chart in which the bar patterns or colors indicate the same group or category, and a scatterplot with different markers representing different categories of categorical variables." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of symmetry states that objects that are symmetrical, or have a balanced appearance, tend to be perceived as a group or a pattern. Some data visualization graphs that can be used to explore this principle are the boxplot with boxes symmetrically placed around the median (Q2), the radar chart displaying multivariate data as a bidimensional chart with quantitative variables, and the mirrored bar chart with two sets of bars with mirrored values displayed." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive processing of position allows us to quickly detect changes in location, such as the presence of a dot or other object that is slightly displaced from the others. The spa‑tial location of visual elements can also be used to guide the viewer’s attention or encode information, such as ranking, hierarchy, or relationship (grouping)." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of shape is a basic visual property that enables us to swiftly 
detect similarities and differences between items based on their shape, without requir‑
ing conscious effort or attention. For instance, in a picture with squares and circles, one 
can quickly differentiate one from the other based on their shapes. Similarly, using differ‑
ent shapes for different forms or categories, or using a shape that is indicative of the data (e.g., a circle for data on a map), can help viewers quickly identify patterns." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

29 December 2006

✏️Gerald Benoît - Collected Quotes

"A model links to the viewers’ engagement with the visualization. Can the viewers identify the purpose and create a relationship in their mind between the nascent message of your visualization and their knowledge and work practices? When sketching out the design and considering the data, what is the first intention of the design? How will viewers interpret the goal of the visualization?" (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"A well-designed 'information visualization' is interactive, allowing viewers to converse with the data: gaining knowledge, exposing insights, and engaging with the data in unexpected ways. It is only through these conversations that the otherwise static display of data transforms into meaningful information." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Before progressing to analysis and visualization of the data, examine the data for inconsistencies and missing values. Data that fall outside an expected range, values that are missing or null, or have a different encoding or data type need to be addressed." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Contemporary information specialists should at least be conversant in the pros/cons, benefits and liabilities, tech and data requirements of each software product they might use." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Experience shows that both neophyte designers of visualizations and commercial visualization applications often overlook the role that type plays in legibility, aesthetics, and meaning construction. Yet the most successful visualizations are those where the details of data, design, and aesthetics are in harmony, and the interactivity allows the end user to understand the explanation and to explore." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"For an information visualization specialist, we must weigh the impact of the purely visual aspects of our designs as well applying visual norms that facilitate interpretation. Finally, we integrate data as the foundation of the visualization - all in a way where each coheres—that is, each contributes the same message to the viewer albeit in different languages (textual, data, interactive, and visual). It’s not useful nor possible to study themes of the aesthetic, technical, and applications of visuals independently of the others." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Information visualization displays meet the definition of an art form in that there is an intended message to be communicated, and the principles of graphic design are applied as they are in other information graphics. Unlike other forms of representational art, InfoVis is a representational art of 'information' as an abstract phenomenon, with the goal of engaging the viewer with forms of interactivity that are not possible with a painting." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Knowing what graphic representation to apply is partially a function of the data themselves and partially from the designer’s understanding of the target audience viewing the graphic. The Internet and publications have many recommended charting types." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"The problem-solving approach favored in the big data/data science realm is datacentric. This is likely because of the similarities between traditional data- and text-mining activities that incorporate visualizing results for exploration and explanation. This field contributes to receptiveness by institutions and the public to very large datasets and the computational infrastructure that provides the data. For data scientists, however, the ultimate interest is using visuals to help chart the data, as opposed to interacting with them. The emphasis is on large datasets and machine learning." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"The rule of thirds applies to fonts, too. The use of fonts is more subtle than one might imagine at first glance. The extreme subtlety of detail when designing fonts contributes to an equally subtle affective impact on a design. The choice of fonts also contributes more evidently to legibility. To a graphic designer, the choice of font contributes to the overall design, addressing more than legibility because the design is tempered with sensitivity to the limitations of the output device (monitor), size of the font, and the overall aesthetic tone." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

" [...] the rule of three applies to the choice of typography, too. In design practice, there is usually a heading font, body text, and then a font for details. [...]  Even though two of the roles (title and body) are the same font name, one is bold and the other is regular. This equates to two fonts. It is common, too, to use a serif font for a title and then a sans serif for the other two (or vice versa). Learning which fonts to use comes only from practice and studying examples." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"When teaching design composition for posters and for websites, there are some introductory rules [...]. One is the 'rule of thirds'. This equates to (no more than) three colors in the design, three typefaces, and three display areas in a design composition [...]" (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

✏️Anker V Andersen - Collected Quotes

"An economic justification for computer graphics is that the organization spends an enormous amount of money on data processing, often providing managers with too many reports, too many data, and an overload of information. The report output has to be condensed into a more usable form. The computer graph essentially is the data represented in a structured pictorial form. The role of the graph is to provide meaningful reports. To the extent that it does. it can be justified." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Graphs are used to meet the need to condense all the available information into a more usable quantity. The selection process of combining and condensing will inevitably produce a less than complete study and will lead the user in certain directions, producing a potential for misleading." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Graphs can present internal accounting data effectively. Because One of the main functions of the accountant is to communicate accounting information to users. accountants should use graphs, at least to the extent that they clarify the presentation of accounting data. present the data fairly, and enhance management's ability to make a more informed decision. It has been argued that the human brain can absorb and understand images more easily than words and numbers, and, therefore, graphs may be better communicative devices than written reports or tabular statements." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Reliability is highly valued by accountants and has been defined as 'the faithfulness with which it (information) represents what it purports to represent'. The reason reliability is so important is that an essential characteristic of an accounting report is its acceptance, and if a report is considered to be misleading or superfluous, it and future reports will be disregarded." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Understandability implies that the graph will mean something to the audience. If the presentation has little meaning to the audience, it has little value. Understandability is the difference between data and information. Data are facts. Information is facts that mean something and make a difference to whoever receives them. Graphic presentation enhances understanding in a number of ways. Many people find that the visual comparison and contrast of information permit relationships to be grasped more easily. Relationships that had been obscure become clear and provide new insights." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"The bar graph and the column graph are popular because they are simple and easy to read. These are the most versatile of the graph forms. They can be used to display time series, to display the relationship between two items, to make a comparison among several items, and to make a comparison between parts and the whole (total). They do not appear to be as 'statistical', which is an advantage to those people who have negative attitudes toward statistics. The column graph shows values over time, and the bar graph shows values at a point in time. bar graph compares different items as of a specific time (not over time)." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"The scales used are important; contracting or expanding the vertical or horizontal scales will change the visual picture. The trend lines need enough grid lines to obviate difficulty in reading the results properly. One must be careful in the use of cross-hatching and shading, both of which can create illusions. Horizontal rulings tend to reduce the appearance. while vertical lines enlarge it. In summary, graphs must be reliable, and reliability depends not only on what is presented but also on how it is presented." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"There are several uses for which the line graph is particularly relevant. One is for a series of data covering a long period of time. Another is for comparing several series on the same graph. A third is for emphasizing the movement of data rather than the amount of the data. It also can be used with two scales on the vertical axis, one on the right and another on the left, allowing different series to use different scales, and it can be used to present trends and forecasts." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"There are two kinds of misrepresentation. In one. the numerical data do not agree with the data in the graph, or certain relevant data are omitted. This kind of misleading presentation. while perhaps hard to determine, clearly is wrong and can be avoided. In the second kind of misrepresentation, the meaning of the data is different to the preparer and to the user." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.