24 December 2011

📉Graphical Representation: Design (Just the Quotes)

"Good design looks right. It is simple (clear and uncomplicated). Good design is also elegant, and does not look contrived. A map should be aesthetically pleasing, thought provoking, and communicative."  (Arthur H Robinson, "Elements of Cartography", 1953)

"The design process involves a series of operations. In map design, it is convenient to break this sequence into three stages. In the first stage, you draw heavily on imagination and creativity. You think of various graphic possibilities, consider alternative ways." (Arthur H Robinson, "Elements of Cartography", 1953)

"Simplicity, accuracy, appropriate size, proper proportion, correct emphasis, and skilled execution - these are the factors that produce the effective chart. To achieve simplicity your chart must be designed with a definite audience in mind, show only essential information. Technical terms should be absent as far as possible. And in case of doubt it is wiser to oversimplify than to make matters unduly complex. Be careful to avoid distortion or misrepresentation. Accuracy in graphics is more a matter of portraying a clear reliable picture than reiterating exact values. Selecting the right scales and employing authoritative titles and legends are as important as precision plotting. The right size of a chart depends on its probable use, its importance, and the amount of detail involved." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Without adequate planning. it is seldom possible to achieve either proper emphasis of each component element within the chart or a presentation that is pleasing in its entirely. Too often charts are developed around a single detail without sufficient regard for the work as a whole. Good chart design requires consideration of these four major factors: (1) size, (2) proportion, (3) position and margins, and (4) composition." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"An especially effective device for enhancing the explanatory power of time-series displays is to add spatial dimensions to the design of the graphic, so that the data are moving over space (in two or three dimensions) as well as over time. […] Occasionally graphics are belligerently multivariate, advertising the technique rather than the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Graphical excellence is the well-designed presentation of interesting data - a matter of substance, of statistics, and of design. Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. Graphical excellence is nearly always multivariate. And graphical excellence requires telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Graphical competence demands three quite different skills: the substantive, statistical, and artistic. Yet now most graphical work, particularly at news publications, is under the direction of but a single expertise-the artistic. Allowing artist-illustrators to control the design and content of statistical graphics is almost like allowing typographers to control the content, style, and editing of prose. Substantive and quantitative expertise must also participate in the design of data graphics, at least if statistical integrity and graphical sophistication are to be achieved." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers even a very large set - is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The theory of the visual display of quantitative information consists of principles that generate design options and that guide choices among options. The principles should not be applied rigidly or in a peevish spirit; they are not logically or mathematically certain; and it is better to violate any principle than to place graceless or inelegant marks on paper. Most principles of design should be greeted with some skepticism, for word authority can dominate our vision, and we may come to see only though the lenses of word authority rather than with our own eyes." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Charts offer opportunities to distort information, to misinform. An old adage can be extended to read: 'There are lies, damned lies, statistics and charts'. Our visual impressions are often more memorable than our understanding of the facts they describe. [...] Never let your design enthusiasms overrule your judgement of the truth." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"Confusion and clutter are failures of design, not attributes of information. And so the point is to find design strategies that reveal detail and complexity - rather than to fault the data for an excess of complication. Or, worse, to fault viewers for a lack of understanding. Among the most powerful devices for reducing noise and enriching the content of displays is the technique of layering and separation, visually stratifying various aspects of the data." (Edward R Tufte, "Envisioning Information", 1990)

"The ducks of information design are false escapes from flatland, adding pretend dimensions to impoverished data sets, merely fooling around with information." (Edward R Tufte, "Envisioning Information", 1990)

"Visual displays rich with data are not only an appropriate and proper complement to human capabilities, but also such designs are frequently optimal. If the visual task is contrast, comparison, and choice - as so often it is - then the more relevant information within eyespan, the better. Vacant, low-density displays, the dreaded posterization of data spread over pages and pages, require viewers to rely on visual memory - a weak skill - to make a contrast, a comparison, a choice." (Edward R Tufte, "Envisioning Information", 1990)

"We envision information in order to reason about, communicate, document, and preserve that knowledge - activities nearly always carried out on two-dimensional paper and computer screen. Escaping this flatland and enriching the density of data displays are the essential tasks of information design." (Edward R Tufte, "Envisioning Information", 1990)

"The content and context of the numerical data determines the most appropriate mode of presentation. A few numbers can be listed, many numbers require a table. Relationships among numbers can be displayed by statistics. However, statistics, of necessity, are summary quantities so they cannot fully display the relationships, so a graph can be used to demonstrate them visually. The attractiveness of the form of the presentation is determined by word layout, data structure, and design." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Dashboards and visualization are cognitive tools that improve your 'span of control' over a lot of business data. These tools help people visually identify trends, patterns and anomalies, reason about what they see and help guide them toward effective decisions. As such, these tools need to leverage people's visual capabilities. With the prevalence of scorecards, dashboards and other visualization tools now widely available for business users to review their data, the issue of visual information design is more important than ever." (Richard Brath & Michael Peters, "Dashboard Design: Why Design is Important," DM Direct, 2004)

"An effective dashboard is the product not of cute gauges, meters, and traffic lights, but rather of informed design: more science than art, more simplicity than dazzle. It is, above all else, about communication." (Stephen Few, "Information Dashboard Design", 2006)

"Good design, however, can dispose of clutter and show all the data points and their names. [...] Clutter calls for a design solution, not a content reduction." (Edward R Tufte, "Beautiful Evidence", 2006)

"Most dashboards fail to communicate efficiently and effectively, not because of inadequate technology (at least not primarily), but because of poorly designed implementations. No matter how great the technology, a dashboard's success as a medium of communication is a product of design, a result of a display that speaks clearly and immediately. Dashboards can tap into the tremendous power of visual perception to communicate, but only if those who implement them understand visual perception and apply that understanding through design principles and practices that are aligned with the way people see and think." (Stephen Few, "Information Dashboard Design", 2006)

"The principles of analytical design are universal - like mathematics, the laws of Nature, the deep structure of language - and are not tied to any particular language, culture, style, century, gender, or technology of information display." (Edward R Tufte, "Beautiful Evidence", 2006)

"For a given dataset there is not a great deal of advice which can be given on content and context. hose who know their own data should know best for their specific purposes. It is advisable to think hard about what should be shown and to check with others if the graphic makes the desired impression. Design should be let to designers, though some basic guidelines should be followed: consistency is important (sets of graphics should be in similar style and use equivalent scaling); proximity is helpful (place graphics on the same page, or on the facing page, of any text that refers to them); and layout should be checked (graphics should be neither too small nor too large and be attractively positioned relative to the whole page or display)." (Antony Unwin, "Good Graphics?" [in "Handbook of Data Visualization"], 2008)

"A viewer’s eye must be guided to 'read' the elements in a logical order. The design of an exploratory graphic needs to allow for the additional component of discovery - guiding the viewer to first understand the overall concept and then engage her to further explore the supporting information." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"Data art is characterized by a lack of structured narrative and absence of any visual analysis capability. Instead, the motivation is much more about creating an artifact, an aesthetic representation or perhaps a technical/technique demonstration. At the extreme end, a design may be more guided by the idea of fun or playfulness or maybe the creation of ornamentation." (Andy Kirk, "Data Visualization: A successful design process", 2012)

"A common mistake is that all visualization must be simple, but this skips a step. You should actually design graphics that lend clarity, and that clarity can make a chart 'simple' to read. However, sometimes a dataset is complex, so the visualization must be complex. The visualization might still work if it provides useful insights that you wouldn’t get from a spreadsheet. […] Sometimes a table is better. Sometimes it’s better to show numbers instead of abstract them with shapes. Sometimes you have a lot of data, and it makes more sense to visualize a simple aggregate than it does to show every data point." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Eye-catching data graphics tend to use designs that are unique (or nearly so) without being strongly focused on the data being displayed. In the world of Infovis, design goals can be pursued at the expense of statistical goals. In contrast, default statistical graphics are to a large extent determined by the structure of the data (line plots for time series, histograms for univariate data, scatterplots for bivariate nontime-series data, and so forth), with various conventions such as putting predictors on the horizontal axis and outcomes on the vertical axis. Most statistical graphs look like other graphs, and statisticians often think this is a good thing." (Andrew Gelman & Antony Unwin, "Infovis and Statistical Graphics: Different Goals, Different Looks" , Journal of Computational and Graphical Statistics Vol. 22(1), 2013)

"The biggest thing to know is that data visualization is hard. Really difficult to pull off well. It requires harmonization of several skills sets and ways of thinking: conceptual, analytic, statistical, graphic design, programmatic, interface-design, story-telling, journalism - plus a bit of ‘gut feel.’ The end result is often simple and beautiful, but the process itself is usually challenging and messy." (David McCandless, 2013)

"A fundamental principle of design is to consider multiple alternatives and then choose the best, rather than to immediately fixate on one solution without considering any alternatives. One way to ensure that more than one possibility is considered is to explicitly generate multiple ideas in parallel. " (Tamara Munzner, "Visualization Analysis and Design", 2014)

"As with all design problems, vis design cannot be easily handled as a simple process of optimization because trade-offs abound. A design that does well by one measure will rate poorly on another. The characterization of trade-offs in the vis design space is a very open problem at the frontier of vis research." (Tamara Munzner, "Visualization Analysis and Design", 2014)

"There are myriad questions that we can ask from data today. As such, it’s impossible to write enough reports or design a functioning dashboard that takes into account every conceivable contingency and answers every possible question." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"While visuals are an essential part of data storytelling, data visualizations can serve a variety of purposes from analysis to communication to even art. Most data charts are designed to disseminate information in a visual manner. Only a subset of data compositions is focused on presenting specific insights as opposed to just general information. When most data compositions combine both visualizations and text, it can be difficult to discern whether a particular scenario falls into the realm of data storytelling or not." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Another problem is that while data visualizations may appear to be objective, the designer has a great deal of control over the message a graphic conveys. Even using accurate data, a designer can manipulate how those data make us feel. She can create the illusion of a correlation where none exists, or make a small difference between groups look big." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"Well-designed data graphics provide readers with deeper and more nuanced perspectives, while promoting the use of quantitative information in understanding the world and making decisions." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

23 December 2011

📉Graphical Representation: Maps (Just the Quotes)

"The dominant principle which characterizes my graphic tables and my figurative maps is to make immediately appreciable to the eye, as much as possible, the proportions of numeric results. […] Not only do my maps speak, but even more, they count, they calculate by the eye." (Chatles D Minard, "Des tableaux graphiques et des cartes figuratives", 1862) 

"Two important characteristics of maps should be noticed. A map is not the territory it represents, but, if correct, it has a similar structure to the territory, which accounts for its usefulness. If the map could be ideally correct, it would include, in a reduced scale, the map of the map; the map of the map, of the map [...]" (Alfred Korzybski, "Science and Sanity: An Introduction to Non-Aristotelian Systems and General Semantics", 1933)

"The first of the principles governing symbols is this: The symbol is NOT the thing symbolized; the word is NOT the thing; the map is NOT the territory it stands for." (Samuel I Hayakawa, "Language in Thought and Action", 1949)

"A fundamental value in the scientific outlook is concern with the best available map of reality. The scientist will always seek a description of events which enables him to predict most by assuming least. He thus already prefers a particular form of behavior. If moralities are systems of preferences, here is at least one point at which science cannot be said to be completely without preferences. Science prefers good maps." (Anatol Rapoport, "Science and the goals of man: a study in semantic orientation", 1950)

"No map contains all the information about the territory it represents. [...] Furthermore, the scale of the map makes a big difference. The smaller the scale the less features will be shown." (Anatol Rapoport, "Science and the goals of man: a study in semantic orientation", 1950)

"Good design looks right. It is simple (clear and uncomplicated). Good design is also elegant, and does not look contrived. A map should be aesthetically pleasing, thought provoking, and communicative."  (Arthur H Robinson, "Elements of Cartography", 1953)

"The design process involves a series of operations. In map design, it is convenient to break this sequence into three stages. In the first stage, you draw heavily on imagination and creativity. You think of various graphic possibilities, consider alternative ways." (Arthur H Robinson, "Elements of Cartography", 1953)

"The types of graphics used in operating a business fall into three main categories: diagrams, maps, and charts. Diagrams, such as organization diagrams, flow diagrams, and networks, are usually intended to graphically portray how an activity should be, or is being, accomplished, and who is responsible for that accomplishment. Maps such as route maps, location maps, and density maps, illustrate where an activity is, or should be, taking place, and what exists there. [...] Charts such as line charts, column charts, and surface charts, are normally constructed to show the businessman how much and when. Charts have the ability to graphically display the past, present, and anticipated future of an activity. They can be plotted so as to indicate the current direction that is being followed in relationship to what should be followed. They can indicate problems and potential problems, hopefully in time for constructive corrective action to be taken." (Robert D Carlsen & Donald L Vest, "Encyclopedia of Business Charts", 1977)

"Maps containing marks that indicate a variety of features at specific locations are easy to produce and often revealing for the reader. You can use dots, numbers, and shapes, with or without keys. The basic map must always be simple and devoid of unnecessary detail. There should be no ambiguity about what happens where." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"Maps used as charts do not need fine cartographic detail. Their purpose is to express ideas, explain relationships, or store data for consultation. Keep your maps simple. Edit out irrelevant detail. Without distortion, try to present the facts as the main feature of your map, which should serve only as a springboard for the idea you're trying to put across." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"The prevailing style of management must undergo transformation. A system cannot understand itself. The transformation requires a view from outside. The aim [...] is to provide an outside view - a lens - that I call a system of profound knowledge. It provides a map of theory by which to understand the organizations that we work in." (Dr. W. Edwards Deming, "The New Economics for Industry, Government, Education", 1994)

"Maps, due to their melding of scientific and artistic approaches, always involve complex interaction between the denotative and the connotative meanings of signs they contain." (Alan MacEachren, "How Maps Work: Representation, Visualization, and Design", 1995)

"The fact that map is a fuzzy and radial, rather than a precisely defined, category is important because what a viewer interprets a display to be will influence her expectations about the display and how she interacts with it." (Alan MacEachren, "How Maps Work: Representation, Visualization, and Design", 1995)

"The representational nature of maps, however, is often ignored - what we see when looking at a map is not the word, but an abstract representation that we find convenient to use in place of the world. When we build these abstract representations we are not revealing knowledge as much as are creating it." (Alan MacEachren, "How Maps Work: Representation, Visualization, and Design", 1995)

"Understanding how maps work and why maps work (or do not work) as representations in their own right and as prompts to further representations, and what it means for a map to work, are critical issues as we embark on a visual information age." (Alan MacEachren, "How Maps Work: Representation, Visualization, and Design", 1995)

"Eliciting and mapping the participant's mental models, while necessary, is far from sufficient [...] the result of the elicitation and mapping process is never more than a set of causal attributions, initial hypotheses about the structure of a system, which must then be tested. Simulation is the only practical way to test these models. The complexity of the cognitive maps produced in an elicitation workshop vastly exceeds our capacity to understand their implications. Qualitative maps are simply too ambiguous and too difficult to simulate mentally to provide much useful information on the adequacy of the model structure or guidance about the future development of the system or the effects of policies." (John D Sterman, "Learning in and about complex systems", Systems Thinking Vol. 3 2003)

"A road plan can show the exact location, elevation, and dimensions of any part of the structure. The map corresponds to the structure, but it's not the same as the structure. Software, on the other hand, is just a codification of the behaviors that the programmers and users want to take place. The map is the same as the structure. […] This means that software can only be described accurately at the level of individual instructions. […] A map or a blueprint for a piece of software must greatly simplify the representation in order to be comprehensible. But by doing so, it becomes inaccurate and ultimately incorrect. This is an important realization: any architecture, design, or diagram we create for software is essentially inadequate. If we represent every detail, then we're merely duplicating the software in another form, and we're wasting our time and effort." (George Stepanek, "Software Project Secrets: Why Software Projects Fail", 2005) 

"A map does not just chart, it unlocks and formulates meaning; it forms bridges between here and there, between disparate ideas that we did not know were previously connected." (Reif Larsen, "The Selected Works of T S Spivet", 2009)

"Raster maps - often also called raster images - represent measurements on a regular grid. They are usually a result of remote sensing techniques via satellites or airborne surveillance systems. They fit neither the construct of scatterplots nor that of maps. Nevertheless, both scatterplots and maps can be used to display raster maps within statistics software which has no extra GIS capabilities." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"Graphics, charts, and maps aren’t just tools to be seen, but to be read and scrutinized. The first goal of an infographic is not to be beautiful just for the sake of eye appeal, but, above all, to be understandable first, and beautiful after that; or to be beautiful thanks to its exquisite functionality." (Alberto Cairo, "The Functional Art", 2011)

"The utility of mapping as a form of data visualization isn’t in accuracy or precision, but rather the map’s capacity to help us make and organize hypothesis about the world of ideas and things. hypothesis-making through the map isn’t strictly inductive or deductive, although it can use the thought process of either, but it is often based on general observations." (Winifred E Newman, "Data Visualization for Design Thinking: Applied Mapping", 2017)

"Maps also have the disadvantage that they consume the most powerful encoding channels in the visualization toolbox - position and size - on an aspect that is held constant. This leaves less effective encoding channels like color for showing the dimension of interest." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Maps are a type of chart that can convey relationships about space and relationships between objects that we relate to in the real world. Their effectiveness as a communication medium is strongly influenced by a host of factors: the nature of spatial data, the form and structure of representation, their intended purpose, the experience of the audience, and the context in the time and space in which the map is viewed. In other words, maps are a ubiquitous representation of spatial information that we can understand and relate to." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

More quotes on 'Maps' on "The Web of Knowledge".

📉Graphical Representation: Presentation (Just the Quotes)

"In many presentations it is not a question of saving time to the reader but a question of placing the arguments in such form that results may surely be obtained. For matters affecting public welfare, it is hard to estimate the benefits which may accrue if a little care be used in presenting data so that they will be convincing to the reader." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"Judgment must be used in the showing of figures in any chart or numerical presentation, so that the figures may not give an appearance of greater accuracy than their method of collection would warrant. Too many otherwise excellent reports contain figures which give the impression of great accuracy when in reality the figures may be only the crudest approximations. Except in financial statements, it is a safe rule to use ciphers whenever possible at the right of all numbers of great size. The use of the ciphers greatly simplifies the grasping of the figures by the reader, and, at the same time, it helps to avoid the impression of an accuracy which is not warranted by the methods of collecting the data." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"The principles of charting and curve plotting are not at all complex, and it is surprising that many business men dodge the simplest charts as though they involved higher mathematics or contained some sort of black magic. [...] The trouble at present is that there are no standards by which graphic presentations can be prepared in accordance with definite rules so that their interpretation by the reader may be both rapid and accurate. It is certain that there will evolve for methods of graphic presentation a few useful and definite rules which will correspond with the rules of grammar for the spoken and written language." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919) 

"Though accurate data and real facts are valuable, when it comes to getting results the manner of presentation is ordinarily more important than the facts themselves. The foundation of an edifice is of vast importance. Still, it is not the foundation but the structure built upon the foundation which gives the result for which the whole work was planned. As the cathedral is to its foundation so is an effective presentation of facts to the data." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"Correct emphasis is basic to effective graphic presentation. Intensity of color is the simplest method of obtaining emphasis. For most reproduction purposes black ink on a white page is most generally used.  Screens, dots and lines can, of course, be effectively used to give a gradation of tone from light grey to solid black. When original charts are the subjects of display presentation, use of colors is limited only by the subject and the emphasis desired." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"The impression created by a chart depends to a great extent on the shape of the grid and the distribution of time and amount scales. When your individual figures are a part of a series make sure your own will harmonize with the other illustrations in spacing of grid rulings, lettering, intensity of lines, and planned to take the same reduction by following the general style of the presentation." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Without adequate planning, it is seldom possible to achieve either proper emphasis of each component element within the chart or a presentation that is pleasing in its entirely. Too often charts are developed around a single detail without sufficient regard for the work as a whole. Good chart design requires consideration of these four major factors: (1) size, (2) proportion, (3) position and margins, and (4) composition." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Structured information is any type of information that is arranged to show relationships between the minute, individual particles (bits) of information and the final presentation of this information in a logical arrangement with continuity from beginning to end." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"The use of trivial data - particularly in graphic presentation - can easily tire the reader so that he soon becomes disinterested. Graphs should be for information considered highly significant. not for unimportant points." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"Graphical excellence is the well-designed presentation of interesting data - a matter of substance, of statistics, and of design. Graphical excellence consists of complex ideas communicated with clarity, precision, and efficiency. Graphical excellence is that which gives to the viewer the greatest number of ideas in the shortest time with the least ink in the smallest space. Graphical excellence is nearly always multivariate. And graphical excellence requires telling the truth about the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The representation of numbers, as physically measured on the surface of the graphic itself, should be directly proportional to the numerical quantities represented." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"There are two kinds of misrepresentation. In one. the numerical data do not agree with the data in the graph, or certain relevant data are omitted. This kind of misleading presentation. while perhaps hard to determine, clearly is wrong and can be avoided. In the second kind of misrepresentation, the meaning of the data is different to the preparer and to the user." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Understandability implies that the graph will mean something to the audience. If the presentation has little meaning to the audience, it has little value. Understandability is the difference between data and information. Data are facts. Information is facts that mean something and make a difference to whoever receives them. Graphic presentation enhances understanding in a number of ways. Many people find that the visual comparison and contrast of information permit relationships to be grasped more easily. Relationships that had been obscure become clear and provide new insights." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"The effective communication of information in visual form, whether it be text, tables, graphs, charts or diagrams, requires an understanding of those factors which determine the 'legibility', 'readability' and 'comprehensibility', of the information being presented. By legibility we mean: can the data be clearly seen and easily read? By readability we mean: is the information set out in a logical way so that its structure is clear and it can be easily scanned? By comprehensibility we mean: does the data make sense to the audience for whom it is intended? Is the presentation appropriate for their previous knowledge, their present information needs and their information processing capacities?" (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"In part, graphing data needs to be iterative because we often do not know what to expect of the data; a graph can help discover unknown aspects of the data, and once the unknown is known, we frequently find ourselves formulating a new question about the data. Even when we understand the data and are graphing them for presentation, a graph will look different from what we had expected; our mind's eye frequently does not do a good job of predicting what our actual eyes will see." (William S Cleveland, "The Elements of Graphing Data", 1985)

"A chart is a bridge between you and your readers. It reveals your skills at comprehending the source information, at mastering presentation methods and at producing the design. Its success depends a great deal on your readers ' understanding of what you are saying, and how you are saying it. Consider how they will use your chart. Will they want to find out from it more information about the subject? Will they just want a quick impression of the data? Or will they use it as a source for their own analysis? Charts rely upon a visual language which both you and your readers must understand." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"Charts and diagrams are the visual presentation of information. Since text and tables of information require close study to obtain the more general impressions of the subject, charts can be used to present readily understandable, easily digestible and, above all, memorable solutions." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"At the heart of quantitative reasoning is a single question: Compared to what? Small multiple designs, multivariate and data bountiful, answer directly by visually enforcing comparisons of changes, of the differences among objects, of the scope of alternatives. For a wide range of problems in data presentation, small multiples are the best design solution." (Edward R Tufte, "Envisioning Information", 1990)

"The content and context of the numerical data determines the most appropriate mode of presentation. A few numbers can be listed, many numbers require a table. Relationships among numbers can be displayed by statistics. However, statistics, of necessity, are summary quantities so they cannot fully display the relationships, so a graph can be used to demonstrate them visually. The attractiveness of the form of the presentation is determined by word layout, data structure, and design." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Graphical illustrations should be simple and pleasing to the eye, but the presentation must remain scientific. In other words, we want to avoid those graphical features that are purely decorative while keeping a critical eye open for opportunities to enhance the scientific inference we expect from the reader. A good graphical design should maximize the proportion of the ink used for communicating scientific information in the overall display." (Phillip I Good & James W Hardin, "Common Errors in Statistics (and How to Avoid Them)", 2003)

"Documentation allows more effective watching, and we have the Fifth Principle for the analysis and presentation of data: 'Thoroughly describe the evidence. Provide a detailed title, indicate the authors and sponsors, document the data sources, show complete measurement scales, point out relevant issues.'" (Edward R Tufte, "Beautiful Evidence", 2006)

"Making a presentation is a moral act as well as an intellectual activity. The use of corrupt manipulations and blatant rhetorical ploys in a report or presentation - outright lying, flagwaving, personal attacks, setting up phony alternatives, misdirection, jargon-mongering, evading key issues, feigning disinterested objectivity, willful misunderstanding of other points of view - suggests that the presenter lacks both credibility and evidence. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Making an evidence presentation is a moral act as well as an intellectual activity. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Principles of design should attend to the fundamental intellectual tasks in the analysis of evidence; thus we have the Second Principle for the analysis and presentation of data: Show causality, mechanism, explanation, systematic structure." (Edward R Tufte, "Beautiful Evidence", 2006)

"[...] the First Principle for the analysis and presentation data: 'Show comparisons, contrasts, differences'. The fundamental analytical act in statistical reasoning is to answer the question "Compared with what?". Whether we are evaluating changes over space or time, searching big data bases, adjusting and controlling for variables, designing experiments , specifying multiple regressions, or doing just about any kind of evidence-based reasoning, the essential point is to make intelligent and appropriate comparisons. Thus visual displays, if they are to assist thinking, should show comparisons." (Edward R Tufte, "Beautiful Evidence", 2006)

"The only thing that is 2-dimensional about evidence is the physical flatland of paper and computer screen. Flatlandy technologies of display encourage flatlandy thinking. Reasoning about evidence should not be stuck in 2 dimensions, for the world seek to understand is profoundly multivariate. Strategies of design should make multivariateness routine, nothing out of the ordinary. To think multivariate. show multivariate; the Third Principle for the analysis and presentation of data: 
'Show multivariate data; that is, show more than 1 or 2 variables.'" (Edward R Tufte, "Beautiful Evidence", 2006)

"The purpose of an evidence presentation is to assist thinking. Thus presentations should be constructed so as to assist with the fundamental intellectual tasks in reasoning about evidence: describing the data, making multivariate comparisons, understanding causality, integrating a diversity of evidence, and documenting the analysis. Thus the Grand Principle of analytical design: 'The principles of analytical design are derived from the principles of analytical thinking.' Cognitive tasks are turned into principles of evidence presentation and design." (Edward R Tufte, "Beautiful Evidence", 2006)

"The Sixth Principle for the analysis and display of data: 'Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content.' This suggests that the most effective way to improve a presentation is to get better content. It also suggests that design devices and gimmicks cannot salvage failed content." (Edward R Tufte, "Beautiful Evidence", 2006)

"Words. numbers. pictures, diagrams, graphics, charts, tables belong together. Excellent maps, which are the heart and soul of good practices in analytical graphics. routinely integrate words, numbers, line-art, grids, measurement scales. Rarely is a distinction among the different modes of evidence useful for making sound inferences. It is all information after all. Thus the Fourth Principle for the analysis and presentation of data: 'Completely integrate words, numbers, images, diagrams.'" (Edward R Tufte, "Beautiful Evidence", 2006)

"Presentation graphics face the challenge to depict a key message in - usually a single - graphic which needs to fit very many observers at a time, without the chance to give further explanations or context. Exploration graphics, in contrast, are mostly created and used only by a single researcher, who can use as many graphics as necessary to explore particular questions. In most cases none of these graphics alone gives a comprehensive answer to those questions, but must be seen as a whole in the context of the analysis." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

📉Graphical Representation: Distortion (Just the Quotes)

"A man's judgment cannot be better than the information on which he has based it. Give him no news, or present him only with distorted and incomplete data, with ignorant, sloppy, or biased reporting, with propaganda and deliberate falsehoods, and you destroy his whole reasoning process and make him somewhat less than a man." (Arthur H Sulzberger, [speech] 1948)

"A type of error common in both simple and weighted averages is the inclusion of components which have no bearing on or merely distort the summarization. Errors of this kind are frequent in per capita estimates covering the total population."  (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"If a chart contains a number of series which vary widely in individual magnitude, optical distortion may result from the necessarily sharp changes in the angle of the curves. The space between steeply rising or falling curves always appears narrower than the vertical distance between the plotting points." (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"The grid lines should be lighter than the curves, with the base line somewhat heavier than the others. All vertical lines should be of equal weight, unless the time scale is subdivided in quarters or other time periods, indicated by heavier rules. Very wide base lines, sometimes employed for pictorial effect, distort the graphic impression by making the base line the most prominent feature of the chart." (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"The fact is that, despite its mathematical base, statistics is as much an art as it is a science. A great many manipulations and even distortions are possible within the bounds of propriety. Often the statistician must choose among methods, a subjective process, and find the one that he will use to represent the facts." (Darell Huff, "How to Lie with Statistics", 1954)

"Where the values of a series are such that a large part the grid would be superfluous, it is the practice to break the grid thus eliminating the unused portion of the scale, but at the same time indicating the zero line. Failure to include zero in the vertical scale is a very common omission which distorts the data and gives an erroneous visual impression." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Many people use statistics as a drunkard uses a street lamp - for support rather than illumination. It is not enough to avoid outright falsehood; one must be on the alert to detect possible distortion of truth. One can hardly pick up a newspaper without seeing some sensational headline based on scanty or doubtful data." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Simplicity, accuracy. appropriate size, proper proportion, correct emphasis, and skilled execution - these are the factors that produce the effective chart. To achieve simplicity your chart must be designed with a definite audience in mind, show only essential information. Technical terms should be absent as far as possible. And in case of doubt it is wiser to oversimplify than to make matters unduly complex. Be careful to avoid distortion or misrepresentation. Accuracy in graphics is more a matter of portraying a clear reliable picture than reiterating exact values. Selecting the right scales and employing authoritative titles and legends are as important as precision plotting. The right size of a chart depends on its probable use, its importance, and the amount of detail involved." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Probably one of the most common misuses (intentional or otherwise) of a graph is the choice of the wrong scale - wrong, that is, from the standpoint of accurate representation of the facts. Even though not deliberate, selection of a scale that magnifies or reduces - even distorts - the appearance of a curve can mislead the viewer." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"For many people the first word that comes to mind when they think about statistical charts is 'lie'. No doubt some graphics do distort the underlying data, making it hard for the viewer to learn the truth. But data graphics are no different from words in this regard, for any means of communication can be used to deceive. There is no reason to believe that graphics are especially vulnerable to exploitation by liars; in fact, most of us have pretty good graphical lie detectors that help us see right through frauds." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Charts offer opportunities to distort information, to misinform. An old adage can be extended to read: 'There are lies, damned lies, statistics and charts'. Our visual impressions are often more memorable than our understanding of the facts they describe. [...] Never let your design enthusiasms overrule your judgement of the truth." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"Maps used as charts do not need fine cartographic detail. Their purpose is to express ideas, explain relationships, or store data for consultation. Keep your maps simple. Edit out irrelevant detail. Without distortion, try to present the facts as the main feature of your map, which should serve only as a springboard for the idea you're trying to put across." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"An axis is the ruler that establishes regular intervals for measuring information. Because it is such a widely accepted convention, it is often taken for granted and its importance overlooked. Axes may emphasize, diminish, distort, simplify, or clutter the information. They must be used carefully and accurately." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"Because 'reality' and 'truth' are essential in these figures, it is important to be straightforward and thoughtful in the selection of the areas to be used. Manipulation such as enlargement, reduction, and increase or decrease of contrast must not distort or change the information. Touch-up is permissible only to eliminate distracting artifacts. Labels should be used judiciously and sparingly, and should not hide or distract from important information." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"But people treat mutant statistics just as they do other statistics - that is, they usually accept even the most implausible claims without question. [...] And people repeat bad statistics [...] bad statistics live on; they take on lives of their own. [...] Statistics, then, have a bad reputation. We suspect that statistics may be wrong, that people who use statistics may be 'lying' - trying to manipulate us by using numbers to somehow distort the truth. Yet, at the same time, we need statistics; we depend upon them to summarize and clarify the nature of our complex society. This is particularly true when we talk about social problems." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

"Not all statistics start out bad, but any statistic can be made worse. Numbers - even good numbers - can be misunderstood or misinterpreted. Their meanings can be stretched, twisted, distorted, or mangled. These alterations create what we can call mutant statistics - distorted versions of the original figures." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

"Cleverly drawn pictures can sometimes disguise or render invisible what is there. At other times, they can make you see things that are not really there. It is helpful to be aware of how these illusions are achieved, as some of the illusionist’s 'tricks of the trade' can also be found in distortions used in graphs and diagrams." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Category definition and selection in the pre-graphical phase of communication offer varied manipulation opportunities. But once we get to designing the chart itself category distortion opportunities are even more attractive." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"A statistical index has all the potential pitfalls of any descriptive statistic - plus the distortions introduced by combining multiple indicators into a single number. By definition, any index is going to be sensitive to how it is constructed; it will be affected both by what measures go into the index and by how each of those measures is weighted." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

22 December 2011

📉Graphical Representation: Categories (Just the Quotes)

"A bar graph typically presents either averages or frequencies. It is relatively simple to present raw data (in the form of dot plots or box plots). Such plots provide much more information. and they are closer to the original data. If the bar graph categories are linked in some way - for example, doses of treatments - then a line graph will be much more informative. Very complicated bar graphs containing adjacent bars are very difficult to grasp. If the bar graph represents frequencies. and the abscissa values can be ordered, then a line graph will be much more informative and will have substantially reduced chart junk." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Stacked bar graphs do not show data structure well. A trend in one of the stacked variables has to be deduced by scanning along the vertical bars. This becomes especially difficult when the categories do not move in the same direction." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Arbitrary category sequence and misplaced pie chart emphasis lead to general confusion and weaken messages. Although this can be used for quite deliberate and targeted deceit, manipulation of the category axis only really comes into its own with techniques that bend the relationship between the data and the optics in a more calculated way. Many of these techniques are just twins of similar ruses on the value axis. but are none the less powerful for that." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"Category definition and selection in the pre-graphical phase of communication offer varied manipulation opportunities. But once we get to designing the chart itself category distortion opportunities are even more attractive." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"Generally pie charts are to be avoided, as they can be difficult to interpret particularly when the number of categories is greater than five. Small proportions can be very hard to discern […] In addition, unless the percentages in each of the individual categories are given as numbers it can be much more difficult to estimate them from a pie chart than from a bar chart […]." (Jenny Freeman et al, "How to Display Data", 2008)

"Where there is no natural ordering to the categories it can be helpful to order them by size, as this can help you to pick out any patterns or compare the relative frequencies across groups. As it can be difficult to discern immediately the numbers represented in each of the categories it is good practice to include the number of observations on which the chart is based, together with the percentages in each category." (Jenny Freeman et al, "How to Display Data", 2008)

"Histograms are often mistaken for bar charts but there are important differences. Histograms show distribution through the frequency of quantitative values (y axis) against defined intervals of quantitative values (x axis). By contrast, bar charts facilitate comparison of categorical values. One of the distinguishing features of a histogram is the lack of gaps between the bars [...]" (Andy Kirk, "Data Visualization: A successful design process", 2012)

"Early exploration of a dataset can be overwhelming, because you don’t know where to start. Ask questions about the data and let your curiosities guide you. […] Make multiple charts, compare all your variables, and see if there are interesting bits that are worth a closer look. Look at your data as a whole and then zoom in on categories and individual data points. […] Subcategories, the categories within categories (within categories), are often more revealing than the main categories. As you drill down, there can be higher variability and more interesting things to see." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"If I had to pick a single go-to graph for categorical data, it would be the horizontal bar chart, which flips the vertical version on its side. Why? Because it is extremely easy to read. The horizontal bar chart is especially useful if your category names are long, as the text is written from left to right, as most audiences read, making your graph legible for your audience." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"A taxonomy is a classification scheme that organizes categories in a broader-narrower hierarchy. Items that share similar qualities are grouped into the same category, and the taxonomy provides a global organization by relating categories to one another." (Jesús Barrasa et al, "Knowledge Graphs: Data in Context for Responsive Businesses", 2021)

📉Graphical Representation: Sparklines (Just the Quotes)

"A sparkline is a small, intense, simple, word-sized graphic with typographic resolution. Sparklines mean that graphics are no longer cartoonish special occasions with captions and boxes, but rather sparkline graphics can be everywhere a word or number can be: embedded in a sentence, table, headline, map, spreadsheet, graphic." (Edward R Tufte, "Beautiful Evidence", 2006)

"By segregating evidence by mode (word, number, image, graph), the current-day computer approach contradicts the spirit of sparklines, a spirit that makes no distinction among words, numbers, graphics, images. It is all evidence, after all. A good system for evidence display should be centered on evidence, not on a collection of application programs each devoted to a single mode of information." (Edward R Tufte, "Beautiful Evidence", 2006)

"By showing recent change in relation to many past changes, sparklines provide a context for nuanced analysis - and, one hopes, better decisions. [...] Sparklines efficiently display and narrate binary data (presence/absence, occurrence/non-occurrence, win/loss). [...] Sparklines can simultaneously accommodate several variables. [...] Sparklines can narrate on-going results detail for any process producing sequential binary outcomes." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines are word-like graphics, With an intensity of visual distinctions comparable to words and letters. [...] Words visually present both an overall shape and letter-by-letter detail; since most readers have seen the word previously, the visual task is usually one of quick recognition. Sparklines present an overall shape and aggregate pattern along with plenty of local detail. Sparklines are read the same way as words, although much more carefully and slowly." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines vastly increase the amount of data within our eye-span and intensify statistical graphics up to the everyday routine capabilities of the human eye-brain system for reasoning about visual evidence, seeing distinctions, and making comparisons. [...] Providing a straightforward and contextual look at intense evidence, sparkline graphics give us some chance to be approximately right rather than exactly wrong. (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines work at intense resolutions, at the level of good typography and cartography. [...] Just as sparklines are like words, so then distributions of sparklines on a page are like sentences and paragraphs. The graphical idea here is make it word-like and typographic - an idea that leads to reasonable answers for most questions about sparkline arrangements." (Edward R Tufte, "Beautiful Evidence", 2006)

"These little data lines, because of their active quality over time, are named sparklines - small, high-resolution graphics usually embedded in a full context of words, numbers, images. Sparklines are data words: data-intense, design-simple, word-sized graphics." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sparklines are compact line graphs that do not have a quantitative scale. They are meant to provide a quick sense of a metric's movement or trend, usually over time. They are more expressive than arrows, which only indicate change from the prior period and do not qualify the degree of change. Sparklines are significantly more compact than normal line graphs but are precise." (Wayne W Eckerson, "Performance Dashboards: Measuring, Monitoring, and Managing Your Business", 2010)

"The biggest difference between line graphs and sparklines is that a sparkline is compact with no grid lines. It isn't meant to give precise values; rather, it should be considered just like any other word in the sentence. Its general shape acts as another term and lends additional meaning in its context. The driving forces behind these compact sparklines are speed and convenience." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"Sparklines aren't necessarily a variation on the line chart, rather, a clever use of them. [...] They take advantage of our visual perception capabilities to discriminate changes even at such a low resolution in terms of size. They facilitate opportunities to construct particularly dense visual displays of data in small space and so are particularly applicable for use on dashboards." (Andy Kirk, "Data Visualization: A successful design process", 2012)

"Line graphs that show more than one line can be useful for making comparisons, but sometimes it is important to discuss each individual line. By using sparklines evaluators can call attention to and discuss individual cases. Sparklines can be embedded within a sentence to illustrate a trend and help stakeholders better understand the data. Evaluators can use this simple visualization when creating reports." (Christopher Lysy, "Developments in Quantitative Data Display and Their Implications for Evaluation", 2013)

"Using sparklines is not as simple as it might seem. You must ensure that variation is as clear as possible. […] Sparklines are an interesting concept, but there are a few issues associated with their extreme miniaturization, among which is the removal of the vertical axes and the consequent absence of quantitative references." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"Sparklines focus on the trend over time and the direction rather than the actual values. Sparklines are used to visualize volatility or outliers. They are usually kept quite narrow on dashboards but still maintain an aspect ratio of 2:3." (Lorna Brown, "Tableau Desktop Cookbook", 2020)

📉Graphical Representation: Clutter (Just the Quotes)

"The practice of drawing several curves on the same sheet is not to be commended except in cases where the curves will not intersect. A crowded chart on which the curves frequently intersect resembles a Chinese puzzle more than a graphic record, and a report submitted in figures is to be preferred to a chart of this kind. Even when the curves do not intersect, they should be made in different colors in order that they may be readily distinguished, one from the other." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)

"Logging skewed variables also helps to reveal the patterns in the data. […] the rescaling of the variables by taking logarithms reduces the nonlinearity in the relationship and removes much of the clutter resulting from the skewed distributions on both variables; in short, the transformation helps clarify the relationship between the two variables. It also […] leads to a theoretically meaningful regression coefficient." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Typically, data analysis is messy, and little details clutter it. Not only confounding factors, but also deviant cases, minor problems in measurement, and ambiguous results lead to frustration and discouragement, so that more data are collected than analyzed. Neglecting or hiding the messy details of the data reduces the researcher's chances of discovering something new." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Do not allow data labels in the data region to interfere with the quantitative data or to clutter the graph. […] Avoid putting notes, keys, and markers in the data region. Put keys and markers just outside the data region and put notes in the legend or in the text." (William S Cleveland, "The Elements of Graphing Data", 1985)

"Make the data stand out and avoid superfluity are two broad strategies that serve as an overall guide to the specific principles […] The data - the quantitative and qualitative information in the data region - are the reason for the existence of the graph. The data should stand out. […] We should eliminate superfluity in graphs. Unnecessary parts of a graph add to the clutter and increase the difficulty of making the necessary elements - the data - stand out." (William S Cleveland, "The Elements of Graphing Data", 1985)

"Confusion and clutter are failures of design, not attributes of information. And so the point is to find design strategies that reveal detail and complexity - rather than to fault the data for an excess of complication. Or, worse, to fault viewers for a lack of understanding. Among the most powerful devices for reducing noise and enriching the content of displays is the technique of layering and separation, visually stratifying various aspects of the data." (Edward R Tufte, "Envisioning Information", 1990)

"An axis is the ruler that establishes regular intervals for measuring information. Because it is such a widely accepted convention, it is often taken for granted and its importance overlooked. Axes may emphasize, diminish, distort, simplify, or clutter the information. They must be used carefully and accurately." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"The more clues to meaning that are supplied elsewhere, the less the need for cluttersome scales." (Eric Meyer, "Designing Infographics", 1997) 

"Areas surrounding data-lines may generate unintentional optical clutter. Strong frames produce melodramatic but content-diminishing visual effects. [...] A good way to assess a display for unintentional optical clutter is to ask 'Do the prominent visual effects convey relevant content?'" (Edward R Tufte, "Beautiful Evidence", 2006)

"Good design, however, can dispose of clutter and show all the data points and their names. [...] Clutter calls for a design solution, not a content reduction." (Edward R Tufte, "Beautiful Evidence", 2006)

"It turns out that our knowledge is always too incomplete and our visual data is too noisy and cluttered to be interpreted by deduction. In this situation, the method of reasoning needed to parse a real-world scene must be statistical, not deductive. To implement this form of reasoning, our knowledge of the world must be encoded in a probabilistic form, known as an a priori probability distribution." (David Mumford, ["The Best Writing of Mathematics: 2012"] 2012)

"The final step in creating your graphic is to refine it. Step back and look at it with fresh eyes. Is there anything that could be removed? Or anything that should be removed because it is distracting? Consider each element in your figure and question whether it contributes enough to your overall goal to justify its contribution. Also consider whether there is anything that could be represented more clearly. Perhaps you have been so effective at simplifying your graphic that you could now include another point in the same figure. Another method of refinement is to check the placement and alignment of your labels. They should be unobtrusive and clearly indicate which object they refer to. Consistency in fonts and alignment of labels can make the difference between something that is easy and pleasant to read, and something that is cluttered and frustrating." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"Visual clutter is one of the most serious issues with bar charts. Using a bar to represent a simple data point is clearly overkill that results in no room for more data. At times, this may make us overlook less obvious things. The population pyramids offer a glaring example of this. But dot plots are not only about reducing clutter and avoiding overstimulation. Because we don’t compare heights, dot plots actually allow us to break the scale to improve resolution, and that’s a big plus over bar charts." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"As a first principle, any visualization should convey its information quickly and easily, and with minimal scope for misunderstanding. Unnecessary visual clutter makes more work for the reader’s brain to do, slows down the understanding (at which point they may give up) and may even allow some incorrect interpretations to creep in." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Estimates based on data are often uncertain. If the data were intended to tell us something about a wider population (like a poll of voting intentions before an election), or about the future, then we need to acknowledge that uncertainty. This is a double challenge for data visualization: it has to be calculated in some meaningful way and then shown on top of the data or statistics without making it all too cluttered." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Clutter is the main issue to keep in mind when assessing whether a paired bar chart is the right approach. With too many bars, and especially when there are more than two bars for each category, it can be difficult for the reader to see the patterns and determine whether the most important comparison is between or within the different categories." (Jonathan Schwabish, "Better Data Visualizations: A guide for scholars, researchers, and wonks", 2021)

"Showing the data and reducing the clutter means reducing extraneous gridlines, markers, and shades that obscure the actual data. Active titles, better labels, and helpful annotations will integrate your chart with the text around it. When charts are dense with many data series, you can use color strategically to highlight series of interest or break one dense chart into multiple smaller versions."  (Jonathan Schwabish, "Better Data Visualizations: A guide for scholars, researchers, and wonks", 2021)

📉Graphical Representation: Diagramming (Just the Quotes)

"Diagrams are of great utility for illustrating certain questions of vital statistics by conveying ideas on the subject through the eye, which cannot be so readily grasped when contained in figures." (Florence Nightingale, "Mortality of the British Army", 1857)

"They [diagrams] are designed not so much to allow of reference to particular numbers, which can be better had from printed tables of figures, as to exhibit to the eye the general results of large masses of figures which it is hopeless to attack in any other way than by graphical representation." (William S Jevons, [letter to Richard Hutton] 1863)

"[…] deduction consists in constructing an icon or diagram the relations of whose parts shall present a complete analogy with those of the parts of the object of reasoning, of experimenting upon this image in the imagination, and of observing the result so as to discover unnoticed and hidden relations among the parts." (Charles S Peirce, 1885)

"Deduction is that mode of reasoning which examines the state of things asserted in the premises, forms a diagram of that state of things, perceives in the parts of the diagram relations not explicitly mentioned in the premises, satisfies itself by mental experiments upon the diagram that these relations would always subsist, or at least would do so in a certain proportion of cases, and concludes their necessary, or probable, truth." (Charles S Peirce, "Kinds of Reasoning", cca. 1896)

"The preliminary examination of most data is facilitated by the use of diagrams. Diagrams prove nothing, but bring outstanding features readily to the eye; they are therefore no substitutes for such critical tests as may be applied to the data, but are valuable in suggesting such tests, and in explaining the conclusions founded upon them." (Sir Ronald A Fisher, "Statistical Methods for Research Workers", 1925)

"Although, the tabular arrangement is the fundamental form for presenting a statistical series, a graphic representation - in a chart or diagram - is often of great aid in the study and reporting of statistical facts. Moreover, sometimes statistical data must be taken, in their sources, from graphic rather than tabular records." (William L Crum et al, "Introduction to Economic Statistics", 1938)

"The eye can accurately appraise only very few features of a diagram, and consequently a complicated or confusing diagram will lead the reader astray. The fundamental rule for all charting is to use a plan which is simple and which takes account, in its arrangement of the facts to be presented, of the above-mentioned capacities of the eye."  (William L Crum et al, "Introduction to Economic Statistics", 1938)

"[…] statistical literacy. That is, the ability to read diagrams and maps; a 'consumer' understanding of common statistical terms, as average, percent, dispersion, correlation, and index number."  (Douglas Scates, "Statistics: The Mathematics for Social Problems", 1943)

"I believe, that the decisive idea which brings the solution of a problem is rather often connected with a well-turned word or sentence. The word or the sentence enlightens the situation, gives things, as you say, a physiognomy. It can precede by little the decisive idea or follow on it immediately; perhaps, it arises at the same time as the decisive idea. […]  The right word, the subtly appropriate word, helps us to recall the mathematical idea, perhaps less completely and less objectively than a diagram or a mathematical notation, but in an analogous way. […] It may contribute to fix it in the mind." (George Pólya [in a letter to Jaque Hadamard, "The Psychology of Invention in the Mathematical Field", 1949])

"The primary purpose of a graph is to show diagrammatically how the values of one of two linked variables change with those of the other. One of the most useful applications of the graph occurs in connection with the representation of statistical data." (John F Kenney & E S Keeping, "Mathematics of Statistics" Vol. I 3rd Ed., 1954)

"A model is a qualitative or quantitative representation of a process or endeavor that shows the effects of those factors which are significant for the purposes being considered. A model may be pictorial, descriptive, qualitative, or generally approximate in nature; or it may be mathematical and quantitative in nature and reasonably precise. It is important that effective means for modeling be understood such as analog, stochastic, procedural, scheduling, flow chart, schematic, and block diagrams." (Harold Chestnut, "Systems Engineering Tools", 1965)

"To analyse graphic representation precisely, it is helpful to distinguish it from musical, verbal and mathematical notations, all of which are perceived in a linear or temporal sequence. The graphic image also differs from figurative representation essentially polysemic, and from the animated image, governed by the laws of cinematographic time. Within the boundaries of graphics fall the fields of networks, diagrams and maps. The domain of graphic imagery ranges from the depiction of atomic structures to the representation of galaxies and extends into the spheres of topography and cartography." (Jacques Bertin, "Semiology of graphics" ["Semiologie Graphique"], 1967)

"One of the methods making the data intelligible is to represent it by means of graphs and diagrams. The graphic & diagrammatic representation of the data is always appealing to the eye as well as to the mind of the observer." (S P Singh & R P S Verma, "Agricultural Statistics", cca. 1969)

"Pencil and paper for construction of distributions, scatter diagrams, and run-charts to compare small groups and to detect trends are more efficient methods of estimation than statistical inference that depends on variances and standard errors, as the simple techniques preserve the information in the original data." (William E Deming, "On Probability as Basis for Action" American Statistician Vol. 29 (4), 1975)

"The types of graphics used in operating a business fall into three main categories: diagrams, maps, and charts. Diagrams, such as organization diagrams, flow diagrams, and networks, are usually intended to graphically portray how an activity should be, or is being, accomplished, and who is responsible for that accomplishment. Maps such as route maps, location maps, and density maps, illustrate where an activity is, or should be, taking place, and what exists there. [...] Charts such as line charts, column charts, and surface charts, are normally constructed to show the businessman how much and when. Charts have the ability to graphically display the past, present, and anticipated future of an activity. They can be plotted so as to indicate the current direction that is being followed in relationship to what should be followed. They can indicate problems and potential problems, hopefully in time for constructive corrective action to be taken." (Robert D Carlsen & Donald L Vest, "Encyclopedia of Business Charts", 1977)

"Charts and diagrams are the visual presentation of information. Since text and tables of information require close study to obtain the more general impressions of the subject, charts can be used to present readily understandable, easily digestible and, above all, memorable solutions." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"As the size of software systems increases, the algorithms and data structures of the computation no longer constitute the major design problems. When systems are constructed from many components, the organization of the overall system - the software architecture - presents a new set of design problems. This level of design has been addressed in a number of ways including informal diagrams and descriptive terms, module interconnection languages, templates and frameworks for systems that serve the needs of specific domains, and formal models of component integration mechanisms." (David Garlan & Mary Shaw, "An introduction to software architecture", Advances in software engineering and knowledge engineering Vol 1, 1993)

"Delay time, the time between causes and their impacts, can highly influence systems. Yet the concept of delayed effect is often missed in our impatient society, and when it is recognized, it’s almost always underestimated. Such oversight and devaluation can lead to poor decision making as well as poor problem solving, for decisions often have consequences that don’t show up until years later. Fortunately, mind mapping, fishbone diagrams, and creativity/brainstorming tools can be quite useful here." (Stephen G Haines, "The Manager's Pocket Guide to Strategic and Business Planning", 1998)

"Always remember that the model is not the diagram. The diagram’s purpose is to help communicate and explain the model. The code can serve as a repository of the details of the design." (Eric Evans, "Domain-Driven Design: Tackling complexity in the heart of software", 2003)

"Data is transformed into graphics to understand. A map, a diagram are documents to be interrogated. But understanding means integrating all of the data. In order to do this it’s necessary to reduce it to a small number of elementary data. This is the objective of the 'data treatment' be it graphic or mathematic." (Jacques Bertin [interview], 2003)

"Diagrams are a means of communication and explanation, and they facilitate brainstorming. They serve these ends best if they are minimal. Comprehensive diagrams of the entire object model fail to communicate or explain; they overwhelm the reader with detail and they lack meaning." (Eric Evans, "Domain-Driven Design: Tackling complexity in the heart of software", 2003)

"Graphical design notations have been with us for a while [...] their primary value is in communication and understanding. A good diagram can often help communicate ideas about a design, particularly when you want to avoid a lot of details. Diagrams can also help you understand either a software system or a business process. As part of a team trying to figure out something, diagrams both help understanding and communicate that understanding throughout a team. Although they aren't, at least yet, a replacement for textual programming languages, they are a helpful assistant." (Martin Fowler, "UML Distilled: A Brief Guide to the Standard Object Modeling", 2004)

"System Thinking is a common concept for understanding how causal relationships and feedbacks work in an everyday problem. Understanding a cause and an effect enables us to analyse, sort out and explain how changes come about both temporarily and spatially in common problems. This is referred to as mental modelling, i.e. to explicitly map the understanding of the problem and making it transparent and visible for others through Causal Loop Diagrams (CLD)." (Hördur V. Haraldsson, "Introduction to System Thinking and Causal Loop Diagrams", 2004)

"A diagram is a graphic shorthand. Though it is an ideogram, it is not necessarily an abstraction. It is a representation of something in that it is not the thing itself. In this sense, it cannot help but be embodied. It can never be free of value or meaning, even when it attempts to express relationships of formation and their processes. At the same time, a diagram is neither a structure nor an abstraction of structure." (Peter Eisenman, "Written Into the Void: Selected Writings", 1990-2004, 2007)

"Diagrams are information graphics that are made up primarily of geometric shapes, such as rectangles, circles, diamonds, or triangles, that are typically (but not always) interconnected by lines or arrows. One of the major purposes of a diagram is to show how things, people, ideas, activities, etc. interrelate and interconnect. Unlike quantitative charts and graphs, diagrams are used to show interrelationships in a qualitative way." (Robbie T Nakatsu, "Diagrammatic Reasoning in AI", 2010)

"[…] a conceptual model is a diagram connecting variables and constructs based on theory and logic that displays the hypotheses to be tested." (Mary W Celsi et al, "Essentials of Business Research Methods", 2011)

"Geographic maps have the advantage of being true to scale - great for walking. Diagrams have the advantage of being easily imaged and remembered, often true to a non-pedestrian experience, and the ability to open up congestion, reduce empty space, and use real estate efficiently. Hybrids 'mapograms' ? - often have the disadvantages of both map and diagram with none of the corresponding advantages." (Joel Katz, "Designing Information: Human factors and common sense in information design", 2012)

"Diagrams furnish only approximate information. They do not add anything to the meaning of the data and, therefore, are not of much use to a statistician or research worker for further mathematical treatment or statistical analysis. On the other hand, graphs are more obvious, precise and accurate than the diagrams and are quite helpful to the statistician for the study of slopes, rates of change and estimation, (interpolation and extrapolation), wherever possible." (S C Gupta & Indra Gupta, "Business Statistics", 2013) 

"System dynamics [...] uses models and computer simulations to understand behavior of an entire system, and has been applied to the behavior of large and complex national issues. It portrays the relationships in systems as feedback loops, lags, and other descriptors to explain dynamics, that is, how a system behaves over time. Its quantitative methodology relies on what are called 'stock-and-flow diagrams' that reflect how levels of specific elements accumulate over time and the rate at which they change. Qualitative systems thinking constructs evolved from this quantitative discipline." (Karen L Higgins, "Economic Growth and Sustainability: Systems Thinking for a Complex World", 2015)

"To keep accuracy and efficiency of your diagrams appealing to a potential audience, explicitly describe the encoding principles we used. Titles, labels, and legends are the most common ways to define the meaning of the diagram and its elements." (Vasily Pantyukhin, "Principles of Design Diagramming", 2015)

"Upon discovering a visual image, the brain analyzes it in terms of primitive shapes and colors. Next, unity contours and connections are formed. As well, distinct variations are segmented. Finally, the mind attracts active attention to the significant things it found. That process is permanently running to react to similarities and dissimilarities in shapes, positions, rhythms, colors, and behavior. It can reveal patterns and pattern-violations among the hundreds of data values. That natural ability is the most important thing used in diagramming." (Vasily Pantyukhin, "Principles of Design Diagramming", 2015)

"Usually, diagrams contain some noise – information unrelated to the diagram’s primary goal. Noise is decorations, redundant, and irrelevant data, unnecessarily emphasized and ambiguous icons, symbols, lines, grids, or labels. Every unnecessary element draws attention away from the central idea that the designer is trying to share. Noise reduces clarity by hiding useful information in a fog of useless data. You may quickly identify noise elements if you can remove them from the diagram or make them less intense and attractive without compromising the function." (Vasily Pantyukhin, "Principles of Design Diagramming", 2015)

"Models are formal structures represented in mathematics and diagrams that help us to understand the world. Mastery of models improves your ability to reason, explain, design, communicate, act, predict, and explore." (Scott E Page, "The Model Thinker", 2018)

"Some scientists (e.g., econometricians) like to work with mathematical equations; others (e.g., hard-core statisticians) prefer a list of assumptions that ostensibly summarizes the structure of the diagram. Regardless of language, the model should depict, however qualitatively, the process that generates the data - in other words, the cause-effect forces that operate in the environment and shape the data generated." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The calculus of causation consists of two languages: causal diagrams, to express what we know, and a symbolic language, resembling algebra, to express what we want to know. The causal diagrams are simply dot-and-arrow pictures that summarize our existing scientific knowledge. The dots represent quantities of interest, called 'variables', and the arrows represent known or suspected causal relationships between those variables - namely, which variable 'listens' to which others." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The main differences between Bayesian networks and causal diagrams lie in how they are constructed and the uses to which they are put. A Bayesian network is literally nothing more than a compact representation of a huge probability table. The arrows mean only that the probabilities of child nodes are related to the values of parent nodes by a certain formula (the conditional probability tables) and that this relation is sufficient. That is, knowing additional ancestors of the child will not change the formula. Likewise, a missing arrow between any two nodes means that they are independent, once we know the values of their parents. [...] If, however, the same diagram has been constructed as a causal diagram, then both the thinking that goes into the construction and the interpretation of the final diagram change." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"Decision trees show the breakdown of the data by one variable then another in a very intuitive way, though they are generally just diagrams that don’t actually encode data visually." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"The term 'infographics' is used for eye-catching diagrams which get a simple message across. They are very popular in advertising and can convey an impression of scientific, reliable information, but they are not the same thing as data visualization. An infographic will typically only convey a few numbers, and not use visual presentations to allow the reader to make comparisons of their own." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

More quotes on "Diagramming" at the-web-of-knowledge.blogspot.com 

21 December 2011

📉Graphical Representation: Area (Just the Quotes)

"In general, the comparison of two circles of different size should be strictly avoided. Many excellent works on statistics approve the comparison of circles of different size, and state that the circles should always be drawn to represent the facts on an area basis rather than on a diameter basis. The rule, however, is not always followed and the reader has no way of telling whether the circles compared have been drawn on a diameter basis or on an area basis, unless the actual figures for the data are given so that the dimensions may be verified." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"Readers of statistical diagrams should not be required to compare magnitudes in more than one dimension. Visual comparisons of areas are particularly inaccurate and should not be necessary in reading any statistical graphical diagram." (William C Marshall, "Graphical methods for schools, colleges, statisticians, engineers and executives", 1921)

"A chart without a border line has several advantages. It is not limited to a designated area. The irregular white space surrounding it makes it more adaptable to any page size. It may be more readily placed either horizontally or vertically on the page, so long as the reduction in the size of the chart does not destroy legibility of lettering." (Mary E Spear, "Charting Statistics", 1952)

"The pie or sector chart makes a comparison of various components with each other and with the whole. However, this type should be used sparingly, especially when there are many segments. It is not only difficult to compare area segments, but most difficult to label them properly. When there are many divisions of the data, a bar chart would give greater clarity." (Mary E Spear, "Charting Statistics", 1952)

"Charts and graphs represent an extremely useful and flexible medium for explaining, interpreting, and analyzing numerical facts largely by means of points, lines, areas, and other geometric forms and symbols. They make possible the presentation of quantitative data in a simple, clear, and effective manner and facilitate comparison of values, trends, and relationships. Moreover, charts and graphs possess certain qualities and values lacking in textual and tabular forms of presentation." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Circles of different size, however cannot properly be used to compare the size of different totals. This is because the reader does not know whether to compare the diameters or the areas (which vary as the squares of the diameters), and is likely to misjudge the comparison in either ease. Usually the circles are drawn so that their diameters are in correct proportion to each other; but then the area comparison is exaggerated. Component bars should be used to show totals of different size since their one dimension lengths can be easily judged not only for the totals themselves but for the component parts as well. Circles, therefore, can show proportions properly by variations in angles of sectors but not by variations in diameters."  (Anna C Rogers, "Graphic Charts Handbook", 1961)

"The histogram, with its columns of area proportional to number, like the bar graph, is one of the most classical of statistical graphs. Its combination with a fitted bell-shaped curve has been common since the days when the Gaussian curve entered statistics. Yet as a graphical technique it really performs quite poorly. Who is there among us who can look at a histogram-fitted Gaussian combination and tell us, reliably, whether the fit is excellent, neutral, or poor? Who can tell us, when the fit is poor, of what the poorness consists? Yet these are just the sort of questions that a good graphical technique should answer at least approximately." (John W Tukey, "The Future of Processes of Data Analysis", 1965)

"The varieties of circle charts are necessarily limited by the lack of basic design variation - a circle is a circle! Also, a circle can be considered as representing only one unit of area. regardless of its size. Thus, circle charts have limited applications, i.e., to show how a given quantity (area) is divided among its component parts,' or to show changes in the variable by showing area changes. A circle chart almost always presents some form of a part-to-total relationship." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"The space between columns, on the other hand, should be just sufficient to separate them clearly, but no more. The columns should not, under any circumstances, be spread out merely to fill the width of the type area. […] Sometimes, however, it is difficult to avoid undesirably large gaps between columns, particularly where the data within any given column vary considerably in length. This problem can sometimes be solved by reversing the order of the columns […]. In other instances the insertion of additional space after every fifth entry or row can be helpful, […] but care must be taken not to imply that the grouping has any special meaning." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"Scatter charts show the relationships between information, plotted as points on a grid. These groupings can portray general features of the source data, and are useful for showing where correlationships occur frequently. Some scatter charts connect points of equal value to produce areas within the grid which consist of similar features." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"There is a technical difference between a bar chart and a histogram in that the number represented is proportional to the length of bar in the former and the area in the latter. This matters if non-uniform binning is used. Bar charts can be used for qualitative or quantitative data, whereas histograms can only be used for quantitative data, as no meaning can be attached to the width of the bins if the data are qualitative." (Roger J Barlow, "Statistics: A guide to the use of statistical methods in the physical sciences", 1989)

"Using area to encode quantitative information is a poor graphical method. Effects that can be readily perceived in other visualizations are often lost in an encoding by area." (William S Cleveland, "Visualizing Data", 1993)

"Area graphs are generally not used to convey specific values. Instead, they are most frequently used to show trends and relationships, to identify and/or add emphasis to specific information by virtue of the boldness of the shading or color, or to show parts-of-the-whole." (Robert L Harris, "Information Graphics: A Comprehensive Illustrated Reference", 1996) 

"Although in most cases the actual value designated by a bar is determined by the location of the end of the bar, many people associate the length or area of the bar with its value. As long as the scale is linear, starts at zero, is continuous, and the bars are the same width, this presents no problem. When any of these conditions are changed, the potential exists that the graph will be misinterpreted." (Robert L Harris, "Information Graphics: A Comprehensive Illustrated Reference", 1996)

"Grouped area graphs sometimes cause confusion because the viewer cannot determine whether the areas for the data series extend down to the zero axis. […] Grouped area graphs can handle negative values somewhat better than stacked area graphs but they still have the problem of all or portions of data curves being hidden by the data series towards the front." (Robert L Harris, "Information Graphics: A Comprehensive Illustrated Reference", 1996)

"A Venn diagram is a simple representation of the sample space, that is often helpful in seeing 'what is going on'. Usually the sample space is represented by a rectangle, with individual regions within the rectangle representing events. It is often helpful to imagine that the actual areas of the various regions in a Venn diagram are in proportion to the corresponding probabilities. However, there is no need to spend a long time drawing these diagrams - their use is simply as a reminder of what is happening." (Graham Upton & Ian Cook, "Introducing Statistics", 2001)

"This pie chart violates several of the rules suggested by the question posed in the introduction. First, immediacy: the reader has to turn to the legend to find out what the areas represent; and the lack of color makes it very difficult to determine which area belongs to what code. Second, the underlying structure of the data is completely ignored. Third, a tremendous amount of ink is used to display eight simple numbers." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Choose scales wisely, as they have a profound influence on the interpretation of graphs. Not all scales require that zero be included, but bar graphs and other graphs where area is judged do require it." (Naomi B Robbins, "Creating More effective Graphs", 2005)

"Areas surrounding data-lines may generate unintentional optical clutter. Strong frames produce melodramatic but content-diminishing visual effects. [...] A good way to assess a display for unintentional optical clutter is to ask 'Do the prominent visual effects convey relevant content?'" (Edward R Tufte, "Beautiful Evidence", 2006)

"The notion of outcomes covering a space is a very useful mental image, as it ties in strongly with the use of Venn diagrams and tables for clarifying the nature of possible events resulting from a trial. There are two important aspects to this. First, when enumerating the various outcomes that comprise an event, the number of (equally. likely) outcomes should correspond, visually, with the area of that part of the diagram represented by the event in question - the greater the probability, the larger the area. Secondly, where events overlap (for example, when rolling a die, consider the two events 'getting an even score' and 'getting a score greater than 2' ), the various regions in the Venn diagram help to clarify the various combinations of events that might occur." (Alan Graham, "Developing Thinking in Statistics", 2006)

"It is important to pay heed to the following detail: a disadvantage of logarithmic diagrams is that a graphical integration is not possible, i.e., the area under the curve (the integral) is of no relevance." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"The data [in tables] should not be so spaced out that it is difficult to follow or so cramped that it looks trapped. Keep columns close together; do not spread them out more than is necessary. If the columns must be spread out to fit a particular area, such as the width of a page, use a graphic device such as a line or screen to guide the reader’s eye across the row." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"A unimodal histogram that is not symmetric is said to be skewed. If the upper tail of the histogram stretches out much farther than the lower tail, then the distribution of values is positively skewed or right skewed. If, on the other hand, the lower tail is much longer than the upper tail, the histogram is negatively skewed or left skewed." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"The use of the density scale to construct the histogram ensures that the area of each rectangle in the histogram will be proportional to the corresponding relative frequency. The formula for density can also be used when class widths are equal. However, when the intervals are of equal width, the extra arithmetic required to obtain the densities is unnecessary." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"Area can also make data seem more tangible or relatable, because physical objects take up space. A circle or a square uses more space than a dot on a screen or paper. There’s less abstraction between visual cue and real world." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"One very common problem in data visualization is that encoding numerical variables to area is incredibly popular, but readers can’t translate it back very well." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

📉Graphical Representation: Histograms (Just the Quotes)

"The histogram, with its columns of area proportional to number, like the bar graph, is one of the most classical of statistical graphs. Its combination with a fitted bell-shaped curve has been common since the days when the Gaussian curve entered statistics. Yet as a graphical technique it really performs quite poorly. Who is there among us who can look at a histogram-fitted Gaussian combination and tell us, reliably, whether the fit is excellent, neutral, or poor? Who can tell us, when the fit is poor, of what the poorness consists? Yet these are just the sort of questions that a good graphical technique should answer at least approximately." (John W Tukey, "The Future of Processes of Data Analysis", 1965)

"There is a technical difference between a bar chart and a histogram in that the number represented is proportional to the length of bar in the former and the area in the latter. This matters if non-uniform binning is used. Bar charts can be used for qualitative or quantitative data, whereas histograms can only be used for quantitative data, as no meaning can be attached to the width of the bins if the data are qualitative." (Roger J Barlow, "Statistics: A guide to the use of statistical methods in the physical sciences", 1989)

"90 percent of all problems can be solved by using the techniques of data stratification, histograms, and control charts. Among the causes of nonconformance, only one-fifth or less are attributable to the workers." (Kaoru Ishikawa, The Quality Management Journal Vol. 1, 1993)

"Averages, ranges, and histograms all obscure the time-order for the data. If the time-order for the data shows some sort of definite pattern, then the obscuring of this pattern by the use of averages, ranges, or histograms can mislead the user. Since all data occur in time, virtually all data will have a time-order. In some cases this time-order is the essential context which must be preserved in the presentation." (Donald J Wheeler," Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"The ordinary histogram is constructed by binning data on a uniform grid. Although this is probably the most widely used statistical graphic, it is one of the more difficult ones to compute. Several problems arise, including choosing the number of bins (bars) and deciding where to place the cutpoints between bars." (Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"The plot tells us the data are granular in the data source, something we could not ascertain with the histogram. There is an important lesson here. Statistics texts and statistical packages that recommend the histogram as the graphical starting point for a data analysis are giving bad advice. The same goes for kernel density estimates. These are appropriate second stages for graphical data analysis. The best starting point for getting a sense of the distribution of a variable is a tally, stem-and-leaf, or a dot plot. A dot plot is a special case of a tally (perhaps best thought of as a delta-neighborhood tally). Once we see that the data are not granular, we may move on to a histogram or kernel density, which smooths the data more than a dot plot." (Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"Use of a histogram should be strictly reserved for continuous numerical data or for data that can be effectively modelled as continuous […]. Unlike bar charts, therefore, the bars of a histogram corresponding to adjacent intervals should not have gaps between them, for obvious reasons." (Alan Graham, "Developing Thinking in Statistics", 2006)

"A histogram consists of the outline of bars of equal width and appropriate length next to each other. By connecting the frequency values at the position of the nominal values (the midpoints of the intervals) with straight lines, a frequency polygon is obtained. Attaching classes with frequency zero at either end makes the area (the integral) under the frequency polygon equal  to that under the histogram." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"Before calculating a confidence interval for a mean, first check that one of the situations just described holds. To determine whether the data are bell-shaped or skewed, and to check for outliers, plot the data using a histogram, dotplot, or stemplot. A boxplot can reveal outliers and will sometimes reveal skewness, but it cannot be used to determine the shape otherwise. The sample mean and median can also be compared to each other. Differences between the mean and the median usually occur if the data are skewed - that is, are much more spread out in one direction than in the other." (Jessica M Utts & Robert F Heckard, "Mind on Statistics", 2007)

"Histograms are powerful in cases where meaningful class breaks can be defined and classes are used to select intervals and groups in the data. However, they often perform poorly when it comes to the visualization of a distribution." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009) 

"Need to consider outliers as they can affect statistics such as means, standard deviations, and correlations. They can either be explained, deleted, or accommodated (using either robust statistics or obtaining additional data to fill-in). Can be detected by methods such as box plots, scatterplots, histograms or frequency distributions." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"A histogram for discrete numerical data is a graph of the frequency or relative frequency distribution, and it is similar to the bar chart for categorical data. Each frequency or relative frequency is represented by a rectangle centered over the corresponding value (or range of values) and the area of the rectangle is proportional to the corresponding frequency or relative frequency." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"A unimodal histogram that is not symmetric is said to be skewed. If the upper tail of the histogram stretches out much farther than the lower tail, then the distribution of values is positively skewed or right skewed. If, on the other hand, the lower tail is much longer than the upper tail, the histogram is negatively skewed or left skewed." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"Histograms are often mistaken for bar charts but there are important differences. Histograms show distribution through the frequency of quantitative values (y axis) against defined intervals of quantitative values(x axis). By contrast, bar charts facilitate comparison of categorical values. One of the distinguishing features of a histogram is the lack of gaps between the bars [...]" (Andy Kirk, "Data Visualization: A successful design process", 2012)

"The use of the density scale to construct the histogram ensures that the area of each rectangle in the histogram will be proportional to the corresponding relative frequency. The formula for density can also be used when class widths are equal. However, when the intervals are of equal width, the extra arithmetic required to obtain the densities is unnecessary." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"Histograms and frequency polygons display a schematic of a numeric variable's frequency distribution. These plots can show us the center and spread of a distribution, can be used to judge the skewness, kurtosis, and modicity of a distribution, can be used to search for outliers, and can help us make decisions about the symmetry and normality of a distribution." (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

"A histogram represents the frequency distribution of the data. Histograms are similar to bar charts but group numbers into ranges. Also, a histogram lets you show the frequency distribution of continuous data. This helps in analyzing the distribution (for example, normal or Gaussian), any outliers present in the data, and skewness." (Umesh R Hodeghatta & Umesha Nayak, "Business Analytics Using R: A Practical Approach", 2017)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.