04 December 2011

📉Graphical Representations: Dashboards (Just the Quotes)

"The real value of dashboard products lies in their ability to replace hunt‐and‐peck data‐gathering techniques with a tireless, adaptable, information‐flow mechanism. Dashboards transform data repositories into consumable information." (Gregory L Hovis, "Stop Searching for InformationMonitor it with Dashboard Technology," DM Direct, 2002)

"Dashboards and visualization are cognitive tools that improve your 'span of control' over a lot of business data. These tools help people visually identify trends, patterns and anomalies, reason about what they see and help guide them toward effective decisions. As such, these tools need to leverage people's visual capabilities. With the prevalence of scorecards, dashboards and other visualization tools now widely available for business users to review their data, the issue of visual information design is more important than ever." (Richard Brath & Michael Peters, "Dashboard Design: Why Design is Important," DM Direct, 2004)

“Dashboards aren't all that different from some of the other means of presenting information, but when properly designed the single-screen display of integrated and finely tuned data can deliver insight in an especially powerful way.” (Richard Brath & Michael Peters, "Dashboard Design: Why Design is Important," DM Direct, 2004)

"An effective dashboard is the product not of cute gauges, meters, and traffic lights, but rather of informed design: more science than art, more simplicity than dazzle. It is, above all else, about communication." (Stephen Few, "Information Dashboard Design", 2006)

"Most dashboards fail to communicate efficiently and effectively, not because of inadequate technology (at least not primarily), but because of poorly designed implementations. No matter how great the technology, a dashboard's success as a medium of communication is a product of design, a result of a display that speaks clearly and immediately. Dashboards can tap into the tremendous power of visual perception to communicate, but only if those who implement them understand visual perception and apply that understanding through design principles and practices that are aligned with the way people see and think." (Stephen Few, "Information Dashboard Design", 2006) 

"Having a purposeless or poorly performing dashboard is more common than not. This happens when the underlying architecture is not designed properly to support the needs of dashboard interaction. There is an obvious disconnect between the design of the data warehouse and the design of the dashboards. The people who design the data warehouse do not know what the dashboard will do; and the people who design the dashboards do not know how the data warehouse was designed, resulting in a lack of cohesion between the two. A similar disconnect can also exist between the dashboard designer and the business analyst, resulting in a dashboard that may look beautiful and dazzling but brings very little business value." (Nils H Rasmussen et al, "Business Dashboards: A visual catalog for design and deployment", 2009)

"In general, it still holds true that 'there is no such thing as a free lunch'. What this means is that the most advanced dashboard solutions with the most features and flexibility are generally also the technologies that require more setup and more skill sets from the administrators and the end users. In some cases companies 'dumb down' their dashboard application in the initial stages of deployment so as not to scare their users with too many options. Later, when a dashboard culture has developed, they open up more of the functionality." (Nils H Rasmussen et al, "Business Dashboards: A visual catalog for design and deployment", 2009)

"There are myriad questions that we can ask from data today. As such, it’s impossible to write enough reports or design a functioning dashboard that takes into account every conceivable contingency and answers every possible question." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"A dashboard is like the executive summary of a report. We read executive summaries and skip the body of the report if the summary is more or less in line with our expectations. Trouble is, measurement is never exhaustive. It is only when we dive in that we realize what areas may have been missed." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"[…] an overall green status indicator doesn’t mean anything most of the time. All it says is that the things under measurement seem okay. But there always will be many more things not under measurement. To celebrate green indicators is to ignore the unknowns. […] The tendency to roll up metrics into dashboards promotes ignorance of the real situation on the ground. We forget that we only see what is under measurement. We only act when something is not green." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"Rolling up fine-grained metrics to create high-level dashboards puts pressure on teams to keep the fine-grained metrics green even when it might not be the best use of their time." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"A performance dashboard is a practical tool to improve management effectiveness and efficiency, not just a pretty retrospective picture in an annual report." (Pearl Zhu, "Performance Master: Take a Holistic Approach to Unlock Digital Performance", 2017)

"All human storytellers bring their subjectivity to their narratives. All have bias, and possibly error. Acknowledging and defusing that bias is a vital part of successfully using data stories. By debating a data story collaboratively and subjecting it to critical thinking, organizations can get much higher levels of engagement with data and analytics and impact their decision making much more than with reports and dashboards alone." (James Richardson, 2017)

"Dashboards are a type of multiform visualization used to summarize and monitor data. These are most useful when proxies have been well validated and the task is well understood. This design pattern brings a number of carefully selected attributes together for fast, and often continuous, monitoring - dashboards are often linked to updating data streams. While many allow interactivity for further investigation, they typically do not depend on it. Dashboards are often used for presenting and monitoring data and are typically designed for at-a-glance analysis rather than deep exploration and analysis." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Infographics combine art and science to produce something that is not unlike a dashboard. The main difference from a dashboard is the subjective data and the narrative or story, which enhances the data-driven visual and engages the audience quickly through highlighting the required context." (Travis Murphy, "Infographics Powered by SAS®: Data Visualization Techniques for Business Reporting", 2018)

"Dashboards are collections of several linked visualizations all in one place. The idea is very popular as part of business intelligence: having current data on activity summarized and presented all inone place. One danger of cramming a lot of disparate information into one place is that you will quickly hit information overload. Interactivity and small multiples are definitely worth considering as ways of simplifying the information a reader has to digest in a dashboard. As with so many other visualizations, layering the detail for different readers is valuable." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"[Dashboards] are popular methods for displaying multiple visualizations and statistical information. Dashboards often take the form of some organizational instrument that offers both at-a-glance and detailed views of many different analytical and information dimensions. Dashboards are not a unique chart type themselves, but rather should be considered compositions that comprise multiple chart types." (Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019)

"Understanding the entire data ecosystem, from the production of a data point to its consumption in a dashboard or a visualization, provides the ability to invoke action, which is more valuable than the mere sum of its parts." (Jesús Barrasa et al, "Knowledge Graphs: Data in Context for Responsive Businesses", 2021)

"A well-designed dashboard needs to provide a similar experience; information cannot be placed just anywhere on the dashboard. Charts that relate to one another are usually positioned close to one another. Important charts often appear larger and more visually prominent than less important ones. In other words, there are natural sizes for how a dashboard comprises charts based on the task and context." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"As we enter into certain types of analytical conversations, we expect the conversations to flow in a predictable and cohesive manner. A KPI dashboard, for example, uses redundant structures across specific dimensions or measures to convey information. A dashboard with a top-down exposition style provides high-level information first and clarifies downward, while a bottom-up dashboard starts with the details and clarifies them against the larger picture." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Chart choices can also create weight within the entire composition. Presenting information as a comprehensive visualization, such as in a dashboard, requires thinking beyond individual charts. In writing, we not only craft sentences, but write the composition as an entire piece. Certain sentences may drive the writing more, but all sentences play a role in conveying the message." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"The sizes of charts in space reflect how we convey information to a reader. In a dashboard context, the content, size, and space that the various charts occupy should reflect the form and function of the main message. As you saw with the bento box metaphor from the introduction, there needs to be deliberate thought put into the placement and size of each individual chart so that they all work together in harmony." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"When integrating written text with charts in a functionally aesthetic way, the reader should be able to find the key takeaways from the chart or dashboard, taking into account the context, constraints, and reading objectives of the overall message."  (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

📉Graphical Representation: Information Design (Just the Quotes)

"The ducks of information design are false escapes from flatland, adding pretend dimensions to impoverished data sets, merely fooling around with information." (Edward R Tufte, "Envisioning Information", 1990)

"We envision information in order to reason about, communicate, document, and preserve that knowledge - activities nearly always carried out on two-dimensional paper and computer screen. Escaping this flatland and enriching the density of data displays are the essential tasks of information design." (Edward R Tufte, "Envisioning Information", 1990)

"Good information design is clear thinking made visible, while bad design is stupidity in action." (Edward Tufte, "Visual Explanations" , 1997)

"Dashboards and visualization are cognitive tools that improve your 'span of control' over a lot of business data. These tools help people visually identify trends, patterns and anomalies, reason about what they see and help guide them toward effective decisions. As such, these tools need to leverage people's visual capabilities. With the prevalence of scorecards, dashboards and other visualization tools now widely available for business users to review their data, the issue of visual information design is more important than ever." (Richard Brath & Michael Peters, "Dashboard Design: Why Design is Important," DM Direct, 2004)

"Information design is defined as the art and science of preparing information so that can be used by human beings with efficiency and effectiveness. Its primary objectives are:To develop documents that are comprehensible, rapidly and accurately retrievable, and easy to translate into effective actions [...]" (Sheila Pontis, "La historia de la esquematica en la visualization de datos", 2007)

"I feel that every day, all of us now are being blasted by information design. It's being poured into our eyes through the Web, and we're all visualizers now; we're all demanding a visual aspect to our information. There's something almost quite magical about visual information. It's effortless; it literally pours in." (David McCandless, "The beauty of data visualization", TEDGlobal, 2010) 

"The composing of intelligible patterns from the noise of raw data is a hallmark of a good information designer. The most successful examples extract and present essential relationships in a coherent manner while limiting the obtrusiveness of accessory relationships. Effective results are self-evident whereby the information graphic is absorbed by the mind holistically." (William A Anderson & William M Bevington, "Complications and Adjacencies: An Organizing Logic for Information Graphics", Parsons Journal of Information Mapping Vol. II(3), 2010)

"Information design, when successful - whether in print, on the web, or in the environment - represents the functional balance of the meaning of the information, the skills and inclinations of the designer, and the perceptions, education, experience, and needs of the audience." (Joel Katz, "Designing Information: Human factors and common sense in information design", 2012)

"Successful information design in movement systems gives the user the information he needs - and only the information he needs - at every decision point." (Joel Katz, "Designing Information: Human factors and common sense in information design", 2012) 

"Information design is a design practice concerned with the presentation of information. It is often associated with the activities of data visualization; indeed sometimes it is presented as the major field in which data visualization belongs. Unquestionably, both share an underlying motive to facilitate understanding. However, in my view, information design has a much broader application concerned with the design of many different forms of visual communication, particularly those with an instructional or functional slant, such as way-finding devices like hospital building maps or in the design of utility bills." (Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019)

03 December 2011

📉Graphical Representation: Charts vs. Thousand Words (Just the Quotes)

"The drawing shows me at a glance what would be spread over ten pages in a book." (Ivan Turgenev, 1862) [2]

"Sometimes, half a dozen figures will reveal, as with a lighting-flash, the importance of a subject which ten thousand labored words with the same purpose in view, had left at last but dim and uncertain." (Mark Twain, "Life on the Mississippi", 1883) 

"One good picture is worth many pages of written description." (William Sproston Caine, 1891) [2]

"One look is worth a thousand words" (Kathleen Caffyn, 1903) 

"Use a picture. It's worth a thousand words." (Arthur Brisbane, The Post-Standard, 1911)

"One Look Is Worth A Thousand Words" ([advertisement] 1913)

"A picture is worth ten thousand words. If you can’t see the truth in these pictures you are among the vast majority that must learn only by experience." (Arthur Brisbane, 1915)

"One picture is worth ten thousand words." (Frederick R Barnard, Printer’s Ink, 1921)

"One Picture Worth Ten Thousand Words" ([Chinese proverb] 1927)

"In many instances, a picture is indeed worth a thousand words. To make this true in more diverse circumstances, much more creative effort is needed to pictorialize the output from data analysis. Naive pictures are often extremely helpful, but more sophisticated pictures can be both simple and even more informative." (John W Tukey & Martin B Wilk, "Data Analysis and Statistics: An Expository Overview", 1966)

"Graphic charts are ways of presenting quantitative as well as qualitative information in an efficient and effective visual form. Numbers and ideas presented graphically are often more easily understood. remembered. and integrated than when they are presented in narrative or tabular form. Descriptions. trends. relationships, and comparisons can be made more apparent. Less time is required to present and comprehend information when graphic methods are employed. As the old truism states, 'One picture is worth a thousand words.'" (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"One word is worth a thousand pictures. If it's the right word." (Edward Abbey, "Beyond the Wall: Essays from the Outside", 1984)

"A picture may be worth a thousand words, a formula is worth a thousand pictures." (Edsger Dijkstra, [conference at ETH Zurich] 1994)

"A magnificent picture is never worth a thousand perfect words." (John Dunning, "The Bookman's Wake", 1995)

"A picture tells a thousand words. But you get a thousand pictures from someone's voice." (Paul Fleischman, "Seek", 2001)

"If a picture is worth a thousand words, a metaphor is worth a thousand pictures." (Daniel H Pink, "A Whole New Mind: Why Right-Brainers Will Rule the Future", 2005)

"The amount of information rendered in a single financial graph is easily equivalent to thousands of words of text or a page-sized table of raw values. A graph illustrates so many characteristics of data in a much smaller space than any other means. Charts also allow us to tell a story in a quick and easy way that words cannot." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"Visual reports exploit the idea that a picture is worth a thousand words and, in particular, for many tasks a picture is more useful than a large table of numbers." (Stephen G Eick, "Graph Drawing for Data Analytics" [in "Handbook of Graph Drawing and Visualization"] , 2013)

"Graphs can help us interpret data and draw inferences. They can help us see tendencies, patterns, trends, and relationships. A picture can be worth not only a thousand words, but a thousand numbers. However, a graph is essentially descriptive - a picture meant to tell a story. As with any story, bumblers may mangle the punch line and the dishonest may lie." (Gary Smith, "Standard Deviations", 2014)

"The caption should explain what is shown, possibly also giving the data source. Captions should be detailed enough that the graphic can pretty well stand on its own. Longer is usually better than shorter. A picture may be worth a thousand words, but you need at least some words to describe and explain it." (Antony Unwin, "Graphical Data Analysis with R", 2015)

"A picture may be worth a thousand words, but not all pictures are readable, interpretable, meaningful, or relevant." (Kristen Sosulski, "Data Visualization Made Simple: Insights into Becoming Visual", 2018)

"A recurring theme in machine learning is combining predictions across multiple models. There are techniques called bagging and boosting which seek to tweak the data and fit many estimates to it. Averaging across these can give a better prediction than any one model on its own. But here a serious problem arises: it is then very hard to explain what the model is (often referred to as a 'black box'). It is now a mixture of many, perhaps a thousand or more, models." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"'A picture is worth a thousand words' is definitely true, and graphs can help you tell a story about your data that would otherwise go untold with only numerical summaries and statistics. While inferential statistics and effect size measures can help us draw relatively reliable conclusions from our data, graphs and visualizations can help make the scientific findings accessible to virtually anyone, even with minimal coursework in statistics or data science." (Daniel J Denis, "Univariate, Bivariate, and Multivariate Statistics Using R: Quantitative Tools for Data Analysis and Data Science, 2020)

"Although a picture may be worth a thousand words, a single static picture is in most cases insufficient for a valid analysis and for understanding of a complex subject. It is usual that an analyst needs to see different aspects or parts of data and look at the data from different perspectives. This means that the analyst needs to interact with the data and with the system that generates visual displays of the data: select data components and subsets for viewing, select and tune visualization techniques, transform the views, transform the data, and so on." (Natalia Andrienko et al, "Visual Analytics for Data Scientists", 2020)

"A picture really can be worth a thousand words, and human beings are adept at extracting useful information from visual presentations. Modern data analysis increasingly relies on graphical presentations to uncover meaning and convey results." (Robert I Kabacoff, "R in Action: Data analysis and graphics with R and Tidyverse", 2022)

"A good metaphor is worth a thousand pictures." (Anon) 

"As the Chinese say, 1001 words is worth more than a picture." (John McCarthy [source]) 

References:
[1] Wikipedia (2024) A picture is worth a thousand words [link]
[2] Quote Investigator (2022) A Picture Is Worth Ten Thousand Words [link


💠SQL Server: Window Functions [new feature]

Introduction

     In the past, in the absence or in parallel with other techniques, aggregate functions proved to be quite useful in order to solve several types of problems that involve the retrieval of first/last record or the display of details together with averages and other aggregates. Typically their use involves two or more joins between a dataset and an aggregation based on the same dataset or a subset of it. An aggregation can involve one or more columns that make the object of analysis. Sometimes it might be needed multiple such aggregations based on different sets of columns. Each such aggregation involves at least a join. Such queries can become quite complex, though they were a price to pay in order to solve such problems.

Partitions

     The introduction of analytic functions in Oracle and of window functions, a similar concept, in SQL Server, allowed the approach of such problems from a different simplified perspective. Central to this feature it’s the partition (of a dataset), its meaning being same as of mathematical partition of a set, defined as a division of a set into non-overlapping and non-empty parts that cover the whole initial set. The introduction of partitions it’s not necessarily something new, as the columns used in a GROUP BY clause determines (implicitly) a partition in a dataset. The difference in analytic/window functions is that the partition is defined explicitly inline together with a ranking or average function evaluated within a partition. If the concept of partition is difficult to grasp, let’s look at the result-set based on two Products (the examples are based on AdventureWorks database):
 
-- Price Details for 2 Products 
SELECT A.ProductID  
, A.StartDate 
, A.EndDate 
, A.StandardCost  
FROM [Production].[ProductCostHistory] A 
WHERE A.ProductID IN (707, 708) 
ORDER BY A.ProductID 
, A.StartDate 

window function - details

   In this case a partition is “created” based on the first Product (ProductId = 707), while a second partition is based on the second Product (ProductId = 708). As a parenthesis, another partitioning could be created based on ProductId and StartDate; considering that the two attributes are a key in the table, this will partition the dataset in partitions of 1 record (each partition will have exactly one record).

Details and Averages

     In order to exemplify the use of simple versus window aggregate functions, let’s consider a problem in which is needed to display Standard Price details together with the Average Standard Price for each ProductId. When a GROUP BY clause is applied in order to retrieve the Average Standard Cost, the query is written under the form: 

-- Average Price for 2 Products 
SELECT A.ProductID  
, AVG(A.StandardCost) AverageStandardCost 
FROM [Production].[ProductCostHistory] A 
WHERE A.ProductID IN (707, 708) 
GROUPBY A.ProductID  
ORDERBY A.ProductID 

window function - GROUP BY 

    In order to retrieve the details, the query can be written with the help of a FULL JOIN as follows:

-- Price Details with Average Price for 2 Products - using JOINs 
SELECT A.ProductID  
, A.StartDate 
, A.EndDate 
, A.StandardCost 
, B.AverageStandardCost 
, A.StandardCost - B.AverageStandardCost DiffStandardCost 
FROM [Production].[ProductCostHistory] A    
  JOIN ( -- average price        
    SELECT A.ProductID         
    , AVG(A.StandardCost) AverageStandardCost         
    FROM [Production].[ProductCostHistory] A        
    WHERE A.ProductID IN (707, 708)        
    GROUP BY A.ProductID      
) B  
    ON A.ProductID = B.ProductID 
WHERE A.ProductID IN (707, 708) 
ORDERBY A.ProductID 
, A.StartDate 

 window function - Average Price JOIN   

    As pointed above the partition is defined by ProductId. The same query written with window functions becomes:

-- Price Details with Average Price for 2 Products - using AVG window function 
SELECT A.ProductID  
, A.StartDate 
, A.EndDate 
, A.StandardCost 
, AVG(A.StandardCost) OVER(PARTITION BY A.ProductID) AverageStandardCost 
, A.StandardCost - AVG(A.StandardCost) OVER(PARTITION BY A.ProductID) DiffStandardCost 
FROM [Production].[ProductCostHistory] A 
WHERE A.ProductID IN (707, 708) 
ORDER BY A.ProductID 
, A.StartDate 

window function - Average Price WF









    As can be seen, in the second example, the AVG function is defined using the OVER clause with PartitionId as partition. Even more, the function is used in a formula to calculate the Difference Standard Cost. More complex formulas can be written making use of multiple window functions.  

The Last Record

     Let’s consider the problem of retrieving the nth record. Because with aggregate functions is easier to retrieve the first or last record, let’s consider that is needed to retrieve the last Standard Price for each ProductId. The aggregate function helps to retrieve the greatest Start Date, which farther helps to retrieve the record containing the Last Standard Price.

-- Last Price Details for 2 Products - using JOINs 
SELECT A.ProductID  
, A.StartDate 
, A.EndDate 
, A.StandardCost 
FROM [Production].[ProductCostHistory] A  
    JOIN ( -- average price          
    SELECT A.ProductID          
    , Max(A.StartDate) LastStartDate          
    FROM [Production].[ProductCostHistory] A          
    WHERE A.ProductID IN (707, 708)          
    GROUP BY A.ProductID      
) B      
   ON A.ProductID = B.ProductID  
  AND A.StartDate = B.LastStartDate 
WHERE A.ProductID IN (707, 708) 
ORDERBY A.ProductID 
,A.StartDate 

window function - Last Price JOIN  

With window functions the query can be rewritten as follows:

-- Last Price Details for 2 Products - using AVG window function 
SELECT * 
FROM (-- ordered prices      
    SELECT A.ProductID      
    , A.StartDate      
    , A.EndDate      
    , A.StandardCost      
    , RANK() OVER(PARTITION BY A.ProductID ORDER BY A.StartDate DESC) Ranking      
    FROM [Production].[ProductCostHistory] A     
    WHERE A.ProductID IN (707, 708) 
  ) A 
WHERE Ranking = 1 
ORDER BY A.ProductID 
, A.StartDate 

window function - Last Price WF  

   As can be seen, in order to retrieve the Last Standard Price, was considered the RANK function, the results being ordered descending by StartDate. Thus, the Last Standard Price will be always positioned on the first record. Because window functions can’t be used in WHERE clauses, it’s needed to encapsulate the initial logic in a subquery. Similarly could be retrieved the First Standard Price, this time ordering ascending the StartDate. The last query can be easily modified to retrieve the nth records (this can prove to be more difficult with simple average functions), the first/last nth records.

Conclusion

    Without going too deep into details, I shown above two representative scenarios in which solutions based on average functions could be simplified by using window functions. In theory the window functions provide greater flexibility but they have their own trade offs too. In the next posts I will attempt to further detail their use, especially in the context of Statistics.

02 December 2011

📉Graphical Representation: Tables (Just the Quotes)

"Information that is imperfectly acquired, is generally as imperfectly retained; and a man who has carefully investigated a printed table, finds, when done, that he has only a very faint and partial idea of what he has read; and that like a figure imprinted on sand, is soon totally erased and defaced." (William Playfair, "The Commercial and Political Atlas", 1786)

"In the course of executing that design, it occurred to me that tables are by no means a good form for conveying such information. [...] Making an appeal to the eye when proportion and magnitude are concerned is the best and readiest method of conveying a distinct idea." (William Playfair, "The Statistical Brewery", 1801)

"Isolated facts, those that can only be obtained by rough estimate and that require development, can only be presented in memoires; but those that can be presented in a body, with details, and on whose accuracy one can rely, may be expounded in tables." (Emmanuel Duvillard, "Memoire sur le travail du Bureau de statistique", 1806)

"Tables are like cobwebs, like the sieve of Danaides; beautifully reticulated, orderly to look upon, but which will hold no conclusion. Tables are abstractions, and the object a most concrete one, so difficult to read the essence of." (Thomas Carlyle, "Chartism", 1840)

"But law is no explanation of anything; law is simply a generalization, a category of facts. Law is neither a cause, nor a reason, nor a power, nor a coercive force. It is nothing but a general formula, a statistical table." (Florence Nightingale, "Suggestions for Thought", 1860)

"The dominant principle which characterizes my graphic tables and my figurative maps is to make immediately appreciable to the eye, as much as possible, the proportions of numeric results. […] Not only do my maps speak, but even more, they count, they calculate by the eye." (Chatles D Minard, "Des tableaux graphiques et des cartes figuratives", 1862) 

"If statistical graphics, although born just yesterday, extends its reach every day, it is because it replaces long tables of numbers and it allows one not only to embrace at glance the series of phenomena, but also to signal the correspondences or anomalies, to find the causes, to identify the laws." (Émile Cheysson, cca. 1877)

"That the ten digits do not occur with equal frequency must be evident to any one making much use of logarithmic tables, and noticing how much faster the first pages wear out than the last ones." (Simon Newcomb, "Note on the frequencies of the different digits in natural numbers", Amer. J. Math 4, 1881)

"To a very striking degree our culture has become a Statistical culture. Even a person who may never have heard of an index number is affected [...] by [...] of those index numbers which describe the cost of living. It is impossible to understand Psychology, Sociology, Economics, Finance or a Physical Science without some general idea of the meaning of an average, of variation, of concomitance, of sampling, of how to interpret charts and tables." (Carrol D Wright, 1887)

"Getting information from a table is like extracting sunlight from a cucumber." (Arthur B. Farquhar & Henry Farquhar, "Economic and Industrial Delusions", 1891)

"The graphical method has considerable superiority for the exposition of statistical facts over the tabular. A heavy bank of figures is grievously wearisome to the eye, and the popular mind is as incapable of drawing any useful lessons from it as of extracting sunbeams from cucumbers." (Arthur B Farquhar & Henry Farquhar, "Economic and Industrial Delusions", 1891)

"The essential quality of graphic representations is clarity. If the diagram fails to give a clearer impression than the tables of figures it replaces, it is useless. To this end, we will avoid complicating the diagram by including too much data." (Armand Julin, "Summary for a Course of Statistics, General and Applied", 1910)

"Since a table is a collection of certain sets of data, a chart with one curve representing each set of data can be made to take the place of the table. Wherever a chart can be plotted by straight lines, the speed of this is infinitely greater than making out a table, and where the curvilinear law is known, or can be approximated by the use of the empiric law, the speed is but little less." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)

"Although, the tabular arrangement is the fundamental form for presenting a statistical series, a graphic representation - in a chart or diagram - is often of great aid in the study and reporting of statistical facts. Moreover, sometimes statistical data must be taken, in their sources, from graphic rather than tabular records." (William L Crum et al, "Introduction to Economic Statistics", 1938)

"When numbers in tabular form are taboo and words will not do the work well as is often the case. There is one answer left: Draw a picture. About the simplest kind of statistical picture or graph, is the line variety. It is very useful for showing trends, something practically everybody is interested in showing or knowing about or spotting or deploring or forecasting." (Darell Huff, "How to Lie with Statistics", 1954)

"We must emphasize that such terms as 'select at random', 'choose at random', and the like, always mean that some mechanical device, such as coins, cards, dice, or tables of random numbers, is used." (Frederick Mosteller et al, "Principles of Sampling", Journal of the American Statistical Association Vol. 49 (265), 1954)

"A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers with sufficient explanatory and qualifying words, phrases and statements in the form of titles, headings and notes to make clear the full meaning of data and their origin." (Alva M Tuttle, "Elementary Business and Economic Statistics", 1957)

"However informative and well designed a statistical table may be, as a medium for conveying to the reader an immediate and clear impression of its content, it is inferior to a good chart or graph. Many people are incapable of comprehending large masses of information presented in tabular form; the figures merely confuse them. Furthermore, many such people are unwilling to make the effort to grasp the meaning of such data. Graphs and charts come into their own as a means of conveying information in easily comprehensible form." (Alfred R Ilersic, "Statistics", 1959)

"All the evidence obtained from the reproduction of the study mentioned here indicates that the graphic method is 'better' than the tabular. Tables, since graphs are based on them, are necessary, but they are like background rocks, heavy and uninteresting. Graphs, on the other hand, spice the reports; clarify them, and make them interesting and palatable." (Karl M Dallenbach, 1963)

"The statistician has no magic touch by which he may come in at the stage of tabulation and make something of nothing. Neither will his advice, however wise in the early stages of a study, ensure successful execution and conclusion. Many a study, launched on the ways of elegant statistical design, later boggled in execution, ends up with results to which the theory of probability can contribute little." (W Edwards Deming, "Principles of Professional Statistical Practice", Annals of Mathematical Statistics, 36(6), 1965)

"The problem that still remains to be solved is that of the orderable matrix, that needs the use of imagination […] When the two components of a data table are orderable, the normal construction is the orderable matrix. Its permutations show the analogy and the complementary nature that exist between the algorithmic treatments and the graphical treatments." (Jacques Bertin, "Semiology of graphics" ["Semiologie Graphique"], 1967)

"A statistical table is a systematic arrangement of numerical data in columns and rows. Its purpose is to show quantitative facts clearly, concisely, and effectively. It should facilitate an understanding of the logical relationships among the numbers presented. Tables are used in the compilation of raw data, in the summarizing and analytic processes, and in the presentation of statistics in final form. A good table is the product of careful thinking and hard work. It is not just a package of figures put into neat compartments and ruled to make it look more attractive. It contains carefully selected data put together with thought and ingenuity to serve a specific purpose." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"Tables are [...] the backbone of most statistical reports. They provide the basic substance and foundation on which conclusions can be based. They are considered valuable for the following reasons: (1) Clarity - they present many items of data in an orderly and organized way. (2) Comprehension - they make it possible to compare many figures quickly. (3) Explicitness - they provide actual numbers which document data presented in accompanying text and charts. (4) Economy - they save space, and words. (5) Convenience - they offer easy and rapid access to desired items of information." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"We would wish ‘numerate’ to imply the possession of two attributes. The first of these is an ‘at-homeness’ with numbers and an ability to make use of mathematical skills which enable an individual to cope with the practical mathematical demands of his everyday life. The second is ability to have some appreciation and understanding of information which is presented in mathematical terms, for instance in graphs, charts or tables or by reference to percentage increase or decrease." (Cockcroft Committee, "Mathematics Counts: A Report into the Teaching of Mathematics in Schools", 1982)

"The basic principle which should be observed in designing tables is that of grouping related data, either by the use of space or, if necessary, rules. Items which are close together will be seen as being more closely related than items which are farther apart, and the judicious use of space is therefore vitally important. Similarly, ruled lines can be used to relate and divide information, and it is important to be sure which function is required. Rules should not be used to create closed compartments; this is time-wasting and it interferes with scanning." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"We are not saying that the primary purpose of a graph is to convey numbers with as many decimal places as possible. We agree with Ehrenberg (1975) that if this were the only goal, tables would be better. The power of a graph is its ability to enable one to take in the quantitative information, organize it, and see patterns and structure not readily revealed by other means of studying the data." (William Cleveland & Robert McGill, "Graphical Perception: Theory, Experimentation, and Application to the Development of Graphical Models", Journal of the American Statistical Association 79, 1984)

"The ease and speed with which tables can be understood depends very much on the tabulation logic. The author must ask himself what information the reader already has when he consults a particular table, and what information he is seeking from it. The row and column headings should relate to the information he already has, thus leading him to the information he seeks which is displayed in the body of the table." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"Wherever possible, numerical tables should be explicit rather than implicit, i.e. the information should be given in full. In an implicit table, the reader may be required to add together two values in order to obtain a third which is not explicitly stated in the table. […] Implicit tables save space, but require more effort on the part of the reader and may cause confusion and errors. They are particularly unsuitable for slides and other transient displays." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"This is why a 'web' of notes with links (like references) between them is far more useful than a fixed hierarchical system. When describing a complex system, many people resort to diagrams with circles and arrows. Circles and arrows leave one free to describe the interrelationships between things in a way that tables, for example, do not. The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything." (Tim Berners-Lee, "Information Management: A Proposal", 1989)

"A good way to evaluate a model is to look at a visual representation of it. After all, what is easier to understand - a table full of mathematical relationships or a graphic displaying a decision tree with all of its splits and branches?" (Seth Paul et al. "Preparing and Mining Data with Microsoft SQL Server 2000 and Analysis", 2002)

"Computers are able to multiply useless images without taking into account that, by definition, every graphic corresponds to a table. This table allows you to think about three basic questions that go from the particular to the general level. When this last one receives an answer, you have answers for all of them. Understanding means accessing the general level and discovering significant grouping (patterns). Consequently, the function of a graphic is answering the three following questions:
Which are the X,Y, Z components of the data table? (What it’s all about?)
What are the groups in X, in Y that Z builds? (What the information at the general level is?
What are the exceptions?

These questions can be applied to every kind of problem. They measure the usefulness of whatever construction or graphical invention allowing you to avoid useless graphics." (Jacques Bertin, [interview] 2003)

"Graphs are for the forest and tables are for the trees. Graphs give you the big picture and show you the trends; tables give you the details." (Naomi B Robbins, "Creating More effective Graphs", 2005)

"What distinguishes data tables from graphics is explicit comparison and the data selection that this requires. While a data table obviously also selects information, this selection is less focused than a chart's on a particular comparison. To the extent that some figures in a table are visually emphasised. say in colour or size and style of print. the table is well on its way to becoming a chart. If you're making no comparisons - because you have no particular message and so need no selection (in other words, if you are simply providing a database, number quarry or recycling facility) - tables are easier to use than charts." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"Data visualization [...] expresses the idea that it involves more than just representing data in a graphical form (instead of using a table). The information behind the data should also be revealed in a good display; the graphic should aid readers or viewers in seeing the structure in the data. The term data visualization is related to the new field of information visualization. This includes visualization of all kinds of information, not just of data, and is closely associated with research by computer scientists." (Antony Unwin et al, "Introduction" [in "Handbook of Data Visualization"], 2008) 

"Plotting data is a useful first stage to any analysis and will show extreme observations together with any discernible patterns. In addition the relative sizes of categories are easier to see in a diagram (bar chart or pie chart) than in a table. Graphs are useful as they can be assimilated quickly, and are particularly helpful when presenting information to an audience. Tables can be useful for displaying information about many variables at once, while graphs can be useful for showing multiple observations on groups or individuals. Although there are no hard and fast rules about when to use a graph and when to use a table, in the context of a report or a paper it is often best to use tables so that the reader can scrutinise the numbers directly." (Jenny Freeman et al, "How to Display Data", 2008)

"When displaying information visually, there are three questions one will find useful to ask as a starting point. Firstly and most importantly, it is vital to have a clear idea about what is to be displayed; for example, is it important to demonstrate that two sets of data have different distributions or that they have different mean values? Having decided what the main message is, the next step is to examine the methods available and to select an appropriate one. Finally, once the chart or table has been constructed, it is worth reflecting upon whether what has been produced truly reflects the intended message. If not, then refine the display until satisfied; for example if a chart has been used would a table have been better or vice versa?" (Jenny Freeman et al, "How to Display Data", 2008)

"Tables work in a variety of situations because they convey large amounts of data in a condensed fashion. Use tables in the following situations: (1) to structure data so the reader can easily pick out the information desired, (2) to display in a chart when the data contains too many variables or values, and (3) to display exact values that are more important than a visual moment in time." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"The data [in tables] should not be so spaced out that it is difficult to follow or so cramped that it looks trapped. Keep columns close together; do not spread them out more than is necessary. If the columns must be spread out to fit a particular area, such as the width of a page, use a graphic device such as a line or screen to guide the reader’s eye across the row." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"By giving numbers a proper shape, by visually encoding them, the graphic has saved you time and energy that you would otherwise waste if you had to use a table that was not designed to aid your mind." (Alberto Cairo, "The Functional Art", 2011)

"A common mistake is that all visualization must be simple, but this skips a step. You should actually design graphics that lend clarity, and that clarity can make a chart 'simple' to read. However, sometimes a dataset is complex, so the visualization must be complex. The visualization might still work if it provides useful insights that you wouldn’t get from a spreadsheet. […] Sometimes a table is better. Sometimes it’s better to show numbers instead of abstract them with shapes. Sometimes you have a lot of data, and it makes more sense to visualize a simple aggregate than it does to show every data point." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"With fast computers and plentiful data, finding statistical significance is trivial. If you look hard enough, it can even be found in tables of random numbers." (Gary Smith, "Standard Deviations", 2014)

"One thing to keep in mind with a table is that you want the design to fade into the background, letting the data take center stage. Don’t let heavy borders or shading compete for attention. Instead, think of using light borders or simply white space to set apart elements of the table." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"[...] tables interact with our verbal system, graphs interact with our visual system, which is faster at processing information." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"Using a table in a live presentation is rarely a good idea. As your audience reads it, you lose their ears and attention to make your point verbally." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"A useful way to think about tables and graphics is to visualize layers. Just as photographic files may be manipulated in photo editing software using layers, data presentations are constructed by imagining that layers of an image are placed one on top of another. There are three general layers that apply to visual data presentations: (a) a frame that is typically a rectangle or matrix, (b) axes and coordinate systems (for graphics), and (c) data presented as numbers or geometric objects." (John Hoffmann, "Principles of Data Management and Presentation", 2017)

"Most of us have difficulty figuring probabilities and statistics in our heads and detecting subtle patterns in complex tables of numbers. We prefer vivid pictures, images, and stories. When making decisions, we tend to overweight such images and stories, compared to statistical information. We also tend to misunderstand or misinterpret graphics." (Daniel J Levitin, "Weaponized Lies", 2017)

"Reference tables show a lot of data with a high degree of precision. They are designed generally to provide users with a way to find particular pieces of data. […] Summary tables provide some type of extraction of data from a reference table or a spreadsheet. The data are usually manipulated, analyzed, or summarized in some way, such as by sorting or providing summary statistics (means, percentages, ranges). The results of statistical models are usually presented in research reports using this type of table." (John Hoffmann, "Principles of Data Management and Presentation", 2017)

"The most accurate but least interpretable form of data presentation is to make a table, showing every single value. But it is difficult or impossible for most people to detect patterns and trends in such data, and so we rely on graphs and charts. Graphs come in two broad types: Either they represent every data point visually (as in a scatter plot) or they implement a form of data reduction in which we summarize the data, looking, for example, only at means or medians." (Daniel J Levitin, "Weaponized Lies", 2017)

"The main differences between Bayesian networks and causal diagrams lie in how they are constructed and the uses to which they are put. A Bayesian network is literally nothing more than a compact representation of a huge probability table. The arrows mean only that the probabilities of child nodes are related to the values of parent nodes by a certain formula (the conditional probability tables) and that this relation is sufficient. That is, knowing additional ancestors of the child will not change the formula. Likewise, a missing arrow between any two nodes means that they are independent, once we know the values of their parents. [...] If, however, the same diagram has been constructed as a causal diagram, then both the thinking that goes into the construction and the interpretation of the final diagram change." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"Apart from the technical challenge of working with the data itself, visualization in big data is different because showing the individual observations is just not an option. But visualization is essential here: for analysis to work well, we have to be assured that patterns and errors in the data have been spotted and understood. That is only possible by visualization with big data, because nobody can look over the data in a table or spreadsheet." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"When visuals are applied to data, they can enlighten the audience to insights that they wouldn’t see without charts or graphs. Many interesting patterns and outliers in the data would remain hidden in the rows and columns of data tables without the help of data visualizations. They connect with our visual nature as human beings and impart knowledge that couldn’t be obtained as easily using other approaches that involve just words or numbers." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

01 December 2011

📉Graphical Representation: Percentages (Just the Quotes)

"[…] statistical literacy. That is, the ability to read diagrams and maps; a 'consumer' understanding of common statistical terms, as average, percent, dispersion, correlation, and index number."  (Douglas Scates, "Statistics: The Mathematics for Social Problems", 1943)

"Percentages offer a fertile field for confusion. And like the ever-impressive decimal they can lend an aura of precision to the inexact. […] Any percentage figure based on a small number of cases is likely to be misleading. It is more informative to give the figure itself. And when the percentage is carried out to decimal places, you begin to run the scale from the silly to the fraudulent." (Darell Huff, "How to Lie with Statistics", 1954)

"Charts not only tell what was, they tell what is; and a trend from was to is (projected linearly into the will be) contains better percentages than clumsy guessing." (Robert A Levy, "The Relative Strength Concept of Common Stock Forecasting", 1968)

"We would wish ‘numerate’ to imply the possession of two attributes. The first of these is an ‘at-homeness’ with numbers and an ability to make use of mathematical skills which enable an individual to cope with the practical mathematical demands of his everyday life. The second is ability to have some appreciation and understanding of information which is presented in mathematical terms, for instance in graphs, charts or tables or by reference to percentage increase or decrease." (Cockcroft Committee, "Mathematics Counts: A Report into the Teaching of Mathematics in Schools", 1982) 

"The ease with which somewhat complex statistics can produce confusion is important, because we live in a world in which complex numbers are becoming more common. Simple statistical ideas - fractions, percentages, rates - are reasonably well understood by many people. But many social problems involve complex chains of cause and effect that can be understood only through complicated models developed by experts. [...] environment has an influence. Sorting out the interconnected causes of these problems requires relatively complicated statistical ideas - net additions, odds ratios, and the like. If we have an imperfect understanding of these ideas, and if the reporters and other people who relay the statistics to us share our confusion - and they probably do - the chances are good that we'll soon be hearing - and repeating, and perhaps making decisions on the basis of - mutated statistics." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

"Precision and recall are ways of monitoring the power of the machine learning implementation. Precision is a metric that monitors the percentage of true positives. […] Recall is the ratio of true positives to true positive plus false negatives." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"The most ubiquitous graph is the pie chart. It is a staple of the business world. [...] Never use a pie chart. Present a simple list of percentages, or whatever constitutes the divisions of the pie chart." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Why does representing information in terms of natural frequencies rather than probabilities or percentages foster insight? For two reasons. First, computational simplicity: The representation does part of the computation. And second, evolutionary and developmental primacy: Our minds are adapted to natural frequencies." (Gerd Gigerenzer, "Calculated Risks: How to know when numbers deceive you", 2002)

"Numbers are often useful in stories because they record a recent change in some amount, or because they are being compared with other numbers. Percentages, ratios and proportions are often better than raw numbers in establishing a context." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"The percentage is one of the best (mathematical) friends a journalist can have, because it quickly puts numbers into context. And it's a context that the vast majority of readers and viewers can comprehend immediately." (Charles Livingston & Paul Voakes, "Working with Numbers and Statistics: A handbook for journalists", 2005)

"Generally pie charts are to be avoided, as they can be difficult to interpret particularly when the number of categories is greater than five. Small proportions can be very hard to discern […] In addition, unless the percentages in each of the individual categories are given as numbers it can be much more difficult to estimate them from a pie chart than from a bar chart […]." (Jenny Freeman et al, "How to Display Data", 2008)

"Another way to obscure the truth is to hide it with relative numbers. […] Relative scales are always given as percentages or proportions. An increase or decrease of a given percentage only tells us part of the story, however. We are missing the anchoring of absolute values." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"Comparisons are the lifeblood of empirical studies. We can’t determine if a medicine, treatment, policy, or strategy is effective unless we compare it to some alternative. But watch out for superficial comparisons: comparisons of percentage changes in big numbers and small numbers, comparisons of things that have nothing in common except that they increase over time, comparisons of irrelevant data. All of these are like comparing apples to prunes." (Gary Smith, "Standard Deviations", 2014)

"How good the data quality is can be looked at both subjectively and objectively. The subjective component is based on the experience and needs of the stakeholders and can differ by who is being asked to judge it. For example, the data managers may see the data quality as excellent, but consumers may disagree. One way to assess it is to construct a survey for stakeholders and ask them about their perception of the data via a questionnaire. The other component of data quality is objective. Measuring the percentage of missing data elements, the degree of consistency between records, how quickly data can be retrieved on request, and the percentage of incorrect matches on identifiers (same identifier, different social security number, gender, date of birth) are some examples." (Aileen Rothbard, "Quality Issues in the Use of Administrative Data Records", 2015)

"Where there is no natural ordering to the categories it can be helpful to order them by size, as this can help you to pick out any patterns or compare the relative frequencies across groups. As it can be difficult to discern immediately the numbers represented in each of the categories it is good practice to include the number of observations on which the chart is based, together with the percentages in each category." (Jenny Freeman et al, "How to Display Data", 2008)

"Reporting numbers as percentages can obscure important changes in net values. […] Percentage calculations can give strange answers when any of the numbers involved are negative." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"While the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what anyone man will be up to, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician." (Sir Arthur C Doyle)

📉Graphical Representation: Dot Plots/Charts (Just the Quotes)

"Dot charts are suggested as replacements for bar charts. The replacements allow more effective visual decoding of the quantitative information and can be used for a wider variety of data sets." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)

"[...] error bars are more effectively portrayed on dot charts than on bar charts. […] On the bar chart the upper values of the intervals stand out well, but the lower values are visually deemphasized and are not as well perceived as a result of being embedded in the bars. This deemphasis does not occur on the dot chart." (William S. Cleveland, "Graphical Methods for Data Presentation: Full Scale Breaks, Dot Charts, and Multibased Logging", The American Statistician Vol. 38 (4) 1984)

"Pie charts have severe perceptual problems. Experiments in graphical perception have shown that compared with dot charts, they convey information far less reliably. But if you want to display some data, and perceiving the information is not so important, then a pie chart is fine." (Richard Becker & William S Cleveland," S-Plus Trellis Graphics User's Manual", 1996)

"A bar graph typically presents either averages or frequencies. It is relatively simple to present raw data (in the form of dot plots or box plots). Such plots provide much more information. and they are closer to the original data. If the bar graph categories are linked in some way - for example, doses of treatments - then a line graph will be much more informative. Very complicated bar graphs containing adjacent bars are very difficult to grasp. If the bar graph represents frequencies. and the abscissa values can be ordered, then a line graph will be much more informative and will have substantially reduced chart junk." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"The plot tells us the data are granular in the data source, something we could not ascertain with the histogram. There is an important lesson here. Statistics texts and statistical packages that recommend the histogram as the graphical starting point for a data analysis are giving bad advice. The same goes for kernel density estimates. These are appropriate second stages for graphical data analysis. The best starting point for getting a sense of the distribution of a variable is a tally, stem-and-leaf, or a dot plot. A dot plot is a special case of a tally (perhaps best thought of as a delta-neighborhood tally). Once we see that the data are not granular, we may move on to a histogram or kernel density, which smooths the data more than a dot plot." (Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"Area can also make data seem more tangible or relatable, because physical objects take up space. A circle or a square uses more space than a dot on a screen or paper. There’s less abstraction between visual cue and real world." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Visualization is what happens when you make the jump from raw data to bar graphs, line charts, and dot plots. […] In its most basic form, visualization is simply mapping data to geometry and color. It works because your brain is wired to find patterns, and you can switch back and forth between the visual and the numbers it represents. This is the important bit. You must make sure that the essence of the data isn’t lost in that back and forth between visual and the value it represents because if you can’t map back to the data, the visualization is just a bunch of shapes." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Another word of caution for dot plots that show changes over time. The dot plot is, by definition, a summary chart. It does not show all of the data in the intervening years. If the data between the two dots generally move in the same direction, a dot plot is sufficient. But if the data contain sharp variations year by year, a dot plot will obscure that pattern (as it also does for bar charts)." (Jonathan Schwabish, "Better Data Visualizations: A guide for scholars, researchers, and wonks", 2021)

30 November 2011

📉Graphical Representation: Effectiveness (Just the Quotes)

"Though accurate data and real facts are valuable, when it comes to getting results the manner of presentation is ordinarily more important than the facts themselves. The foundation of an edifice is of vast importance. Still, it is not the foundation but the structure built upon the foundation which gives the result for which the whole work was planned. As the cathedral is to its foundation so is an effective presentation of facts to the data." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"Graphic charts have often been thought to be tools of those alone who are highly skilled in mathematics, but one needs to have a knowledge of only eighth-grade arithmetic to use intelligently even the logarithmic or ratio chart, which is considered so difficult by those unfamiliar with it. […] If graphic methods are to be most effective, those who are unfamiliar with charts must give some attention to their fundamental structure. Even simple charts may be misinterpreted unless they are thoroughly understood. For instance, one is not likely to read an arithmetic chart correctly unless he also appreciates the significance of a logarithmic chart." (John R Riggleman & Ira N Frisbee, "Business Statistics", 1938)

"Charts and graphs represent an extremely useful and flexible medium for explaining, interpreting, and analyzing numerical facts largely by means of points, lines, areas, and other geometric forms and symbols. They make possible the presentation of quantitative data in a simple, clear, and effective manner and facilitate comparison of values, trends, and relationships. Moreover, charts and graphs possess certain qualities and values lacking in textual and tabular forms of presentation." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Correct emphasis is basic to effective graphic presentation. Intensity of color is the simplest method of obtaining emphasis. For most reproduction purposes black ink on a white page is most generally used.  Screens, dots and lines can, of course, be effectively used to give a gradation of tone from light grey to solid black. When original charts are the subjects of display presentation, use of colors is limited only by the subject and the emphasis desired." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Simplicity, accuracy, appropriate size, proper proportion, correct emphasis, and skilled execution - these are the factors that produce the effective chart. To achieve simplicity your chart must be designed with a definite audience in mind, show only essential information. Technical terms should be absent as far as possible. And in case of doubt it is wiser to oversimplify than to make matters unduly complex. Be careful to avoid distortion or misrepresentation. Accuracy in graphics is more a matter of portraying a clear reliable picture than reiterating exact values. Selecting the right scales and employing authoritative titles and legends are as important as precision plotting. The right size of a chart depends on its probable use, its importance, and the amount of detail involved." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"The bar or column chart is the easiest type of graphic to prepare and use in reports. It employs a simple form: four straight lines that are joined to construct a rectangle or oblong box. When the box is shown horizontally it is called a bar; when it is shown vertically it is called a column. [...] The bar chart is an effective way to show comparisons between or among two or more items. It has the added advantage of being easily understood by readers who have little or no background in statistics and who are not accustomed to reading complex tables or charts." (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers even a very large set - is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"The effective communication of information in visual form, whether it be text, tables, graphs, charts or diagrams, requires an understanding of those factors which determine the 'legibility', 'readability' and 'comprehensibility', of the information being presented. By legibility we mean: can the data be clearly seen and easily read? By readability we mean: is the information set out in a logical way so that its structure is clear and it can be easily scanned? By comprehensibility we mean: does the data make sense to the audience for whom it is intended? Is the presentation appropriate for their previous knowledge, their present information needs and their information processing capacities?" (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"One graph is more effective than another if its quantitative information can be decoded more quickly or more easily by most observers. […] This definition of effectiveness assumes that the reason we draw graphs is to communicate information - but there are actually many other reasons to draw graphs." (Naomi B Robbins, "Creating More effective Graphs", 2005)

"An effective dashboard is the product not of cute gauges, meters, and traffic lights, but rather of informed design: more science than art, more simplicity than dazzle. It is, above all else, about communication." (Stephen Few, "Information Dashboard Design", 2006)

"Most dashboards fail to communicate efficiently and effectively, not because of inadequate technology (at least not primarily), but because of poorly designed implementations. No matter how great the technology, a dashboard's success as a medium of communication is a product of design, a result of a display that speaks clearly and immediately. Dashboards can tap into the tremendous power of visual perception to communicate, but only if those who implement them understand visual perception and apply that understanding through design principles and practices that are aligned with the way people see and think." (Stephen Few, "Information Dashboard Design", 2006)

"The main goal of data visualization is its ability to visualize data, communicating information clearly and effectively. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex dataset by communicating its key aspects in a more intuitive way. Yet designers often tend to discard the balance between design and function, creating gorgeous data visualizations which fail to serve its main purpose - communicate information." (Vitaly Friedman, "Data Visualization and Infographics", Smashing Magazine, 2008)

"For a visual to qualify as beautiful, it must be aesthetically pleasing, yes, but it must also be novel, informative, and efficient. [...] For a visual to truly be beautiful, it must go beyond merely being a conduit for information and offer some novelty: a fresh look at the data or a format that gives readers a spark of excitement and results in a new level of understanding. Well-understood formats (e.g., scatterplots) may be accessible and effective, but for the most part they no longer have the ability to surprise or delight us. Most often, designs that delight us do so not because they were designed to be novel, but because they were designed to be effective; their novelty is a byproduct of effectively revealing some new insight about the world." (Noah Iliinsky, "On Beauty", [in "Beautiful Visualization"] 2010)

"A graph is considered effective if it conveys the intended information in a way that can be understood quickly and without ambiguity by most consumers." (Sanjay Matange  & Dan Neath, "Statistical Graphics Procedures by Example: Effective Graphs Using SAS", 2011)

"Color can tell us where to look, what to compare and contrast, and it can give us a visual scale of measure. Because color can be so effective, it is often used for multiple purposes in the same graphic - which can create graphics that are dazzling but difficult to interpret. Separating the roles that color can play makes it easier to apply color specifically for encouraging different kinds of visual thinking. [...] Choose colors to draw attention, to label, to show relationships (compare and contrast), or to indicate a visual scale of measure." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"The process of visual analysis can potentially go on endlessly, with seemingly infinite combinations of variables to explore, especially with the rich opportunities bigger data sets give us. However, by deploying a disciplined and sensible balance between deductive and inductive enquiry you should be able to efficiently and effectively navigate towards the source of the most compelling stories." (Andy Kirk, "Data Visualization: A successful design process", 2012)

"A great infographic leads readers on a visual journey, telling them a story along the way. Powerful infographics are able to capture people’s attention in the first few seconds with a strong title and visual image, and then reel them in to digest the entire message. Infographics have become an effective way to speak for the creator, conveying information and image simultaneously." (Justin Beegel, "Infographics For Dummies", 2014)

"The effectiveness principle dictates that the importance of the attribute should match the salience of the channel; that is, its noticeability. In other words, the most important attributes should be encoded with the most effective channels in order to be most noticeable, and then decreasingly important attributes can be matched with less effective channels. " (Tamara Munzner, "Visualization Analysis and Design", 2014)

[…] no single visualization is ever quite able to show all of the important aspects of our data at once - there just are not enough visual encoding channels. […] designing effective visualizations to make sense of data is not an art - it is a systematic and repeatable process." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Data storytelling gives your insight the best opportunity to capture attention, be understood, be remembered, and be acted on. An effective data story helps your insight reach its full potential: inspiring others to act and drive change." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Understanding language goes hand in hand with the ability to integrate complex contextual information into an effective visualization and being able to converse with the data interactively, a term we call analytical conversation. It also helps us think about ways to create artifacts that support and manage how we converse with machines as we see and understand data."(Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Unlike text, visual communication is governed less by an agreed-upon convention between 'writer' and 'reader' than by how our visual systems react to stimuli, often before we’re aware of it. And just as composers use music theory to create music that produces certain predictable effects on an audience, chart makers can use visual perception theory to make more-effective visualizations with similarly predictable effects." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)

"Good design isn’t just choosing colors and fonts or coming up with an aesthetic for charts. That’s styling - part of design, but by no means the most important part. Rather, people with design talent develop and execute systems for effective visual communication. They understand how to create and edit visuals to focus an audience and distill ideas." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)

"Graphic design is not just about making things look good. It is a powerful combination of form and function that uses visual elements to communicate a message. Form refers to the physical appearance of a design, such as its shape, color, and typography. Function refers to the purpose of a design, such as what it is trying to communicate or achieve. A good graphic design is both visually appealing and functional. It uses the right combination of form and function to communicate its message effectively. Graphic design is also a strategic and thoughtful craft. It requires careful planning and execution to create a design that is both effective and aesthetically pleasing." (Faith Aderemi, "The Essential Graphic Design Handbook", 2024)

"When deeply complex charts work, we find them effective and beautiful, just as we find a symphony beautiful, which is another marvelously complex arrangement of millions of data points that we experience as a coherent whole." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.