Showing posts with label Graphical Representation. Show all posts
Showing posts with label Graphical Representation. Show all posts

27 May 2026

📉Graphical Representation: Nomographs (Just the Quotes)

"The term nomography serves to designate the general study of the graphic representation of equations in any number of variables on a plane surface. Its practical applications consist in the representation of the numerical relations between the variables by calibrated systems (straight lines or curves) constructed once for all and permitting the determination by a single reading of one or more of the variables when the others are given." (Howard G Funkhouser," Historical Development of the Graphical Representation of Statistical Data", 1937) 

"Now the condition that in the intersection chart three straight lines shall meet in a point is identical with the condition that in the corresponding alignment chart three points shall lie on a straight line. This is called the 'principle of duality', and this condition is given in the form of a determinant, known in nomography as the 'basic nomogram determinant', which enables us to plot the three scales of a nomogram, whether they are straight or curved, on squared paper. Whenever it proves possible to transform an equation into the form of a basic nomogram determinant a true nomogram can be drawn, but only too frequently this proves to be impossible and recourse must be had to graphical methods." (Philip Lyle, "The Construction of Nomograms for Use in Statistics: Part I. True and Empirical", Journal of the Royal Statistical Society - Series C (Applied Statistics) Vol. 3 (2), 1954)

"A nomograph of a formula is a graph or diagram composed of lines scaled relatively and placed in such relative positions that the values of the variables are found on a line crossing the scales. The object is to substitute for the labor of computation a simple mechanical operation such as the one previously described. It is easy to read a nomogram with precision because of the few lines. It provides a tabulation of all possible values, enables solutions to be made irrespective of what quantity in the formula is unknown and also enables one to observe instantly the effect of a change, either small or great, in any one of the variables. The principles of such diagrams may be given in a general way and simple nomograms be constructed, but equations with many unknown quantities cannot be solved graphically without higher mathematics." (William C Marshall, "Graphical methods for schools, colleges, statisticians, engineers and executives", 1921)

"Nomograms are graphic devices for representing equations on a plane surface. They are widely used in engineering design and to a lesser extent in the social and physical sciences. Nomograms can be divided into two classes, or distinct graphic formats: (i) Abac: Equation drawn as a graph on Cartesian or logarithmic coordinates. (ii) Alignment chart. Three or more scales arranged so that a straight line joining two known values cuts the third scale to give the required value." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"Since the chief purpose of the nomogram is to make exact data available for operational use, its chief competitor is the table. Operational tables may break Ehrenberg's two-digit rule, since they are not used to detect general trends but to provide exact data for some operational purpose. The choice  between nomogram and table involves a complex tradeoff among cost, space, convenience, accuracy, and speed. These tradeoff situations provide one good reason why no one graphic format is suitable for all purposes. Of course, there can be good methods (sarisfying solutions) for particular cases." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"A great virtue of nomograms is that they are usually multivariate, showing relationships among variables in quite complex systems. It is surely helpful to have both an analysis of the underlying equation along with nomogram visualization of the curves generated by the equation. Nomograms show how equations perform. Nomograms remain useful for understanding; their computational use has passed. Computational power is so cheap now, we don’t need look-up tables or nomograms; we can just plug the numbers into the equations and solve." (Edward Tufte, 2002)

"Nomographs are effective ways to graphically calculate various functionally related quantities. Nomographs are really graphical computational devices. They were once used widely in engineering situations when calculating was more laborious than at the present time, and they still can be useful when complex relationships are concerned. In brief, scales are laid out in which the scale intervals and placement of the lines are chosen by well-established procedures. A straight edge can then be used to interconnect independent variables so the corresponding values of dependent variables can be read." (Cheryl Cihon & John K Taylor, "Statistical Techniques for Data Analysis" 2nd. ed., 2005)

"A nomogram not only sheds light on how the effect of one predictor on the probability of response depends on the levels of other factors, but it allows one to quickly estimate the probability of response for individual subjects." (Frank E. Harrell Jr, "Regression Modeling Strategies", 2015)


26 May 2026

📉Graphical Representation: Format (Just the Quotes)

"A graph presents a limited number of figures in a bold and forceful manner. To do this it usually must omit a large number of figures available on the subject. The choice of what graphic format to use is largely a matter of deciding what figures have the greatest significance to the intended reader and what figures he can best afford to skip." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"Any graphic format can be executed well, or poorly, for a particular purpose. This is often a more significant variable than the choice of format." (Macdonald-Ross, 1977)

"The main benefit of tabular presentation is its compactness; a great deal of data can be put on a single page. Also, even with the two-digit restriction a table presents numbers more exactly than bar or pie charts do. Therefore it seems likely that tables will remain the preferred format for professional users. The great weakness of tables is their abstract nature. A table consists entirely of abstract symbols-words and numbers." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"Some believe that the vertical bar should be used when comparing similar items for different time periods and the horizontal bar for comparing different items for the same time period. However, most people find the vertical-bar format easier to prepare and read. and a more effective way to show most types of comparisons." (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"Charts are used to represent quantitative data in a graphic format. A chart visually illustrates relationships between numbers. When creating a chart, keep in mind that the goal is to represent the data in a simplified and appealing way so as not to muddle the message the chart is meant to convey." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"For a visual to qualify as beautiful, it must be aesthetically pleasing, yes, but it must also be novel, informative, and efficient. [...] For a visual to truly be beautiful, it must go beyond merely being a conduit for information and offer some novelty: a fresh look at the data or a format that gives readers a spark of excitement and results in a new level of understanding. Well-understood formats (e.g., scatterplots) may be accessible and effective, but for the most part they no longer have the ability to surprise or delight us. Most often, designs that delight us do so not because they were designed to be novel, but because they were designed to be effective; their novelty is a byproduct of effectively revealing some new insight about the world." (Noah Iliinsky, "On Beauty", [in "Beautiful Visualization"] 2010)

"The first requirement of a beautiful visualization is that it is novel, fresh, or unique. It is difficult (though not impossible) to achieve the necessary novelty using default formats. In most situations, well-defined formats have well-defined, rational conventions of use: line graphs for continuous data, bar graphs for discrete data, pie graphs for when you are more interested in a pretty picture than conveying knowledge." (Noah Iliinsky, "On Beauty", [in "Beautiful Visualization"] 2010)

"The best visualizations will reveal what is interesting about the specific data set you’re working with. Different data may require different approaches, encodings, or techniques to reveal its interesting aspects. While default visualization formats are a great place to start, and may come with the correct design choices pre-selected, sometimes the data will yield new knowledge when a different visualization approach or format is used." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Infographics combine data with design to enable visual learning. This communication process helps deliver complex information in a way that is more quickly and easily understood. [...] In an era of data overload, infographics offer your audience information in a format that is easy to consume and share. [...] A well-placed, self-contained infographic addresses our need to be confident about the content we’re sharing. Infographics relay the gist of your information quickly, increasing the chance for it to be shared and fueling its spread across a wide variety of digital channels." (Mark Smiciklas, "The Power of Infographics: Using Pictures to Communicate and Connect with Your Audiences", 2012)

"Presenting data in a graphical format makes it much easier to see and understand what is happening with the data. Data visualization applies to all phases of the data science process."  (John D Kelleher & Brendan Tierney, "Data Science", 2018)

"There is often no one 'best' visualization, because it depends on context, what your audience already knows, how numerate or scientifically trained they are, what formats and conventions are regarded as standard in the particular field you’re working in, the medium you can use, and so on. It’s also partly scientific and partly artistic, so you get to express your own design style in it, which is what makes it so fascinating." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Data is dirty. Let's just get that out there. How is it dirty? In all sorts of ways. Misspelled text values, date format problems, mismatching units, missing values, null values, incompatible geospatial coordinate formats, the list goes on and on." (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020) 

 

25 May 2026

✏️Michael Macdonald-Ross - Collected Quotes

"Any graphic format can be executed well, or poorly, for a particular purpose. This is often a more significant variable than the choice of format." (Macdonald-Ross, 1977)

"Nomograms are graphic devices for representing equations on a plane surface. They are widely used in engineering design and to a lesser extent in the social and physical sciences. Nomograms can be divided into two classes, or distinct graphic formats: (i) Abac: Equation drawn as a graph on Cartesian or logarithmic coordinates. (ii) Alignment chart. Three or more scales arranged so that a straight line joining two known values cuts the third scale to give the required value." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

 "Notations and codes are invented because of the limitations of ordinary language: notations to say things that can hardly be expressed in ordinary language, and codes to hide messages that would otherwise be all too clear. Notations are certainly important for the growth and expression of ideas and hence are of interest to us." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"Since the chief purpose of the nomogram is to make exact data available for operational use, its chief competitor is the table. Operational tables may break Ehrenberg's two-digit rule, since they are not used to detect general trends but to provide exact data for some operational purpose. The choice  between nomogram and table involves a complex tradeoff among cost, space, convenience, accuracy, and speed. These tradeoff situations provide one good reason why no one graphic format is suitable for all purposes. Of course, there can be good methods (sarisfying solutions) for particular cases." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"The main benefit of tabular presentation is its compactness; a great deal of data can be put on a single page. Also, even with the two-digit restriction a table presents numbers more exactly than bar or pie charts do. Therefore it seems likely that tables will remain the preferred format for professional users. The great weakness of tables is their abstract nature. A table consists entirely of abstract symbols-words and numbers." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"The practitioner who designs a graphic device is acting, as we all do, with imperfect knowledge. A graphic device is an artifact, intended to get across a particular idea to some particular readers. There is no way a science of instruction could lay down minutely detailed prescriptions for all conceivable situations. This is simple realism. However, it is possible to put together the knowledge we already have, to improve it, and to make it more easily available. Reliable knowledge applied intelligently will improve the effectiveness of graphic communication." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"To design a chart or table the designer may need to go back to source documents to check the definition of key terms, the sampling procedures, and so on. This does require some basic familiarity with research methods, and it may be that the training of graphic designers could be improved in this respect." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

"When hundreds of numbers are arrayed in a complex table most people find it difficult to sort out the significant features; indeed, there are many who cannot interpret even the simplest tables. No doubt it would help if such skills were taught in schools, but they are not, and the practicing communicator has to take people as they are. Therefore it is common practice to use charts for a general readership. Bar charts show quantity by length, and Isotype charts show quantity by rows of standard symbols. In effect this reduces the need for abstract cognition and offers the data as a series of visual comparisons (this/that, here/there, now/then)." (Michael Macdonald-Ross, "Graphics in Texts", Review of Research in Education Vol. 5, 1977)

24 May 2026

📉Graphical Representation: Perspectives (Just the Quotes)

"Comparison between circles of different size should be absolutely avoided. It is inexcusable when we have available simple methods of charting so good and so convenient from every point of view as the horizontal bar." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"In line charts with an arithmetic scale, it is essential to set the base line at zero in order that the correct perspective of the general movement may not be lost. Breaking or leaving off part of the scale leads to misinterpretation, because the trend then shows a disproportionate degree of variation in movement." (Mary E Spear, "Charting Statistics", 1952)

"The information on a plot should be relevant to the goals of the analysis. This means that in choosing graphical methods we should match the capabilities of the methods to our needs in the context of each application. [...] Scatter plots, with the views carefully selected as in draftsman's displays, casement displays, and multiwindow plots, are likely to be more informative. We must be careful, however, not to confuse what is relevant with what we expect or want to find. Often wholly unexpected phenomena constitute our most important findings." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"Making a presentation is a moral act as well as an intellectual activity. The use of corrupt manipulations and blatant rhetorical ploys in a report or presentation - outright lying, flagwaving, personal attacks, setting up phony alternatives, misdirection, jargon-mongering, evading key issues, feigning disinterested objectivity, willful misunderstanding of other points of view - suggests that the presenter lacks both credibility and evidence. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Sorting data is one of the most efficient actions to derive different views of data in order to see the variables from many angles. Sorting is usually not applied to the data itself, but to statistical objects of a plot. We might want to sort the bars in a barchart, the variables in a parallel boxplot or the categories in a boxplot y by x." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"A beautiful visualization has a clear goal, a message, or a particular perspective on the information that it is designed to convey. Access to this information should be as straightforward as possible, without sacrificing any necessary, relevant complexity. [...] Most importantly, beautiful visualizations reflect the qualities of the data that they represent, explicitly revealing properties and relationships inherent and implicit in the source data. As these properties and relationships become available to the reader, they bring new knowledge, insight, and enjoyment."  (Noah Iliinsky, "On Beauty", [in "Beautiful Visualization"] 2010)

"A persuasive visualization primarily serves the relationship between the designer and the reader. It is useful when the designer wishes to change the reader’s mind about something. It represents a very specific point of view, and advocates a change of opinion or action on the part of the reader. In this category of visualization, the data represented is specifically chosen for the purpose of supporting the designer’s point of view, and is presented carefully so as to convince the reader of same." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Processes take place over time and result in change. However, we’re often constrained to depict processes in static graphics, perhaps even a single image. Luckily, a good static graphic can be just as successful, perhaps even more so, than an animation. Giving the reader the ability to see each 'frame' of time can of f er a valuable perspective." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"Visualization can be appreciated purely from an aesthetic point of view, but it’s most interesting when it’s about data that’s worth looking at. That’s why you start with data, explore it, and then show results rather than start with a visual and try to squeeze a dataset into it. It’s like trying to use a hammer to bang in a bunch of screws. […] Aesthetics isn’t just a shiny veneer that you slap on at the last minute. It represents the thought you put into a visualization, which is tightly coupled with clarity and affects interpretation." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Interactivity is crucial for building vis tools that handle complexity. When datasets are large enough, the limitations of both people and displays preclude just showing everything at once; interaction where user actions cause the view to change is the way forward. Moreover, a single static view can show only one aspect of a dataset. For some combinations of simple datasets and tasks, the user may only need to see a single visual encoding. In contrast, an interactively changing display supports many possible queries. " (Tamara Munzner, "Visualization Analysis and Design", 2014)

"When you are exploring your data, look for alternate views of the data; you just may find a more interesting insight."  (Andy Kriebel & Eva Murray, "#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time", 2018)

"First, from an ethos perspective, the success of your data story will be shaped by your own credibility and the trustworthiness of your data. Second, because your data story is based on facts and figures, the logos appeal will be integral to your message. Third, as you weave the data into a convincing narrative, the pathos or emotional appeal makes your message more engaging. Fourth, having a visualized insight at the core of your message adds the telos appeal, as it sharpens the focus and purpose of your communication. Fifth, when you share a relevant data story with the right audience at the right time (kairos), your message can be a powerful catalyst for change." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Well-designed data graphics provide readers with deeper and more nuanced perspectives, while promoting the use of quantitative information in understanding the world and making decisions." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020) 

"Numbers can always yield multiple interpretations, and they may be approached from varied angles. We journalists don’t vary our approaches more often because many of us are sloppy, innumerate, or simply forced to publish stories at a quick pace. That’s why chart readers must remain vigilant. Even the most honest chart creator makes mistakes." (Alberto Cairo, "How Charts Lie", 2019)

"Well-designed data graphics provide readers with deeper and more nuanced perspectives, while promoting the use of quantitative information in understanding the world and making decisions." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)


23 May 2026

📉Graphical Representation: Grammar (Just the Quotes)

"Statistical accounts are to be referred to as a dictionary by men of riper years, and by young men as a grammar, to teach them the relations and proportions of different statistical subjects, and to imprint them on the mind at a time when the memory is capable of being impressed in a lasting and durable manner, thereby laying the foundation for accurate and valuable knowledge." (William Playfair, "The Statistical Brewery", 1801)

"The principles of charting and curve plotting are not at all complex, and it is surprising that many business men dodge the simplest charts as though they involved higher mathematics or contained some sort of black magic. [...] The trouble at present is that there are no standards by which graphic presentations can be prepared in accordance with definite rules so that their interpretation by the reader may be both rapid and accurate. It is certain that there will evolve for methods of graphic presentation a few useful and definite rules which will correspond with the rules of grammar for the spoken and written language." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919) 

"Although mathematical notation undoubtedly possesses parsing rules, they are rather loose, sometimes contradictory, and seldom clearly stated. [...] The proliferation of programming languages shows no more uniformity than mathematics. Nevertheless, programming languages do bring a different perspective. [...] Because of their application to a broad range of topics, their strict grammar, and their strict interpretation, programming languages can provide new insights into mathematical notation." (Kenneth E Iverson, "Math for the Layman", 1999) 

"A grammar of graphics facilitates coordinated activity in a set of relatively autonomous components. This grammar enables us to develop a system in which adding a graphic to a frame (say, a surface) requires no adjustments or changes in definitions other than the simple message 'add this graphic'. Similarly, we can remove graphics, transform scales, permute attributes, and make other alterations without redefining the basic structure."(Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"The grammar of graphics takes us beyond a limited set of charts (words) to an almost unlimited world of graphical forms (statements). The rules of graphics grammar are sometimes mathematical and sometimes aesthetic. Mathematics provides symbolic tools for representing abstractions. Aesthetics, in the original Greek sense, offers principles for relating sensory attributes (color, shape, sound, etc.) to abstractions. In modern usage, aesthetics can also mean taste." (Leland Wilkinson, "The Grammar of Graphics" 2nd Ed., 2005)

"Even if a chart is correctly designed, it may still deceive us because we don’t know how to read it correctly - we can’t grasp its symbols and grammar, so to speak - or we misinterpret its meaning, or both. Contrary to what many people believe, most good charts aren’t simple, pretty illustrations that can be understood easily and intuitively." (Alberto Cairo, "How Charts Lie", 2019)

"Beyond the design of individual charts, the sequence of data visualizations creates grammar within the exposition. Cohesive visualizations follow common narrative structures to fully express their message. Order matters."  (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Cohesion means ideas work together to build a unified whole, which helps conversation interlink in purposeful ways, and the basic parts adhere to grammar." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

11 May 2026

✏️Jose Berengueres - Collected Quotes

"[...] a mark of due diligence is to always ask if there is more data." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Any good set of data will offer transparency into the methodology of how the data was gathered. This means paying particular attention to what and how questions are asked in surveys or statements made. A red flag is any use of adverbs and adjectives. They are usually loaded with bias." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Bias not only can be sorted by their point of entry (data, story, narrative) but also by the area they exploit in the cognition system (optical illusions, cultural biases). It is easy to assume that bias is intentional. However, bias can emerge for many reasons. First, bias can be embedded in the data itself, intentionally in the way it is gathered but also accidentally by not realizing what is missing. Second, bias can appear as the story is crafted. Again, this can be intentional by cherry-picking from existing data, or accidental from cases where not enough time is spent exploring all data available (usually due to time pressure). Third, it can be embedded in the narrative itself." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Helping the reader situate the new information into existing frameworks makes the new information easier to assimilate, use and recall." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"In broad terms, bias is any systematic error. In other words, a systematic difference between a model and the 'truth' it supposedly represents. In social sciences, bias is judged to be unethical when it is unfair (usually towards a minority)." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Mind the gap is a common strategy to think about differences between categories in the data [...]. Thinking about why the gap exists can help explain the reality that the chart is representing." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Note how the key step to creating meaning (knowledge) is not only to summarize and declutter but to find where the information is most useful and then by linking it to that context (reference framework)." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"There is a fundamental difference between circular charts and bar charts. The brain is sensitive to angular change and (by comparison) quite numb to linear change. This is particularly true when considering motion, and sensitivity to small changes. If in your narrative, highlighting minute changes in a variable is important for the story, then circular pie charts (speed needle gauges) are the way to go. If on the contrary, too much attention to change is a distraction, avoid pie charts and needles."(Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Unfortunately, aesthetically pleasing visuals and a visual that gets the job done do not always coincide." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

"Unless you are in a preliminary Exploratory Data Analysis (EDA), it is not a good idea to disseminate a chart unless there is a clear why (narrative) for the chart. And even if you produce many charts as a part of an EDA, resist the temptation to show them off." (Jose Berengueres & Marybeth Sandell, "Introduction to Data Visualization & Storytelling: A Guide For The Data Scientist" 2nd. Ed., 2019)

30 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 3: Heuristics)

Graphical Representation Series
Graphical Representation Series
 

Consider the following general heuristics in data visualizations (work in progress):

  • plan design
    • plan page composition
      • text
        • title, subtitles
        • dates 
          • refresh, filters applied
        • parameters applied
        • guidelines/tooltips
        • annotation 
      • navigation
        • main page(s)
        • additional views
        • drill-through
        • zoom in/out
        • next/previous page
        • landing page
      • slicers/selections
        • date-related
          • date range
          • date granularity
        • functional
          • metric
          • comparisons
        • categorical
          • structural relations
      • icons/images
        • company logo
        • button icons
        • background
    • pick a theme
      • choose a layout and color schema
        • use a color palette generator
        • use a focused color schema or restricted palette
        • use consistent and limited color scheme
        • use suggestive icons
          • use one source (with similar design)
        • use formatting standards
    • create a visual hierarchy 
      • use placement, size and color for emphasis
      • organize content around eye movement pattern
      • minimize formatting changes
      • 1 font, 2 weights, 4 sizes
    • plan the design
      • build/use predictable and consistent templates
        • e.g. using Figma
      • use layered design
      • aim for design unity
      • define & use formatting standards
      • check changes
    • GRACEFUL
      • group visuals with white space 
      • right chart type
      • avoid clutter
      • consistent & limited color schema
      • enhanced readability 
      • formatting standard
      • unity of design
      • layered design
  • keep it simple 
    • be predictable and consistent 
    • focus on the message
      • identify the core insights and design around them
      • pick suggestive titles/subtitles
        • use dynamics subtitles
      • align content with the message
    • avoid unnecessary complexity
      • minimize visual clutter
      • remove the unnecessary elements
      • round numbers
    • limit colors and fonts
      • use a restrained color palette (<5 colors)
      • stick to 1-2 fonts 
      • ensure text is legible without zooming
    • aggregate values
      • group similar data points to reduce noise
      • use statistical methods
        • averages, medians, min/max
      • categories when detailed granularity isn’t necessary
    • highlight what matters 
      • e.g. actionable items
      • guide attention to key areas
        • via annotations, arrows, contrasting colors 
        • use conditional formatting
      • do not show only the metrics
        • give context 
      • show trends
        • via sparklines and similar visuals
    • use familiar visuals
      • avoid questionable visuals 
        • e.g. pie charts, gauges
    • avoid distortions
      • preserve proportions
        • scale accurately to reflect data values
        • avoid exaggerated visuals
          • don’t zoom in on axes to dramatize small differences
      • use consistent axes
        • compare data using the same scale and units across charts
        • don't use dual axes or shifting baselines that can mislead viewers
      • avoid manipulative scaling
        • use zero-baseline on bar charts 
        • use logarithmic scales sparingly
    • design for usability
      • intuitive interaction
      • at-a-glance perception
      • use contrast for clarity
      • use familiar patterns
        • use consistent formats the audience already knows
    • design with the audience in mind
      • analytical vs managerial perspectives (e.g. dashboards)
    • use different level of data aggregations
      •  in-depth data exploration 
    • encourage scrutiny
      • give users enough context to assess accuracy
        • provide raw values or links to the source
      • explain anomalies, outliers or notable trends
        • via annotations
    • group related items together
      • helps identify and focus on patterns and other relationships
    • diversify 
      • don't use only one chart type
      • pick the chart that reflects the best the data in the conrext considered
    • show variance 
      • absolute vs relative variance
      • compare data series
      • show contribution to variance
    • use familiar encodings
      • leverage (known) design patterns
    • use intuitive navigation
      • synchronize slicers
    • use tooltips
      • be concise
      • use hover effects
    • use information buttons
      • enhances user interaction and understanding 
        • by providing additional context, asking questions
    • use the full available surface
      • 1080x1920 works usually better 
    • keep standards in mind 
      • e.g. IBCS
  • state the assumptions
    • be explicit
      • clearly state each assumption 
        • instead of leaving it implied
    • contextualize assumptions
      • explain the assumption
        • use evidence, standard practices, or constraints
    • state scope and limitations
      • mention what the assumption includes and excludes
    • tie assumptions to goals & objectives
      • helps to clarify what underlying beliefs are shaping the analysis
      • helps identify whether the visualization achieves its intended purpose 
  • show the data
    • be honest (aka preserve integrity)
      • avoid distortion, bias, or trickery
    • support interpretation
      • provide labels, axes, legends
    • emphasize what's meaningful
      • patterns, trends, outliers, correlations, local/global maxima/minima
  • show what's important 
    • e.g. facts, relationships, flow, similarities, differences, outliers, unknown
    • prioritize and structure the content
      • e.g. show first an overview, what's important
    • make the invisible visible
      • think about what we do not see
    • know your (extended) users/audience
      • who'll use the content, at what level, for that
  • test for readability
    • get (early) feedback
      • have the content reviewed first
        • via peer review, dry run presentation
  • tell the story
    • know the audience and its needs
    • build momentum, expectation
    • don't leave the audience to figure it out
    • show the facts
    • build a narrative
      • show data that support it
      • arrange the visuals in a logical sequence
    • engage the reader
      • ask questions that bridge the gaps
        • e.g. in knowledge, in presentation's flow
      • show the unexpected
      • confirm logical deductions
Previous Post <<||>> Next Post

27 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 2: Guidelines)

Graphical Representation Series
Graphical Representation Series
 

Consider the following best practices in data visualizations (work in progress):

  • avoid poor labeling and annotation practices
    • label data points
      • considering labeling at least the important number of points
        • e.g. starts, ends, local/global minima/minima
        • when labels clutter the chart or there's minimal variation
    • avoid abbreviations
      • unless they are defined clearly upfront, consistent and/or universally understood
      • can hinder understanding
        • abbreviations should help compress content without losing meaning
    • use font types, font sizes, and text orientation that are easy to read
    • avoid stylish design that makes content hard to read
    • avoid redundant information
    • text should never overshadow or distort the actual message or data
      • use neutral, precise wording
  • avoid the use of pre-attentive attributes 
    • aka visual features that our brains process almost instantly
    • color
      • has identity value: used to distinguish one thing from another
        • carries its own connotations
        • gives a visual scale of measure
        • the use of color doesn’t always help
      • hue 
        • refers to the dominant color family of a specific color, being processed by the brain based on the different wavelengths of light
          • allows to differentiate categories
        • use distinct hues to represent different categories
      • intensity (aka brightness)
        • refers to how strong or weak a color appears
      • saturation (aka chroma, intensity) 
        • refers to the purity or vividness of a color
          • as saturation decreases, the color becomes more muted or washed out
          • highly saturated colors have little or no gray in it
          • highly desaturated colors are almost gray, with none of the original colors
        • use high saturation for important elements like outliers, trends, or alerts
        • use low saturation for background elements
      • avoid pure colors that are bright and saturated
        • drive attention to the respective elements 
      • avoid colors that are too similar in tone or saturation
      • avoid colors hard to distinguish for color-blind users
        • e.g. red-green color blindness
          • brown-green, orange-red, blue-purple combinations
          • avoid red-green pairings for status indicators 
            • e.g. success/error
        • e.g. blue-yellow color blindness
          • blue-green, yellow-ping, purple-blue
        • e.g. total color blindness (aka monochromacy)
          • all colors appear as shades of gray
            • ⇒ users must rely entirely on contrast, shape, and texture
      • use icons, labels, or patterns alongside color
      • use tools to test for color issues
      • use colorblind-safe palettes 
      • for sequential or diverging data, use one hue and vary saturation or brightness to show magnitude
      • start with all-gray data elements
        • use color only when it corresponds to differences in data
          • ⇐ helps draw attention to whatever isn’t gray
      • dull and neutral colors give a sense of uniformity
      • can modify/contradict readers' intuitive response
      • choose colors to draw attention, to label, to show relationships 
    • form
      • shape
        • allows to distinguish types of data points and encode information
          • well-shaped data has functional and aesthetic character
        • complex shapes can become more difficult to be perceived
      • size
        • attribute used to encode the magnitude or extent of elements 
        • should be aligned to its probable use, importance, and amount of detail involved
          • larger elements draw more attention
        • its encoding should be meaningful
          • e.g. magnitudes of deviations from the baseline
        • overemphasis can lead to distortions
        • choose a size range that is appropriate for the data
        • avoid using size to represent nominal or categorical data where there's no inherent order to the sizes
      • orientation
        • angled or rotated items stand out.
      • length/width
        • useful in bar charts to show quantity
        • avoid stacked bar graphs
      • curvature
        • curved lines can contrast with straight ones.
      • collinearity
        • alignment can suggest grouping or flow
    • highlighting
    • spatial positioning
      • 2D position
        • placement on axes or grids conveys value 
      • 3D position in 2D space

      • grouping
        • proximity implies relationships.
        • keep columns, respectively bars close together
      • enclosure
        • borders or shaded areas signal clusters.
      • depth (stereoscopic or shading)
        • adds dimensionality
  • avoid graphical features that are purely decorative
    • aka elements that don't affect understanding, structure or usability
    • stylistic embellishments
      • borders/frames
        • ornamental lines or patterns around content
      • background images
        • images used for ambiance, not content
      • drop shadows and gradients
        • enhance depth or style but don’t add meaning.
      • icons without function
        • decorative icons that don’t represent actions or concepts
    • non-informative imagery
      • stock photos
        • generic visuals that aren’t referenced in the text.
      • illustrations
        • added for visual interest, not explanation.
      • mascots or logos
        • when repeated or not tied to specific content.
    • layout elements
      • spacers
        • transparent or blank images used to control layout
        • leave the right amount of 'white' space between chart elements
      • custom bullets or list markers
        • designed for flair, not clarity
      • visual separators
        • lines or shapes that divide sections without conveying hierarchy or meaning
  • avoid bias
    • sampling bias
      • showing data that doesn’t represent the full population
        • avoid cherry-picking data
          • aka selecting only the data that support a particular viewpoint while ignoring others that might contradict it
          • enable users to look at both sets of data and contrast them
          • enable users to navigate the data
        • avoid survivor bias
          • aka focusing only on the data that 'survived' a process and ignoring the data that didn’t
      • use representative data
        • aka the dataset includes all relevant groups
      • check for collection bias
        • avoid data that only comes from one source 
        • avoid data that excludes key demographics
    • cognitive bias
      • mental shortcut that sometimes affect interpretation
        • incl. confirmation bias, framing bias, pattern bias
      • balance visual hierarchies
        • don’t make one group look more important by overemphasizing it
      • show uncertainty
        • by including confidence intervals or error bars to reflect variability
      • separate comparisons
        • when comparing groups, use adjacent charts rather than combining them into one that implies a hierarchy
          • e.g. ethnicities, region
    • visual bias
      • design choices that unintentionally (or intentionally) distort meaning
        • respectively how viewers interpret the data
      • avoid manipulating axes 
        • by truncating y-axis
          • exaggerates differences
        • by changing scale types
          • linear vs. logarithmic
            • a log scale compresses large values and expands small ones, which can flatten exponential growth or make small changes seem more significant
          • uneven intervals
            • using inconsistent spacing between tick marks can distort trends
        • by zooming in/out
          • adjusting the axis to focus on a specific range can highlight or hide variability and eventually obscure the bigger picture
        • by using dual axes
          • if the scales differ too much, it can falsely imply correlation or exaggerate relationships 
        • by distorting the aspect ration
          • stretching or compressing the chart area can visually amplify or flatten trends
            • e.g. a steep slope might look flat if the x-axis is stretched
        • avoid inconsistent scales
        • label axes clearly
        • explain scale choices
      • avoid overemphasis 
        • avoid unnecessary repetition 
          • e.g. of the same graph, of content
        • avoid focusing on outliers, (short-term) trends
        • avoid truncating axes, exaggerating scales
        • avoid manipulating the visual hierarchy 
      • avoid color bias
        • bright colors draw attention unfairly
      • avoid overplotting 
        • too much data obscures patterns
      • avoid clutter
        • creates cognitive friction
          • users struggle to focus on what matters because their attention is pulled in too many directions
          • is about design excess
        • avoid unnecessary or distracting elements 
          • they don’t contribute to understanding the data
      • avoid overloading 
        • attempting to show too much data at once
          • is about data excess
        • overwhelms readers' processing capacity, making it hard to extract insights or spot patterns
    • algorithmic bias 
      • the use of ML or other data processing techniques can reinforce certain aspects (e.g. social inequalities, stereotypes)
      • visualize uncertainty
        • include error bars, confidence intervals, and notes on limitations
      • audit data and algorithms
        • look for bias in inputs, model assumptions and outputs
    • intergroup bias
      • charts tend to reflect or reinforce societal biases
        • e.g. racial or gender disparities
      • use thoughtful ordering, inclusive labeling
      • avoid deficit-based comparisons
  • avoid overcomplicating the visualizations 
    • e.g. by including too much data, details, other elements
  • avoid comparisons across varying dimensions 
    • e.g. (two) circles of different radius, bar charts of different height, column charts of different length, 
    • don't make users compare angles, areas, volumes

21 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 1: An Introduction)

Graphical Representation Series
Graphical Representation Series

Introduction

Creating simple charts or more complex data visualizations may appear trivial for many, though their authors shouldn't forget that readers have different backgrounds, degrees of literacy, many of them not being maybe able to make sense of graphical displays, at least not without some help.

Beginners start with a limited experience and build upon it, then, on the road to mastery, they get acquainted with the many possibilities, a deeper sense is achieved and the choices become a few. Independently of one's experience, there are seldom 'yes' and 'no' answers for the various choices, but everything is a matter of degree that varies with one's experience, available time, audience's expectations, and many more aspects might be considered in time.  

The following questions are intended to expand, respectively narrow down our choices when dealing with data visualizations from a data professional's perspective. The questions are based mainly on [1] though they were extended to include a broader perspective. 

General Questions

Where does the data come from? Is the source reliable, representative (for the whole population in scope)? Is the data source certified? Are yhe data actual? 

Are there better (usable) sources? What's the effort to consider them? Does the data overlap? To what degree? Are there any benefits in merging the data? How much this changes the overall picture? Are the changes (in trends) explainable? 

Was the data collected? How, from where, and using what method? [1] What methodology/approach was used?

What's the dataset about? Can one recognize the data, the (data) entities, respectively the structures behind? How big is the fact table (in terms of rows and columns)? How many dimensions are in scope?

What transformations, calculations or modifications have been applied? What was left out and what's the overall impact?

Any significant assumptions were made? [1] Were the assumptions clearly stated? Are they entitled? Is it more to them? 

Were any transformation applied? Do the transformations change any data characteristics? Were they adequately documented/explained? Do they make sense? Was it something important left out? What's the overall impact?

What criteria were used to include/exclude data from the display? [1] Are the criteria adequately explained/documented? Do they make sense?

Are similar data publicly available? Is it (freely) accessible/usable? To what degree? How much do the datasets overlap? Is there any benefit to analyze/use the respective data? Are the characteristics comparable? To what degree?

Dataviz Questions

What's the title/subtitle of the chart? Is it meaningful for the readers? Does the title reflect the data, respectively the findings adequately? Can it be better formulated? Is it an eye-catcher? Does it meet the expectations? 

What data is shown? Of what type? At what level is the data aggregated? 

What chart (type) is being used? [1] Are the readers familiar with the chart type? Does it needs further introduction/clarifications? Are there better means to represent the data? Does the chart offer the appropriate perspective? Does it make sense to offer different (complementary) perspective(s)? To what degree other perspectives help?

What items of data do the marks represent? What value associations do the attributes represent? [1] Are the marks visible? Are the marks adequately presented (e.g. due to missing data)? 

What range of values are displayed? [1] What approximation the values support? To what degree can the values be rounded without losing meaning?

Is the data categorical, ordinal or continuous? 

Are the axes property chosen/displayed/labeled? Is the scale properly chosen (linear, semilogarithmic, logarithmic), respectively displayed? Do they emphasize, diminish, distort, simplify, or clutter the information? 

What features (shapes, patterns, differences or connections) are observable, interesting or vital for understanding the chart? [1] 

Where are the largest, mid-sized and smallest values? (aka ‘stepped magnitude’ judgements). [1] 

Where lie the most/least values? Where is the average or normal? (aka ‘global comparison’ judgements)” [1] How are the values distributed? Are there any outliers present? Are they explainable? 

What features are expected or unexpected? [1] To what degree are they unexpected?  

What features are important given the subject? [1] 

What shapes and patterns strike readers as being semantically aligned with the subject? [1] 

What is the overall feeling when looking at the final result? Is the chart overcrowded? Can anything be left out/included? 

What colors were used? [1] Are the colors adequately chosen, respectively meaningful? Do they follow the general recommendations?  

What colors, patterns, forms do readers see first? What impressions come next, respectively last longer?  

Are the various elements adequately/intuitively positioned/distinguishable? What's the degree of overlapping/proximity? Do the elements respect an intuitive hierarchy? Do they match readers' expectations, respectively the best practices in scope? Are the deviations entitled? 

Is the space properly used? To what degree? Are there major gaps? 

Know Your Audience

What audience targets the visualization? Which are its characteristics (level of experience with data visualizations; authors, experts or casual attendees)? Are there any accidental attendees? How likely is the audience to pay attention? 

What is audience’s relationship with the subject matter? What knowledge do they have or, conversely, lack about the subject? What assistance might they need to interpret the meaning of the subject? Do they have the capacity to comprehend what it means to them? [1]

Why do the audience wants/needs to understand the topic? Are they familiar, respectively actively interested or more passive? Is it able to grasp the intended meaning? [1] To what degree? What kind of challenges might be involved, of what nature?

What is their motivation? Do they have a direct, expressed need or are they more passive and indifferent? Is it needed a way to persuade them or even seduce them to engage? [1] Can this be done without distorting the data and its meaning(s)?

What are their visualization literacy skill set? Do they require assistance perceiving the chart(s)? Are they sufficiently comfortable with operating features of interactivity? Do they have any visual accessibility issues (e.g. red–green color blindness)? Do they need to be (re)factored into the design? [1]

Reflections

What has been learnt? Has it reinforced or challenged existing knowledge? [1] Was new knowledge gained? How valuable is this knowledge? Can it be reused? In which contexts? 

Do the findings meet one's expectations? To what degree? Were the expectations entitled? On what basis? What's missing? What's gaps' relevance? 

What feelings have been stirred? Has the experience had an impact emotionally? [1] To what degree? Is the impact positive/negative? Is the reaction entitled/explainable? Are there any factors that distorted the reactions? Are they explainable? Do they make sense? 

What does one do with this understanding? Is it just knowledge acquired or something to inspire action (e.g. making a decision or motivating a change in behavior)? [1] How relevant/valuable is the information for us? Can it be used/misused? To what degree? 

Are the data and its representation trustworthy? [1] To what degree?

Previous Post <<||>> Next Post

References:
[1] Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019

03 May 2025

📊Graphical Representation: Graphics We Live By (Part XI: Comparisons Between Data Series)

Graphical Representation Series
Graphical Representation Series

Over the past 10-20 years it became so easy to create data visualizations just by dropping some of the data available into a tool like Excel and providing a visual depiction of it with just a few clicks. In many cases, the first draft, typically provided by default in the tool used, doesn't even need further work as the objective was reached, while in others the creator must have a minimum skillset for making the visualization useful, appealing, or whatever quality is a final requirement for the work in scope. However, the audience might judge the visualization(s) from different perspectives, and there can be a broad audience with different skills in reading, evaluating and understanding the work.

There are many depictions on the web resembling the one below, taken from a LinkedIn post:

Example Chart - Boing vs. Airbus

Even if the visualization is not perfect, it does a fair job in representing the data. Improvements can be made in the areas of labels, the title and positioning of elements, and the color palette used. At least these were the improvements made in the original post. It must be differentiated also between the environment in which the charts are made available, the print format having different characteristics than the ones in business setups. Unfortunately, the requirements of the two are widely confused, probably also because of the overlapping of the mediums used. 

Probably, it's a good idea to always start with the row data (or summaries of it) when the result consists of only a few data points that can be easily displayed in a table like the one below (the feature to round the decimals for integer values should be available soon in Power BI):

Summary Table

Of course, one can calculate more meaningful values like percentages from the total, standard deviations and other values that offer more perspectives into the data. Even if the values adequately reflect the reality, the reader can but wonder about the local and global minimal/maximal values, without talking much about the meaning of data points, which is easily identifiable in a chart. At least in the case of small data sets, using a table in combination with a chart can provide a more complete perspective and different ways of analyzing the data, especially when the navigation is interactive. 

Column and bar charts do a fair job in comparing values over time, though they do use a lot of ink in the process (see D). While they make it easy to compare neighboring values, the rectangles used tend to occupy a lot of space when they are made too wide or too high to cover the empty space within the display (e.g. when just a few values are displayed, space being wasted in the process). As the main downside, it takes a lot of scanning until the reader identifies the overall trends, and the further away the bars are from each other, the more difficult it becomes to do comparisons. 

In theory, line charts are more efficient in representing the above data points, because the marks are usually small and the line thin enough to provide a better data-ink ratio, while one can see a lot at a glance. In Power BI the creator can use different types of interpolation: linear (A), step (B) or smooth (C). In many cases, it might be a good idea to use a linear interpolation, though when there are no or minimal overlapping, it might be worthwhile to explore the other types if interpolation too (and further request feedback from the users):

Linear, Step and Smooth Line Charts

The nearness of values from different series can raise difficulties in identifying adequately the points, respectively delimiting the lines (see B).When the density of values allows it, it makes sense also to include the averages for each data series to reflect the distance between the two data sets. Unfortunately, the chart can get crowded if further data series or summaries are added to the cart(s). 

If the column chart (E) is close to the redesigned chart provided in the original redesign, the other alternatives can provide upon case more value. Stacked column charts (D) allow also to compare the overall quantity by month, area charts (F) tend to use even more color than needed, while water charts (G) allow to compare the difference between data points per time unit. Tornado charts (H) are a variation of bar charts, allowing easier comparing of the size of the bars, while ribbon charts (I) show well the stacking values. 

Alternatives to Line Charts

One should consider changing the subtitle(s) slightly to reflect the chart type when the patterns shown imply a shift in attention or meaning. Upon case, more that one of the above charts can be used within the same report when two or more perspectives are important. Using a complementary perspective can facilitate data's understanding or of identifying certain patterns that aren't easily identifiable otherwise. 

In general, the graphics creators try to use various representational means of facilitating a data set's understanding, though seldom only two series or a small subset of dimensions provide a complete description. The value of data comes when multiple perspectives are combined. Frankly, the same can be said about the above data series. Yes, there are important differences between the two series, though how do the numbers compare when one looks at the bigger picture, especially when broken down on element types (e.g. airplane size). How about plan vs. actual values, how long does it take more for production or other processes? It's one of a visualization's goals to improve the questions posed, but how efficient are visualizations that barely scratch the surface?

In what concerns the code, the following scripts can be used to prepare the data:

-- Power Query script (Boeing vs Airbus)
= let
    Source = let
    Source = #table({"Sorting", "Month Name", "Serial Date", "Boeing Deliveries", "Airbus Deliveries"},
    {
        {1, "Oct", #date(2023, 10, 31), 30, 50},
        {2, "Nov", #date(2023, 11, 30), 40, 40},
        {3, "Dec", #date(2023, 12, 31), 40, 110},
        {4, "Jan", #date(2024, 1, 31), 20, 30},
        {5, "Feb", #date(2024, 2, 29), 30, 40},  // Leap year adjustment
        {6, "Mar", #date(2024, 3, 31), 30, 60},
        {7, "Apr", #date(2024, 4, 30), 40, 60},
        {8, "May", #date(2024, 5, 31), 40, 50},
        {9, "Jun", #date(2024, 6, 30), 50, 80},
        {10, "Jul", #date(2024, 7, 31), 40, 90},
        {11, "Aug", #date(2024, 8, 31), 40, 50},
        {12, "Sep", #date(2024, 9, 30), 30, 50}
    }
    ),
    #"Changed Types" = Table.TransformColumnTypes(Source, {{"Sorting", Int64.Type}, {"Serial Date", type date}, {"Boeing Deliveries", Int64.Type}, {"Airbus Deliveries", Int64.Type}})
in
    #"Changed Types"
in
    Source

It can be useful to create the labels for the charts dynamically:

-- DAX code for labels
MaxDate = Format(Max('Boeing vs Airbus'[Serial Date]),"MMM-YYYY")
MinDate = FORMAT (Min('Boeing vs Airbus'[Serial Date]),"MMM-YYYY")
MinMaxDate = [MinDate] & " to " & [MaxDate]
Title Boing Airbus = "Boing and Airbus Deliveries " & [MinMaxDate]

Happy coding!

Previous Post <<||>> Next Post

04 August 2024

📊Graphical Representation: Graphics We Live By (Part X: Pie and Donut Charts in Power BI and Excel)

Graphical Representation Series
Graphical Representation Series

Pie charts are loved and hated by many altogether, and there are many entitled reasons to use them and avoid them, though the most important criteria to evaluate them is whether they do the intended job in an acceptable manner, especially when compared to other representational means. The most important aspect they depict is the part to whole ratio, which even if can be depicted by other graphical tools, few tools are efficient in representing it. 

The pie chart works well as a visualization tool when it has only 3-5 values that are easily recognizable in the visualization, however as soon the size or the number of pieces vary considerably, the more difficult it is to visualize and interpret them, in case their representation has more negative than positive effects. There are many topics that form something like a long tail - the portion of the distribution having many occurrences far from the head or beginning. Displaying the items from the long tail together with the other components together can totally obscure the distribution of the items from the long tail as they become unrecognizable in the diagram. 

One approach to handle this is to group all the items from the long tail together under a piece (e.g. Other) and use a second form of representation to display them separately. For example,  Microsoft Excel offers a way to zoom in the section of a pie chart with small percentages by displaying them in a second pie chart (pie of pie) or bar chart (bar of pie), something like a "zoom in" perspective (see image below). Unfortunately, the feature seems to limit itself only to small percentages, and thus can't be used currently to offer a broader perspective. Ideally, it would be useful to zoom in on any piece of the pie, especially when the items are categorized as a hierarchy with two or even more levels. 


Unfortunately, even modern visualization tools offer limited features in displaying this kind of perspective into a flexible unitary visualization, and thus users are forced to use their creativity in providing proper solutions. In the below example the "Renewables" piece of pie is further broken down into several components of a full pie, an ensemble supposed to function as a single form of representation. With a bit of effort, the reader probably will understand the meaning behind the two pie charts, however the encoding of colors and other elements used are suboptimal in the decoding process. 

Pie Charts - Original Solution

In the above example, the arrow may suggest that in between the two donut charts exists a relationship, reflected also in the description provided, however the readers may still have difficulties in correctly interpreting the diagrams, especially when there's some kind of overlapping or other type of implied or unimplied resemblance. If the colors overlap or have other similarities, are they intentional? If the circles have the same size, does this observed resemblance have a meaning? The reader shouldn't bother himself with this type of questions, but see the resemblance and the meaning of the various elements with a minimum of effort while decoding a chart's elements. Of course, when the meaning is not clear, some guidance should be ideally provided!

Unfortunately, Power BI doesn't seem to have a similar visual like the one from Excel yet, however with a bit of effort one can obtain similar results, even if there are other minor or important limitations. For example, the lines between the two pie charts can't be drawn, so one is forced to use other encodings to show that there's a connection between the Renewable slice and the small pie chart. Moreover, the ensemble thus created isn't treated unitary and handled accordingly. Frankly, the maturity of a graphical representation environment can and should be judged also from this perspective!

The below representation built in Power BI uses a few tricks to display two pie charts together. The smaller pie chart representing the breakdown and pieces' colors are variations of parent's color, attempting to show that there's a relationship between the slice from the first chart and the pie chart with the details. Unfortunately, it wasn't possible to use similar lines like in Excel to show the relation between the two sections. 

Pie of Pie in Power BI

Instead of a pie chart, one can use a donut, like in the original representation. Even if the donut uses a smaller area for representation, in theory the pie chart offers a better basis for comparisons, at least in theory. Stacked column charts can be used as well (see C), however one loses the certainty that the pieces must add up to 100%. Further limitations can appear when one wants to achieve more with the visualizations.

Custom charts can be used as well. The pie chart coming from xViz (see D) allows to increase the size of a pie piece by using another radius, technique which could be used to highlight the piece represented in the second chart. Frankly, sunburst diagrams (see E) are better at representing the parent to child proportions, where the same color encoding has been used. Unfortunately, the more information is shown, the more loaded the visualization seems to be.

Pie of Pie Alternatives in Power BI I

A treemap can prove to be a better representation alternative because it encodes proportions in a unitary way, much like pie charts do, though it takes more space if one wants to make the labels visible. Radial charts (see G) and Aster plots (see I) can be occasionally better choices, especially because they use less space as they display only the main categories. A second diagram chart can be used to display the subcategories, much like in A and B. Sankey charts (see H) can be used as well, even if they don't allow representing any quantitative values unless one encodes them directly in the labels. 

Pie of Pie Alternatives in Power BI II

When one dives into the world of diagrams and goes behind the still limited representational choices provided by the standard tools, one can be surprised by the additional representational choices. However, their appropriateness should be considered against readers' skillset to read and interpret them! Frankly, the alternatives considered above could be a better choice when they will reach a representational maturity. 

Many thanks to Christopher Chin, who in his weekly post on data visualization blunders, suggested the examples used as basis for this post (see [1])!

Previous Post <<||>> Next Post

References:
[1] LinkedIn (2024) Christopher Chin's post (link)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.