Showing posts with label data visualization. Show all posts
Showing posts with label data visualization. Show all posts

02 November 2025

📉Graphical Representation: Clearness (Just the Quotes)

 "The essential quality of graphic representations is clarity. If the diagram fails to give a clearer impression than the tables of figures it replaces, it is useless. To this end, we will avoid complicating the diagram by including too much data." (Armand Julin, "Summary for a Course of Statistics, General and Applied", 1910)

"A warning seems justifiable that the background of a chart should not be made any more prominent than actually necessary. Many charts have such heavy coordinate ruling and such relatively narrow lines for curves or other data that the real facts the chart is intended to portray do not stand out clearly from the background. No more coordinate lines should be used than are absolutely necessary to guide the eye of the reader and to permit an easy reading of the curves." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"It is not possible to lay down any hard and fast rules for determining what chart is the best for any given problem. Ordinarily that one is the best which will produce the quickest and clearest results. but unfortunately it is not always possible to construct the clearest one in the least time. Experience is the best guide. Generally speaking, a rectilinear chart is best adapted for equations of the first degree, logarithmic for those other than the first degree and not containing over two variables, and alignment charts where there are three or more variables. However, nearly every person becomes more or less familiar with one type of chart and prefers to adhere to the use of that type because he does not care to take the time and trouble to find out how to use the others. It is best to know what the possibilities of all types are and to be governed accordingly when selecting one or the other for presenting or working out certain data." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)

"Sometimes the scales of these accompanying charts are so large that the reader is puzzled to get clearly in his mind what the whole chart is driving at. There is a possibility of making a simple chart on such a large scale that the mere size of the chart adds to its complexity by causing the reader to glance from one side of the chart to the other in trying to get a condensed visualization of the chart." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"The title for any chart presenting data in the graphic form should be so clear and so complete that the chart and its title could be removed from the context and yet give all the information necessary for a complete interpretation of the data. Charts which present new or especially interesting facts are very frequently copied by many magazines. A chart with its title should be considered a unit, so that anyone wishing to make an abstract of the article in which the chart appears could safely transfer the chart and its title for use elsewhere." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"Statistics are numerical statements of facts in any department of inquiry, placed in relation to each other; statistical methods are devices for abbreviating and classifying the statements and making clear the relations." (Arthur L Bowley, "An Elementary Manual of Statistics", 1934)

"An important rule in the drafting of curve charts is that the amount scale should begin at zero. In comparisons of size the omission of the zero base, unless clearly indicated, is likely to give a misleading impression of the relative values and trend." (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"The use of two or more amount scales for comparisons of series in which the units are unlike and, therefore, not comparable [...] generally results in an ineffective and confusing presentation which is difficult to understand and to interpret. Comparisons of this nature can be much more clearly shown by reducing the components to a comparable basis as percentages or index numbers." (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"Good design looks right. It is simple (clear and uncomplicated). Good design is also elegant, and does not look contrived. A map should be aesthetically pleasing, thought provoking, and communicative." (Arthur H Robinson, "Elements of Cartography", 1953)

"Conflicting with the idea of integrating evidence regardless of its these guidelines provoke several issues: First, labels are data. even intriguing data. [...] Second, when labels abandon the data points, then a code is often needed to relink names to numbers. Such codes, keys, and legends are impediments to learning, causing the reader's brow to furrow. Third, segregating nouns from data-dots breaks up evidence on the basis of mode" (verbal vs. nonverbal), a distinction lacking substantive relevance. Such separation is uncartographic; contradicting the methods of map design often causes trouble for any type of graphical display. Fourth, design strategies that reduce data-resolution take evidence displays in the wrong direction. Fifth, what clutter? Even this supposedly cluttered graph clearly shows the main ideas: brain and body mass are roughly linear in logarithms, and as both variables increase, this linearity becomes less tight." (Edward R Tufte, "Beautiful Evidence", 2006) [argumentation against Cleveland's recommendation of not using words on data plots]

"A statistical table is the logical listing of related quantitative data in vertical columns and horizontal rows of numbers with sufficient explanatory and qualifying words, phrases and statements in the form of titles, headings and notes to make clear the full meaning of data and their origin." (Alva M Tuttle, "Elementary Business and Economic Statistics", 1957)

"Although flow charts are not used to portray or interpret statistical data, they possess definite utility for certain kinds of research and administrative problems. With a well-designed flow chart it is possible to present a large number of facts and relationships simply, clearly, and accurately, without resorting to extensive or involved verbal description." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Charts and graphs represent an extremely useful and flexible medium for explaining, interpreting, and analyzing numerical facts largely by means of points, lines, areas, and other geometric forms and symbols. They make possible the presentation of quantitative data in a simple, clear, and effective manner and facilitate comparison of values, trends, and relationships. Moreover, charts and graphs possess certain qualities and values lacking in textual and tabular forms of presentation." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Simplicity, accuracy, appropriate size, proper proportion, correct emphasis, and skilled execution - these are the factors that produce the effective chart. To achieve simplicity your chart must be designed with a definite audience in mind, show only essential information. Technical terms should be absent as far as possible. And in case of doubt it is wiser to oversimplify than to make matters unduly complex. Be careful to avoid distortion or misrepresentation. Accuracy in graphics is more a matter of portraying a clear reliable picture than reiterating exact values. Selecting the right scales and employing authoritative titles and legends are as important as precision plotting. The right size of a chart depends on its probable use, its importance, and the amount of detail involved." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Simplicity, accuracy. appropriate size, proper proportion, correct emphasis, and skilled execution - these are the factors that produce the effective chart. To achieve simplicity your chart must be designed with a definite audience in mind, show only essential information. Technical terms should be absent as far as possible. And in case of doubt it is wiser to oversimplify than to make matters unduly complex. Be careful to avoid distortion or misrepresentation. Accuracy in graphics is more a matter of portraying a clear reliable picture than reiterating exact values. Selecting the right scales and employing authoritative titles and legends are as important as precision plotting. The right size of a chart depends on its probable use, its importance, and the amount of detail involved." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Charts and graphs are a method of organizing information for a unique purpose. The purpose may be to inform, to persuade, to obtain a clear understanding of certain facts, or to focus information and attention on a particular problem. The information contained in charts and graphs must, obviously, be relevant to the purpose. For decision-making purposes. information must be focused clearly on the issue or issues requiring attention. The need is not simply for 'information', but for structured information, clearly presented and narrowed to fit a distinctive decision-making context. An advantage of having a 'formula' or 'model' appropriate to a given situation is that the formula indicates what kind of information is needed to obtain a solution or answer to a specific problem." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"If two or more data paths ate to appear on the graph. it is essential that these lines be labeled clearly, or at least a reference should be provided for the reader to make the necessary identifications. While clarity seems to be a most obvious goal. graphs with inadequate or confusing labeling do appear in publications, The user should not find identification of data paths troublesome or subject to misunderstanding. The designer normally should place no more than three data paths on the graph to prevent confusion - particularly if the data paths intersect at one or more points on the Cartesian plane." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"It is almost impossible to define 'time-sequence chart' in a clear and unambiguous manner because of the many forms and adaptations open to this type of chart. However. it might be said that, in essence, time-sequence chart portrays a chain of activities through time, indicates the type of activity in each link of the chain, shows clearly the position of the link in the total sequence chain, and indicates the duration of each activity. The time sequence chart may also contain verbal elements explaining when to begin an activity, how long to continue the activity, and a description of the activity. The chart may also indicate when to blend a given activity with another and the point at which a given activity is completed. The basic time-sequence chart may also be accompanied by verbal explanations and by secondary or contributory charts." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"A statistical table is a systematic arrangement of numerical data in columns and rows. Its purpose is to show quantitative facts clearly, concisely, and effectively. It should facilitate an understanding of the logical relationships among the numbers presented. Tables are used in the compilation of raw data, in the summarizing and analytic processes, and in the presentation of statistics in final form. A good table is the product of careful thinking and hard work. It is not just a package of figures put into neat compartments and ruled to make it look more attractive. It contains carefully selected data put together with thought and ingenuity to serve a specific purpose." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"A graphic is an illustration that, like a painting or drawing, depicts certain images on a flat surface. The graphic depends on the use of lines and shapes or symbols to represent numbers and ideas and show comparisons, trends, and relationships. The success of the graphic depends on the extent to which this representation is transmitted in a clear and interesting manner." (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"For most line charts the maximum number of plotted lines should not exceed five; three or fewer is the ideal number. When multiple plotted lines are shown each line should be differentiated by using" (a) a different type of line and/or" (b) different plotting marks, if shown, and" (c) clearly differentiated labeling." (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"Understanding is accomplished through:" (a) the use of relative size of the shapes used in the graphic;" (b) the positioning of the graphic-line forms;" (c) shading;" (d) the use of scales of measurement; and" (e) the use of words to label the forms in the graphic. In addition. in order for a person to attach meaning to a graphic it must also be simple, clear, and appropriate." (Robert Lefferts, "Elements of Graphics: How to prepare charts and graphs for effective reports", 1981)

"Clear, detailed, and thorough labeling should be used to defeat graphical distortion and ambiguity. Write out explanations of the data on the graphic itself. Label important events in the data." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"There are two kinds of misrepresentation. In one. the numerical data do not agree with the data in the graph, or certain relevant data are omitted. This kind of misleading presentation. while perhaps hard to determine, clearly is wrong and can be avoided. In the second kind of misrepresentation, the meaning of the data is different to the preparer and to the user." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"Understandability implies that the graph will mean something to the audience. If the presentation has little meaning to the audience, it has little value. Understandability is the difference between data and information. Data are facts. Information is facts that mean something and make a difference to whoever receives them. Graphic presentation enhances understanding in a number of ways. Many people find that the visual comparison and contrast of information permit relationships to be grasped more easily. Relationships that had been obscure become clear and provide new insights." (Anker V Andersen, "Graphing Financial Information: How accountants can use graphs to communicate", 1983)

"In order to be easily understood, a display of information must have a logical structure which is appropriate for the user's knowledge and needs, and this structure must be clearly represented visually. In order to indicate structure, it is necessary to be able to eemphasiz, divide and relate items of information. Visual emphasis can be used to indicate a hierarchical relationship between items of information, as in the case of systems of headings and subheadings for example. Visual separation of items can be used to indicate that they are different in kind or are unrelated functionally, and similarly a visual relationship between items will imply that they are of a similar kind or bear some functional relation to one another. This kind of visual 'coding' helps the reader to appreciate the extent and nature of the relationship between items of information, and to adopt an appropriate scanning strategy." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"The effective communication of information in visual form, whether it be text, tables, graphs, charts or diagrams, requires an understanding of those factors which determine the 'legibility', 'readability' and 'comprehensibility', of the information being presented. By legibility we mean: can the data be clearly seen and easily read? By readability we mean: is the information set out in a logical way so that its structure is clear and it can be easily scanned? By comprehensibility we mean: does the data make sense to the audience for whom it is intended? Is the presentation appropriate for their previous knowledge, their present information needs and their information processing capacities?" (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"The space between columns, on the other hand, should be just sufficient to separate them clearly, but no more. The columns should not, under any circumstances, be spread out merely to fill the width of the type area. […] Sometimes, however, it is difficult to avoid undesirably large gaps between columns, particularly where the data within any given column vary considerably in length. This problem can sometimes be solved by reversing the order of the columns […]. In other instances the insertion of additional space after every fifth entry or row can be helpful, […] but care must be taken not to imply that the grouping has any special meaning." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"Clear vision is a vital aspect of graphs. The viewer must be able to visually disentangle the many different items that appear on a graph." (William S Cleveland, "The Elements of Graphing Data", 1985)

"Iteration and experimentation are important for all of data analysis, including graphical data display. In many cases when we make a graph it is immediately clear that some aspect is inadequate and we regraph the data. In many other cases we make a graph, and all is well, but we get an idea for studying the data in a different way with a different graph; one successful graph often suggests another." (William S Cleveland, "The Elements of Graphing Data", 1985)

"Good graphics can be spoiled by bad annotation. Labels must always be subservient to the information to be conveyed, and legibility should never be sacrificed for style. All the information on the sheet should be easy to read, and more important, easy to interpret. The priorities of the information should be clearly expressed by the use of differing sizes, weights and character of letters." (Bruce Robertson, "How to Draw Charts & Diagrams", 1988)

"A range-frame does not require any viewing or decoding instructions; it is not a graphical puzzle and most viewers can easily tell what is going on. Since it is more informative about the data in a clear and precise manner, the range-frame should replace the non-data bearing frame inmany graphical applications." (Edward R Tufte, "Data-Ink Maximization and Graphical Design", Oikos Vol. 58 (2), 1990) 

"A graph is a system of connections expressed by means of commonly accepted symbols. As such, the symbols and symbolic forms used in making graphs are significant. To communicate clearly this symbolism must be acknowledged." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"Information needs representation. The idea that it is possible to communicate information in a 'pure' form is fiction. Successful risk communication requires intuitively clear representations. Playing with representations can help us not only to understand numbers" (describe phenomena) but also to draw conclusions from numbers" (make inferences). There is no single best representation, because what is needed always depends on the minds that are doing the communicating." (Gerd Gigerenzer, "Calculated Risks: How to know when numbers deceive you", 2002)

"Most dashboards fail to communicate efficiently and effectively, not because of inadequate technology (at least not primarily), but because of poorly designed implementations. No matter how great the technology, a dashboard's success as a medium of communication is a product of design, a result of a display that speaks clearly and immediately. Dashboards can tap into the tremendous power of visual perception to communicate, but only if those who implement them understand visual perception and apply that understanding through design principles and practices that are aligned with the way people see and think." (Stephen Few, "Information Dashboard Design", 2006)

"Clearly principles and guidelines for good presentation graphics have a role to play in exploratory graphics, but personal taste and individual working style also play important roles. The same data may be presented in many alternative ways, and taste and customs differ as to what is regarded as a good presentation graphic. Nevertheless, there are principles that should be respected and guidelines that are generally worth following. No one should expect a perfect consensus where graphics are concerned." (Antony Unwin, Good Graphics?"[in "Handbook of Data Visualization"], 2008)

"Perception requires imagination because the data people encounter in their lives are never complete and always equivocal. [...] We also use our imagination and take shortcuts to fill gaps in patterns of nonvisual data. As with visual input, we draw conclusions and make judgments based on uncertain and incomplete information, and we conclude, when we are done analyzing the patterns, that out picture is clear and accurate. But is it?" (Leonard Mlodinow, "The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"The main goal of data visualization is its ability to visualize data, communicating information clearly and effectively. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex dataset by communicating its key aspects in a more intuitive way. Yet designers often tend to discard the balance between design and function, creating gorgeous data visualizations which fail to serve its main purpose - communicate information." (Vitaly Friedman, "Data Visualization and Infographics", Smashing Magazine, 2008)

"When displaying information visually, there are three questions one will find useful to ask as a starting point. Firstly and most importantly, it is vital to have a clear idea about what is to be displayed; for example, is it important to demonstrate that two sets of data have different distributions or that they have different mean values? Having decided what the main message is, the next step is to examine the methods available and to select an appropriate one. Finally, once the chart or table has been constructed, it is worth reflecting upon whether what has been produced truly reflects the intended message. If not, then refine the display until satisfied; for example if a chart has been used would a table have been better or vice versa?" (Jenny Freeman et al, "How to Display Data", 2008)

"Dealing with a circular visualization and trying to compare its radial portions is always problematic. When designing with data, the story should always be told as clearly as possible. To do so, it is often best to avoid round charts and graphs." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"When a chart is presented properly, information just lows to the viewer in the clearest and most efficient way. There are no extra layers of colors, no enhancements to distract us from the clarity of the information." (Dona Wong, "The Wall Street Journal guide to information graphics: The dos and don’ts of presenting data, facts, and figures", 2010)

"The final step in creating your graphic is to refine it. Step back and look at it with fresh eyes. Is there anything that could be removed? Or anything that should be removed because it is distracting? Consider each element in your figure and question whether it contributes enough to your overall goal to justify its contribution. Also consider whether there is anything that could be represented more clearly. Perhaps you have been so effective at simplifying your graphic that you could now include another point in the same figure. Another method of refinement is to check the placement and alignment of your labels. They should be unobtrusive and clearly indicate which object they refer to. Consistency in fonts and alignment of labels can make the difference between something that is easy and pleasant to read, and something that is cluttered and frustrating." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"Context (information that lends to better understanding the who, what, when, where, and why of your data) can make the data clearer for readers and point them in the right direction. At the least, it can remind you what a graph is about when you come back to it a few months later. […] Context helps readers relate to and understand the data in a visualization better. It provides a sense of scale and strengthens the connection between abstract geometry and colors to the real world." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"[...] communicating with data is less often about telling a specific story and more like starting a guided conversation. It is a dialogue with the audience rather than a monologue. While some data presentations may share the linear approach of a traditional story, other data products" (analytical tools, in particular) give audiences the flexibility for exploration. In our experience, the best data products combine a little of both: a clear sense of direction defined by the author with the ability for audiences to focus on the information that is most relevant to them. The attributes of the traditional story approach combined with the self-exploration approach leads to the guided safari analogy." (Zach Gemignani et al, "Data Fluency", 2014)

"Commonly, data do not make a clear and unambiguous statement about our world, often requiring tools and methods to provide such clarity. These methods, called statistical data analysis, involve collecting, manipulating, analyzing, interpreting, and presenting data in a form that can be used, understood, and communicated to others." (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

"A well-designed graph clearly shows you the relevant end points of a continuum. This is especially important if you’re documenting some actual or projected change in a quantity, and you want your readers to draw the right conclusions. […]" (Daniel J Levitin, "Weaponized Lies", 2017)

"Numbers are ideal vehicles for promulgating bullshit. They feel objective, but are easily manipulated to tell whatever story one desires. Words are clearly constructs of human minds, but numbers? Numbers seem to come directly from Nature herself. We know words are subjective. We know they are used to bend and blur the truth. Words suggest intuition, feeling, and expressivity. But not numbers. Numbers suggest precision and imply a scientific approach. Numbers appear to have an existence separate from the humans reporting them." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"Before even thinking about charts, it should be recognised that the table on its own is extremely useful. Its clear structure, with destination regions organised in columns and origins in rows, allows the reader to quickly look up any value - including totals - quickly and precisely. That’s what tables are good for. The deficiency of the table, however, is in identifying patterns within the data. Trying to understand the relationships between the numbers is difficult because, to compare the numbers with each other, the reader needs to store a lot of information in working memory, creating what psychologists refer to as a high 'cognitive load'." (Alan Smith, "How Charts Work: Understand and explain data with confidence", 2022)

"We see first what stands out. Our eyes go right to change and difference - peaks, valleys, intersections, dominant colors, outliers. Many successful charts - often the ones that please us the most and are shared and talked about - exploit this inclination by showing a single salient point so clearly that we feel we understand the chart’s meaning without even trying." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)


09 August 2025

🧭Business Intelligence: Perspectives (Part 33: Data Lifecycle for Analytics)

Business Intelligence Series
Business Intelligence Series

In the context of BI, Analytics and other data-related topics, the various parties usually talk about data ingestion, preparation, storage, analysis and visualization, often ignoring processes like data generation, collection, and interpretation. It’s also true that a broader discussion may shift the attention unnecessarily, though it’s important to increase people’s awareness in respect to data’s full lifecycle. Otherwise, many of the data solutions become a mix of castles built into the air, respectively structures of cards waiting for the next flurry to be blown away. 

Data is generated continuously by organizations, their customers, vendors, and third parties, as part of a complex network of processes, systems and integrations that extend beyond their intended boundaries. Independently of their type, scope and various other characteristics, all processes consume and generate data at a rapid pace that steadily exceeds organizations’ capabilities to make good use of it.

There are also scenarios in which the data must be collected via surveys, interviews, forms, measurements or direct observations, and whatever processes are used to elicit some aspect of importance. The volume and other characteristics of data generated in this way may depend on the goals and objectives in scope, respectively the methods, procedures and even the methodologies used. 

Data ingestion is the process of importing data from the various sources into a central or intermediary repository for storage, processing, analysis and visualization. The repository can be a data mart, warehouse, lakehouse, data lake or any other destination intended for the intermediary or the final intended destination of data. Moreover, data can have different levels of quality in respect to its intended usage.

Data storage refers to the systems and approaches used to securely retain, organize, and access data throughout its journey within the various layers of the infrastructure. It focuses on where and how data is stored, independently on whether that’s done on-premises, in the cloud or across hybrid environments.

Data preparation is the process of transforming the data into a form close to what is intended for analysis and visualization. It may involve data aggregation, enrichment, transposition and other operations that facilitate further steps. It’s probably the most important step in a data project given that the final outcome can have an important impact on data analysis and visualization, facilitating or impeding the respective processes. 

Data analysis consists of a multitude of processes that attempt to harness value from data in its various forms of aggregation. The ultimate purpose is to infer meaningful information, respectively knowledge from the data augmented as insights. The road from raw data to these targeted outcomes is a tedious one, where recipes can help and imped altogether. Expecting value from any pile of data can easily become a costly illusion when data, processes and their usage is poorly understood and harnessed. 

Data visualization is the means of presenting data and its characteristics in the form of figures, diagrams and other forms of representation that facilitate data’s navigation, perception and understanding for various purposes. Usually, the final purpose is fact-checking, decision-making, problem-solving, etc., though there is a multitude of steps in between. Especially in these areas there are mixed good and poor practices altogether.  

Data interpretation is the attempt of drawing meaningful conclusions from the data, information and knowledge gained mainly from data analysis and visualization. It is often a subjective interpretation as it’s usually regarded from people’s understanding of the various facts as they are considered. The inferences made in the process can be a matter of gut feeling, respectively of mature analysis. It’s about sense-making, contextualization, critical thinking, pattern recognition, internalization and externalization, and other similar cognitive processes.

Previous Post <<||>> Next Post

30 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 3: Heuristics)

Graphical Representation Series
Graphical Representation Series
 

Consider the following general heuristics in data visualizations (work in progress):

  • plan design
    • plan page composition
      • text
        • title, subtitles
        • dates 
          • refresh, filters applied
        • parameters applied
        • guidelines/tooltips
        • annotation 
      • navigation
        • main page(s)
        • additional views
        • drill-through
        • zoom in/out
        • next/previous page
        • landing page
      • slicers/selections
        • date-related
          • date range
          • date granularity
        • functional
          • metric
          • comparisons
        • categorical
          • structural relations
      • icons/images
        • company logo
        • button icons
        • background
    • pick a theme
      • choose a layout and color schema
        • use a color palette generator
        • use a focused color schema or restricted palette
        • use consistent and limited color scheme
        • use suggestive icons
          • use one source (with similar design)
        • use formatting standards
    • create a visual hierarchy 
      • use placement, size and color for emphasis
      • organize content around eye movement pattern
      • minimize formatting changes
      • 1 font, 2 weights, 4 sizes
    • plan the design
      • build/use predictable and consistent templates
        • e.g. using Figma
      • use layered design
      • aim for design unity
      • define & use formatting standards
      • check changes
    • GRACEFUL
      • group visuals with white space 
      • right chart type
      • avoid clutter
      • consistent & limited color schema
      • enhanced readability 
      • formatting standard
      • unity of design
      • layered design
  • keep it simple 
    • be predictable and consistent 
    • focus on the message
      • identify the core insights and design around them
      • pick suggestive titles/subtitles
        • use dynamics subtitles
      • align content with the message
    • avoid unnecessary complexity
      • minimize visual clutter
      • remove the unnecessary elements
      • round numbers
    • limit colors and fonts
      • use a restrained color palette (<5 colors)
      • stick to 1-2 fonts 
      • ensure text is legible without zooming
    • aggregate values
      • group similar data points to reduce noise
      • use statistical methods
        • averages, medians, min/max
      • categories when detailed granularity isn’t necessary
    • highlight what matters 
      • e.g. actionable items
      • guide attention to key areas
        • via annotations, arrows, contrasting colors 
        • use conditional formatting
      • do not show only the metrics
        • give context 
      • show trends
        • via sparklines and similar visuals
    • use familiar visuals
      • avoid questionable visuals 
        • e.g. pie charts, gauges
    • avoid distortions
      • preserve proportions
        • scale accurately to reflect data values
        • avoid exaggerated visuals
          • don’t zoom in on axes to dramatize small differences
      • use consistent axes
        • compare data using the same scale and units across charts
        • don't use dual axes or shifting baselines that can mislead viewers
      • avoid manipulative scaling
        • use zero-baseline on bar charts 
        • use logarithmic scales sparingly
    • design for usability
      • intuitive interaction
      • at-a-glance perception
      • use contrast for clarity
      • use familiar patterns
        • use consistent formats the audience already knows
    • design with the audience in mind
      • analytical vs managerial perspectives (e.g. dashboards)
    • use different level of data aggregations
      •  in-depth data exploration 
    • encourage scrutiny
      • give users enough context to assess accuracy
        • provide raw values or links to the source
      • explain anomalies, outliers or notable trends
        • via annotations
    • group related items together
      • helps identify and focus on patterns and other relationships
    • diversify 
      • don't use only one chart type
      • pick the chart that reflects the best the data in the conrext considered
    • show variance 
      • absolute vs relative variance
      • compare data series
      • show contribution to variance
    • use familiar encodings
      • leverage (known) design patterns
    • use intuitive navigation
      • synchronize slicers
    • use tooltips
      • be concise
      • use hover effects
    • use information buttons
      • enhances user interaction and understanding 
        • by providing additional context, asking questions
    • use the full available surface
      • 1080x1920 works usually better 
    • keep standards in mind 
      • e.g. IBCS
  • state the assumptions
    • be explicit
      • clearly state each assumption 
        • instead of leaving it implied
    • contextualize assumptions
      • explain the assumption
        • use evidence, standard practices, or constraints
    • state scope and limitations
      • mention what the assumption includes and excludes
    • tie assumptions to goals & objectives
      • helps to clarify what underlying beliefs are shaping the analysis
      • helps identify whether the visualization achieves its intended purpose 
  • show the data
    • be honest (aka preserve integrity)
      • avoid distortion, bias, or trickery
    • support interpretation
      • provide labels, axes, legends
    • emphasize what's meaningful
      • patterns, trends, outliers, correlations, local/global maxima/minima
  • show what's important 
    • e.g. facts, relationships, flow, similarities, differences, outliers, unknown
    • prioritize and structure the content
      • e.g. show first an overview, what's important
    • make the invisible visible
      • think about what we do not see
    • know your (extended) users/audience
      • who'll use the content, at what level, for that
  • test for readability
    • get (early) feedback
      • have the content reviewed first
        • via peer review, dry run presentation
  • tell the story
    • know the audience and its needs
    • build momentum, expectation
    • don't leave the audience to figure it out
    • show the facts
    • build a narrative
      • show data that support it
      • arrange the visuals in a logical sequence
    • engage the reader
      • ask questions that bridge the gaps
        • e.g. in knowledge, in presentation's flow
      • show the unexpected
      • confirm logical deductions
Previous Post <<||>> Next Post

27 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 2: Guidelines)

Graphical Representation Series
Graphical Representation Series
 

Consider the following best practices in data visualizations (work in progress):

  • avoid poor labeling and annotation practices
    • label data points
      • considering labeling at least the important number of points
        • e.g. starts, ends, local/global minima/minima
        • when labels clutter the chart or there's minimal variation
    • avoid abbreviations
      • unless they are defined clearly upfront, consistent and/or universally understood
      • can hinder understanding
        • abbreviations should help compress content without losing meaning
    • use font types, font sizes, and text orientation that are easy to read
    • avoid stylish design that makes content hard to read
    • avoid redundant information
    • text should never overshadow or distort the actual message or data
      • use neutral, precise wording
  • avoid the use of pre-attentive attributes 
    • aka visual features that our brains process almost instantly
    • color
      • has identity value: used to distinguish one thing from another
        • carries its own connotations
        • gives a visual scale of measure
        • the use of color doesn’t always help
      • hue 
        • refers to the dominant color family of a specific color, being processed by the brain based on the different wavelengths of light
          • allows to differentiate categories
        • use distinct hues to represent different categories
      • intensity (aka brightness)
        • refers to how strong or weak a color appears
      • saturation (aka chroma, intensity) 
        • refers to the purity or vividness of a color
          • as saturation decreases, the color becomes more muted or washed out
          • highly saturated colors have little or no gray in it
          • highly desaturated colors are almost gray, with none of the original colors
        • use high saturation for important elements like outliers, trends, or alerts
        • use low saturation for background elements
      • avoid pure colors that are bright and saturated
        • drive attention to the respective elements 
      • avoid colors that are too similar in tone or saturation
      • avoid colors hard to distinguish for color-blind users
        • e.g. red-green color blindness
          • brown-green, orange-red, blue-purple combinations
          • avoid red-green pairings for status indicators 
            • e.g. success/error
        • e.g. blue-yellow color blindness
          • blue-green, yellow-ping, purple-blue
        • e.g. total color blindness (aka monochromacy)
          • all colors appear as shades of gray
            • ⇒ users must rely entirely on contrast, shape, and texture
      • use icons, labels, or patterns alongside color
      • use tools to test for color issues
      • use colorblind-safe palettes 
      • for sequential or diverging data, use one hue and vary saturation or brightness to show magnitude
      • start with all-gray data elements
        • use color only when it corresponds to differences in data
          • ⇐ helps draw attention to whatever isn’t gray
      • dull and neutral colors give a sense of uniformity
      • can modify/contradict readers' intuitive response
      • choose colors to draw attention, to label, to show relationships 
    • form
      • shape
        • allows to distinguish types of data points and encode information
          • well-shaped data has functional and aesthetic character
        • complex shapes can become more difficult to be perceived
      • size
        • attribute used to encode the magnitude or extent of elements 
        • should be aligned to its probable use, importance, and amount of detail involved
          • larger elements draw more attention
        • its encoding should be meaningful
          • e.g. magnitudes of deviations from the baseline
        • overemphasis can lead to distortions
        • choose a size range that is appropriate for the data
        • avoid using size to represent nominal or categorical data where there's no inherent order to the sizes
      • orientation
        • angled or rotated items stand out.
      • length/width
        • useful in bar charts to show quantity
        • avoid stacked bar graphs
      • curvature
        • curved lines can contrast with straight ones.
      • collinearity
        • alignment can suggest grouping or flow
    • highlighting
    • spatial positioning
      • 2D position
        • placement on axes or grids conveys value 
      • 3D position in 2D space

      • grouping
        • proximity implies relationships.
        • keep columns, respectively bars close together
      • enclosure
        • borders or shaded areas signal clusters.
      • depth (stereoscopic or shading)
        • adds dimensionality
  • avoid graphical features that are purely decorative
    • aka elements that don't affect understanding, structure or usability
    • stylistic embellishments
      • borders/frames
        • ornamental lines or patterns around content
      • background images
        • images used for ambiance, not content
      • drop shadows and gradients
        • enhance depth or style but don’t add meaning.
      • icons without function
        • decorative icons that don’t represent actions or concepts
    • non-informative imagery
      • stock photos
        • generic visuals that aren’t referenced in the text.
      • illustrations
        • added for visual interest, not explanation.
      • mascots or logos
        • when repeated or not tied to specific content.
    • layout elements
      • spacers
        • transparent or blank images used to control layout
        • leave the right amount of 'white' space between chart elements
      • custom bullets or list markers
        • designed for flair, not clarity
      • visual separators
        • lines or shapes that divide sections without conveying hierarchy or meaning
  • avoid bias
    • sampling bias
      • showing data that doesn’t represent the full population
        • avoid cherry-picking data
          • aka selecting only the data that support a particular viewpoint while ignoring others that might contradict it
          • enable users to look at both sets of data and contrast them
          • enable users to navigate the data
        • avoid survivor bias
          • aka focusing only on the data that 'survived' a process and ignoring the data that didn’t
      • use representative data
        • aka the dataset includes all relevant groups
      • check for collection bias
        • avoid data that only comes from one source 
        • avoid data that excludes key demographics
    • cognitive bias
      • mental shortcut that sometimes affect interpretation
        • incl. confirmation bias, framing bias, pattern bias
      • balance visual hierarchies
        • don’t make one group look more important by overemphasizing it
      • show uncertainty
        • by including confidence intervals or error bars to reflect variability
      • separate comparisons
        • when comparing groups, use adjacent charts rather than combining them into one that implies a hierarchy
          • e.g. ethnicities, region
    • visual bias
      • design choices that unintentionally (or intentionally) distort meaning
        • respectively how viewers interpret the data
      • avoid manipulating axes 
        • by truncating y-axis
          • exaggerates differences
        • by changing scale types
          • linear vs. logarithmic
            • a log scale compresses large values and expands small ones, which can flatten exponential growth or make small changes seem more significant
          • uneven intervals
            • using inconsistent spacing between tick marks can distort trends
        • by zooming in/out
          • adjusting the axis to focus on a specific range can highlight or hide variability and eventually obscure the bigger picture
        • by using dual axes
          • if the scales differ too much, it can falsely imply correlation or exaggerate relationships 
        • by distorting the aspect ration
          • stretching or compressing the chart area can visually amplify or flatten trends
            • e.g. a steep slope might look flat if the x-axis is stretched
        • avoid inconsistent scales
        • label axes clearly
        • explain scale choices
      • avoid overemphasis 
        • avoid unnecessary repetition 
          • e.g. of the same graph, of content
        • avoid focusing on outliers, (short-term) trends
        • avoid truncating axes, exaggerating scales
        • avoid manipulating the visual hierarchy 
      • avoid color bias
        • bright colors draw attention unfairly
      • avoid overplotting 
        • too much data obscures patterns
      • avoid clutter
        • creates cognitive friction
          • users struggle to focus on what matters because their attention is pulled in too many directions
          • is about design excess
        • avoid unnecessary or distracting elements 
          • they don’t contribute to understanding the data
      • avoid overloading 
        • attempting to show too much data at once
          • is about data excess
        • overwhelms readers' processing capacity, making it hard to extract insights or spot patterns
    • algorithmic bias 
      • the use of ML or other data processing techniques can reinforce certain aspects (e.g. social inequalities, stereotypes)
      • visualize uncertainty
        • include error bars, confidence intervals, and notes on limitations
      • audit data and algorithms
        • look for bias in inputs, model assumptions and outputs
    • intergroup bias
      • charts tend to reflect or reinforce societal biases
        • e.g. racial or gender disparities
      • use thoughtful ordering, inclusive labeling
      • avoid deficit-based comparisons
  • avoid overcomplicating the visualizations 
    • e.g. by including too much data, details, other elements
  • avoid comparisons across varying dimensions 
    • e.g. (two) circles of different radius, bar charts of different height, column charts of different length, 
    • don't make users compare angles, areas, volumes

21 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 1: An Introduction)

Graphical Representation Series
Graphical Representation Series

Introduction

Creating simple charts or more complex data visualizations may appear trivial for many, though their authors shouldn't forget that readers have different backgrounds, degrees of literacy, many of them not being maybe able to make sense of graphical displays, at least not without some help.

Beginners start with a limited experience and build upon it, then, on the road to mastery, they get acquainted with the many possibilities, a deeper sense is achieved and the choices become a few. Independently of one's experience, there are seldom 'yes' and 'no' answers for the various choices, but everything is a matter of degree that varies with one's experience, available time, audience's expectations, and many more aspects might be considered in time.  

The following questions are intended to expand, respectively narrow down our choices when dealing with data visualizations from a data professional's perspective. The questions are based mainly on [1] though they were extended to include a broader perspective. 

General Questions

Where does the data come from? Is the source reliable, representative (for the whole population in scope)? Is the data source certified? Are yhe data actual? 

Are there better (usable) sources? What's the effort to consider them? Does the data overlap? To what degree? Are there any benefits in merging the data? How much this changes the overall picture? Are the changes (in trends) explainable? 

Was the data collected? How, from where, and using what method? [1] What methodology/approach was used?

What's the dataset about? Can one recognize the data, the (data) entities, respectively the structures behind? How big is the fact table (in terms of rows and columns)? How many dimensions are in scope?

What transformations, calculations or modifications have been applied? What was left out and what's the overall impact?

Any significant assumptions were made? [1] Were the assumptions clearly stated? Are they entitled? Is it more to them? 

Were any transformation applied? Do the transformations change any data characteristics? Were they adequately documented/explained? Do they make sense? Was it something important left out? What's the overall impact?

What criteria were used to include/exclude data from the display? [1] Are the criteria adequately explained/documented? Do they make sense?

Are similar data publicly available? Is it (freely) accessible/usable? To what degree? How much do the datasets overlap? Is there any benefit to analyze/use the respective data? Are the characteristics comparable? To what degree?

Dataviz Questions

What's the title/subtitle of the chart? Is it meaningful for the readers? Does the title reflect the data, respectively the findings adequately? Can it be better formulated? Is it an eye-catcher? Does it meet the expectations? 

What data is shown? Of what type? At what level is the data aggregated? 

What chart (type) is being used? [1] Are the readers familiar with the chart type? Does it needs further introduction/clarifications? Are there better means to represent the data? Does the chart offer the appropriate perspective? Does it make sense to offer different (complementary) perspective(s)? To what degree other perspectives help?

What items of data do the marks represent? What value associations do the attributes represent? [1] Are the marks visible? Are the marks adequately presented (e.g. due to missing data)? 

What range of values are displayed? [1] What approximation the values support? To what degree can the values be rounded without losing meaning?

Is the data categorical, ordinal or continuous? 

Are the axes property chosen/displayed/labeled? Is the scale properly chosen (linear, semilogarithmic, logarithmic), respectively displayed? Do they emphasize, diminish, distort, simplify, or clutter the information? 

What features (shapes, patterns, differences or connections) are observable, interesting or vital for understanding the chart? [1] 

Where are the largest, mid-sized and smallest values? (aka ‘stepped magnitude’ judgements). [1] 

Where lie the most/least values? Where is the average or normal? (aka ‘global comparison’ judgements)” [1] How are the values distributed? Are there any outliers present? Are they explainable? 

What features are expected or unexpected? [1] To what degree are they unexpected?  

What features are important given the subject? [1] 

What shapes and patterns strike readers as being semantically aligned with the subject? [1] 

What is the overall feeling when looking at the final result? Is the chart overcrowded? Can anything be left out/included? 

What colors were used? [1] Are the colors adequately chosen, respectively meaningful? Do they follow the general recommendations?  

What colors, patterns, forms do readers see first? What impressions come next, respectively last longer?  

Are the various elements adequately/intuitively positioned/distinguishable? What's the degree of overlapping/proximity? Do the elements respect an intuitive hierarchy? Do they match readers' expectations, respectively the best practices in scope? Are the deviations entitled? 

Is the space properly used? To what degree? Are there major gaps? 

Know Your Audience

What audience targets the visualization? Which are its characteristics (level of experience with data visualizations; authors, experts or casual attendees)? Are there any accidental attendees? How likely is the audience to pay attention? 

What is audience’s relationship with the subject matter? What knowledge do they have or, conversely, lack about the subject? What assistance might they need to interpret the meaning of the subject? Do they have the capacity to comprehend what it means to them? [1]

Why do the audience wants/needs to understand the topic? Are they familiar, respectively actively interested or more passive? Is it able to grasp the intended meaning? [1] To what degree? What kind of challenges might be involved, of what nature?

What is their motivation? Do they have a direct, expressed need or are they more passive and indifferent? Is it needed a way to persuade them or even seduce them to engage? [1] Can this be done without distorting the data and its meaning(s)?

What are their visualization literacy skill set? Do they require assistance perceiving the chart(s)? Are they sufficiently comfortable with operating features of interactivity? Do they have any visual accessibility issues (e.g. red–green color blindness)? Do they need to be (re)factored into the design? [1]

Reflections

What has been learnt? Has it reinforced or challenged existing knowledge? [1] Was new knowledge gained? How valuable is this knowledge? Can it be reused? In which contexts? 

Do the findings meet one's expectations? To what degree? Were the expectations entitled? On what basis? What's missing? What's gaps' relevance? 

What feelings have been stirred? Has the experience had an impact emotionally? [1] To what degree? Is the impact positive/negative? Is the reaction entitled/explainable? Are there any factors that distorted the reactions? Are they explainable? Do they make sense? 

What does one do with this understanding? Is it just knowledge acquired or something to inspire action (e.g. making a decision or motivating a change in behavior)? [1] How relevant/valuable is the information for us? Can it be used/misused? To what degree? 

Are the data and its representation trustworthy? [1] To what degree?

Previous Post <<||>> Next Post

References:
[1] Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019

06 July 2025

🧭Business Intelligence: Perspectives (Part 32: Data Storytelling in Visualizations)

Business Intelligence Series
Business Intelligence Series

From data-related professionals to book authors on data visualization topics, there are many voices that require from any visualization to tell a story, respectively to conform to storytelling principles and best practices, and this independently of the environment or context in which the respective artifacts are considered. The need for data visualizations to tell a story may be entitled, though in business setups the data, its focus and context change continuously with the communication means, objectives, and, at least from this perspective, one can question storytelling’s hard requirement.

Data storytelling can be defined as "a structured approach for communicating data insights using narrative elements and explanatory visuals" [1]. Usually, this supposes the establishment of a context, respectively a fundament on which further facts, suppositions, findings, arguments, (conceptual) models, visualizations and other elements can be based upon. Stories help to focus the audience on the intended messages, they connect and eventually resonate with the audience, facilitate the retaining of information and understanding the chain of implications the decisions in scope have, respectively persuade and influence, when needed.

Conversely, besides the fact that it takes time and effort to prepare stories and the afferent content (presentations, manually created visualizations, documentation), expecting each meeting to be a storytelling session can rapidly become a nuisance for the auditorium as well for the presenters. Like in any value-generating process, one should ask where the value in storytelling is based on data visualizations and the effort involved, or whether the effort can be better invested in other areas.

In many scenarios, requesting from a dashboard to tell a story is an entitled requirement given that many dashboards look like a random combination of visuals and data whose relationship and meaning can be difficult to grasp and put into a plausible narrative, even if they are based on the same set of data. Data visualizations of any type should have an intentional well-structured design that facilitates visual elements’ navigation, understanding facts’ retention, respectively resonate with the auditorium.

It’s questionable whether such practices can be implemented in a consistent and meaningful manner, especially when rich navigation features across multiple visuals are available for users to look at data from different perspectives. In such scenarios the identification of cases that require attention and the associations existing between well-established factors help in the discovery process.

Often, it feels like visuals were arranged aleatorily in the page or that there’s no apparent connection between them, which makes the navigation and understanding more challenging. For depicting a story, there must be a logical sequencing of the various visualizations displayed in the dashboards or reports, especially when visuals’ arrangement doesn’t reflect the typical navigation of the visuals or when the facts need a certain sequencing that facilitates understanding. Moreover, the sequencing doesn’t need to be linear but have a clear start and end that encompasses everything in between.

Storytelling works well in setups in which something is presented as the basis for one-time or limited in scope sessions like decision-making, fact-checking, awareness raising and other types of similar communication. However, when building solutions for business monitoring and data exploration, there can be multiple stories or no story worth telling, at least not for the predefined scope. Even if one can zoom in or out, respectively rearrange the visuals and add others to highlight the stories encompassed, the value added by taking the information out of the dashboards and performing such actions can be often neglected to the degree that it doesn’t pay off. A certain consistency, discipline and acumen is needed then for focusing on the important aspects and ignoring thus the nonessential. 

References:
[1] Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019 [quotes]

03 May 2025

🧭Business Intelligence: Perspectives (Part 31: More on Data Visualization)

Business Intelligence Series
Business Intelligence Series

There are many reasons why the data visualizations available in the different mediums can be considerate as having poor quality and unfortunately there is often more than one issue that can be corroborated with this - the complexity of the data or of the models behind them, the lack of identifying the right data, respectively aspects that should be visualized, poor data visualization software or the lack of skills to use its capabilities, improper choice of visual displays, misleading choice of scales, axes and other elements, the lack of clear outlines for telling a story respectively of pushing a story too far, not adapting visualizations to changing requirements or different perspectives, to name just the most important causes.

The complexity of the data increases with the dimensions associated typically with what we call currently big data - velocity, volume, value, variety, veracity, variability and whatever V might be in scope. If it's relatively easy to work with a small dataset, understanding its shapes and challenges, our understanding power decreases with the Vs added into the picture. Of course, we can always treat the data alike, though the broader the timeframe, the higher the chances are for the data to have important changing characteristics that can impact the outcomes. It can be simple definition changes or more importantly, the model itself. Data, processes and perspectives change fluidly with the many requirements, and quite often the further implications for reporting, visualizations and other aspects are not considered.

Quite often there's a gap between what one wants to achieve with a data visualization and the data or knowledge available. It might be a matter of missing values or whole attributes that would help to delimit clearly the different perspectives or of modelling adequately the processes behind. It can be the intrinsic data quality issues that can be challenging to correct after the fact. It can also be our understanding about the processes themselves as reflected in the data, or more important, on what's missing to provide better perspectives. Therefore, many are forced to work with what they have or what they know.

Many of the data visualizations inadvertently reflect their creators' understanding about the data, procedures, processes, and any other aspects related to them. Unfortunately, also business users or other participants have only limited views and thus their knowledge must be elicited accordingly. Even then, it might be pieces of data that are not reflected in any knowledge available.

If one tortures enough data, one or more stories worthy of telling can probably be identified. However, much of the data is dull to the degree that some creators feel forced to add elements. Earlier, one could have blamed the software for it, though modern software provides nice graphics and plenty of features that can help graphics creators in the process. Even data with high quality can reveal some challenges difficult to overcome. One needs to compromise and there can be compromises in many places to the degree that one can but wonder whether the end result still reflects reality. Unfortunately, it's difficult to evaluate the impact of such gaps, however progress can be made occasionally by continuously evaluating the gaps and finding the appropriate methods to address them.

Not all stories must have complex visualizations in which multiple variables are used to provide the many perspectives. Some simple visualizations can be enough for establishing common ground on which something more complex (or simple) can be built upon. Data visualization is a continuous process of exploration, extrapolation, evaluation, testing assumptions and ideas, where one's experience can be a useful mediator between the various forces. 

Previous Post <<||>> Next Post

📊Graphical Representation: Graphics We Live By (Part XI: Comparisons Between Data Series)

Graphical Representation Series
Graphical Representation Series

Over the past 10-20 years it became so easy to create data visualizations just by dropping some of the data available into a tool like Excel and providing a visual depiction of it with just a few clicks. In many cases, the first draft, typically provided by default in the tool used, doesn't even need further work as the objective was reached, while in others the creator must have a minimum skillset for making the visualization useful, appealing, or whatever quality is a final requirement for the work in scope. However, the audience might judge the visualization(s) from different perspectives, and there can be a broad audience with different skills in reading, evaluating and understanding the work.

There are many depictions on the web resembling the one below, taken from a LinkedIn post:

Example Chart - Boing vs. Airbus

Even if the visualization is not perfect, it does a fair job in representing the data. Improvements can be made in the areas of labels, the title and positioning of elements, and the color palette used. At least these were the improvements made in the original post. It must be differentiated also between the environment in which the charts are made available, the print format having different characteristics than the ones in business setups. Unfortunately, the requirements of the two are widely confused, probably also because of the overlapping of the mediums used. 

Probably, it's a good idea to always start with the row data (or summaries of it) when the result consists of only a few data points that can be easily displayed in a table like the one below (the feature to round the decimals for integer values should be available soon in Power BI):

Summary Table

Of course, one can calculate more meaningful values like percentages from the total, standard deviations and other values that offer more perspectives into the data. Even if the values adequately reflect the reality, the reader can but wonder about the local and global minimal/maximal values, without talking much about the meaning of data points, which is easily identifiable in a chart. At least in the case of small data sets, using a table in combination with a chart can provide a more complete perspective and different ways of analyzing the data, especially when the navigation is interactive. 

Column and bar charts do a fair job in comparing values over time, though they do use a lot of ink in the process (see D). While they make it easy to compare neighboring values, the rectangles used tend to occupy a lot of space when they are made too wide or too high to cover the empty space within the display (e.g. when just a few values are displayed, space being wasted in the process). As the main downside, it takes a lot of scanning until the reader identifies the overall trends, and the further away the bars are from each other, the more difficult it becomes to do comparisons. 

In theory, line charts are more efficient in representing the above data points, because the marks are usually small and the line thin enough to provide a better data-ink ratio, while one can see a lot at a glance. In Power BI the creator can use different types of interpolation: linear (A), step (B) or smooth (C). In many cases, it might be a good idea to use a linear interpolation, though when there are no or minimal overlapping, it might be worthwhile to explore the other types if interpolation too (and further request feedback from the users):

Linear, Step and Smooth Line Charts

The nearness of values from different series can raise difficulties in identifying adequately the points, respectively delimiting the lines (see B).When the density of values allows it, it makes sense also to include the averages for each data series to reflect the distance between the two data sets. Unfortunately, the chart can get crowded if further data series or summaries are added to the cart(s). 

If the column chart (E) is close to the redesigned chart provided in the original redesign, the other alternatives can provide upon case more value. Stacked column charts (D) allow also to compare the overall quantity by month, area charts (F) tend to use even more color than needed, while water charts (G) allow to compare the difference between data points per time unit. Tornado charts (H) are a variation of bar charts, allowing easier comparing of the size of the bars, while ribbon charts (I) show well the stacking values. 

Alternatives to Line Charts

One should consider changing the subtitle(s) slightly to reflect the chart type when the patterns shown imply a shift in attention or meaning. Upon case, more that one of the above charts can be used within the same report when two or more perspectives are important. Using a complementary perspective can facilitate data's understanding or of identifying certain patterns that aren't easily identifiable otherwise. 

In general, the graphics creators try to use various representational means of facilitating a data set's understanding, though seldom only two series or a small subset of dimensions provide a complete description. The value of data comes when multiple perspectives are combined. Frankly, the same can be said about the above data series. Yes, there are important differences between the two series, though how do the numbers compare when one looks at the bigger picture, especially when broken down on element types (e.g. airplane size). How about plan vs. actual values, how long does it take more for production or other processes? It's one of a visualization's goals to improve the questions posed, but how efficient are visualizations that barely scratch the surface?

In what concerns the code, the following scripts can be used to prepare the data:

-- Power Query script (Boeing vs Airbus)
= let
    Source = let
    Source = #table({"Sorting", "Month Name", "Serial Date", "Boeing Deliveries", "Airbus Deliveries"},
    {
        {1, "Oct", #date(2023, 10, 31), 30, 50},
        {2, "Nov", #date(2023, 11, 30), 40, 40},
        {3, "Dec", #date(2023, 12, 31), 40, 110},
        {4, "Jan", #date(2024, 1, 31), 20, 30},
        {5, "Feb", #date(2024, 2, 29), 30, 40},  // Leap year adjustment
        {6, "Mar", #date(2024, 3, 31), 30, 60},
        {7, "Apr", #date(2024, 4, 30), 40, 60},
        {8, "May", #date(2024, 5, 31), 40, 50},
        {9, "Jun", #date(2024, 6, 30), 50, 80},
        {10, "Jul", #date(2024, 7, 31), 40, 90},
        {11, "Aug", #date(2024, 8, 31), 40, 50},
        {12, "Sep", #date(2024, 9, 30), 30, 50}
    }
    ),
    #"Changed Types" = Table.TransformColumnTypes(Source, {{"Sorting", Int64.Type}, {"Serial Date", type date}, {"Boeing Deliveries", Int64.Type}, {"Airbus Deliveries", Int64.Type}})
in
    #"Changed Types"
in
    Source

It can be useful to create the labels for the charts dynamically:

-- DAX code for labels
MaxDate = Format(Max('Boeing vs Airbus'[Serial Date]),"MMM-YYYY")
MinDate = FORMAT (Min('Boeing vs Airbus'[Serial Date]),"MMM-YYYY")
MinMaxDate = [MinDate] & " to " & [MaxDate]
Title Boing Airbus = "Boing and Airbus Deliveries " & [MinMaxDate]

Happy coding!

Previous Post <<||>> Next Post
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.