27 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 2: Guidelines)

Graphical Representation Series
Graphical Representation Series
 

Consider the following best practices in data visualizations (work in progress):

  • avoid poor labeling and annotation practices
    • label data points
      • considering labeling at least the important number of points
        • e.g. starts, ends, local/global minima/minima
        • when labels clutter the chart or there's minimal variation
    • avoid abbreviations
      • unless they are defined clearly upfront, consistent and/or universally understood
      • can hinder understanding
        • abbreviations should help compress content without losing meaning
    • use font types, font sizes, and text orientation that are easy to read
    • avoid stylish design that makes content hard to read
    • avoid redundant information
    • text should never overshadow or distort the actual message or data
      • use neutral, precise wording
  • avoid the use of pre-attentive attributes 
    • aka visual features that our brains process almost instantly
    • color
      • has identity value: used to distinguish one thing from another
        • carries its own connotations
        • gives a visual scale of measure
        • the use of color doesn’t always help
      • hue 
        • refers to the dominant color family of a specific color, being processed by the brain based on the different wavelengths of light
          • allows to differentiate categories
        • use distinct hues to represent different categories
      • intensity (aka brightness)
        • refers to how strong or weak a color appears
      • saturation (aka chroma, intensity) 
        • refers to the purity or vividness of a color
          • as saturation decreases, the color becomes more muted or washed out
          • highly saturated colors have little or no gray in it
          • highly desaturated colors are almost gray, with none of the original colors
        • use high saturation for important elements like outliers, trends, or alerts
        • use low saturation for background elements
      • avoid pure colors that are bright and saturated
        • drive attention to the respective elements 
      • avoid colors that are too similar in tone or saturation
      • avoid colors hard to distinguish for color-blind users
        • e.g. red-green color blindness
          • brown-green, orange-red, blue-purple combinations
          • avoid red-green pairings for status indicators 
            • e.g. success/error
        • e.g. blue-yellow color blindness
          • blue-green, yellow-ping, purple-blue
        • e.g. total color blindness (aka monochromacy)
          • all colors appear as shades of gray
            • ⇒ users must rely entirely on contrast, shape, and texture
      • use icons, labels, or patterns alongside color
      • use tools to test for color issues
      • use colorblind-safe palettes 
      • for sequential or diverging data, use one hue and vary saturation or brightness to show magnitude
      • start with all-gray data elements
        • use color only when it corresponds to differences in data
          • ⇐ helps draw attention to whatever isn’t gray
      • dull and neutral colors give a sense of uniformity
      • can modify/contradict readers' intuitive response
      • choose colors to draw attention, to label, to show relationships 
    • form
      • shape
        • allows to distinguish types of data points and encode information
          • well-shaped data has functional and aesthetic character
        • complex shapes can become more difficult to be perceived
      • size
        • attribute used to encode the magnitude or extent of elements 
        • should be aligned to its probable use, importance, and amount of detail involved
          • larger elements draw more attention
        • its encoding should be meaningful
          • e.g. magnitudes of deviations from the baseline
        • overemphasis can lead to distortions
        • choose a size range that is appropriate for the data
        • avoid using size to represent nominal or categorical data where there's no inherent order to the sizes
      • orientation
        • angled or rotated items stand out.
      • length/width
        • useful in bar charts to show quantity
        • avoid stacked bar graphs
      • curvature
        • curved lines can contrast with straight ones.
      • collinearity
        • alignment can suggest grouping or flow
    • highlighting
    • spatial positioning
      • 2D position
        • placement on axes or grids conveys value 
      • 3D position in 2D space

      • grouping
        • proximity implies relationships.
        • keep columns, respectively bars close together
      • enclosure
        • borders or shaded areas signal clusters.
      • depth (stereoscopic or shading)
        • adds dimensionality
  • avoid graphical features that are purely decorative
    • aka elements that don't affect understanding, structure or usability
    • stylistic embellishments
      • borders/frames
        • ornamental lines or patterns around content
      • background images
        • images used for ambiance, not content
      • drop shadows and gradients
        • enhance depth or style but don’t add meaning.
      • icons without function
        • decorative icons that don’t represent actions or concepts
    • non-informative imagery
      • stock photos
        • generic visuals that aren’t referenced in the text.
      • illustrations
        • added for visual interest, not explanation.
      • mascots or logos
        • when repeated or not tied to specific content.
    • layout elements
      • spacers
        • transparent or blank images used to control layout
        • leave the right amount of 'white' space between chart elements
      • custom bullets or list markers
        • designed for flair, not clarity
      • visual separators
        • lines or shapes that divide sections without conveying hierarchy or meaning
  • avoid bias
    • sampling bias
      • showing data that doesn’t represent the full population
        • avoid cherry-picking data
          • aka selecting only the data that support a particular viewpoint while ignoring others that might contradict it
          • enable users to look at both sets of data and contrast them
          • enable users to navigate the data
        • avoid survivor bias
          • aka focusing only on the data that 'survived' a process and ignoring the data that didn’t
      • use representative data
        • aka the dataset includes all relevant groups
      • check for collection bias
        • avoid data that only comes from one source 
        • avoid data that excludes key demographics
    • cognitive bias
      • mental shortcut that sometimes affect interpretation
        • incl. confirmation bias, framing bias, pattern bias
      • balance visual hierarchies
        • don’t make one group look more important by overemphasizing it
      • show uncertainty
        • by including confidence intervals or error bars to reflect variability
      • separate comparisons
        • when comparing groups, use adjacent charts rather than combining them into one that implies a hierarchy
          • e.g. ethnicities, region
    • visual bias
      • design choices that unintentionally (or intentionally) distort meaning
        • respectively how viewers interpret the data
      • avoid manipulating axes 
        • by truncating y-axis
          • exaggerates differences
        • by changing scale types
          • linear vs. logarithmic
            • a log scale compresses large values and expands small ones, which can flatten exponential growth or make small changes seem more significant
          • uneven intervals
            • using inconsistent spacing between tick marks can distort trends
        • by zooming in/out
          • adjusting the axis to focus on a specific range can highlight or hide variability and eventually obscure the bigger picture
        • by using dual axes
          • if the scales differ too much, it can falsely imply correlation or exaggerate relationships 
        • by distorting the aspect ration
          • stretching or compressing the chart area can visually amplify or flatten trends
            • e.g. a steep slope might look flat if the x-axis is stretched
        • avoid inconsistent scales
        • label axes clearly
        • explain scale choices
      • avoid overemphasis 
        • avoid unnecessary repetition 
          • e.g. of the same graph, of content
        • avoid focusing on outliers, (short-term) trends
        • avoid truncating axes, exaggerating scales
        • avoid manipulating the visual hierarchy 
      • avoid color bias
        • bright colors draw attention unfairly
      • avoid overplotting 
        • too much data obscures patterns
      • avoid clutter
        • creates cognitive friction
          • users struggle to focus on what matters because their attention is pulled in too many directions
          • is about design excess
        • avoid unnecessary or distracting elements 
          • they don’t contribute to understanding the data
      • avoid overloading 
        • attempting to show too much data at once
          • is about data excess
        • overwhelms readers' processing capacity, making it hard to extract insights or spot patterns
    • algorithmic bias 
      • the use of ML or other data processing techniques can reinforce certain aspects (e.g. social inequalities, stereotypes)
      • visualize uncertainty
        • include error bars, confidence intervals, and notes on limitations
      • audit data and algorithms
        • look for bias in inputs, model assumptions and outputs
    • intergroup bias
      • charts tend to reflect or reinforce societal biases
        • e.g. racial or gender disparities
      • use thoughtful ordering, inclusive labeling
      • avoid deficit-based comparisons
  • avoid overcomplicating the visualizations 
    • e.g. by including too much data, details, other elements
  • avoid comparisons across varying dimensions 
    • e.g. (two) circles of different radius, bar charts of different height, column charts of different length, 
    • don't make users compare angles, areas, volumes
Previous Post <<||>> Next Post

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.