Showing posts with label work-in-progress. Show all posts
Showing posts with label work-in-progress. Show all posts

27 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 2: Guidelines)

Graphical Representation Series
Graphical Representation Series
 

Consider the following best practices in data visualizations (work in progress):

  • avoid poor labeling and annotation practices
    • label data points
      • considering labeling at least the important number of points
        • e.g. starts, ends, local/global minima/minima
        • when labels clutter the chart or there's minimal variation
    • avoid abbreviations
      • unless they are defined clearly upfront, consistent and/or universally understood
      • can hinder understanding
        • abbreviations should help compress content without losing meaning
    • use font types, font sizes, and text orientation that are easy to read
    • avoid stylish design that makes content hard to read
    • avoid redundant information
    • text should never overshadow or distort the actual message or data
      • use neutral, precise wording
  • avoid the use of pre-attentive attributes 
    • aka visual features that our brains process almost instantly
    • color
      • has identity value: used to distinguish one thing from another
        • carries its own connotations
        • gives a visual scale of measure
        • the use of color doesn’t always help
      • hue 
        • refers to the dominant color family of a specific color, being processed by the brain based on the different wavelengths of light
          • allows to differentiate categories
        • use distinct hues to represent different categories
      • intensity (aka brightness)
        • refers to how strong or weak a color appears
      • saturation (aka chroma, intensity) 
        • refers to the purity or vividness of a color
          • as saturation decreases, the color becomes more muted or washed out
          • highly saturated colors have little or no gray in it
          • highly desaturated colors are almost gray, with none of the original colors
        • use high saturation for important elements like outliers, trends, or alerts
        • use low saturation for background elements
      • avoid pure colors that are bright and saturated
        • drive attention to the respective elements 
      • avoid colors that are too similar in tone or saturation
      • avoid colors hard to distinguish for color-blind users
        • e.g. red-green color blindness
          • brown-green, orange-red, blue-purple combinations
          • avoid red-green pairings for status indicators 
            • e.g. success/error
        • e.g. blue-yellow color blindness
          • blue-green, yellow-ping, purple-blue
        • e.g. total color blindness (aka monochromacy)
          • all colors appear as shades of gray
            • ⇒ users must rely entirely on contrast, shape, and texture
      • use icons, labels, or patterns alongside color
      • use tools to test for color issues
      • use colorblind-safe palettes 
      • for sequential or diverging data, use one hue and vary saturation or brightness to show magnitude
      • start with all-gray data elements
        • use color only when it corresponds to differences in data
          • ⇐ helps draw attention to whatever isn’t gray
      • dull and neutral colors give a sense of uniformity
      • can modify/contradict readers' intuitive response
      • choose colors to draw attention, to label, to show relationships 
    • form
      • shape
        • allows to distinguish types of data points and encode information
          • well-shaped data has functional and aesthetic character
        • complex shapes can become more difficult to be perceived
      • size
        • attribute used to encode the magnitude or extent of elements 
        • should be aligned to its probable use, importance, and amount of detail involved
          • larger elements draw more attention
        • its encoding should be meaningful
          • e.g. magnitudes of deviations from the baseline
        • overemphasis can lead to distortions
        • choose a size range that is appropriate for the data
        • avoid using size to represent nominal or categorical data where there's no inherent order to the sizes
      • orientation
        • angled or rotated items stand out.
      • length/width
        • useful in bar charts to show quantity
        • avoid stacked bar graphs
      • curvature
        • curved lines can contrast with straight ones.
      • collinearity
        • alignment can suggest grouping or flow
    • highlighting
    • spatial positioning
      • 2D position
        • placement on axes or grids conveys value 
      • 3D position in 2D space

      • grouping
        • proximity implies relationships.
        • keep columns, respectively bars close together
      • enclosure
        • borders or shaded areas signal clusters.
      • depth (stereoscopic or shading)
        • adds dimensionality
  • avoid graphical features that are purely decorative
    • aka elements that don't affect understanding, structure or usability
    • stylistic embellishments
      • borders/frames
        • ornamental lines or patterns around content
      • background images
        • images used for ambiance, not content
      • drop shadows and gradients
        • enhance depth or style but don’t add meaning.
      • icons without function
        • decorative icons that don’t represent actions or concepts
    • non-informative imagery
      • stock photos
        • generic visuals that aren’t referenced in the text.
      • illustrations
        • added for visual interest, not explanation.
      • mascots or logos
        • when repeated or not tied to specific content.
    • layout elements
      • spacers
        • transparent or blank images used to control layout
        • leave the right amount of 'white' space between chart elements
      • custom bullets or list markers
        • designed for flair, not clarity
      • visual separators
        • lines or shapes that divide sections without conveying hierarchy or meaning
  • avoid bias
    • sampling bias
      • showing data that doesn’t represent the full population
        • avoid cherry-picking data
          • aka selecting only the data that support a particular viewpoint while ignoring others that might contradict it
          • enable users to look at both sets of data and contrast them
          • enable users to navigate the data
        • avoid survivor bias
          • aka focusing only on the data that 'survived' a process and ignoring the data that didn’t
      • use representative data
        • aka the dataset includes all relevant groups
      • check for collection bias
        • avoid data that only comes from one source 
        • avoid data that excludes key demographics
    • cognitive bias
      • mental shortcut that sometimes affect interpretation
        • incl. confirmation bias, framing bias, pattern bias
      • balance visual hierarchies
        • don’t make one group look more important by overemphasizing it
      • show uncertainty
        • by including confidence intervals or error bars to reflect variability
      • separate comparisons
        • when comparing groups, use adjacent charts rather than combining them into one that implies a hierarchy
          • e.g. ethnicities, region
    • visual bias
      • design choices that unintentionally (or intentionally) distort meaning
        • respectively how viewers interpret the data
      • avoid manipulating axes 
        • by truncating y-axis
          • exaggerates differences
        • by changing scale types
          • linear vs. logarithmic
            • a log scale compresses large values and expands small ones, which can flatten exponential growth or make small changes seem more significant
          • uneven intervals
            • using inconsistent spacing between tick marks can distort trends
        • by zooming in/out
          • adjusting the axis to focus on a specific range can highlight or hide variability and eventually obscure the bigger picture
        • by using dual axes
          • if the scales differ too much, it can falsely imply correlation or exaggerate relationships 
        • by distorting the aspect ration
          • stretching or compressing the chart area can visually amplify or flatten trends
            • e.g. a steep slope might look flat if the x-axis is stretched
        • avoid inconsistent scales
        • label axes clearly
        • explain scale choices
      • avoid overemphasis 
        • avoid unnecessary repetition 
          • e.g. of the same graph, of content
        • avoid focusing on outliers, (short-term) trends
        • avoid truncating axes, exaggerating scales
        • avoid manipulating the visual hierarchy 
      • avoid color bias
        • bright colors draw attention unfairly
      • avoid overplotting 
        • too much data obscures patterns
      • avoid clutter
        • creates cognitive friction
          • users struggle to focus on what matters because their attention is pulled in too many directions
          • is about design excess
        • avoid unnecessary or distracting elements 
          • they don’t contribute to understanding the data
      • avoid overloading 
        • attempting to show too much data at once
          • is about data excess
        • overwhelms readers' processing capacity, making it hard to extract insights or spot patterns
    • algorithmic bias 
      • the use of ML or other data processing techniques can reinforce certain aspects (e.g. social inequalities, stereotypes)
      • visualize uncertainty
        • include error bars, confidence intervals, and notes on limitations
      • audit data and algorithms
        • look for bias in inputs, model assumptions and outputs
    • intergroup bias
      • charts tend to reflect or reinforce societal biases
        • e.g. racial or gender disparities
      • use thoughtful ordering, inclusive labeling
      • avoid deficit-based comparisons
  • avoid overcomplicating the visualizations 
    • e.g. by including too much data, details, other elements
  • avoid comparisons across varying dimensions 
    • e.g. (two) circles of different radius, bar charts of different height, column charts of different length, 
    • don't make users compare angles, areas, volumes
Previous Post <<||>> Next Post

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.