Graphical Representation Series |
Consider the following best practices in data visualizations (work in progress):
- avoid poor labeling and annotation practices
- label data points
- considering labeling at least the important number of points
- e.g. starts, ends, local/global minima/minima
- when labels clutter the chart or there's minimal variation
- avoid abbreviations
- unless they are defined clearly upfront, consistent and/or universally understood
- can hinder understanding
- abbreviations should help compress content without losing meaning
- use font types, font sizes, and text orientation that are easy to read
- avoid stylish design that makes content hard to read
- avoid redundant information
- text should never overshadow or distort the actual message or data
- use neutral, precise wording
- avoid the use of pre-attentive attributes
- aka visual features that our brains process almost instantly
- color
- has identity value: used to distinguish one thing from another
- carries its own connotations
- gives a visual scale of measure
- the use of color doesn’t always help
- hue
- refers to the dominant color family of a specific color, being processed by the brain based on the different wavelengths of light
- allows to differentiate categories
- use distinct hues to represent different categories
- intensity (aka brightness)
- refers to how strong or weak a color appears
- saturation (aka chroma, intensity)
- refers to the purity or vividness of a color
- as saturation decreases, the color becomes more muted or washed out
- highly saturated colors have little or no gray in it
- highly desaturated colors are almost gray, with none of the original colors
- use high saturation for important elements like outliers, trends, or alerts
- use low saturation for background elements
- avoid pure colors that are bright and saturated
- drive attention to the respective elements
- avoid colors that are too similar in tone or saturation
- avoid colors hard to distinguish for color-blind users
- e.g. red-green color blindness
- brown-green, orange-red, blue-purple combinations
- avoid red-green pairings for status indicators
- e.g. success/error
- e.g. blue-yellow color blindness
- blue-green, yellow-ping, purple-blue
- e.g. total color blindness (aka monochromacy)
- all colors appear as shades of gray
- ⇒ users must rely entirely on contrast, shape, and texture
- use icons, labels, or patterns alongside color
- use tools to test for color issues
- use colorblind-safe palettes
- e.g. ColorBrewer or Viridis4
- for sequential or diverging data, use one hue and vary saturation or brightness to show magnitude
- start with all-gray data elements
- use color only when it corresponds to differences in data
- ⇐ helps draw attention to whatever isn’t gray
- dull and neutral colors give a sense of uniformity
- can modify/contradict readers' intuitive response
- choose colors to draw attention, to label, to show relationships
- form
- shape
- allows to distinguish types of data points and encode information
- well-shaped data has functional and aesthetic character
- complex shapes can become more difficult to be perceived
- size
- attribute used to encode the magnitude or extent of elements
- should be aligned to its probable use, importance, and amount of detail involved
- larger elements draw more attention
- its encoding should be meaningful
- e.g. magnitudes of deviations from the baseline
- overemphasis can lead to distortions
- choose a size range that is appropriate for the data
- avoid using size to represent nominal or categorical data where there's no inherent order to the sizes
- orientation
- angled or rotated items stand out.
- length/width
- useful in bar charts to show quantity
- avoid stacked bar graphs
- curvature
- curved lines can contrast with straight ones.
- collinearity
- alignment can suggest grouping or flow
- highlighting
- spatial positioning
- 2D position
- placement on axes or grids conveys value
- 3D position in 2D space
- grouping
- proximity implies relationships.
- keep columns, respectively bars close together
- enclosure
- borders or shaded areas signal clusters.
- depth (stereoscopic or shading)
- adds dimensionality
- avoid graphical features that are purely decorative
- aka elements that don't affect understanding, structure or usability
- stylistic embellishments
- borders/frames
- ornamental lines or patterns around content
- background images
- images used for ambiance, not content
- drop shadows and gradients
- enhance depth or style but don’t add meaning.
- icons without function
- decorative icons that don’t represent actions or concepts
- non-informative imagery
- stock photos
- generic visuals that aren’t referenced in the text.
- illustrations
- added for visual interest, not explanation.
- mascots or logos
- when repeated or not tied to specific content.
- layout elements
- spacers
- transparent or blank images used to control layout
- leave the right amount of 'white' space between chart elements
- custom bullets or list markers
- designed for flair, not clarity
- visual separators
- lines or shapes that divide sections without conveying hierarchy or meaning
- avoid bias
- sampling bias
- showing data that doesn’t represent the full population
- avoid cherry-picking data
- aka selecting only the data that support a particular viewpoint while ignoring others that might contradict it
- enable users to look at both sets of data and contrast them
- enable users to navigate the data
- avoid survivor bias
- aka focusing only on the data that 'survived' a process and ignoring the data that didn’t
- use representative data
- aka the dataset includes all relevant groups
- check for collection bias
- avoid data that only comes from one source
- avoid data that excludes key demographics
- cognitive bias
- mental shortcut that sometimes affect interpretation
- incl. confirmation bias, framing bias, pattern bias
- balance visual hierarchies
- don’t make one group look more important by overemphasizing it
- show uncertainty
- by including confidence intervals or error bars to reflect variability
- separate comparisons
- when comparing groups, use adjacent charts rather than combining them into one that implies a hierarchy
- e.g. ethnicities, region
- visual bias
- design choices that unintentionally (or intentionally) distort meaning
- respectively how viewers interpret the data
- avoid manipulating axes
- by truncating y-axis
- exaggerates differences
- by changing scale types
- linear vs. logarithmic
- a log scale compresses large values and expands small ones, which can flatten exponential growth or make small changes seem more significant
- uneven intervals
- using inconsistent spacing between tick marks can distort trends
- by zooming in/out
- adjusting the axis to focus on a specific range can highlight or hide variability and eventually obscure the bigger picture
- by using dual axes
- if the scales differ too much, it can falsely imply correlation or exaggerate relationships
- by distorting the aspect ration
- stretching or compressing the chart area can visually amplify or flatten trends
- e.g. a steep slope might look flat if the x-axis is stretched
- avoid inconsistent scales
- label axes clearly
- explain scale choices
- avoid overemphasis
- avoid unnecessary repetition
- e.g. of the same graph, of content
- avoid focusing on outliers, (short-term) trends
- avoid truncating axes, exaggerating scales
- avoid manipulating the visual hierarchy
- avoid color bias
- bright colors draw attention unfairly
- avoid overplotting
- too much data obscures patterns
- avoid clutter
- creates cognitive friction
- users struggle to focus on what matters because their attention is pulled in too many directions
- is about design excess
- avoid unnecessary or distracting elements
- they don’t contribute to understanding the data
- avoid overloading
- attempting to show too much data at once
- is about data excess
- overwhelms readers' processing capacity, making it hard to extract insights or spot patterns
- algorithmic bias
- the use of ML or other data processing techniques can reinforce certain aspects (e.g. social inequalities, stereotypes)
- visualize uncertainty
- include error bars, confidence intervals, and notes on limitations
- audit data and algorithms
- look for bias in inputs, model assumptions and outputs
- intergroup bias
- charts tend to reflect or reinforce societal biases
- e.g. racial or gender disparities
- use thoughtful ordering, inclusive labeling
- avoid deficit-based comparisons
- avoid overcomplicating the visualizations
- e.g. by including too much data, details, other elements
- avoid comparisons across varying dimensions
- e.g. (two) circles of different radius, bar charts of different height, column charts of different length,
- don't make users compare angles, areas, volumes
Previous Post <<||>> Next Post