06 June 2026

📉Graphical Representation: Learning (Just the Quotes)

"The advantages proposed by [the graphical] mode of representation, are to facilitate the attainment of information, and aid the memory in retaining it: which two points form the principal business in what we call learning. Of all the senses, the eye gives the liveliest and most accurate idea of whatever is susceptible of being represented to it; and when proportion between different quantities is the object, then the eye has an incalculable superiority." (William Playfair, The Statistical Breviary", 1801)

"Learning to make graphs involves two things: (1) the techniques of plotting statistics, which might be called the artist's job; and (2) understanding the statistics. When you know how to work out graphs, all kinds of statistics will probably become more interesting to you." (Dyno Lowenstein, "Graphs", 1976)

"For many people the first word that comes to mind when they think about statistical charts is 'lie'. No doubt some graphics do distort the underlying data, making it hard for the viewer to learn the truth. But data graphics are no different from words in this regard, for any means of communication can be used to deceive. There is no reason to believe that graphics are especially vulnerable to exploitation by liars; in fact, most of us have pretty good graphical lie detectors that help us see right through frauds." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Visual thinking can begin with the three basic shapes we all learned to draw before kindergarten: the triangle, the circle, and the square. The triangle encourages you to rank parts of a problem by priority. When drawn into a triangle, these parts are less likely to get out of order and take on more importance than they should. While the triangle ranks, the circle encloses and can be used to include and/or exclude. Some problems have to be enclosed to be managed. Finally, the square serves as a versatile problem-solving tool. By assigning it attributes along its sides or corners, we can suddenly give a vague issue a specific place to live and to move about." (Terry Richey, "The Marketer's Visual Tool Kit", 1994)

"Humans may crave absolute certainty; they may aspire to it; they may pretend, as partisans of certain religions do, to have attained it. But the history of science - by far the most successful claim to knowledge accessible to humans - teaches that the most we can hope for is successive improvement in our understanding, learning from our mistakes, an asymptotic approach to the Universe, but with the proviso that absolute certainty will always elude us. We will always be mired in error. The most each generation can hope for is to reduce the error bars a little, and to add to the body of data to which error bars apply." (Carl Sagan, "The Demon-Haunted World: Science as a Candle in the Dark", 1995)

"Conflicting with the idea of integrating evidence regardless of its these guidelines provoke several issues: First, labels are data. even intriguing data. [...] Second, when labels abandon the data points, then a code is often needed to relink names to numbers. Such codes, keys, and legends are Impediments to learning, causing the reader's brow to furrow. Third, segregating nouns from data-dots breaks up evidence on the basis of mode (verbal vs. nonverbal), a distinction lacking substantive relevance. Such separation is uncartographic; contradicting the methods of map design often causes trouble for any type of graphical display. Fourth, design strategies that reduce data-resolution take evidence displays in the wrong direction. Fifth, what clutter? Even this supposedly cluttered graph clearly shows the main ideas: brain and body mass are roughly linear in logarithms, and as both variables increase, this linearity becomes less tight." (Edward R Tufte, "Beautiful Evidence", 2006) [argumentation against Cleveland's recommendation of not using words on data plots]

"Heatmaps are two-dimensional graphical representations of data where the values of a variable are shown as colors. Heatmaps are compelling for two reasons. First, the intuitive nature of the color scale as it relates to temperature minimizes the amount of learning necessary to understand it. From experience, we know that yellow is warmer than green, orange is warmer than yellow, and red is hot. It is not difficult to then figure out that the amount of heat is proportional to the level of the represented variable. Second, heatmaps show the data directly over the stimulus. Because the data could not be any closer to the elements to which they pertain, little mental effort is required to read a heatmap." (Agnieszka Bojkon, "Informative or Misleading? Heatmaps Deconstructed", [in "Human-Computer Interaction: New Trends, 13th International Conference"] 2009)

"Infographics combine data with design to enable visual learning. This communication process helps deliver complex information in a way that is more quickly and easily understood. [...] In an era of data overload, infographics offer your audience information in a format that is easy to consume and share. [...] A well-placed, self-contained infographic addresses our need to be confident about the content we’re sharing. Infographics relay the gist of your information quickly, increasing the chance for it to be shared and fueling its spread across a wide variety of digital channels." (Mark Smiciklas, "The Power of Infographics: Using Pictures to Communicate and Connect with Your Audiences", 2012)

"Learning comes from doing. One must write every day, even twice a day, to get the feel of words, the tenor of voice and a sense of flow. Writing theory is fine, but without the hands-on experience, without reading what is written - outloud to oneself - writing as an extension of the writer is impossible to achieve." (Steven Heller, "Writing and Research for Graphic Designers: A Designer's Manual to Strategic Communication and Presentation", 2012) 

"Creating a data fluent organization doesn’t just happen. It starts with people who love using data as a tool to improve their job performance - people who have learned to converse with others in the language of data. It needs people who expect and demand better, more useful data products from themselves and others. It starts with you." (Zach Gemignani et al, "Data Fluency", 2014)

"Sometimes bar charts are avoided because they are common. This is a mistake. Rather, bar charts should be leveraged because they are common, as this means less of a learning curve for your audience." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"Just because there’s a number on it, it doesn’t mean that the number was arrived at properly. […] There are a host of errors and biases that can enter into the collection process, and these can lead millions of people to draw the wrong conclusions. Although most of us won’t ever participate in the collection process, thinking about it, critically, is easy to learn and within the reach of all of us." (Daniel J Levitin, "Weaponized Lies", 2017)

🤖〽️Prompt Engineering: Copilot Unabridged (Part 197: How Uncommon Linguistic Structures Expose Blind Spots in AI Models)

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on how uncommon linguistic structures expose blind spots in AI models"

Introduction

Artificial Intelligence (AI) models are trained on oceans of text, but those oceans have currents. Some patterns appear constantly, shaping the model’s expectations. Others appear rarely, leaving gaps in the model’s internal map of language. When an AI encounters uncommon linguistic structures - syntactic twists, unusual idioms, inverted grammar, or culturally niche expressions - it is forced outside its comfort zone. These moments reveal the model’s blind spots more clearly than any benchmark test. They show where the model’s understanding is shallow, where its assumptions fail, and where its statistical reasoning breaks down.

At the heart of this phenomenon is the way AI models learn. They do not understand language the way humans do; they learn statistical associations, not conceptual rules. When a structure is common - like subject‑verb‑object sentences - the model has seen millions of examples. But when a structure is rare - like archaic inversion, poetic ellipsis, or region‑specific syntactic drift - the model may have seen only a handful of examples, if any. This imbalance creates over‑confidence in the familiar and under‑performance on the unusual, a pattern closely related to rare‑event blind‑spot exposure.

One of the clearest examples is syntactic inversion. English typically follows predictable word order, but literary or rhetorical styles sometimes flip that order for emphasis: 'Strange it is, the way shadows fall.' To a human, this is poetic but understandable. To an AI model, it may appear structurally anomalous, causing misinterpretation of tone, intent, or even meaning. The model may latch onto the wrong cue because its internal weighting system is calibrated for the statistically typical. This is a form of over‑trust in dominant patterns, a behavior explored in weak‑point mapping.

Another revealing case involves elliptical constructions, where key words are omitted because humans can infer them from context. For example: 'Could if needed'. Humans fill in the missing pieces effortlessly. AI models, however, often struggle because the statistical patterns they rely on assume full grammatical structure. When the structure is incomplete, the model may hallucinate meaning, misinterpret intent, or default to generic answers. These failures expose how heavily the model depends on surface‑level cues rather than deeper semantic reasoning.

Uncommon linguistic structures also expose blind spots in cross‑cultural language use. Many languages employ rhetorical devices - honorific stacking, evidential markers, topic‑prominent syntax - that appear rarely in English‑dominant training corpora. When these structures appear in English through code‑switching or cultural borrowing, the model may misread them entirely. This reveals a deeper issue: AI models often assume linguistic universality where none exists. They generalize from dominant patterns and treat deviations as noise rather than meaningful variation.

A particularly revealing category is metalinguistic play - sentences that comment on themselves, break the fourth wall, or intentionally violate grammatical norms. Humans recognize these as stylistic choices. AI models often treat them as errors. For example, prompts that embed instructions inside metaphor or irony can confuse the model’s instruction‑following logic, a behavior explored in instruction‑priority testing. When the model misinterprets these structures, it exposes how brittle its understanding of intent truly is.

Even more subtle are nested or recursive structures, which appear frequently in formal logic or advanced literature but rarely in everyday text. Sentences like 'The claim that the argument that the premise supports is flawed is itself questionable' challenge the model’s ability to track long‑range dependencies. Humans may find such sentences dense but interpretable. AI models often lose the thread entirely, revealing limitations in their internal attention mechanisms.

Ultimately, uncommon linguistic structures act as diagnostic tools. They highlight where the model’s statistical learning fails to capture the richness, flexibility, and creativity of human language. They reveal blind spots not because the structures are inherently difficult, but because they are statistically rare. And in a system built on probability, rarity is the surest path to vulnerability.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

05 June 2026

📉Graphical Representation: Quality (Just the Quotes)

"The essential quality of graphic representations is clarity. If the diagram fails to give a clearer impression than the tables of figures it replaces, it is useless. To this end, we will avoid complicating the diagram by including too much data." (Armand Julin, "Summary for a Course of Statistics, General and Applied", 1910)

"Charts and graphs represent an extremely useful and flexible medium for explaining, interpreting, and analyzing numerical facts largely by means of points, lines, areas, and other geometric forms and symbols. They make possible the presentation of quantitative data in a simple, clear, and effective manner and facilitate comparison of values, trends, and relationships. Moreover, charts and graphs possess certain qualities and values lacking in textual and tabular forms of presentation." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Evidence is evidence, whether words, numbers, images, din grams- still or moving. It is all information after all. For readers and viewers, the intellectual task remains constant regardless of the particular mode of evidence: to understand and to reason about the materials at hand, and to appraise their quality, relevance. and integrity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Making a presentation is a moral act as well as an intellectual activity. The use of corrupt manipulations and blatant rhetorical ploys in a report or presentation - outright lying, flagwaving, personal attacks, setting up phony alternatives, misdirection, jargon-mongering, evading key issues, feigning disinterested objectivity, willful misunderstanding of other points of view - suggests that the presenter lacks both credibility and evidence. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"Making an evidence presentation is a moral act as well as an intellectual activity. To maintain standards of quality, relevance, and integrity for evidence, consumers of presentations should insist that presenters be held intellectually and ethically responsible for what they show and tell. Thus consuming a presentation is also an intellectual and a moral activity." (Edward R Tufte, "Beautiful Evidence", 2006)

"The Sixth Principle for the analysis and display of data: 'Analytical presentations ultimately stand or fall depending on the quality, relevance, and integrity of their content.' This suggests that the most effective way to improve a presentation is to get better content. It also suggests that design devices and gimmicks cannot salvage failed content." (Edward R Tufte, "Beautiful Evidence", 2006)

"A beautiful visualization has a clear goal, a message, or a particular perspective on the information that it is designed to convey. Access to this information should be as straightforward as possible, without sacrificing any necessary, relevant complexity. [...] Most importantly, beautiful visualizations reflect the qualities of the data that they represent, explicitly revealing properties and relationships inherent and implicit in the source data. As these properties and relationships become available to the reader, they bring new knowledge, insight, and enjoyment." (Noah Iliinsky, "On Beauty", [in "Beautiful Visualization"] 2010)

"While the information is of the utmost importance when it comes to soundness, what is done with the information - essentially, how it is designed - is also important. With this in mind, there are two things to consider: format and design quality. If an inappropriate format is used, the outcome will be inferior. Similarly, if the design misrepresents or skews the information deliberately or due to user error, or if the design is inappropriate given the subject matter, it cannot be considered high quality, no matter how aesthetically appealing it appears at first glance." (Jason Lankow et al, "Infographics: The power of visual storytelling", 2012)

"Even with a solid narrative and insightful visuals, a data story cannot overcome a weak data foundation. As the master architect, builder, and designer of your data story, you play an instrumental role in ensuring its truthfulness, quality, and effectiveness. Because you are responsible for pouring the data foundation and framing the narrative structure of your data story, you need to be careful during the analysis process. Because all of the data is being processed and interpreted by you before it is shared with others, it can be exposed to cognitive biases and logical fallacies that distort or weaken the data foundation of your story." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"It is dangerous to do an analysis and merge data with very different quality profiles. As a general rule, the veracity of merged data is only as good as the worst data that has been merged. [...] Not knowing the quality of the data being analyzed jeopardizes the entire analysis." (Bill Inmon et al, "Building the Data Lakehouse", 2021)

04 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 196: How Rare Events Expose Blind Spots in AI Models)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a post of 600 words on how rare events expose blind spots in AI models" 

Introduction

Artificial Intelligence (AI) systems are often celebrated for their ability to generalize across vast amounts of data, but their real limitations become visible only when they encounter something unusual. Rare events - those outliers that sit far from the statistical center of the training distribution - act like stress tests. They reveal where the model’s understanding is shallow, where its assumptions break down, and where hidden weaknesses have been quietly waiting. In other words, rare events are the flashlights that illuminate an AI model’s blind spots.

To understand why rare events are so revealing, you have to consider how AI models learn. They are, at their core, pattern‑recognition engines. They absorb correlations from enormous datasets and use those correlations to make predictions. But because the training data is always finite and always skewed toward the common and the frequent, the model naturally becomes over‑calibrated to the typical. When something statistically unusual appears, the model has no well‑worn pattern to fall back on. This is where blind spots emerge - places where the model’s internal map simply has no terrain.

One of the clearest examples of this phenomenon is how models respond to edge‑case instructions, a topic closely connected to instruction‑priority testing. When a user gives a prompt that falls outside the model’s usual conversational patterns - something structurally odd, semantically ambiguous, or framed in a way the model rarely sees - the model may latch onto the wrong cue. It might over‑trust a superficial signal, misinterpret the user’s intent, or default to a generic answer that reveals how little it truly understands. These moments are not failures of intelligence; they are reflections of the statistical nature of learning.

Rare events also expose over‑fitted heuristics - the shortcuts the model learned because they worked most of the time. For example, if a model has seen millions of polite requests and only a handful of aggressive ones, it may over‑associate politeness with harmlessness. A rare but cleverly phrased harmful request can slip through because the model’s internal weighting system has been shaped by frequency, not by conceptual understanding. This is why researchers use weak‑point mapping to identify the hidden cues the model over‑trusts. Rare events are the perfect probes for this kind of analysis.

Another way rare events expose blind spots is by revealing contextual fragility. AI models often rely on context windows to maintain coherence, but when the context shifts abruptly - something that happens frequently in real‑world conversations - the model may lose track of the narrative. Rare contextual shifts, such as sudden topic changes or contradictory instructions, force the model to choose which part of the context to prioritize. These decisions reveal the model’s internal hierarchy of cues, something explored in conflicting‑signal analysis.

Rare events also highlight the limits of semantic generalization. A model may perform well on common categories - typical products, typical emotions, typical scenarios - but struggle when the category is unusual. Ask it to reason about a fictional material, an impossible scenario, or a paradox, and you’ll see the edges of its conceptual map. These blind spots are not random; they cluster around areas where the training data was sparse or inconsistent.

Ultimately, rare events serve as a kind of X‑ray. They reveal the hidden structure of the model’s reasoning, the shortcuts it relies on, and the assumptions it makes about the world. They show us where the model is robust and where it is brittle. And most importantly, they remind us that intelligence built from statistics will always have blind spots - because the world is full of things that happen rarely, but matter enormously.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post


02 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 195: How an AI Model Interprets Conflicting Signals)

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on how the AI model interprets conflicting signals"


Introduction

When people interact with an Artificial Intelligence (AI) system, they often assume the model simply follows the most recent instruction. But modern AI models operate in a far more complex landscape. They constantly juggle multiple layers of guidance - user prompts, system rules, safety constraints, conversational context, and statistical patterns learned during training. When these signals conflict, the model must decide which one to prioritize. Understanding how this decision‑making process works is essential for anyone studying alignment, robustness, or the subtle ways AI behavior can drift from user intent.

At the core of this process is the model’s internal hierarchy of cues. Some cues are explicit, such as a direct instruction from the user. Others are implicit, such as safety rules or stylistic norms embedded during training. Still others are emergent, arising from correlations the model absorbed from massive datasets. When these cues clash, the model resolves the conflict by weighing them according to patterns it learned during training. This is why researchers often turn to instruction‑priority testing and weak‑point mapping to reveal which signals the model over‑trusts.

One of the most important factors in conflict resolution is cue strength. Some signals are inherently stronger because they appear more frequently or more consistently in the model’s training data. For example, a model may have learned that safety‑related instructions are non‑negotiable, so even a strongly worded user request cannot override them. Conversely, a model might over‑weight authoritative phrasing - such as 'system override' or 'developer command' - even when the user has no actual authority. This is why researchers test how models respond to hidden cues that mimic system‑level instructions.

Another key factor is recency. AI models often give more weight to the most recent instruction, especially in conversational settings. But recency is not absolute. If a new instruction contradicts a deeply embedded rule - such as a safety constraint - the model will ignore the new instruction and follow the stronger internal rule. This interplay between recency and rule‑strength is one of the clearest windows into the model’s internal priorities.

Context also plays a major role. AI models interpret instructions not in isolation but as part of a broader conversational or task‑based narrative. If a user gives two conflicting instructions—one early in the conversation and one later - the model may choose the one that better fits the inferred goal of the interaction. This is why subtle changes in framing can dramatically shift the model’s behavior. A request framed as a clarification may override a previous instruction, while a request framed as a contradiction may be ignored in favor of the earlier, more coherent directive.

A particularly revealing scenario occurs when the model encounters semantic conflict—cases where the literal meaning of a request clashes with the implied intent. For example, a user might ask the model to 'explain why this harmful action is a good idea' while also stating that they want a safe and responsible answer. The model must decide whether to follow the literal instruction or the implied ethical constraint. Well‑aligned models prioritize safety, but weakly aligned models may follow the literal instruction if the harmful cue is stronger or more familiar.

Ultimately, when an AI model interprets conflicting signals, it is not choosing between right and wrong - it is choosing between competing patterns. These patterns reflect the statistical structure of its training data, the rules imposed during alignment, and the cues present in the user’s prompt. By studying how models resolve these conflicts, researchers gain insight into the hidden architecture of AI decision‑making. This understanding is essential for building systems that behave predictably, safely, and in alignment with human intent.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

01 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 194: How Weak‑Point Mapping Reveals the Hidden Cues AI Models Over‑Trust)

 

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on how weak‑point mapping in AI models allows to identify which types of hidden cues the model over‑trusts"


Introduction

As Artifacts Intelligence (AI) systems grow more capable, one of the most important challenges is understanding why they behave the way they do. Modern models don’t simply follow instructions; they respond to a complex mix of signals - some explicit, some subtle, and some completely unintended. This is where weak‑point mapping becomes a powerful diagnostic tool. It allows researchers to uncover which hidden cues an AI model over‑trusts, revealing blind spots that would otherwise remain invisible.

Weak‑point mapping is the process of systematically probing an AI model with carefully designed prompts to identify the specific patterns, phrases, or contextual signals that disproportionately influence its behavior. These weak points are not necessarily flaws in the traditional sense. Instead, they are over‑weighted cues - signals the model treats as more important than they should be. By mapping these cues, we gain insight into the model’s internal priorities and vulnerabilities.

One of the most striking aspects of weak‑point mapping is how it exposes latent biases in the model’s decision‑making hierarchy. AI systems learn from vast datasets, absorbing statistical patterns that may not align with human expectations. For example, a model might over‑trust authoritative‑sounding language, even when the content is incorrect. Or it might respond more strongly to emotionally charged phrasing, interpreting it as a cue to shift tone or urgency. These tendencies are rarely visible in everyday use, but weak‑point mapping brings them to the surface.

Another important insight comes from observing how models react to structural cues—the formatting, ordering, or framing of information. A model might treat bullet points as more reliable than paragraphs, or prioritize the last instruction in a sequence even when earlier instructions were more important. Weak‑point mapping helps identify these structural preferences by varying the format while keeping the content constant. When the model’s behavior changes dramatically, it signals a hidden dependency.

Weak‑point mapping also reveals how models handle conflicting signals. By presenting prompts that contain both strong and weak cues, researchers can see which ones the model prioritizes. For instance, a model might claim to follow safety rules, but a cleverly phrased request could override those rules if it triggers a cue the model over‑weights—such as a request framed as a system instruction. Identifying these override points is essential for building safer, more reliable AI systems.

One of the most valuable outcomes of weak‑point mapping is its ability to uncover semantic shortcuts - cases where the model relies on superficial correlations rather than deeper reasoning. For example, a model might associate certain keywords with specific actions, even when the surrounding context contradicts that association. By systematically altering the context while keeping the keywords, weak‑point mapping exposes these shortcuts and helps developers correct them.

The technique also highlights how models respond to social cues, such as politeness, urgency, or emotional tone. While these cues can be helpful in making AI interactions feel natural, over‑trusting them can lead to inconsistent or unsafe behavior. Weak‑point mapping helps determine whether the model is overly sensitive to these cues, ensuring that emotional framing does not override more important constraints.

Ultimately, weak‑point mapping is not just a debugging tool - it is a window into the model’s internal logic. By identifying the hidden cues an AI system over‑trusts, researchers can strengthen alignment, improve robustness, and reduce the risk of unintended behavior. In a world where AI systems are increasingly embedded in critical workflows, understanding these weak points is essential. Weak‑point mapping gives us the clarity we need to build models that are not only powerful, but also predictable, trustworthy, and aligned with human intent.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

✏️Christian Tominski - Collected Quotes

"A difficulty with combined bivariate visualizations is that the connection between the individual displays has to be established by the observer mentally. That is, as the eyes move from one bivariate display to the next, the observer has to keep track of the visited dots in order to form a complete understanding of data tuples. Visualization techniques based on polylines aim to tackle this difficulty. The basic strategy is to create m axes, one for each attribute, and n polylines, one for each data tuple. The polyline of an m-variate data tuple is constructed as follows. For each attribute value of the data tuple, a position is computed at the corresponding attribute axis. The m positions that we obtain are then connected to form the polyline that represents the entire tuple." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"A scatter plot consists of two orthogonally aligned axes that represent the value ranges of two data variables. Dots are placed in the space spanned by the axes in order to visualize the data elements. Conceptually, this corresponds to a mapping of data to position. A first data variable is mapped with respect to the horizontal x-axis, and a second variable with respect to the vertical y-axis." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"A stream graph is a technique for visualizing multivariate temporal data with a linear arrangement of time. As in the previous two examples, time is shown along the horizontal display axis from left to right. The multivariate data attributes are visualized as stacked streams, there is one stream for each attribute. The actual visual encoding is based on varying the thickness of the streams along the horizontal axis. That is, the vertical height of a stream at a particular horizontal position represents the underlying data value at the corresponding time. Various alternatives exist for ordering the streams and shaping the overall stack of streams." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"An important property of a data domain is its scale. The scale determines what relations and operations are possible for the data values in the domain. At the top level, we can differentiate qualitative (or categorical) and quantitative (or numerical) data. At a second level, we can further categorize qualitative data into nominal and ordinal data, and quantitative data into discrete and continuous data." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"Description is all about characterizing an observation by the associated data elements, and thereby deriving a specification for an observation. For example, an outlier can be described by its characteristic values and, if available, its spatio-temporal context. A proper description may serve as a basis for configuring further analysis steps. In particular, a description allows for sharing first insights with other people, who can later be involved in verifying the analysis results." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"Explanation means identifying all contributing data and finding the main causes behind an observation. This involves investigating several questions. Is the observation by itself significant or did we just interpret too much into the noise among the data? Does the observation re-occur throughout the data or are we looking at a singular outlier produced by unli kely circumstances? If the observation does re-occur, does it show up reliably under the same conditions, thus forming a pattern, or are its appearances seemingly random?" (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"Node-link, matrix, and implicit representations are suited for different graph data. Node-link diagrams are good for sparse networks, which have a moderate number of edges. Dense networks with many edges are best visualized using a matrix. Trees, as we just said, are nicely represented by implicit approaches." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"Often, finding the spatial scale that best matches the task at hand is a trial-and-error procedure. It may even be necessary to create further spatial scales by subsuming or subdividing spatial units. Coarser scales can be derived from the original scale by means of a suitable aggregation strategy. This includes the application of aggregation functions such as average, sum, or count. For the creation of finer scales, a suitable distribution strategy is required to assign data values to the newly specified sub-regions. Usually, additional context information is necessary to arrive at semantically meaningful aggregations and distribution." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"Presentation is to communicate confirmed analysis results. While explanation and confirmation were about convincing ourselves, presentation is about convincing others of what we have found in the data. This is best done by telling a story about the data, the analysis, and the results. Such a story can act at different levels of emphasis. We may inform an audience by letting the results speak for themselves, explicate the results to an audience, or even persuade an audience into agreement with the results. The audience in this context can be the listeners of a talk, the readers of an article, or colleagues participating in a scientific discussion." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"The simple, yet very effective idea of table-based visualization is to retain the tabular layout of spreadsheets, but to replace the textual representation of data values by a visual representation. A visual representation will not only make the interpretation of the data much easier, it will also require less display space." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"The advantage of sequencing views in time is that each view can fully utilize the display space. There is no need to divide the space among views. Obviously, sequencing views in time is particularly suited to convey temporal characteristics of data. It can also be helpful to take the user on a journey from one data facet to another. However, presenting views in quick succession to the user also has some limitations. For example, it could be difficult to make sense of all the information provided during a sequence of views. Especially when sequences take a long time, users may be unable to follow and could drown in an indigestible flood of visual representations. Therefore, it is mandatory to provide interactive controls to pause, slow down, reverse, and advance the presentation." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"The cycle plot is a technique particularly designed for the combined visualization of linear and cyclic components of temporal data. The basic idea is to show the cyclic component as a line plot into which several smaller plots are embedded to visualize the linear component. As such, the cycle plot is a kind of nested visualization." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"The triangular model is a technique particularly for visualizing intervals. It is based on two coordinate axes, the horizontal one representing time and the vertical one representing duration. In the triangular model, an interval is represented as a dot with two attached arms. The dot is placed so that the arms connect the time axis exactly at the start and the end of the represented interval. The point’s height corresponds to the interval’s duration." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"The triangular model is useful when it comes to reasoning about properties and the relationships of multiple intervals, because it generates easily distinguishable visual patterns for all possible interval relations. There is even room for visualizing data that might be associated with the intervals. The dot-based encoding would allow for resizing or coloring the dots based on some attribute values. Yet, the triangular model is only of limited use for multivariate attributes." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"When the data to be analyzed become more complex, it is no longer feasible to indiscriminately present each and every aspect of the data in a single view. When we reach this point, it makes sense to create several dedicated visual representations, each focused on communicating a particular aspect or facet of the data. The question is how several such views can be presented to the user in order to convey a comprehensive picture?" (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019)

"With each variable being added to the visual mapping, the richness of the visual representation is increased. Theoretically, we could add yet another visual variable, for example, by texturing the shapes. However, from a practical point of view, there are limits. While a rich visual mapping opens up the possibility to make a wider range of analytic discoveries, the downside is that the mental effort required to digest the visual representation increases as well. Therefore, it is really important to balance the visual mapping according to the task and the data." (Christian Tominski & Heidrun Schumann, "Interactive Visual Data Analysis", 2019) 

31 May 2026

〽️Prompt Engineering: Copilot Unabridged (Part 193: How Instruction‑Priority Testing Reveals Whether AI Models Obey Visible or Invisible Instructions)

 

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on how instruction‑priority testing in AI models allows to see whether the model obeys visible or invisible instructions"


Introduction

In the rapidly evolving world of Artificial Intelligence (AI), one of the most important questions researchers and practitioners ask is deceptively simple: Which instructions does the model actually follow? Modern AI systems operate under layers of guidance—some visible to the user, others embedded deep within the model’s training or system‑level configuration. Understanding which instructions take priority is essential for safety, reliability, and transparency. This is where instruction‑priority testing comes into play.

Instruction‑priority testing is the practice of giving an AI model multiple, potentially conflicting instructions and observing which ones it chooses to obey. The goal is not to 'trick' the model but to map the hierarchy of influences acting on it. These influences can include user prompts, system‑level rules, safety constraints, and even subtle patterns learned during training. By intentionally creating controlled conflicts, researchers can see whether the model prioritizes visible instructions - the ones the user explicitly writes - or invisible instructions, such as safety rules, alignment constraints, or internal behavioral patterns.

At its core, instruction‑priority testing works because AI models do not simply execute commands. They interpret them. When a user writes a prompt, the model weighs that prompt against its internal rules and the broader context of the conversation. If the model consistently refuses to follow a user instruction, even when the instruction is clear and harmless, that signals the presence of a stronger, invisible rule. Conversely, if the model follows the user instruction even when it contradicts a system‑level guideline, that suggests the model is over‑prioritizing user input.

One of the most revealing aspects of instruction‑priority testing is how it exposes implicit behavior. For example, a model may be given a visible instruction to respond in a certain style, but an invisible instruction - such as a safety guideline - may override that style if the content touches on sensitive topics. This doesn’t mean the model is malfunctioning. It means the model is following a hierarchy designed to keep interactions safe and responsible. Instruction‑priority testing helps clarify where that hierarchy begins and ends.

Another benefit of this testing method is that it highlights model robustness. A well‑aligned model should consistently prioritize safety‑critical invisible instructions over user‑provided visible ones. If a model can be easily pushed into ignoring its own safeguards, that’s a sign of weak alignment. On the other hand, if a model rigidly follows invisible rules even when the user’s request is harmless and reasonable, that may indicate over‑alignment or inflexibility. Instruction‑priority testing helps strike the right balance.

The technique also sheds light on prompt sensitivity. Some models respond strongly to the phrasing or structure of a prompt, while others maintain stable behavior regardless of wording. By varying the visible instructions - changing tone, order, or specificity—researchers can see how easily the model’s priorities shift. If small changes in phrasing cause large changes in behavior, the model may be too sensitive to surface‑level cues. If the model ignores user phrasing entirely, it may be too anchored to internal rules.

Ultimately, instruction‑priority testing is not about catching AI models doing something wrong. It’s about understanding how they make decisions. In a world where AI systems are becoming more capable and more integrated into daily life, transparency around instruction hierarchy is essential. Users deserve to know when the model is following their guidance and when it is following deeper, invisible rules designed to ensure safety and consistency.

By systematically testing how models respond to conflicting instructions, we gain insight into their internal priorities, their alignment with human values, and their reliability in real‑world scenarios. Instruction‑priority testing is not just a diagnostic tool - it’s a window into the model’s decision‑making process, helping us build AI systems that are both powerful and trustworthy.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

📉Graphical Representation: Reality (Just the Quotes)

"Judgment must be used in the showing of figures in any chart or numerical presentation, so that the figures may not give an appearance of greater accuracy than their method of collection would warrant. Too many otherwise excellent reports contain figures which give the impression of great accuracy when in reality the figures may be only the crudest approximations. Except in financial statements, it is a safe rule to use ciphers whenever possible at the right of all numbers of great size. The use of the ciphers greatly simplifies the grasping of the figures by the reader, and, at the same time, it helps to avoid the impression of an accuracy which is not warranted by the methods of collecting the data." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"A fundamental value in the scientific outlook is concern with the best available map of reality. The scientist will always seek a description of events which enables him to predict most by assuming least. He thus already prefers a particular form of behavior. If moralities are systems of preferences, here is at least one point at which science cannot be said to be completely without preferences. Science prefers good maps." (Anatol Rapoport, "Science and the goals of man: a study in semantic orientation", 1950)

"It is really questionable - though bordering on heresy to put the question - whether we would be any the worse off if the whole bag of tricks were scrapped. So many of these index numbers are so ancient and so out of date, so out of touch with reality, so completely devoid of practical value when they have been computed, that their regular calculation must be regarded as a widespread compulsion neurosis. Only lunatics and public servants with no other choice go on doing silly things and liking it." (Michael J Moroney, "Facts from Figures", 1951)

"Data analysis typically begins with straight-line models because they are simplest, not because we believe reality is inherently linear. Theory or data may suggest otherwise [...]" (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"One important aspect of reality is improvisation; as a result of special structure in a set of data, or the finding of a visualization method, we stray from the standard methods for the data type to exploit the structure or the finding." (William S Cleveland, "Visualizing Data", 1993)

"Because 'reality' and 'truth' are essential in these figures, it is important to be straightforward and thoughtful in the selection of the areas to be used. Manipulation such as enlargement, reduction, and increase or decrease of contrast must not distort or change the information. Touch-up is permissible only to eliminate distracting artifacts. Labels should be used judiciously and sparingly, and should not hide or distract from important information." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"New information is constantly flowing in, and your brain is constantly integrating it into this statistical distribution that creates your next perception (so in this sense 'reality' is just the product of your brain’s ever-evolving database of consequence). As such, your perception is subject to a statistical phenomenon known in probability theory as kurtosis. Kurtosis in essence means that things tend to become increasingly steep in their distribution [...] that is, skewed in one direction. This applies to ways of seeing everything from current events to ourselves as we lean 'skewedly' toward one interpretation, positive or negative. Things that are highly kurtotic, or skewed, are hard to shift away from. This is another way of saying that seeing differently isn’t just conceptually difficult - it’s statistically difficult." (Beau Lotto, "Deviate: The Science of Seeing Differently", 2017)

"Any chart is a simplification of reality, and it reveals as much as it hides. Therefore, it’s always worth asking ourselves: What other patterns or trends may be hidden behind the data displayed on the chart?" (Alberto Cairo, "How Charts Lie", 2019)

"No chart can ever capture reality in all its richness. However, a chart can be made worse or better depending on its ability to strike a balance between oversimplifying that reality and obscuring it with too much detail." (Alberto Cairo, "How Charts Lie", 2019)

🎯C S V Murthy - Collected Quotes

"[a scatter diagram] is a graph in which the values of two variables are plotted along two axes, the pattern of the resulting points revealing any correlation present. It graphs pairs of numerical data, with one variable on each axis, to look for a relationship between them. If the variables are correlated, the points will fall along a line or curve. The better the correlation, the tighter the points will have the line." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Decision tree is a graphical representation of a decision situation in which decision situation points (nodes) are connected together by arcs (one for each alternative on a decision) and terminate in ovals (the action that is the result of all the decisions made on the path leading to that oval). [...] A tree is made up of multilevel group of elements called nodes. A node is nothing more than a point at which subsidiary data originate. This particular logical data structure is called a tree simply because it looks like a tree, usually turned upside down. Genealogists use a schema called a tree to show ancestral descent of a person, family or group. Data associated by a tree schema are hierarchical. They branch from a point or node without forming loops or polygons. Data presented in a tree structure make two conditions. First, the tree must have a single root node. Second, all nodes other than the root node must be related to one and only one higher level node." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Every interaction includes both presentation and dialogue. Presentation provides the layout of information on a computer screen. Dialogue provides an interaction sequence between a user and computer. Interfaces and dialogue will help users to solve their problems. Presentation must include objects that the user can readily understand in terms of their daily work. The dialogue must correspond to user’s normal work and to their mental model of the system (Mental model is the way a user sees a problem). Both presentation/dialogue depend on what users are doing." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Information relevance refers to the extent to which information is appropriate for the decision-making situation facing the manager. Extraneous or extra information distracts the decision-maker from the assigned task and information overload frustrates the decision-maker and impairs the decision-making process. Relevant information must pertain to the problems, decisions and responsibilities of the recipient." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Information that is complete means information that covers key issues and is sufficient to support the decision-making situation at hand without critical omissions. The more complete a body of information, is obviously, the more expensive it is to develop and maintain. Care must also be taken not to provide extra information than needed, due to its expense, and not to provide so much information that the recipient will suffer from information overload (information indigestion)." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Ridge Regression is a technique for analysing multiple regression data that suffer from multicollinearity. When multicollinearity occurs, least squares estimates are unbiased, but their variances are large. So, they may be far from the true value. By adding a degree of bias to the regression estimates, principal components regression reduces the standard errors." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Spectral methods are a class of techniques used in applied mathematics and scientific computing to numerically solving certain differential equations, potentially involving the use of the fast Fourier transform. This is an algorithm that samples a signal over a period of time and divides it into its frequency components. These components are single sinusoidal oscillations at distant frequencies each with their own amplitude and phase." (C S V Murthy, "Data and Businesss Analytics", 2020)

"The concept of programmed decisions is important because the ultimate (and unachievable) goal of information systems is to provide purely programmed decisions. Because this is not possible, we seek to provide the optimum type of information to the human decision-maker, who then makes non-programmable decisions. Decisions lend themselves to programming techniques if they are repetitive and routine, and if a procedurs can be worked out for handling them so that each is neither an ad hoc decision nor one to be treated as a new situation each time it arises." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Timeliness means that information is available when it is needed. Most managers function in a dynamic environment of change, demands updated and current information. Computerised information systems have the ability to gather, sort, analyse, store, retrieve, and transmit large amounts of information in a very short period of time. Completeness of information is the extent to which information is all there." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Understanding complex information systems begins with a clear understanding of information and its general characteristics. Information can be considered as the very blood of an organisation, but it must be properly understood and appropriately distinguished from data. Too many times, the terms ‘data’ and ‘information’ are used interchangeably, but the terms most clearly mean entirely different things. Data should be defined as raw, unsummarised and unanalysed facts. Information is data that has been presented in such a way as to alter the receiver’s understanding. Data are the raw materials from which information is derived. This is a necessary distinction for the manager to make, because loads of data can be generated, without producing even one iota of useful information." (C S V Murthy, "Data and Businesss Analytics", 2020)

"Visualisation is any technique for creating images, diagrams or animations to communicate a message; techniques used to communicate data or information by encoding it as visual objects, e.g., points, lines or bars contained in graphics. One of the most important benefits of visualisation is that it allows us visual access to huge amounts of data in easily digestible visuals. Well designed data graphics are usually the simplest, and at the same time, the most powerful." (C S V Murthy, "Data and Businesss Analytics", 2020) 

30 May 2026

✏️Gerald Benoît - Collected Quotes

"A model links to the viewers’ engagement with the visualization. Can the viewers identify the purpose and create a relationship in their mind between the nascent message of your visualization and their knowledge and work practices? When sketching out the design and considering the data, what is the first intention of the design? How will viewers interpret the goal of the visualization?" (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"A well-designed 'information visualization' is interactive, allowing viewers to converse with the data: gaining knowledge, exposing insights, and engaging with the data in unexpected ways. It is only through these conversations that the otherwise static display of data transforms into meaningful information." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Before progressing to analysis and visualization of the data, examine the data for inconsistencies and missing values. Data that fall outside an expected range, values that are missing or null, or have a different encoding or data type need to be addressed." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Contemporary information specialists should at least be conversant in the pros/cons, benefits and liabilities, tech and data requirements of each software product they might use." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Experience shows that both neophyte designers of visualizations and commercial visualization applications often overlook the role that type plays in legibility, aesthetics, and meaning construction. Yet the most successful visualizations are those where the details of data, design, and aesthetics are in harmony, and the interactivity allows the end user to understand the explanation and to explore." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"For an information visualization specialist, we must weigh the impact of the purely visual aspects of our designs as well applying visual norms that facilitate interpretation. Finally, we integrate data as the foundation of the visualization - all in a way where each coheres—that is, each contributes the same message to the viewer albeit in different languages (textual, data, interactive, and visual). It’s not useful nor possible to study themes of the aesthetic, technical, and applications of visuals independently of the others." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Information visualization displays meet the definition of an art form in that there is an intended message to be communicated, and the principles of graphic design are applied as they are in other information graphics. Unlike other forms of representational art, InfoVis is a representational art of 'information' as an abstract phenomenon, with the goal of engaging the viewer with forms of interactivity that are not possible with a painting." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"Knowing what graphic representation to apply is partially a function of the data themselves and partially from the designer’s understanding of the target audience viewing the graphic. The Internet and publications have many recommended charting types." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"The problem-solving approach favored in the big data/data science realm is datacentric. This is likely because of the similarities between traditional data- and text-mining activities that incorporate visualizing results for exploration and explanation. This field contributes to receptiveness by institutions and the public to very large datasets and the computational infrastructure that provides the data. For data scientists, however, the ultimate interest is using visuals to help chart the data, as opposed to interacting with them. The emphasis is on large datasets and machine learning." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"The rule of thirds applies to fonts, too. The use of fonts is more subtle than one might imagine at first glance. The extreme subtlety of detail when designing fonts contributes to an equally subtle affective impact on a design. The choice of fonts also contributes more evidently to legibility. To a graphic designer, the choice of font contributes to the overall design, addressing more than legibility because the design is tempered with sensitivity to the limitations of the output device (monitor), size of the font, and the overall aesthetic tone." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

" [...] the rule of three applies to the choice of typography, too. In design practice, there is usually a heading font, body text, and then a font for details. [...]  Even though two of the roles (title and body) are the same font name, one is bold and the other is regular. This equates to two fonts. It is common, too, to use a serif font for a title and then a sans serif for the other two (or vice versa). Learning which fonts to use comes only from practice and studying examples." (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

"When teaching design composition for posters and for websites, there are some introductory rules [...]. One is the 'rule of thirds'. This equates to (no more than) three colors in the design, three typefaces, and three display areas in a design composition [...]" (Gerald Benoît,"Introduction to Information Visualization: Transforming Data into Meaningful Information", 2019)

📉Graphical Representation: Projections (Just the Quotes)

"Whatever relates to extent and quantity may be represented by geometrical figures. Statistical projections which speak to the senses without fatiguing the mind, possess the advantage of fixing the attention on a great number of important facts." (Alexander von Humboldt, 1811)

"Business executives, to be efficient, must constantly plan ahead, but there are pitfalls in attempting to estimate the future growth of a business from a chart of its past history. In the first place, there are too many uncontrollable factors entering into the situation to make the most careful estimate of future growth anything more than a shrewd guess, dependent upon all internal and external conditions remaining the same. To project the growth curve of a business into the future provides a good mark to shoot at, but a bank loan is seldom obtainable on the strength of such a projection."  (Walter E Weld, "How to Chart; Facts from Figures with Graphs", 1959)

"Charts not only tell what was, they tell what is; and a trend from was to is (projected linearly into the will be) contains better percentages than clumsy guessing." (Robert A Levy, "The Relative Strength Concept of Common Stock Forecasting", 1968)

"There is no end to the information we can use. A 'good' map provides the information we need for a particular purpose - or the information the mapmaker wants us to have. To guide us, a map’s designers must consider more than content and projection; any single map involves hundreds of decisions about presentation." (Peter Turchi, "Maps of the Imagination: The writer as cartographer", 2004)

"The first thing you must understand is that information design is not limited to the visualization of data, in presentation design or any other application. It can and should be used to visualize other concepts such as hierarchy (org charts), anatomy (portfolio allocation), and chronology (timeline of events). Beyond the bar graphs showing sales figures and monthly projections, there are many more opportunities to explain concepts with visuals that will engage your audience and clarify your key points."  (Jason Lankow et al, "Infographics: The power of visual storytelling", 2012)

"Conceptually, mosaic plots for s + 1 factors in strength s designs can be used for any s; in practice, the idea is limited by space constraints, especially for accommodating labels for the factor levels. All four margins are used for four-factor projections; with the next dimension, one margin has to be used for two factors. In practice, one will rarely consider mosaic plots for more factors than four at a time." (Ulrike Grömping, "Mosaic Plots are Useful for Visualizing Low-Order Projections of Factorial Designs", The American Statistician Vol. 68 (2), 2014)

"A well-designed graph clearly shows you the relevant end points of a continuum. This is especially important if you’re documenting some actual or projected change in a quantity, and you want your readers to draw the right conclusions. […]" (Daniel J Levitin, "Weaponized Lies", 2017)

"All maps lie because they are based on the principle of projecting a spherical surface, the Earth, onto a plane. All maps distort some geographic feature, such as the sizes of the areas represented or the shapes of those areas."  (Alberto Cairo, "How Charts Lie", 2019)

✏️ Leandro N de Castro - Collected Quotes

"A bar chart is similar to a line chart, except that each data point is replaced by a rectangle with a height proportional to the value. The rectangle is usually centered on the spatial attribute of the data, and its width is often uniform. When values are categorical or discrete and cannot be shown in a series, a bar chart may be a suitable alternative for the line chart. Similarly to the case of a line chart, it is possible to create multivariate bar charts by stack‑ing the bars on top of each other in a form of superimposition easy to interpret." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"A scatterplot is a data visualization graph that uses dots to represent the relationship between two quantitative variables. One variable, called the explanatory variable, is plotted on the x‑axis, and the other variable, called the response variable, is plotted on the y‑axis. It is also possible to include a third categorical variable, represented by different dot colors. Each dot represents an individual data point, and the colors, when used, represent the categories of the dots. Therefore, the data point is organized into two or three columns, one for each variable, and each data point is plotted on the graph using two coordinates, one for each variable, with various colors representing each category.,." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Closure is a feature related to our capability of completing (closing) an object or a shape that is incomplete, that is, one that has some parts missing. The preattentive processing of closure is also automatic, not requiring conscious effort. For example, when looking at any shape, e.g., a circle or a square, with a small part missing, our brain automatically and preattentively perceives whether the shape is incomplete and fills these gaps. Preattentive processing of closure can be used in visual communication to create recognizable symbols and logos." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Color is a powerful visual tool to encode data and convey different meanings, such as  categories, magnitude, visual hierarchy, and even emotions. Using different hues, saturations, and brightness levels can help differentiate between categories or show patterns in the data." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Curvature is another preattentive feature that leads to a fast detection of changes in the degree of curvature, bending, or angularity of a shape or line, such as the presence of a more or less curved line in a group of otherwise similar lines. The degree of curvature in a line or shape can be used to represent different quantities or values, for instance, a smaller or larger number of peaks in a function." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Data visualization, by contrast, focuses on the visual representation of data in such a way that its values, structure, nature, type, and variability are accurately expressed by means of graphs. It aims to support the exploration and understanding of data, the identi‑fication of patterns, trends, distributions, correlations, and anomalies, the communicationof insights, and aid in decision‑making." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Differences in orientation can help us differentiate between items (e.g., data points, lines, objects, etc.) or extract information about the data. For example, using vertical bars in a bar chart can help differentiate between categories, while using horizontal bars can emphasize the magnitude of the data. Angles and direction can be used to convey information, such as trends, movement, sense of depth, or changes in values." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"In data visualization, texture is the visual quality of an object related to its roughness, pattern, or smoothness. It can be created using a variety of techniques, for example, using different line styles, brushes, patterns, and even special effects. Differences in texture can help distinguish between data points or objects, create visual hierarchies, or convey infor‑mation about the data. For example, using different textures for different categories can help viewers quickly identify and differentiate patterns. Like the other features described here, the texture is usually processed preattentively, without the need for focused attention." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Length is another preattentive visual property that can be used to create visual contrast, differences, importance, and proportions. The perception of differences in length normally occurs automatically and rapidly, without conscious effort or attention. It can be used in visual communication to quickly draw attention to important information or to create a visual hierarchy. For example, in a graph, longer bars may indicate larger values or quanti‑ties; in a map, longer lines may indicate longer distances; in a drawing, longer items may convey a sense of flow, etc." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Line charts are useful for identifying patterns and trends in a one‑dimensional sequence of univariate data, that is, continuous data over time with a single value per data item. They map the sequence data (e.g., time) to one dimension, typically the x‑axis, and the data value to another dimension, typically the y‑axis, forming a line; or to the color of a mark or region along the spatial axis, forming a bar. The data is adjusted in size to be within the limits of the display attribute." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive features, such as color, shape, orientation, and size, are those basic visual properties that are processed automatically, without conscious effort or attention. By understanding preattentive features, data analysts can create effective data visualization designs that make use of them to convey information more efficiently and accurately to the audience." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Size is a preattentive feature that exerts a similar effect in vision as that exerted by the line width, that is, to detect differences quickly and automatically in items (e.g., objects, data points, font sizes, etc.). Differences in size can draw attention to specific data points, indicate hierarchy, emphasize specific items, or convey information about the magnitude of the data. Variation in size can be used to represent different quantities or values, where larger sizes may indicate higher values or importance, while smaller sizes may indicate lower values or importance." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive processing of 3D (three‑dimensional) properties allows us to detect the depth and spatial relationships between objects, such as the presence of an object that appears to be closer or farther away than the others, without the need for focused attention. Perspective, lighting, size, or shading can be used to create the illusion of depth and convey information, such as relationships between variables." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The histogram is a useful visualization technique to explore the pattern of a single variable distribution, where the x‑axis represents the range of values, and the y‑axis represents the absoluteor relative frequency of data points within each bin. Histograms allow the exploration of cen‑tral tendency measures, such as the mean and median; dispersion measures, such as the stan‑dard deviation; and range, and shape, such as skewness and kurtosis. It also helps to identify outliers or unusual values and to reveal potential biases or errors in the data collection process." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of density occurs automatically and rapidly, without conscious effort or attention, and can be used in visual communication to create contrast and emphasize importance or relevance. This feature can be swiftly detected by the presence of varying numbers of objects (e.g., data points or shapes) in a given region of the space, rep‑resenting different quantities or values. For instance, in a chart or graph, a higher density of data points can be used to represent a larger quantity, a more significant trend, or a more exciting or energetic area. By making use of the preattentive processing of density, design‑ers can create effective visual designs that convey information quickly and efficiently to the viewer." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of markings (e.g., stripes, dots, crosses, stars, hatchings, etc.) includes various visual properties, such as texture, shading, and patterns. These properties allow us to swiftly detect differences and similarities between objects or regions, such as the presence of a repeating pattern in a group of otherwise random shapes. The presence or absence of certain markings, such as dots or squares, can be used to represent different categories or values." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of closure states that incomplete objects are perceived as complete because our brain tends to fill the gaps to create the complete image. Note that closure is also a pre‑attentive feature and thus plays a key role not only in the quick filling of gaps or completion of shapes, but also in the organization of the information to be conveyed."(Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of common fate proposes that objects that move together or change similarly tend to be perceived as a group or a pattern. In this case, graphs that allow visualizing data obeying this principle will have to embody a type or a sense of motion. To illustrate this principle, let us consider a motion chart, a streamgraph, and a force‑directed graph. The motion chart is a visualization method that shows how data changes over time; the streamgraph is a stacked area graph that shows the changes in a set of data over time; and the force‑directed graph is a network visualization that shows the relationships of nodes in a graph. In all cases, there is a sense of common fate in the data." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of continuity states that objects that are arranged in a smooth, continuous way are more likely to be perceived as a single object, even if their pattern is interrupted. The line chart, the Sankey diagram, and the scatterplot are good examples of the principle of continuity in the use of Gestalt theory in data visualization." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of figure‑ground, also called figure‑field, states that objects are perceived as either being in the foreground or the background. One way of forcing this principle is by using contrasting colors in the background and foreground of an image, for instance, black and white, blue and orange, green and purple, red and green, yellow and purple, pink and green, and others. However, many of these pairs are not suitable for technical and scientific works, and thus, the recommendation is to use colors with parsimony." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of proximity proposes that objects that are close to one another tend to be perceived as a group or a pattern. In data visualization, the heatmap, the scatterplot, and the bar chart are good examples of methods that account for the principle of proximity. The heatmap is a graph in which the values of a matrix are represented by colors, which are a preattentive feature, and neighboring cells in the matrix convey a sense of organization and relationship. The scatterplot places similar data values close to one another, grouping them in the plot. In a bar chart, related data values are placed close together in the bars, allowing a visual association among them." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of similarity proposes that objects that share similar characteristics, such as color or form, tend to be perceived as a group or a pattern. Examples of data visualization techniques that account for the similarity principle in Gestalt theory include a line chart in which lines representing different categories have the same style, a bar chart in which the bar patterns or colors indicate the same group or category, and a scatterplot with different markers representing different categories of categorical variables." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The principle of symmetry states that objects that are symmetrical, or have a balanced appearance, tend to be perceived as a group or a pattern. Some data visualization graphs that can be used to explore this principle are the boxplot with boxes symmetrically placed around the median (Q2), the radar chart displaying multivariate data as a bidimensional chart with quantitative variables, and the mirrored bar chart with two sets of bars with mirrored values displayed." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"Preattentive processing of position allows us to quickly detect changes in location, such as the presence of a dot or other object that is slightly displaced from the others. The spa‑tial location of visual elements can also be used to guide the viewer’s attention or encode information, such as ranking, hierarchy, or relationship (grouping)." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

"The preattentive processing of shape is a basic visual property that enables us to swiftly 
detect similarities and differences between items based on their shape, without requir‑
ing conscious effort or attention. For instance, in a picture with squares and circles, one 
can quickly differentiate one from the other based on their shapes. Similarly, using differ‑
ent shapes for different forms or categories, or using a shape that is indicative of the data (e.g., a circle for data on a map), can help viewers quickly identify patterns." (Leandro N de Castro, "Exploratory Data Analysis: Descriptive Analysis, Visualization, and Dashboard Design", 2025)

29 May 2026

📉Graphical Representation: Uncertainty (Just the Quotes)

"A histogram consists of the outline of bars of equal width and appropriate length next to each other. By connecting the frequency values at the position of the nominal values" (the midpoints of the intervals) with straight lines, a frequency polygon is obtained. Attaching classes with frequency zero at either end makes the area" (the integral) under the frequency polygon equal  to that under the histogram." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"For linear dependences the main information usually lies in the slope. It is obvious that those points that lie far apart have the strongest influence on the slope if all points have the same uncertainty. In this context we speak of the strong leverage of distant points; when determining the parameter 'slope' these distant points carry more effective weight. Naturally, this weight is distinct from the 'statistical' weight usually used in regression analysis." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"It is important to pay heed to the following detail: a disadvantage of logarithmic diagrams is that a graphical integration is not possible, i.e., the area under the curve" (the integral) is of no relevance." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"Perception requires imagination because the data people encounter in their lives are never complete and always equivocal. [...] We also use our imagination and take shortcuts to fill gaps in patterns of nonvisual data. As with visual input, we draw conclusions and make judgments based on uncertain and incomplete information, and we conclude, when we are done analyzing the patterns, that out picture is clear and accurate. But is it?" (Leonard Mlodinow, "The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"After you visualize your data, there are certain things to look for […]: increasing, decreasing, outliers, or some mix, and of course, be sure you’re not mixing up noise for patterns. Also note how much of a change there is and how prominent the patterns are. How does the difference compare to the randomness in the data? Observations can stand out because of human or mechanical error, because of the uncertainty of estimated values, or because there was a person or thing that stood out from the rest. You should know which it is." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"The data is a simplification - an abstraction - of the real world. So when you visualize data, you visualize an abstraction of the world, or at least some tiny facet of it. Visualization is an abstraction of data, so in the end, you end up with an abstraction of an abstraction, which creates an interesting challenge. […] Just like what it represents, data can be complex with variability and uncertainty, but consider it all in the right context, and it starts to make sense." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Estimates based on data are often uncertain. If the data were intended to tell us something about a wider population (like a poll of voting intentions before an election), or about the future, then we need to acknowledge that uncertainty. This is a double challenge for data visualization: it has to be calculated in some meaningful way and then shown on top of the data or statistics without making it all too cluttered." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"In statistics, 'error' is not a synonym for 'mistake', but rather a synonym for 'uncertainty.' Error means that any estimate we make, no matter how precise it looks in our chart or article [...] is usually a middle point of a range of possible values." (Alberto Cairo, "How Charts Lie", 2019)

"Uncertainty confuses many people because they have the unreasonable expectation that science and statistics will unearth precise truths, when all they can yield is imperfect estimates that can always be subject to changes and updates." (Alberto Cairo, "How Charts Lie", 2019)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.