Showing posts with label noise. Show all posts
Showing posts with label noise. Show all posts

16 October 2024

🧭💹Business Intelligence: Perspectives (Part XVIII: There’s More to Noise)

Business Intelligence Series
Business Intelligence Series

Visualizations should be built with an audience's characteristics in mind! Upon case, it might be sufficient to show only values or labels of importance (minima, maxima, inflexion points, exceptions, trends), while other times it might be needed to show all or most of the values to provide an accurate extended perspective. It even might be useful to allow users switching between the different perspectives to reduce the clutter when navigating the data or look at the patterns revealed by the clutter. 

In data-based storytelling are typically shown the points, labels and further elements that support the story, the aspects the readers should focus on, though this approach limits the navigability and users’ overall experience. The audience should be able to compare magnitudes and make inferences based on what is shown, and the accurate decoding shouldn’t be taken as given, especially when the audience can associate different meanings to what’s available and what’s missing. 

In decision-making, selecting only some well-chosen values or perspectives to show might increase the chances for a decision to be made, though is this equitable? Cherry-picking may be justified by the purpose, though is in general not a recommended practice! What is not shown can be as important as what is shown, and people should be aware of the implications!

One person’s noise can be another person’s signal. Patterns in the noise can provide more insight compared with the trends revealed in the "unnoisy" data shown! Probably such scenarios are rare, though it’s worth investigating what hides behind the noise. The choice of scale, the use of special types of visualizations or the building of models can reveal more. If it’s not possible to identify automatically such scenarios using the standard software, the users should have the possibility of changing the scale and perspective as seems fit. 

Identifying patterns in what seems random can prove to be a challenge no matter the context and the experience in the field. Occasionally, one might need to go beyond the general methods available and statistical packages can help when used intelligently. However, a presenter’s challenge is to find a plausible narrative around the findings and communicate it further adequately. Additional capabilities must be available to confirm the hypotheses framed and other aspects related to this approach.

It's ideal to build data models and a set of visualizations around them. Most probable some noise may be removed in the process, while other noise will be further investigated. However, this should be done through adjustable visual filters because what is removed can be important as well. Rare events do occur, probably more often than we are aware and they may remain hidden until we find the right perspective that takes them into consideration. 

Probably, some of the noise can be explained by special events that don’t need to be that rare. The challenge is to identify those parameters, associations, models and perspectives that reveal such insights. One’s gut feeling and experience can help in this direction, though novel scenarios can surprise us as well.

Not in every set of data one can find patterns, respectively a story trying to come out. Whether we can identify something worth revealing depends also on the data available at our disposal, respectively on whether the chosen data allow identifying significant patterns. Occasionally, the focus might be too narrow, too wide or too shallow. It’s important to look behind the obvious, to look at data from different perspectives, even if the data seems dull. It’s ideal to have the tools and knowledge needed to explore such cases and here the exposure to other real-life similar scenarios is probably critical!

10 December 2011

📉Graphical Representation: Noise (Just the Quotes)

"While all data contain noise, some data contain signals. Before you can detect a signal, you must filter out the noise." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"Noise is a signal we don't like. Noise has two parts. The first has to do with the head and the second with the heart. The first part is the scientific or objective part: Noise is a signal. [...] The second part of noise is the subjective part: It deals with values. It deals with how we draw the fuzzy line between good signals and bad signals. Noise signals are the bad signals. They are the unwanted signals that mask or corrupt our preferred signals. They not only interfere but they tend to interfere at random." (Bart Kosko, "Noise", 2006)

"One person’s signal is another person’s noise and vice versa. We call this relative role reversal the noise-signal duality." (Bart Kosko, "Noise", 2006)

"A signal is a useful message that resides in data. Data that isn’t useful is noise. […] When data is expressed visually, noise can exist not only as data that doesn’t inform but also as meaningless non-data elements of the display (e.g. irrelevant attributes, such as a third dimension of depth in bars, color variation that has no significance, and artificial light and shadow effects)." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"Data contain descriptions. Some are true, some are not. Some are useful, most are not. Skillful use of data requires that we learn to pick out the pieces that are true and useful. [...] To find signals in data, we must learn to reduce the noise - not just the noise that resides in the data, but also the noise that resides in us. It is nearly impossible for noisy minds to perceive anything but noise in data." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"There are two kinds of mistakes that an inappropriate inductive bias can lead to: underfitting and overfitting. Underfitting occurs when the prediction model selected by the algorithm is too simplistic to represent the underlying relationship in the dataset between the descriptive features and the target feature. Overfitting, by contrast, occurs when the prediction model selected by the algorithm is so complex that the model fits to the dataset too closely and becomes sensitive to noise in the data." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies", 2015)

"When we find data quality issues due to valid data during data exploration, we should note these issues in a data quality plan for potential handling later in the project. The most common issues in this regard are missing values and outliers, which are both examples of noise in the data." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015)

"Form simplification means simplifying relationships among the components of the whole, emphasizing the whole and reducing the relevance of individual components by standardizing and generalizing relationships. This results in an increased weight of useful information (signal) against useless information (noise)." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"In addition to managing how the data is visualized to reduce noise, you can also decrease the visual interference by minimizing the extraneous cognitive load. In these cases, the nonrelevant information and design elements surrounding the data can cause extraneous noise. Poor design or display decisions by the data storyteller can inadvertently interfere with the communication of the intended signal. This form of noise can occur at both a macro and micro level." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Visualizations can remove the background noise from enormous sets of data so that only the most important points stand out to the intended audience. This is particularly important in the era of big data. The more data there is, the more chance for noise and outliers to interfere with the core concepts of the data set." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.