SQL Troubles: exploration

Showing posts with label exploration. Show all posts

06 July 2025

🧭Business Intelligence: Perspectives (Part 32: Data Storytelling in Visualizations)

Business Intelligence Series

From data-related professionals to book authors on data visualization topics, there are many voices that require from any visualization to tell a story, respectively to conform to storytelling principles and best practices, and this independently of the environment or context in which the respective artifacts are considered. The need for data visualizations to tell a story may be entitled, though in business setups the data, its focus and context change continuously with the communication means, objectives, and, at least from this perspective, one can question storytelling’s hard requirement.

Data storytelling can be defined as "a structured approach for communicating data insights using narrative elements and explanatory visuals" [1]. Usually, this supposes the establishment of a context, respectively a fundament on which further facts, suppositions, findings, arguments, (conceptual) models, visualizations and other elements can be based upon. Stories help to focus the audience on the intended messages, they connect and eventually resonate with the audience, facilitate the retaining of information and understanding the chain of implications the decisions in scope have, respectively persuade and influence, when needed.

Conversely, besides the fact that it takes time and effort to prepare stories and the afferent content (presentations, manually created visualizations, documentation), expecting each meeting to be a storytelling session can rapidly become a nuisance for the auditorium as well for the presenters. Like in any value-generating process, one should ask where the value in storytelling is based on data visualizations and the effort involved, or whether the effort can be better invested in other areas.

In many scenarios, requesting from a dashboard to tell a story is an entitled requirement given that many dashboards look like a random combination of visuals and data whose relationship and meaning can be difficult to grasp and put into a plausible narrative, even if they are based on the same set of data. Data visualizations of any type should have an intentional well-structured design that facilitates visual elements’ navigation, understanding facts’ retention, respectively resonate with the auditorium.

It’s questionable whether such practices can be implemented in a consistent and meaningful manner, especially when rich navigation features across multiple visuals are available for users to look at data from different perspectives. In such scenarios the identification of cases that require attention and the associations existing between well-established factors help in the discovery process.

Often, it feels like visuals were arranged aleatorily in the page or that there’s no apparent connection between them, which makes the navigation and understanding more challenging. For depicting a story, there must be a logical sequencing of the various visualizations displayed in the dashboards or reports, especially when visuals’ arrangement doesn’t reflect the typical navigation of the visuals or when the facts need a certain sequencing that facilitates understanding. Moreover, the sequencing doesn’t need to be linear but have a clear start and end that encompasses everything in between.

Storytelling works well in setups in which something is presented as the basis for one-time or limited in scope sessions like decision-making, fact-checking, awareness raising and other types of similar communication. However, when building solutions for business monitoring and data exploration, there can be multiple stories or no story worth telling, at least not for the predefined scope. Even if one can zoom in or out, respectively rearrange the visuals and add others to highlight the stories encompassed, the value added by taking the information out of the dashboards and performing such actions can be often neglected to the degree that it doesn’t pay off. A certain consistency, discipline and acumen is needed then for focusing on the important aspects and ignoring thus the nonessential.

Previous Post <<||>> Next Post

References:
[1] Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019 [quotes]

14 December 2024

🧭💹Business Intelligence: Perspectives (Part 21: Data Visualization Revised)

Data Visualization Series

Creating data visualizations nowadays became so easy that anybody can do it with a minimum of effort and knowledge, which on one side is great for the creators but can be easily become a nightmare for the readers, respectively users. Just dumping data in visuals can be barely called data visualization, even if the result is considered as such. The problems of visualization are multiple – the lack of data culture, the lack of understanding processes, data and their characteristics, the lack of being able to define and model problems, the lack of educating the users, the lack of managing the expectations, etc.

There are many books on data visualization though they seem an expensive commodity for the ones who want rapid enlightenment, and often the illusion of knowing proves maybe to be a barrier. It's also true that many sets of data are so dull, that the lack of information and meaning is compensated by adding elements that give a kitsch look-and-feel (aka chartjunk), shifting the attention from the valuable elements to decorations. So, how do we overcome the various challenges?

Probably, the most important step when visualizing data is to define the primary purpose of the end product. Is it to inform, to summarize or to navigate the data, to provide different perspectives at macro and micro level, to help discovery, to explore, to sharpen the questions, to make people think, respectively understand, to carry a message, to be artistic or represent truthfully the reality, or maybe is just a filler or point of attraction in a textual content?

Clarifying the initial purpose is important because it makes upfront the motives and expectations explicit, allowing to determine the further requirements, characteristics, and set maybe some limits in what concern the time spent and the qualitative and/or qualitative criteria upon which the end result should be eventually evaluated. Narrowing down such aspects helps in planning and the further steps performed.

Many of the steps are repetitive and past experience can help reduce the overall effort. Therefore, professionals in the field, driven by intuition and experience probably don't always need to go through the full extent of the process. Conversely, what is learned and done poorly, has high chances of delivering poor quality.

A visualization can be considered as effective when it serves the intended purpose(s), when it reveals with minimal effort the patterns, issues or facts hidden in the data, when it allows people to explore the data, ask questions and find answers altogether. One can talk also about efficiency, especially when readers can see at a glance the many aspects encoded in the visualization. However, the more the discovery process is dependent on data navigation via filters or other techniques, the more difficult it becomes to talk about efficiency.

Better criteria to judge visualizations is whether they are meaningful and useful for the readers, whether the readers understood the authors' intent, the further intrinsic implication, though multiple characteristics can be associated with these criteria: clarity, specificity, correctedness, truthfulness, appropriateness, simplicity, etc. All these are important in lower or higher degree depending on the broader context of the visualization.

All these must be weighted in the bigger picture when creating visualizations, though there are probably also exceptions, especially on the artistic side, where artists can cut corners for creating an artistic effect, though also in here the authors need to be truthful to the data and make sure that their work don't distort excessively the facts. Failing to do so might not have an important impact on the short term considerably, though in time the effects can ripple with unexpected effects.

Previous Post <<||>> Next Post

13 December 2024

🧭💹Business Intelligence: Perspectives (Part 20: From BI to AI)

Business Intelligence Series

No matter how good data visualizations, reports or other forms of BI artifacts are, they only serve a set of purposes for a limited amount of time, limited audience or any other factors that influence their lifespan. Sooner or later the artifacts become thus obsolete, being eventually disabled, archived and/or removed from the infrastructure.

Many artifacts require a considerable number of resources for their creation and maintenance over time. Sometimes the costs can be considerably higher than the benefits brought, especially when the data or the infrastructure are used for a narrow scope, though there can be other components that need to be considered in the bigger picture. Having a report or visualization one can use when needed can have an important impact on the business in correcting issues, sizing opportunities or filling the knowledge gaps.

Even if it’s challenging to quantify the costs associated with the loss of opportunities rooted in the lack of data, respectively information, the amounts can be considerable high, greater even than building a whole BI infrastructure. Organization’s agility in addressing the important gaps can make a considerable difference, at least in theory. Having the resources that can be pulled on demand can give organizations the needed competitive boost. Internal or external resources can be used altogether, though, pragmatically speaking, there will be always a gap between demand and supply of knowledgeable resources.

The gap in BI artefacts can be addressed nowadays by AI-driven tools, which have the theoretical potential of shortening the gap between needs and the availability of solutions, respectively a set of answers that can be used in the process. Of course, the processes of sense-making and discovery are not that simple as we’d like, though it’s a considerable step forward.

Having the possibility of asking questions in natural language and guiding the exploration process to create visualizations and other artifacts using prompt engineering and other AI-enabled methods offers new possibilities and opportunities that at least some organizations started exploring already. This however presumes the existence of an infrastructure on which the needed foundation can be built upon, the knowledge required to bridge the gap, respectively the resources required in the process.

It must be stressed out that the exploration processes may bring no sensible benefits, at least no immediately, and the whole process depends on organizations’ capabilities of identifying and sizing the respective opportunities. Therefore, even if there are recipes for success, each organization must identify what matters and how to use technologies and the available infrastructure to bridge the gap.

Ideally to make progress organizations need besides the financial resources the required skillset, a set of projects that support learning and value creation, respectively the design and execution of a business strategy that addresses the steps ahead. Each of these aspects implies risks and opportunities altogether. It will be a test of maturity for many organizations. It will be interesting to see how many organizations can handle the challenge, respectively how much past successes or failures will weigh in the balance.

AI offers a set of capabilities and opportunities, however the chance of exploring and failing fast is of great importance. AI is an enabler and not a magic wand, no matter what is preached in technical workshops! Even if progress follows an exponential trajectory, it took us more than half of century from the first steps until now and probably many challenges must be still overcome.

The future looks interesting enough to be pursued, though are organizations capable to size the opportunities, respectively to overcome the challenges ahead? Are organizations capable of supporting the effort without neglecting the other priorities?

Previous Post <<||>> Next Post

08 April 2024

🧭Business Intelligence: Why Data Projects Fail to Deliver Real-Life Impact (Part III: Failure through the Looking Glass)

Business Intelligence Series

There’s a huge volume of material available on project failure – resources that document why individual projects failed, why in general projects fail, why project members, managers and/or executives think projects fail, and there seems to be no other more rewarding activity at the end of a project than to theorize why a project failed, the topic culminating occasionally with the blaming game. Success may generate applause, though it's failure that attracts and stirs the most waves (irony, disapproval, and other similar behavior) and everybody seems to be an expert after the consumed endeavor.

The mere definition of a project failure – not fulfilling project’s objectives within the set budget and timeframe - is a misnomer because budgets and timelines are estimated based on the information available at the beginning of the project, the amount of uncertainty for many projects being considerable, and data projects are no exceptions from it. The higher the uncertainty the less probable are the two estimates. Even simple projects can reveal uncertainty especially when the broader context of the projects is considered.

Even if it’s not a common practice, one way to cope with uncertainty is to add a tolerance for the estimates, though even this practice probably will not always accommodate the full extent of the unknown as the tolerances are usually small. The general expectation is to have an accurate and precise landing, which for big or exploratory projects is seldom possible!

Moreover, the assumptions under which the estimates hold are easily invalidated in praxis – resources’ availability, first time right, executive’s support to set priorities, requirements’ quality, technologies’ maturity, etc. If one looks beyond the reasons why projects fail in general, quite often the issues are more organizational than technological, the lack of knowledge and experience being some of the factors.

Conversely, many projects will not get approved if the estimates don’t look positive, and therefore people are pressured in one way or another to make the numbers fit the expectations. Some projects, given their importance, need to be done even if the numbers don’t look good or can’t be quantified correctly. Other projects represent people’s subsistence on the job, respectively people's self-occupation to create motion, though they can occasionally have also a positive impact for the organizations. These kinds of aspects almost never make it in statistics or surveys. Neither do the big issues people are afraid to talk about. Where to consider that in the light of politics and office’s grapevine the facts get distorted!

Data projects reflect all the symptoms of failure projects have in general, though when words like AI, Statistics or Machine Learning are used, the chances for failure are even higher given that the respective fields require a higher level of expertise, the appropriate use of technologies and adherence to the scientific process for the results to be valid. If projects can benefit from general recipes, respectively established procedures and methods, their range of applicability decreases when the mentioned areas are involved.

Many data projects have an exploratory nature – seeing what’s possible - and therefore a considerable percentage will not reach production. Moreover, even those that reach that far might arrive to be stopped or discarded sooner or later if they don’t deliver the expected value, and probably many of the models created in the process are biased, irrelevant, or incorrectly apply the theory. Where to add that the mere use of tools and algorithms is not Data Science or Data Analysis.

The challenge for many data projects is to identify which Project Management (PM) best practices to consider. Following all or no practices at all just increases the risks of failure!

Previous Post <<||>> Next Post

SQL Troubles

Pages

06 July 2025

🧭Business Intelligence: Perspectives (Part 32: Data Storytelling in Visualizations)

14 December 2024

🧭💹Business Intelligence: Perspectives (Part 21: Data Visualization Revised)

13 December 2024

🧭💹Business Intelligence: Perspectives (Part 20: From BI to AI)

08 April 2024

🧭Business Intelligence: Why Data Projects Fail to Deliver Real-Life Impact (Part III: Failure through the Looking Glass)

About Me