Showing posts with label data. Show all posts
Showing posts with label data. Show all posts

30 July 2025

📊Graphical Representation: Sense-making in Data Visualizations (Part 3: Heuristics)

Graphical Representation Series
Graphical Representation Series
 

Consider the following general heuristics in data visualizations (work in progress):

  • plan design
    • plan page composition
      • text
        • title, subtitles
        • dates 
          • refresh, filters applied
        • parameters applied
        • guidelines/tooltips
        • annotation 
      • navigation
        • main page(s)
        • additional views
        • drill-through
        • zoom in/out
        • next/previous page
        • landing page
      • slicers/selections
        • date-related
          • date range
          • date granularity
        • functional
          • metric
          • comparisons
        • categorical
          • structural relations
      • icons/images
        • company logo
        • button icons
        • background
    • pick a theme
      • choose a layout and color schema
        • use a color palette generator
        • use a focused color schema or restricted palette
        • use consistent and limited color scheme
        • use suggestive icons
          • use one source (with similar design)
        • use formatting standards
    • create a visual hierarchy 
      • use placement, size and color for emphasis
      • organize content around eye movement pattern
      • minimize formatting changes
      • 1 font, 2 weights, 4 sizes
    • plan the design
      • build/use predictable and consistent templates
        • e.g. using Figma
      • use layered design
      • aim for design unity
      • define & use formatting standards
      • check changes
    • GRACEFUL
      • group visuals with white space 
      • right chart type
      • avoid clutter
      • consistent & limited color schema
      • enhanced readability 
      • formatting standard
      • unity of design
      • layered design
  • keep it simple 
    • be predictable and consistent 
    • focus on the message
      • identify the core insights and design around them
      • pick suggestive titles/subtitles
        • use dynamics subtitles
      • align content with the message
    • avoid unnecessary complexity
      • minimize visual clutter
      • remove the unnecessary elements
      • round numbers
    • limit colors and fonts
      • use a restrained color palette (<5 colors)
      • stick to 1-2 fonts 
      • ensure text is legible without zooming
    • aggregate values
      • group similar data points to reduce noise
      • use statistical methods
        • averages, medians, min/max
      • categories when detailed granularity isn’t necessary
    • highlight what matters 
      • e.g. actionable items
      • guide attention to key areas
        • via annotations, arrows, contrasting colors 
        • use conditional formatting
      • do not show only the metrics
        • give context 
      • show trends
        • via sparklines and similar visuals
    • use familiar visuals
      • avoid questionable visuals 
        • e.g. pie charts, gauges
    • avoid distortions
      • preserve proportions
        • scale accurately to reflect data values
        • avoid exaggerated visuals
          • don’t zoom in on axes to dramatize small differences
      • use consistent axes
        • compare data using the same scale and units across charts
        • don't use dual axes or shifting baselines that can mislead viewers
      • avoid manipulative scaling
        • use zero-baseline on bar charts 
        • use logarithmic scales sparingly
    • design for usability
      • intuitive interaction
      • at-a-glance perception
      • use contrast for clarity
      • use familiar patterns
        • use consistent formats the audience already knows
    • design with the audience in mind
      • analytical vs managerial perspectives (e.g. dashboards)
    • use different level of data aggregations
      •  in-depth data exploration 
    • encourage scrutiny
      • give users enough context to assess accuracy
        • provide raw values or links to the source
      • explain anomalies, outliers or notable trends
        • via annotations
    • group related items together
      • helps identify and focus on patterns and other relationships
    • diversify 
      • don't use only one chart type
      • pick the chart that reflects the best the data in the conrext considered
    • show variance 
      • absolute vs relative variance
      • compare data series
      • show contribution to variance
    • use familiar encodings
      • leverage (known) design patterns
    • use intuitive navigation
      • synchronize slicers
    • use tooltips
      • be concise
      • use hover effects
    • use information buttons
      • enhances user interaction and understanding 
        • by providing additional context, asking questions
    • use the full available surface
      • 1080x1920 works usually better 
    • keep standards in mind 
      • e.g. IBCS
  • state the assumptions
    • be explicit
      • clearly state each assumption 
        • instead of leaving it implied
    • contextualize assumptions
      • explain the assumption
        • use evidence, standard practices, or constraints
    • state scope and limitations
      • mention what the assumption includes and excludes
    • tie assumptions to goals & objectives
      • helps to clarify what underlying beliefs are shaping the analysis
      • helps identify whether the visualization achieves its intended purpose 
  • show the data
    • be honest (aka preserve integrity)
      • avoid distortion, bias, or trickery
    • support interpretation
      • provide labels, axes, legends
    • emphasize what's meaningful
      • patterns, trends, outliers, correlations, local/global maxima/minima
  • show what's important 
    • e.g. facts, relationships, flow, similarities, differences, outliers, unknown
    • prioritize and structure the content
      • e.g. show first an overview, what's important
    • make the invisible visible
      • think about what we do not see
    • know your (extended) users/audience
      • who'll use the content, at what level, for that
  • test for readability
    • get (early) feedback
      • have the content reviewed first
        • via peer review, dry run presentation
  • tell the story
    • know the audience and its needs
    • build momentum, expectation
    • don't leave the audience to figure it out
    • show the facts
    • build a narrative
      • show data that support it
      • arrange the visuals in a logical sequence
    • engage the reader
      • ask questions that bridge the gaps
        • e.g. in knowledge, in presentation's flow
      • show the unexpected
      • confirm logical deductions
Previous Post <<||>> Next Post

🤖〽️Prompt Engineering: Copilot Unabridged (Part 67: AI and the Illusion of Consciousness: Can Machines Truly Think?)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a blogpost of 600 words on how AI is to mimic consciousness"

Introduction

Artificial Intelligence (AI) has made astonishing strides in recent years, prompting debates about its potential to mimic human consciousness. While AI can exhibit complex behavior, generate original content, and even hold conversations that feel eerily natural, does it possess consciousness? Or is it merely executing an advanced illusion of intelligence?

Consciousness - the awareness of self, emotions, and existence - is a distinctly human trait shaped by biological and psychological processes. AI, despite its advancements, does not experience thoughts, emotions, or awareness in the way humans do. Instead, it mimics consciousness by analyzing vast amounts of data and predicting patterns in human responses.

The Mechanics of AI Mimicry: Pattern Processing vs. Genuine Awareness

AI’s ability to simulate consciousness stems from deep learning, neural networks, and large-scale data processing. These technologies allow AI to recognize patterns, adjust responses, and make seemingly intelligent decisions.

For instance, language models can generate lifelike conversations by statistically predicting responses based on prior dialogues. AI-powered chatbots appear thoughtful, empathetic, and even humorous—but their responses stem from computational probabilities, not actual emotions or understanding.

Neural networks mimic the brain’s structure, but they do not replicate human thought. Unlike the human brain, which adapts dynamically through emotions, intuition, and social experiences, AI operates on mathematical functions and predefined algorithms.

The Question of Self-Awareness

Consciousness entails self-awareness - the ability to recognize oneself as a thinking entity. Humans experience emotions, form personal identities, and contemplate existence. AI, on the other hand, does not possess a self or subjective experience. It does not contemplate its own state or possess intrinsic motivation.

Even AI-driven personal assistants and conversational models - while capable of simulating engagement - do not understand their own responses. They recognize linguistic structures and user intent but remain fundamentally unaware of their own existence.

Creativity: A Byproduct of Data, Not Thought

Can AI be creative? While it generates music, poetry, and art, this creativity is driven by pattern recognition rather than intrinsic motivation or inspiration. AI analyzes existing works and predicts stylistic variations rather than originating genuine creative intent.

Human creativity is deeply tied to emotions, personal experiences, and imagination. AI lacks the ability to reflect, to feel inspiration, or to create from an emotional state. Instead, it mimics artistic processes through algorithmic optimization, appearing creative without experiencing the depth of human creativity.

Limitations of AI’s Illusion of Consciousness

Despite its ability to mimic intelligent behaviors, AI remains constrained by its inherent limitations. Without genuine self-awareness, emotions, or moral reasoning, it lacks the essence of true consciousness. AI does not ponder existence, reflect on personal growth, or experience subjective thoughts.

Additionally, AI depends entirely on data quality. If trained on biased or flawed datasets, its decisions and responses inherit those biases - unlike humans, who adapt through ethical, emotional, and experiential reasoning.

Conclusion

AI does not possess consciousness - it mimics it. Through advanced algorithms, neural networks, and data-driven models, AI simulates elements of awareness, creativity, and intelligence. Yet, it lacks self-reflection, emotions, and the personal identity that define true consciousness.

While AI continues to evolve, intelligence and consciousness are more than data processing. They involve intuition, existential awareness, and emotional depth - qualities that, for now, remain exclusive to living beings.

Disclaimer: The whole text was generated by Copilot at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

29 July 2025

🤖〽️Prompt Engineering: Copilot Unabridged (Part 66: AI and the Illusion of Knowledge: Learning Without Understanding)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a blogpost of 600 words on how AI is to mimic knowledge"

Introduction

Artificial Intelligence has transformed industries by providing rapid access to information and problem-solving capabilities that rival human expertise. But how does AI mimic knowledge? Does it truly know things, or is it merely processing data in complex ways?

Knowledge, in a human sense, is more than just recalling facts - it involves comprehension, reasoning, and contextual awareness. AI, however, does not understand information as humans do. Instead, it simulates knowledge through pattern recognition, data aggregation, and probabilistic predictions.

How AI Processes and Mimics Knowledge

At its core, AI operates through machine learning and natural language processing (NLP), analyzing vast amounts of text and extracting patterns that enable it to respond intelligently. When an AI model answers a question, it is not 'recalling' the way a human does. Instead, it generates the most statistically likely response based on trained data.

For example, AI-powered assistants can provide accurate medical insights, legal interpretations, and even academic analysis. However, they do not understand these topics - they predict and structure responses based on patterns found in the dataset they were trained on.

This mimicry enables AI to appear knowledgeable, but its responses lack subjective reflection or independent critical thinking.

Knowledge vs. Pattern Recognition

Human knowledge stems from experiences, emotional intelligence, and rational deduction. AI, on the other hand, depends on stored datasets and probabilistic modeling. It does not learn in the traditional human sense - it analyzes information but does not gain wisdom or insight from lived experience.

Consider search engines or AI-powered chatbots: They retrieve relevant information efficiently, yet they do not know the significance of that information. Unlike humans, who develop perspectives and interpretations over time, AI delivers responses mechanically, without personal reflection.

Can AI Be an Expert?

AI models can outperform humans in tasks like diagnosing diseases, optimizing logistics, or identifying trends in financial markets. They simulate expertise by processing vast amounts of information rapidly.

However, true expertise involves adapting to uncertainty, ethical considerations, and human intuition - elements that AI lacks. AI can predict outcomes, but it cannot independently determine why something matters beyond the scope of its training.

For example, an AI model trained on legal texts may provide detailed case law analysis. But it does not understand fairness, justice, or morality - it simply processes textual data.

Limitations of AI Knowledge Mimicry

Despite its ability to generate coherent responses, AI has limitations in mimicking deep knowledge:

  • Lack of Understanding: AI does not comprehend the meaning behind concepts; it merely predicts likely associations.
  • Bias in Training Data: AI relies on human-made datasets, which can contain errors, biases, or outdated information.
  • Inability to Innovate Independently: AI does not create new knowledge - it processes existing information and reorganizes it.
  • No Critical Thinking: AI does not challenge ideas, form personal opinions, or develop independent insights.

These limitations highlight the gap between true knowledge and AI-driven pattern recognition.

Conclusion

AI mimics knowledge but does not possess it. Through advanced algorithms and vast datasets, it appears informed, yet lacks genuine understanding, intuition, and the ability to critically analyze beyond its programming.

While AI can enhance research, automate decision-making, and assist professionals, human wisdom remains irreplaceable. Knowledge is more than data - it is comprehension, context, and deep intellectual reflection, qualities that AI has yet to replicate.

Disclaimer: The whole text was generated by Copilot (under Windows 10) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

28 July 2025

🤖〽️Prompt Engineering: Copilot Unabridged (Part 65: AI: A Reflection of Intelligence, Not a Replica)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a blogpost of 600 words on how AI is to mimic intelligence"

Introduction

Artificial intelligence has long fascinated both technologists and philosophers alike. As machines evolve, the question arises: How does AI mimic human intelligence, and can it ever truly replicate the intricacies of human thought?

The reality is that AI does not think as humans do. Instead, it mimics intelligence through patterns, logic, and predictive algorithms that allow it to process information, respond dynamically, and even generate creativity - though within computational boundaries.

The Foundation of AI Mimicry: Learning from Data

AI functions by identifying patterns and learning from vast amounts of data - a process known as machine learning. Unlike humans, who build knowledge through experience, emotions, and reasoning, AI systems rely on structured inputs. Models such as neural networks attempt to simulate the way neurons interact in the human brain, but instead of cognition, they operate through mathematical functions.

For example, large language models (LLMs) predict what comes next in a sentence based on probabilities derived from billions of words. AI-generated art is created by analyzing artistic elements across different styles and assembling outputs that appear creative. These forms of intelligence mimic human processes rather than authentically experience them.

Reasoning vs. Pattern Recognition

Human intelligence thrives on reasoning - the ability to connect concepts, intuit emotions, and act based on context beyond raw data. AI, on the other hand, excels at pattern recognition.

Consider chatbots and virtual assistants. They may respond appropriately to questions by analyzing previous human interactions and predicting relevant replies. However, their understanding remains surface-level rather than intuitive. AI does not possess self-awareness, emotions, or independent thought; it follows structured logic rather than engaging in free-form introspection.

Creativity: Genuine or Simulated?

One of the most intriguing debates in AI is whether it can truly be creative. While AI can generate poetry, music, and art, it does so based on prior inputs and existing patterns. Human creativity is deeply tied to experience, emotion, and a sense of self, whereas AI creativity stems from mathematical optimization.

For example, an AI-powered writing assistant can produce eloquent text based on learned styles, but it does not possess the intrinsic motivations that drive human expression. It mimics artistry rather than experiencing the inspiration behind it.

Limitations of AI Intelligence

While AI has transformed industries - from healthcare diagnostics to autonomous driving - it remains bound by its limitations. Without emotions, intuition, or genuine comprehension, AI lacks the depth of human intelligence. It cannot independently redefine ideas, nor can it develop consciousness.

Additionally, AI depends on data quality; biases in datasets result in flawed decision-making. Human intelligence, by contrast, adapts through emotional and social learning, allowing for ethical reasoning and subjective reflection. This is why, despite AI’s advancements, human oversight remains crucial.

Conclusion

AI is an extraordinary achievement in technology, yet its intelligence is not a direct replica of human cognition. Rather, AI mimics intelligence by recognizing patterns, predicting outcomes, and responding dynamically - all without genuine understanding.

Its ability to learn and evolve is remarkable, but its limitations remind us that intelligence is more than processing data - it is about emotion, intuition, and consciousness, qualities that machines have yet to grasp.

Disclaimer: The whole text was generated by Copilot (under Windows 10) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

06 July 2025

🧭Business Intelligence: Perspectives (Part 32: Data Storytelling in Visualizations)

Business Intelligence Series
Business Intelligence Series

From data-related professionals to book authors on data visualization topics, there are many voices that require from any visualization to tell a story, respectively to conform to storytelling principles and best practices, and this independently of the environment or context in which the respective artifacts are considered. The need for data visualizations to tell a story may be entitled, though in business setups the data, its focus and context change continuously with the communication means, objectives, and, at least from this perspective, one can question storytelling’s hard requirement.

Data storytelling can be defined as "a structured approach for communicating data insights using narrative elements and explanatory visuals" [1]. Usually, this supposes the establishment of a context, respectively a fundament on which further facts, suppositions, findings, arguments, (conceptual) models, visualizations and other elements can be based upon. Stories help to focus the audience on the intended messages, they connect and eventually resonate with the audience, facilitate the retaining of information and understanding the chain of implications the decisions in scope have, respectively persuade and influence, when needed.

Conversely, besides the fact that it takes time and effort to prepare stories and the afferent content (presentations, manually created visualizations, documentation), expecting each meeting to be a storytelling session can rapidly become a nuisance for the auditorium as well for the presenters. Like in any value-generating process, one should ask where the value in storytelling is based on data visualizations and the effort involved, or whether the effort can be better invested in other areas.

In many scenarios, requesting from a dashboard to tell a story is an entitled requirement given that many dashboards look like a random combination of visuals and data whose relationship and meaning can be difficult to grasp and put into a plausible narrative, even if they are based on the same set of data. Data visualizations of any type should have an intentional well-structured design that facilitates visual elements’ navigation, understanding facts’ retention, respectively resonate with the auditorium.

It’s questionable whether such practices can be implemented in a consistent and meaningful manner, especially when rich navigation features across multiple visuals are available for users to look at data from different perspectives. In such scenarios the identification of cases that require attention and the associations existing between well-established factors help in the discovery process.

Often, it feels like visuals were arranged aleatorily in the page or that there’s no apparent connection between them, which makes the navigation and understanding more challenging. For depicting a story, there must be a logical sequencing of the various visualizations displayed in the dashboards or reports, especially when visuals’ arrangement doesn’t reflect the typical navigation of the visuals or when the facts need a certain sequencing that facilitates understanding. Moreover, the sequencing doesn’t need to be linear but have a clear start and end that encompasses everything in between.

Storytelling works well in setups in which something is presented as the basis for one-time or limited in scope sessions like decision-making, fact-checking, awareness raising and other types of similar communication. However, when building solutions for business monitoring and data exploration, there can be multiple stories or no story worth telling, at least not for the predefined scope. Even if one can zoom in or out, respectively rearrange the visuals and add others to highlight the stories encompassed, the value added by taking the information out of the dashboards and performing such actions can be often neglected to the degree that it doesn’t pay off. A certain consistency, discipline and acumen is needed then for focusing on the important aspects and ignoring thus the nonessential. 

Previous Post <<||>> Next Post

References:
[1] Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019 [quotes]

19 June 2025

🤖〽️Prompt Engineering: Copilot Unabridged (Part 26: How Humans Can Outsmart AI: The Power of Human Ingenuity)

Prompt Engineering Series

Prompt: "write a blogpost of 600 words on how human can outsmart AI"

Introduction

Artificial Intelligence (AI) has made incredible advancements, surpassing human capabilities in data processing, automation, and strategic decision-making. However, despite its speed and efficiency, AI still lacks intuition, creativity, and emotional intelligence - qualities that give humans a unique advantage. While AI can outperform humans in structured environments, there are several ways in which human ingenuity can outsmart AI.

1. Leveraging Creativity and Abstract Thinking

AI excels at pattern recognition and logical reasoning, but it struggles with creativity and abstract thought. Humans can:

  • Think outside the box, generating innovative solutions AI cannot predict.
  • Create art, music, and literature that reflect emotions and cultural depth.
  • Solve problems intuitively, without relying solely on data-driven patterns.

While AI can generate content, it lacks the ability to truly understand human creativity, making human ingenuity a powerful advantage.

2. Using Emotional Intelligence and Social Skills

AI lacks empathy, intuition, and emotional intelligence, which are essential for human relationships, leadership, and negotiation. Humans can:

  • Read emotions and body language, adapting communication accordingly.
  • Build trust and rapport, essential for teamwork and collaboration.
  • Make ethical decisions, considering moral implications beyond logic.

AI may analyze sentiment in text, but it cannot genuinely understand human emotions, giving humans an edge in social interactions and leadership.

3. Adapting to Unpredictable Situations

AI relies on structured data and predefined algorithms, but humans excel in adapting to uncertainty. Humans can:

  • Make quick decisions in unpredictable environments, such as crisis management.
  • Learn from minimal examples, while AI requires vast datasets.
  • Navigate complex social dynamics, where AI struggles with nuance.

AI performs well in controlled settings, but humans thrive in real-world unpredictability, making adaptability a key advantage.

4. Outsmarting AI in Ethical and Moral Reasoning

AI lacks moral judgment and ethical reasoning, making humans essential for guiding AI development responsibly. Humans can:

  • Recognize ethical dilemmas that AI may overlook.
  • Ensure fairness and inclusivity in AI-driven decisions.
  • Prevent AI from reinforcing biases, ensuring responsible AI use.

AI may optimize efficiency, but humans prioritize ethical considerations, ensuring technology aligns with societal values.

5. Controlling AI’s Development and Purpose

Ultimately, humans design, regulate, and oversee AI, ensuring it serves human interests rather than operating autonomously. Humans can:

  • Set boundaries for AI, preventing unintended consequences.
  • Regulate AI applications, ensuring responsible use.
  • Guide AI’s evolution, ensuring it complements human intelligence rather than replacing it.

While AI is powerful, humans remain in control, shaping its development to enhance society rather than disrupt it.

Conclusion: Human Ingenuity Will Always Matter

AI may outperform humans in speed, efficiency, and automation, but it cannot replicate creativity, emotional intelligence, adaptability, ethical reasoning, or human oversight. By leveraging these strengths, humans can outsmart AI, ensuring technology remains a tool for progress rather than an unchecked force.

Disclaimer: The whole text was generated by Copilot at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

18 April 2025

🧮ERP: Implementations (Part XV: An Ecosystemic View)

ERP Implementation Series
ERP Implementations Series

In organizations, the network of interconnected systems, technologies, applications, users and all what resides around them, physically or logically, through their direct or indirect interactions form an ecosystem governed by common goals, principles, values, policies, processes and resources. An ERP system becomes thus part of the ecosystem and, given its importance, probably becomes the centerpiece of the respective structure, having the force to shift towards it all the flows existing in organizations, respectively nourishing its satellite systems, enabling them to create more value.

Ecology’s components should allow for growth, regeneration, adaptation, flow, interaction and diversity, respectively creating value for the organizations. The more the flow of information, data, products, and money is unrestricted and sustained by the overall architecture, the more beneficial they become. At least, this is the theoretical basis on which such architectures are built upon. Of course, all this doesn’t happen by itself, but must be sustained and nourished adequately!

The mixture of components can enable system interoperability, scalability, reliability, automation, extensibility or maintainability with impact on collaboration, innovation, sustainability, cost-efficiency and effectiveness. All of these are concepts with weight and further implications for ecology. Some aspects are available by design, while others need to be introduced and enforced in a systematic, respectively systemic manner. It’s an exploration of what’s achievable, of what works, of what creates value. Organizations should review the best practices in the field and explore.

Ideally, the ERP system should nourish the ecology it belongs to, enabling it to grow and create more value for the organization. Conversely, the ERP system can become a cannibal for the ecology, pulling towards it all the resources, but more importantly, limiting the other systems in terms of functionality, value, usage, etc. In other terms, an ERP system has the power to cannibalize its environment, with all the implications deriving from this. This may seem like a pessimistic view, though it’s met in organizations more often than people think. Just think about the rate of failure in ERP implementations, the ERP cannibalizing thus all the financial and human resources available at their disposal.

To create more value, the ERPs should be integrated with the other critical and non-critical information systems – email and other communication systems, customer and vendor relationship systems, inventory management systems, regulatory reporting platforms and probably there are many other systems designed to address the various needs. Of course, not every organization needs all these types of information systems, though the functions exist to some degree in many organizations. Usually, components grow in importance with the shift of needs and attention.

There are also small organizations that only use small subparts of the ERP system (e.g. finance, HR), however an ERP investment becomes more attractive and cost-effective, the further the organization can leverage all its (important) features. It’s also true that an ERP system is not always the solution for everybody. It’s a matter of scale, functionality, and business model. Above all, the final solution must be cost-effective and an enabler for the organization!

Beyond the ecosystemic view, there are design principles and rules that can be used to describe, map and plan the road ahead, though nothing should be considered as fixed. One needs to balance between opportunities and risks, value and costs, respectively between the different priorities. In some scenarios there’s a place for experimentation, while in other scenarios organizations must stick to what they know. Even if there are maybe recipes for success, there’s no guarantee behind them. Each organization must evaluate what works and what doesn’t. It’s a never-ending story in which such topics need to be approached gradually and iteratively.

Previous Post <<||>> Next Post

04 February 2025

🧭Business Intelligence: Perspectives (Part 26: Monitoring - A Cockpit View)

Business Intelligence Series
Business Intelligence Series

The monitoring of business imperatives is sometimes compared metaphorically with piloting an airplane, where pilots look at the cockpit instruments to verify whether everything is under control and the flight ensues according to the expectations. The use of a cockpit is supported by the fact that an airplane is an almost "closed" system in which the components were developed under strict requirements and tested thoroughly under specific technical conditions. Many instruments were engineered and evolved over decades to operate as such. The processes are standardized, inputs and outputs are under strict control, otherwise the whole edifice would crumble under its own complexity. 

In organizational setups, a similar approach is attempted for monitoring the most important aspects of a business. A few dashboards and reports are thus built to monitor and control what’s happening in the areas which were identified as critical for the organization. The various gauges and other visuals were designed to provide similar perspectives as the ones provided by an airplane’s cockpit. At first sight the cockpit metaphor makes sense, though at careful analysis, there are major differences. 

Probably, the main difference is that businesses don’t necessarily have standardized processes that were brought under control (and thus have variation). Secondly, the data used doesn’t necessarily have the needed quality and occasionally isn’t fit for use in the business processes, including supporting processes like reporting or decision making. Thirdly, are high the chances that the monitoring within the BI infrastructures doesn’t address the critical aspects of the business, at least not at the needed level of focus, detail or frequency. The interplay between these three main aspects can lead to complex issues and a muddy ground for a business to build a stable edifice upon. 

The comparison with airplanes’ cockpit was chosen because the number of instruments available for monitoring is somewhat comparable with the number of visuals existing in an organization. In contrast, autos have a smaller number of controls simple enough to help the one(s) sitting in the cockpit. A car’s monitoring capabilities can probably reflect the needs of single departments or teams, though each unit needs its own gauges with specific business focus. The parallel is however limited because the areas of focus in organizations can change and shift in other directions, some topics may have a periodic character while others can regain momentum after a long time. 

There are further important aspects. At high level, the expectation is for software products and processes, including the ones related to BI topics, to have the same stability and quality as the mass production of automobiles, airplanes or other artifacts that have similar complexity and manufacturing characteristics. Even if the design process of software and manufacturing may share many characteristics, the similar aspects diverge as soon as the production processes start, respectively progress, and these are the areas where the most differences lie. Starting from the requirements and ending with the overall goals, everything resembles the characteristics of quick shifting sands on which is challenging to build any stabile edifice.

At micro level in manufacturing each piece was carefully designed and produced according to a set of characteristics that were proved to work. Everything must fit perfectly in the grand design and there are many tests and steps to make sure that happens. To some degree the same is attempted when building software products, though the processes break along the way with the many changes attempted, with the many cost, time and quality constraints. At some point the overall complexity kicks back; it might be still manageable though the overall effort is higher than what organizations bargained for. 

26 January 2025

🧭Business Intelligence: Perspectives (Part 25: Grounding the Roots)

Business Intelligence Series
Business Intelligence Series

When building something that is supposed to last, one needs a solid foundation on which the artifact can be built upon. That’s valid for castles, houses, IT architectures, and probably most important, for BI infrastructures. There are so many tools out there that allow building a dashboard, report or other types of BI artifacts with a few drag-and-drops, moving things around, adding formatting and shiny things. In many cases all these steps are followed to create a prototype for a set of ideas or more formalized requirements keeping the overall process to a minimum. 

Rapid prototyping, the process of building a proof-of-concept by focusing at high level on the most important design and functional aspects, is helpful and sometimes a mandatory step in eliciting and addressing the requirements properly. It provides a fast road from an idea to the actual concept, however the prototype, still in its early stages, can rapidly become the actual solution that unfortunately continues to haunt the dreams of its creator(s). 

Especially in the BI area, there are many solutions that started as a prototype and gained mass until they start to disturb many things around them with implications for security, performance, data quality, and many other aspects. Moreover, the mass becomes in time critical, to the degree that it pulled more attention and effort than intended, with positive and negative impact altogether. It’s like building an artificial sun that suddenly becomes a danger for the nearby planet(s) and other celestial bodies. 

When building such artifacts, it’s important to define what goals the end-result must or would be nice to have, differentiating clearly between them, respectively when is the time to stop and properly address the aspects mandatory in transitioning from the prototype to an actual solution that addresses the best practices in scope. It’s also the point when one should decide upon solution’s feasibility, needed quality acceptance criteria, and broader aspects like supporting processes, human resources, data, and the various aspects that have impact. Unfortunately, many solutions gain inertia without the proper foundation and in extremis succumb under the various forces.

Developing software artifacts of any type is a balancing act between all these aspects, often under suboptimal circumstances. Therefore, one must be able to set priorities right, react and change direction (and gear) according to the changing context. Many wish all this to be a straight sequential road, when in reality it looks more like mountain climbing, with many peaks, valleys and change of scenery. The more exploration is needed, the slower the progress.

All these aspects require additional time, effort, resources and planning, which can easily increase the overall complexity of projects to the degree that it leads to (exponential) effort and more important - waste. Moreover, the complexity pushes back, leading to more effort, and with it to higher costs. On top of this one has the iteration character of BI topics, multiple iterations being needed from the initial concept to the final solution(s), sometimes many steps being discarded in the process, corners are cut, with all the further implications following from this. 

Somewhere in the middle, between minimum and the broad overextending complexity, is the sweet spot that drives the most impact with a minimum of effort. For some organizations, respectively professionals, reaching and remaining in the zone will be quite a challenge, though that’s not impossible. It’s important to be aware of all the aspects that drive and sustain the quality of artefacts, data and processes. There’s a lot to learn from successful as well from failed endeavors, and the various aspects should be reflected in the lessons learned. 

11 October 2024

🧭Business Intelligence: Perspectives (Part 17: Creating Value for Organizations)

Business Intelligence Series
Business Intelligence Series

How does one create value for an organization in BI area? This should be one of the questions the BI professional should ask himself and eventually his/her colleagues on a periodic basis because the mere act of providing reports and good-looking visualizations doesn’t provide value per se. Therefore, it’s important to identify the critical to success and value drivers within each area!

One can start with the data, BI or IT strategies, when organizations invest the time in their direction, respectively with the considered KPIs and/or OKRs defined, and hopefully the organizations already have something similar in place! However, these are just topics that can be used to get a bird view over the overall landscape and challenges. It’s advisable to dig deeper, especially when the strategic, tactical and operational plans aren’t in sync, and let’s be realistic, this happens probably in many organizations, more often than one wants to admit!

Ideally, the BI professional should be able to talk with the colleagues who could benefit from having a set of reports or dashboards that offer a deeper perspective into their challenges. Talking with each of them can be time consuming and not necessarily value driven. However, giving each team or department the chance to speak their mind, and brainstorm what can be done, could in theory bring more value. Even if their issues and challenges should be reflected in the strategy, there’s always an important gap between the actual business needs and those reflected in formal documents, especially when the latter are not revised periodically. Ideally, such issues should be tracked back to a business goal, though it’s questionable how much such an alignment is possible in practice. Exceptions will always exist, no matter how well structured and thought a strategy is!

Unfortunately, this approach also involves some risks. Despite their local importance, the topics raised might not be aligned with what the organization wants, and there can be a strong case against and even a set of negative aspects related to this. However, talking about the costs involved by losing an opportunity can hopefully change the balance favorably. In general, transposing the perspective of issues into the area of their associated cost for the organization has (hopefully) the power to change people’s minds.

Organizations tend to bring forward the major issues, addressing the minor ones only after that, this having the effect that occasionally some of the small issues increase in impact when not addressed. It makes sense to prioritize with the risks, costs and quick wins in mind while looking at the broader perspective! Quick wins are usually addressed at strategic level, but apparently seldom at tactical and operational level, and at these levels one can create the most important impact, paving the way for other strategic measures and activities.

The question from the title is not limited only to BI professionals - it should be in each manager and every employee’s mind. The user is the closest to the problems and opportunities, while the manager is the one who has a broader view and the authority to push the topic up the waiting list. Unfortunately, the waiting lists in some organizations are quite big, while not having a good set of requests on the list might pinpoint that issues might exist in other areas!  

BI professionals and organizations probably know the theory well but prove to have difficulties in combining it with praxis. It’s challenging to obtain the needed impact (eventually the maximum effect) with a minimum of effort while addressing the different topics. Sooner or later the complexity of the topic kicks in, messing things around!

11 September 2024

🗄️Data Management: Data Culture (Part IV: Quo vadis? [Where are you going?])

Data Management Series

The people working for many years in the fields of BI/Data Analytics, Data and Process Management probably met many reactions that at the first sight seem funny, though they reflect bigger issues existing in organizations: people don’t always understand the data they work with, how data are brought together as part of the processes they support, respectively how data can be used to manage and optimize the respective processes. Moreover, occasionally people torture the data until it confesses something that doesn’t necessarily reflect the reality. It’s even more deplorable when the conclusions are used for decision-making, managing or optimizing the process. In extremis, the result is an iterative process that creates more and bigger issues than whose it was supposed to solve!

Behind each blunder there are probably bigger understanding issues that need to be addressed. Many of the issues revolve around understanding how data are created, how are brought together, how the processes work and what data they need, use and generate. Moreover, few business and IT people look at the full lifecycle of data and try to optimize it, or they optimize it in the wrong direction. Data Management is supposed to help, and it does this occasionally, though a methodology, its processes and practices are as good as people’s understanding about data and its use! No matter how good a data methodology is, it’s as weak as the weakest link in its use, and typically the issues revolving around data and data understanding are the weakest link. 

Besides technical people, few businesspeople understand the full extent of managing data and its lifecycle. Unfortunately, even if some of the topics are treated in the books, they are too dry, need hands on experience and some thought in corroborating practices with theories. Without this, people will do things mechanically, processes being as good as the people using them, their value becoming suboptimal and hinder the business. That’s why training on Data Management is not enough without some hands-on experience!

The most important impact is however in BI/Data Analytics areas - how the various artifacts are created and used as support in decision-making, process optimization and other activities rooted in data. Ideally, some KPIs and other metrics should be enough for managing and directing a business, however just basing the decisions on a set of KPIs without understanding the bigger picture, without having a feeling of the data and their quality, the whole architecture, no matter how splendid, can breakdown as sandcastle on a shore meeting the first powerful wave!

Sometimes it feels like organizations do things from inertia, driven by the forces of the moment, initiatives and business issues for which temporary and later permanent solutions are needed. The best chance for solving many of the issues would have been a long time ago, when the issues were still small to create any powerful waves within the organizations. Therefore, a lot of effort is sometimes spent in solving the consequences of decisions not made at the right time, and that can be painful and costly!

For building a good business one needs also a solid foundation. In the past it was enough to have a good set of products that are profitable. However, during the past decade(s) the rules of the game changed driven by the acerb competition across geographies, inefficiencies, especially in the data and process areas, costing organizations on the short and long term. Data Management in general and Data Quality in particular, even if they’re challenging to quantify, have the power to address by design many of the issues existing in organizations, if given the right chance!

Previous Post <<||>> Next Post

21 August 2024

🧭Business Intelligence: Perspectives (Part 14: From Data to Storytelling II)

Business Intelligence Series

Being snapshots in people and organizations’ lives, data arrive to tell a story, even if the story might not be worth telling or might be important only in certain contexts. In fact each record in a dataset has the potential of bringing a story to life, though business people are more interested in the hidden patterns and “stories” the data reveal through more or less complex techniques. Therefore, data are usually tortured until they confess something, and unfortunately people stop analyzing the data with the first confession(s). 

Even if it looks like torture, data need to be processed to reveal certain characteristics, trends or patterns that could help us in sense-making, decision-making or similar specific business purposes. Unfortunately, the volume of data increases with an incredible velocity to which further characteristics like variety, veracity, volume, velocity, value, veracity and variability may add up. 

The data in a dashboard, presentation or even a report should ideally tell a story otherwise the data might not be worthy looking at, at least from some people’s perspective. Probably, that’s one of the reason why man dashboards remain unused shortly after they were made available, even if considerable time and money were invested in them. Seeing the same dull numbers gives the illusion that nothing changed, that nothing is worth reviewing, revealing or considering, which might be occasionally true, though one can’t take this as a rule! Lot of important facts could remain hidden or not considered. 

One can suppose that there are businesses in which something important seldom happens and an alert can do a better job than reviewing a dashboard or a report frequently. Probably an alert is a better choice than reporting metrics nobody looks at! 

Organizations usually define a set of KPIs (key performance indicators) and other types of metrics they (intend to) review periodically. Ideally, the numbers collected should define and reflect the critical points (aka pain points) of an organization, if they can be known in advance. Unfortunately, in dynamic businesses the focus can change considerably from one day to another. Moreover, in systemic contexts critical points can remain undiscovered in time if the set of metrics defined doesn’t consider them adequately. 

Typically only one’s experience and current or past issues can tell what one should consider or ignore, which are the critical/pain points or important areas that must be monitored. Ideally, one should implement alerts for the critical points that require a immediate response and use KPIs for the recurring topics (though the two approaches may overlap). 

Following the flow of goods, money and other resources one can look at the processes and identify the areas that must be monitored, prioritize them and identify the metrics that are worth tracking, respectively that reflect strengths, weaknesses, opportunities, threats and the risks associated with them. 

One can start with what changed by how much, what caused the change(s) and what further impact is expected directly or indirectly, by what magnitude, respectively why nothing changed in the considered time unit. Causality diagrams can help in the process even if the representations can become quite complex. 

The deeper one dives and the more questions one attempts to answer, the higher the chances to find a story. However, can we find a story that’s worth telling in any set of data? At least this is the point some adepts of storytelling try to make. Conversely, the data can be dull, especially when one doesn’t track or consider the right data. There are many aspects of a business that may look boring, and many metrics seem to track the boring but probably important aspects. 

18 August 2024

🧭Business Intelligence: Mea Culpa (Part III: Problem Solving)

Business Intelligence Series
Business Intelligence Series

I've been working for more than 20 years in BI and Data Analytics area, in combination with Software Engineering, ERP implementations, Project Management, IT services and several other areas, which allowed me to look at many recurring problems from different perspectives. One of the things I learnt is that problems are more complex and more dynamic than they seem, respectively that they may require tailored dynamic solutions. Unfortunately, people usually focus on one or two immediate perspectives, ignoring the dynamics and the multilayered character of the problems!

Sometimes, a quick fix and limited perspective is what we need to get started and fix the symptoms, and problem-solvers usually stop there. When left unsupervised, the problems tend to kick back, build up momentum and appear under more complex forms in various places. Moreover, the symptoms can remain hidden until is too late. To this also adds the political agendas and the further limitations existing in organizations (people, money, know-how, etc.).

It seems much easier to involve external people (individual experts, consultancy companies) to solve the problem(s), though unless they get a deep understanding of the business and the issues existing in it, the chances are high that they solve the wrong problems and/or implement the wrong solutions. Therefore, it's more advisable to have internal experts, when feasible, and that's the point where business people with technical expertise and/or IT people with business expertise can help. Ideally, one should have a good mix and the so called competency centers can do a great job in handling the challenges of organizations. 

Between business and IT people there's a gap that can be higher or lower depending on resources know-how or the effort made by organizations to reduce it. To this adds the nature of the issues existing in organizations, which can vary considerable across departments, organizations or any other form of establishment. Conversely, the specific skillset can be transmuted where needed, which might happen naturally, though upon case also considerable effort needs to be involved in the process.

Being involved in similar tasks, one may get the impression that one can do whatever the others can do. This can happen in IT as well on the business side. There can be activities that can be done by parties from the other group, though there are also many exceptions in both directions, especially when one considers that one can’t generalize the applicability and/or transmutation of skillset. 

A more concrete example is the know-how needed by a businessperson to use the BI infrastructure for answering business questions, and ideally for doing all or at least most of the activities a BI professional can do. Ideally, as part of the learning path, it would be helpful to have a pursuable path in between the two points. The mastery of tools helps in the process though there are different mindsets involved.

Unfortunately, the data-related fields are full of overconfident people who get the problem-solving process wrong. Data-based problem-solving resumes in gathering the right facts and data, building the right conceptual model, identifying the right questions to ask, collecting more data, refining methods and solutions, etc. There’s aways an easy wrong way to solve a problem!

The mastery of tools doesn’t imply the mastery of business domains! What people from the business side can bring is deeper insight in the business problems, though getting from there to implementing solutions can prove a long way, especially when problems require different approaches, different levels of approximations, etc. No tool alone can bridge such gaps yet! Frankly, this is the most difficult to learn and unfortunately many data professionals seem to get this wrong!

Previous Post <<||>> Next Post

16 August 2024

🧭Business Intelligence: Perspectives (Part 13: From Data to Storytelling I)

Business Intelligence Series
Business Intelligence Series

Data is an amalgam of signs, words, numbers and other visual or auditory elements used together to memorize, interpret, communicate and do whatever operation may seem appropriate with them. However, the data we use is usually part of one or multiple stories - how something came into being, what it represents, how is used in the various mental and non-mental processes - respectively, the facts, concepts, ideas, contexts places or other physical and nonphysical elements that are brought in connection with.

When we are the active creators of a story, we can in theory easily look at how the story came into being, the data used and its role in the bigger picture, respective the transformative elements considered or left out, etc. However, as soon we deal with a set of data, facts, or any other elements of a story we are not familiar with, we need to extrapolate the hypothetical elements that seem to be connected to the story. We need to make sense of these elements and consider all that seems meaningful, what we considered or left out shaping the story differently. 

As children and maybe even later, all of us dealt with stories in one way or another, we all got fascinated by metaphors' wisdom and felt the energy that kept us awake, focused and even transformed by the words coming from narrator's voice, probably without thinking too much at the whole picture, but letting the words do their magic. Growing up, the stories grew in complexity, probably became richer in meaning and contexts, as we were able to decipher the metaphors and other elements, as we included more knowledge about the world around, about stories and storytelling.

In the professional context, storytelling became associated with our profession - data, information, knowledge and wisdom being created, assimilated and exchanged in more complex processes. From, this perspective, data storytelling is about putting data into a (business) context to seed cultural ground, to promote decision making and better understanding by building a narrative around the data, problems, challenges, opportunities, and further organizational context.

Further on, from a BI's perspective, all these cognitive processes impact on how data, information and knowledge are created, (pre)processed, used and communicated in organizations especially when considering data visualizations and their constituent elements (e.g. data, text, labels, metaphors, visual cues), the narratives that seem compelling and resonate with the auditorium. 

There's no wonder that data storytelling has become something not to neglect in many business contexts. Storytelling has proved that words, images and metaphors can transmit ideas and knowledge, be transformative, make people think, or even act without much thinking. Stories have the power to seed memes, ideas, or more complex constructs into our minds, they can be used (for noble purposes) or misused. 

A story's author usually takes compelling images, metaphors, and further elements, manipulates them to the degree they become interesting to himself/herself, to the auditorium, to the degree they are transformative and become an element of the business vocabulary, respectively culture, without the need to reiterate them when needed to bring more complex concepts, ideas or metaphors into being.  

A story can be seen as a replication of the constituting elements, while storytelling is a set of functions that operate on them and change the initial structure and content into something that might look or not like the initial story. Through retelling and reprocessing in any form, the story changes independently of its initial form and content. Sometimes, the auditorium makes connections not recognized or intended by the storyteller. Other times, the use and manipulation of language makes the story change as seems fit. 

Previous Post <<||>> Next Post

06 April 2024

🧭Business Intelligence: Why Data Projects Fail to Deliver Real-Life Impact (Part I: First Thoughts)

Business Intelligence
Business Intelligence Series

A data project has a set of assumptions and requirements that must be met, otherwise the project has a high chance of failing. It starts with a clear idea of the goals and objectives, and they need to be achievable and feasible, with the involvement of key stakeholders and the executive without which it’s impossible to change the organization’s data culture. Ideally, there should also be a business strategy, respectively a data strategy available to understand the driving forces and the broader requirements. 

An organization’s readiness is important not only in what concerns the data but also the things revolving around the data - processes, systems, decision-making, requirements management, project management, etc. One of the challenges is that the systems and processes available can’t be used as they are for answering important business questions, and many of such questions are quite basic, though unavailability or poor quality of data makes this challenging if not impossible. 

Thus, when starting a data project an organization must be ready to change some of its processes to address a project’s needs, and thus the project can become more expensive as changes need to be made to the systems. For many organizations the best time to have done this was when they implemented the system, respectively the integration(s) between systems. Any changes made after that come in theory with higher costs derived from systems and processes’ redesign.

Many projects start big and data projects are no exception to this. Some of them build a costly infrastructure without first analyzing the feasibility of the investment, or at least whether the data can form a basis for answering the targeted questions. On one side one can torture any dataset and some knowledge will be obtained from it (aka data will confess), though few datasets can produce valuable insights, and this is where probably many data projects oversell their potential. Conversely, some initiatives are worth pursuing even only for the sake of the exposure and experience the employees get. However, trying to build something big only through the perspective of one project can easily become a disaster. 

When building a data infrastructure, the project needs to be an initiative given the transformative potential such an endeavor can have for the organization, and the different aspects must be managed accordingly. It starts with the management of stakeholders’ expectations, with building a data strategy, respectively with addressing the opportunities and risks associated with the broader context.

Organizations recognize that they aren’t capable of planning and executing such a project or initiative, and they search for a partner to lead the way. Becoming overnight such a partner is more than a challenge as a good understanding of the industry and the business is needed. Some service providers have such knowledge, at least in theory, though the leap from knowledge to results can prove to be a challenge even for experienced service providers. 

Many projects follow the pattern: the service provider comes, analyzes the requirements, builds something wonderful, the solution is used for some time and then the business realizes that the result is not what was intended. The causes are multiple and usually form a complex network of causality, though probably the most important aspect is that customers don’t have the in-house technical resources to evaluate the feasibility of requirements, solutions, respectively of the results. Even if organizations involve the best key users, are needed also good data professionals or similar resources who can become the bond between the business and the services provider. Without such an intermediary the disconnect between the business and the service provider can grow with all the implications. 

Previous Post <<||>> Next Post

16 March 2024

🧭Business Intelligence: A Software Engineer's Perspective (Part VII: Think for Yourself!)

Business Intelligence
Business Intelligence Series

After almost a quarter-century of professional experience the best advice I could give to younger professionals is to "gather information and think for themselves", and with this the reader can close the page and move forward! Anyway, everybody seems to be looking for sudden enlightenment with minimal effort, as if the effort has no meaning in the process!

In whatever endeavor you are caught, it makes sense to do upfront a bit of thinking for yourself - what's the task, or more general the problem, which are the main aspects and interpretations, which are the goals, respectively the objectives, how a solution might look like, respectively how can it be solved, how long it could take, etc. This exercise is important for familiarizing yourself with the problem and creating a skeleton on which you can build further. It can be just vague ideas or something more complex, though no matter the overall depth is important to do some thinking for yourself!

Then, you should do some research to identify how others approached and maybe solved the problem, what were the justifications, assumptions, heuristics, strategies, and other tools used in sense-making and problem solving. When doing research, one should not stop with the first answer and go with it. It makes sense to allocate a fair amount of time for information gathering, structuring the findings in a reusable way (e.g. tables, mind maps or other tools used for knowledge mapping), and looking at the problem from the multiple perspectives derived from them. It's important to gather several perspectives, otherwise the decisions have a high chance of being biased. Just because others preferred a certain approach, it doesn't mean one should follow it, at least not blindly!

The purpose of research is multifold. First, one should try not to reinvent the wheel. I know, it can be fun, and a lot can be learned in the process, though when time is an important commodity, it's important to be pragmatic! Secondly, new information can provide new perspectives - one can learn a lot from other people’s thinking. The pragmatism of problem solvers should be combined, when possible, with the idealism of theories. Thus, one can make connections between ideas that aren't connected at first sight.

Once a good share of facts was gathered, you can review the new information in respect to the previous ones and devise from there several approaches worthy of attack. Once the facts are reviewed, there are probably strong arguments made by others to follow one approach over the others. However, one can show that has reached a maturity when is able to evaluate the information and take a decision based on the respective information, even if the decision is not by far perfect.

One should try to develop a feeling for decision making, even if this seems to be more of a gut-feeling and stressful at times. When possible, one should attempt to collect and/or use data, though collecting data is often a luxury that tends to postpone the decision making, respectively be misused by people just to confirm their biases. Conversely, if there's any important benefit associated with it, one can collect data to validate in time one's decision, though that's a more of a scientist’s approach.

I know that's easier to go with the general opinion and do what others advise, especially when some ideas are popular and/or come from experts, though then would mean to also follow others' mistakes and biases. Occasionally, that can be acceptable, especially when the impact is neglectable, however each decision we are confronted with is an opportunity to learn something, to make a difference! 

Previous Post <<||>> Next Post

14 February 2024

🧭Business Intelligence: A One-Man Show (Part VI: The Lakehouse Perspective)

Business Intelligence Suite
Business Intelligence Suite

Continuing the ideas on Christopher Laubenthal's article "Why one person can't do everything in the data space" [1] and why his analogy between a college's functional structure and the core data roles is poorly chosen. In the last post I mentioned as a first argument that the two constructions have different foundations.

Secondly, it's a matter of construction, namely the steps used to arrive from one state to another. Indeed, there's somebody who builds the data warehouse (DWH), somebody who builds the ETL/ELT pipelines for moving the data from the sources to the DWH, somebody who builds the sematic data model that includes business related logic, respectively people who tap into the data for reporting, data visualizations, data science projects, and whatever is still needed in the organization. On top of this, there should be somebody who manages the DWH. I haven't associated any role to them because one of the core roles can be responsible for more than one step. 

In the case of a lakehouse, it is the data engineer who moves the data from the various data sources to the data lake if that doesn't happen already by design or configuration. As per my understanding the data engineers are the ones who design and build the new lakehouse, move transform and manage the data as required. The Data Analysts, Data Scientist and maybe some Information Designers can tap then into the data. However, the DWH and the lakehouse(s) are technologies that facilitate their work. They can still do their work also if the same data are available by other means.

In what concerns the dorm analogy, the verbs were chosen to match the way data warehouses (DWH) or lakehouses are built, though the congruence of the steps is questionable. One could have compared the number of students with the numbers of data entities, but not with the data themselves. Usually, students move by themselves and occupy the places. The story tellers, the assistants and researchers are independent on whether the students are hosted in the dorm or not. Therefore, the analogy seems to be a bit forced. 

Frankly, I covered all the steps except the ones related to Data Science by myself for both described scenarios. It helped that I knew the data from the data sources and the transformations rules I had to apply, respectively the techniques needed for moving and transforming the data, and the volume of data entities was manageable somehow. Conversely, 1-2 more resources in the area of data analysis and visualizations could have helped to bring more value to the business. 

This opens the challenge of scale and it has do to with systems engineering and how the number of components and the interactions between them increase systems' complexity and the demand for managing the respective components. In the simplest linear models, for each multiplier of a certain number of components of the same type from the organization, the number of resources managing the respective layer matches to some degree the multiplier. E.g. if a data engineer can handle x data entities in a unit of time, then for hand n*x components are more likely at least n data engineers required. However, the output of n components is only a fraction of the n*x given the dependencies existing between components and other constraints.

An optimization problem resumes in finding out what data roles to chose to cover an organization's needs. A one man show can be the best solution for small organizations, though unless there's a good division of labor, bringing a second person will make the throughput slower until will become faster.

Previous Post <<|||>> Next Post

Resources:
[1] Christopher Laubenthal (2024) "Why One Person Can’t Do Everything In Data" (link)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.