Showing posts with label Business intelligence. Show all posts
Showing posts with label Business intelligence. Show all posts

09 August 2025

🧭Business Intelligence: Perspectives (Part 33: Data Lifecycle for Analytics)

Business Intelligence Series
Business Intelligence Series

In the context of BI, Analytics and other data-related topics, the various parties usually talk about data ingestion, preparation, storage, analysis and visualization, often ignoring processes like data generation, collection, and interpretation. It’s also true that a broader discussion may shift the attention unnecessarily, though it’s important to increase people’s awareness in respect to data’s full lifecycle. Otherwise, many of the data solutions become a mix of castles built into the air, respectively structures of cards waiting for the next flurry to be blown away. 

Data is generated continuously by organizations, their customers, vendors, and third parties, as part of a complex network of processes, systems and integrations that extend beyond their intended boundaries. Independently of their type, scope and various other characteristics, all processes consume and generate data at a rapid pace that steadily exceeds organizations’ capabilities to make good use of it.

There are also scenarios in which the data must be collected via surveys, interviews, forms, measurements or direct observations, and whatever processes are used to elicit some aspect of importance. The volume and other characteristics of data generated in this way may depend on the goals and objectives in scope, respectively the methods, procedures and even the methodologies used. 

Data ingestion is the process of importing data from the various sources into a central or intermediary repository for storage, processing, analysis and visualization. The repository can be a data mart, warehouse, lakehouse, data lake or any other destination intended for the intermediary or the final intended destination of data. Moreover, data can have different levels of quality in respect to its intended usage.

Data storage refers to the systems and approaches used to securely retain, organize, and access data throughout its journey within the various layers of the infrastructure. It focuses on where and how data is stored, independently on whether that’s done on-premises, in the cloud or across hybrid environments.

Data preparation is the process of transforming the data into a form close to what is intended for analysis and visualization. It may involve data aggregation, enrichment, transposition and other operations that facilitate further steps. It’s probably the most important step in a data project given that the final outcome can have an important impact on data analysis and visualization, facilitating or impeding the respective processes. 

Data analysis consists of a multitude of processes that attempt to harness value from data in its various forms of aggregation. The ultimate purpose is to infer meaningful information, respectively knowledge from the data augmented as insights. The road from raw data to these targeted outcomes is a tedious one, where recipes can help and imped altogether. Expecting value from any pile of data can easily become a costly illusion when data, processes and their usage is poorly understood and harnessed. 

Data visualization is the means of presenting data and its characteristics in the form of figures, diagrams and other forms of representation that facilitate data’s navigation, perception and understanding for various purposes. Usually, the final purpose is fact-checking, decision-making, problem-solving, etc., though there is a multitude of steps in between. Especially in these areas there are mixed good and poor practices altogether.  

Data interpretation is the attempt of drawing meaningful conclusions from the data, information and knowledge gained mainly from data analysis and visualization. It is often a subjective interpretation as it’s usually regarded from people’s understanding of the various facts as they are considered. The inferences made in the process can be a matter of gut feeling, respectively of mature analysis. It’s about sense-making, contextualization, critical thinking, pattern recognition, internalization and externalization, and other similar cognitive processes.

Previous Post <<||>> Next Post

06 July 2025

🧭Business Intelligence: Perspectives (Part 32: Data Storytelling in Visualizations)

Business Intelligence Series
Business Intelligence Series

From data-related professionals to book authors on data visualization topics, there are many voices that require from any visualization to tell a story, respectively to conform to storytelling principles and best practices, and this independently of the environment or context in which the respective artifacts are considered. The need for data visualizations to tell a story may be entitled, though in business setups the data, its focus and context change continuously with the communication means, objectives, and, at least from this perspective, one can question storytelling’s hard requirement.

Data storytelling can be defined as "a structured approach for communicating data insights using narrative elements and explanatory visuals" [1]. Usually, this supposes the establishment of a context, respectively a fundament on which further facts, suppositions, findings, arguments, (conceptual) models, visualizations and other elements can be based upon. Stories help to focus the audience on the intended messages, they connect and eventually resonate with the audience, facilitate the retaining of information and understanding the chain of implications the decisions in scope have, respectively persuade and influence, when needed.

Conversely, besides the fact that it takes time and effort to prepare stories and the afferent content (presentations, manually created visualizations, documentation), expecting each meeting to be a storytelling session can rapidly become a nuisance for the auditorium as well for the presenters. Like in any value-generating process, one should ask where the value in storytelling is based on data visualizations and the effort involved, or whether the effort can be better invested in other areas.

In many scenarios, requesting from a dashboard to tell a story is an entitled requirement given that many dashboards look like a random combination of visuals and data whose relationship and meaning can be difficult to grasp and put into a plausible narrative, even if they are based on the same set of data. Data visualizations of any type should have an intentional well-structured design that facilitates visual elements’ navigation, understanding facts’ retention, respectively resonate with the auditorium.

It’s questionable whether such practices can be implemented in a consistent and meaningful manner, especially when rich navigation features across multiple visuals are available for users to look at data from different perspectives. In such scenarios the identification of cases that require attention and the associations existing between well-established factors help in the discovery process.

Often, it feels like visuals were arranged aleatorily in the page or that there’s no apparent connection between them, which makes the navigation and understanding more challenging. For depicting a story, there must be a logical sequencing of the various visualizations displayed in the dashboards or reports, especially when visuals’ arrangement doesn’t reflect the typical navigation of the visuals or when the facts need a certain sequencing that facilitates understanding. Moreover, the sequencing doesn’t need to be linear but have a clear start and end that encompasses everything in between.

Storytelling works well in setups in which something is presented as the basis for one-time or limited in scope sessions like decision-making, fact-checking, awareness raising and other types of similar communication. However, when building solutions for business monitoring and data exploration, there can be multiple stories or no story worth telling, at least not for the predefined scope. Even if one can zoom in or out, respectively rearrange the visuals and add others to highlight the stories encompassed, the value added by taking the information out of the dashboards and performing such actions can be often neglected to the degree that it doesn’t pay off. A certain consistency, discipline and acumen is needed then for focusing on the important aspects and ignoring thus the nonessential. 

References:
[1] Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019 [quotes]

03 May 2025

🧭Business Intelligence: Perspectives (Part 31: More on Data Visualization)

Business Intelligence Series
Business Intelligence Series

There are many reasons why the data visualizations available in the different mediums can be considerate as having poor quality and unfortunately there is often more than one issue that can be corroborated with this - the complexity of the data or of the models behind them, the lack of identifying the right data, respectively aspects that should be visualized, poor data visualization software or the lack of skills to use its capabilities, improper choice of visual displays, misleading choice of scales, axes and other elements, the lack of clear outlines for telling a story respectively of pushing a story too far, not adapting visualizations to changing requirements or different perspectives, to name just the most important causes.

The complexity of the data increases with the dimensions associated typically with what we call currently big data - velocity, volume, value, variety, veracity, variability and whatever V might be in scope. If it's relatively easy to work with a small dataset, understanding its shapes and challenges, our understanding power decreases with the Vs added into the picture. Of course, we can always treat the data alike, though the broader the timeframe, the higher the chances are for the data to have important changing characteristics that can impact the outcomes. It can be simple definition changes or more importantly, the model itself. Data, processes and perspectives change fluidly with the many requirements, and quite often the further implications for reporting, visualizations and other aspects are not considered.

Quite often there's a gap between what one wants to achieve with a data visualization and the data or knowledge available. It might be a matter of missing values or whole attributes that would help to delimit clearly the different perspectives or of modelling adequately the processes behind. It can be the intrinsic data quality issues that can be challenging to correct after the fact. It can also be our understanding about the processes themselves as reflected in the data, or more important, on what's missing to provide better perspectives. Therefore, many are forced to work with what they have or what they know.

Many of the data visualizations inadvertently reflect their creators' understanding about the data, procedures, processes, and any other aspects related to them. Unfortunately, also business users or other participants have only limited views and thus their knowledge must be elicited accordingly. Even then, it might be pieces of data that are not reflected in any knowledge available.

If one tortures enough data, one or more stories worthy of telling can probably be identified. However, much of the data is dull to the degree that some creators feel forced to add elements. Earlier, one could have blamed the software for it, though modern software provides nice graphics and plenty of features that can help graphics creators in the process. Even data with high quality can reveal some challenges difficult to overcome. One needs to compromise and there can be compromises in many places to the degree that one can but wonder whether the end result still reflects reality. Unfortunately, it's difficult to evaluate the impact of such gaps, however progress can be made occasionally by continuously evaluating the gaps and finding the appropriate methods to address them.

Not all stories must have complex visualizations in which multiple variables are used to provide the many perspectives. Some simple visualizations can be enough for establishing common ground on which something more complex (or simple) can be built upon. Data visualization is a continuous process of exploration, extrapolation, evaluation, testing assumptions and ideas, where one's experience can be a useful mediator between the various forces. 

Previous Post <<||>> Next Post

24 April 2025

🧭Business Intelligence: Perspectives (Part 30: The Data Science Connection)

Business Intelligence Series
Business Intelligence Series

Data Science is a collection of quantitative and qualitative methods, respectively techniques, algorithms, principles, processes and technologies used to analyze, and process amounts of raw and aggregated data to extract information or knowledge it contains. Its theoretical basis is rooted within mathematics, mainly statistics, computer science and domain expertise, though it can include further aspects related to communication, management, sociology, ecology, cybernetics, and probably many other fields, as there’s enough space for experimentation and translation of knowledge from one field to another.  

The aim of Data Science is to extract valuable insights from data to support decision-making, problem-solving, drive innovation and probably it can achieve more in time. Reading in between the lines, Data Science sounds like a superhero that can solve all the problems existing out there, which frankly is too beautiful to be true! In theory everything is possible, when in practice there are many hard limitations! Given any amount of data, the knowledge that can be obtained from it can be limited by many factors - the degree to which the data, processes and models built reflect reality, and there can be many levels of approximation, respectively the degree to which such data can be collected consistently. 

Moreover, even if the theoretical basis seems sound, the data, information or knowledge which is not available can be the important missing link in making any sensible progress toward the goals set in Data Science projects. In some cases, one might be aware of what's missing, though for the data scientist not having the required domain knowledge, this can be a hard limit! This gap can be probably bridged with sensemaking, exploration and experimentation approaches, especially by applying models from other domains, though there are no guarantees ahead!

AI can help in this direction by utilizing its capacity to explore fast ideas or models. However, it's questionable how much the models built with AI can be further used if one can't build mechanistical mental models of the processes reflected in the data. It's like devising an algorithm for winning at lottery small amounts, though investing more money in the algorithm doesn't automatically imply greater wins. Even if occasionally the performance is improved, it's questionable how much it can be leveraged for each utilization. Statistics has its utility when one studies data in aggregation and can predict average behavior. It can’t be used to predict the occurrence of events with a high precision. Think how hard the prediction of earthquakes or extreme weather is by just looking at a pile of data reflecting what’s happening only in a certain zone!

In theory, the more data one has from different geographical areas or organizations, the more robust the models can become. However, no two geographies, respectively no two organizations are alike: business models, the people, the events and other aspects make global models less applicable to local context. Frankly, one has more chances of progress if a model is obtained by having a local scope and then attempting to leverage the respective model for a broader scope. Even then, there can be differences between the behavior or phenomena at micro, respectively at macro level (see the law of physics). 

This doesn’t mean that Data Science or AI related knowledge is useless. The knowledge accumulated by applying various techniques, models and programming languages in problem-solving can be more valuable than the results obtained! Experimentation is a must for organizations to innovate, to extend their knowledge base. It’s also questionable how much of the respective knowledge can be retained and put to good use. In the end, each organization must determine this by itself!

27 March 2025

🧭Business Intelligence: Perspectives (Part 29: Navigating into the Unknown)

Business Intelligence Series
Business Intelligence Series

One of the important challenges in Business Intelligence and the other related knowledge domains is that people try to oversell ideas, overstretching, shifting, mixing and bending the definition of concepts and their use to suit the sales pitch or other related purposes. Even if there are several methodologies built around data that attempt to provide a solid foundation on which organizations can build upon, terms like actionable, value, insight, quality or importance continue to be a matter of perception, interpretation, and quite often be misused. 

It's often challenging to define precisely such businesses concepts especially there are degrees of fuzziness that may apply to the different contexts that are associated with them. What makes a piece of signal, data, information or knowledge valuable, respectively actionable? What is the value, respectively values we associate with a piece or aggregation of information, insight or degree of quality? When do values, changes, variations and other aspects become important, respectively can be ignored? How much can one generalize or particularize certain aspects? And, many more such questions can be added to this line of inquiry. 

Just because an important value changed, no matter in what direction, it might mean nothing as long as the value moves in certain ranges, respectively other direct or indirect conditions are met or not. Sometimes, there are simple rules and models that can be used to identify the various areas that should trigger different responses, respectively actions, though even small variations can increase the overall complexity multifold. There seems to be certain comfort in numbers, even if the same numbers can mean different things to different people, at different points in time, respectively contexts.

In the pursuit to bridge the multitude of gaps and challenges, organization attempt to arrive at common definitions and understanding in what concerns the business terms, goals, objectives, metrics, rules, procedures, processes and other points of focus associated with the respective terms. Unfortunately, many such foundations barely support the edifices built as long as there’s no common mental models established!

Even if the use of shared models is not new, few organizations try to make the knowledge associated with them explicit, respectively agree on and evolve a set of mental models that reflect how the business works, what is important, respectively can be ignored, which are the dependent and independent aspects, etc. This effort can prove to be a challenge for many organizations, especially when several leaps of faith must be made in the process.

Independently on whether organizations use shared mental models, some kind of common ground must be achieved. It starts with open dialog, identifying the gaps, respectively the minimum volume of knowledge required for making progress in the right direction(s). The broader the gaps and the misalignment, the more iterations are needed to make progress! And, of course, one must know which are the destinations, what paths to follow, what to ignore, etc. 

It's important how we look at the business, and people tend to use different filters (aka glasses or hats) for this purpose. Simple relationships between the various facts are ideal, though uncommon. There’s a chain of causality that may trigger a certain change, though more likely one deals with a networked structure of cause-effect relationships. The world is more complex than we (can} imagine. We try to focus on the aspects we are aware of, respectively consider as important. However, in a complex world also small variations in certain areas can shift the overall weight to aspects outside of our focus, influence or area of responsibility. Quite often, what we don’t know is more important than what we know!

10 March 2025

🧭Business Intelligence: Perspectives (Part 28: Cutting through Complexity)

Business Intelligence Series
Business Intelligence Series

Independently of the complexity of the problems, one should start by framing the problem(s) correctly and this might take several steps and iterations until a foundation is achieved, upon which further steps can be based. Ideally, the framed problem should reflect reality and should provide a basis on which one can build something stable, durable and sustainable. Conversely, people want quick low-cost fixes and probably the easiest way to achieve this is by focusing on appearances, which are often confused with more.

In many data-related contexts, there’s the tendency to start with the "solution" in mind, typically one or more reports or visualizations which should serve as basis for further exploration. Often, the information exchange between the parties involved (requestor(s), respectively developer(s)) is kept to a minimum, though the formalized requirements barely qualify for the minimum required. The whole process starts with a gap that can create further changes as the development process progresses, with all the consequences deriving from this: the report doesn’t satisfy the needs, more iterations are needed - requirements’ reevaluation, redesign, redevelopment, retesting, etc.

The poor results are understandable, all parties start with a perspective based on facts which provide suboptimal views when compared with the minimum basis for making the right steps in the right direction. That’s not only valid for reports’ development but also for more complex endeavors – data models, data marts and warehouses, and other software products. Data professionals attempt to bridge the gaps by formalizing and validating the requirements, building mock-ups and prototypes, testing, though that’s more than many organizations can handle!

There are simple reports or data visualizations for which not having prior knowledge of the needed data sources, processes and the business rules has a minimal impact on the further steps of the processes involved in building the final product(s). However, "all generalizations are false" to some degree, and there’s a critical point after which minimal practices tend to create more waste than companies can afford. Consequently, applying the full extent of the processes can lead to waste when the steps aren’t imperative for the final product.

Even if one is aware of all the implications, one’s experience and the application of best practices doesn’t guarantee the quality of the results as long as some kind of novelty, unknown, fuzziness or complexity is involved. Novelty can appear in different ways – process, business rules, data or problem formulations, particularities that aren’t easily perceived or correctly understood. Each important minor piece of information can have an exponential impact under the wrong circumstances.

The unknown can encompass novelty, though can be also associated with the multitude of facts not explicitly and/or directly considered. "The devil is in details" and it’s so easy for important or minor facts to remain hidden under the veil of suppositions, expectations, respectively under the complex and fuzzy texture of logical aspects. Many processes operate under strict rules, though there are various consequences, facts or unnecessary information that tend to increase the overall complexity and fuzziness.

Predefined processes, procedures and practices can help cut and illuminate through this complex structure associated with the various requirements and aspects of problems. Plunging headfirst can be occasionally associated with the need to evaluate what is known and unknown from facts and data’s perspective, to identify the gaps and various factors that can weigh in the final solution. Unfortunately, too often it’s nothing of this!  

Besides the multitude of good/best practices and problem-solving approaches, all one has is his experience and intuition to cut through the overall complexity. False steps are inevitable for finding the approachable path(s) from the requirements to the solution.

21 February 2025

🧩IT: Idioms, Sayings, Proverbs and Other Words of Wisdom

In IT setups one can hear many idioms, sayings and other type of words of wisdom that make the audience smile, even if some words seem to rub salt in the wounds. These are some of the idioms met in IT meetings or literature. Frankly, it's worth to write more about each of them, and this it the purpose of the "project". 

"A bad excuse is better than none"

"A bird in the hand is worth two in the bush": a working solution is worth more than hypothetically better solutions. 

"A drowning man will clutch at a straw": a drowning organization will clutch to the latest hope

"A friend in need (is a friend indeed)": 

"A journey of a thousand miles begins with a single step"

"A little learning is a dangerous thing"

"A nail keeps a shoe, a shoe a horse, a horse a man, a man a castle" (cca 1610): A nail keeps the shoe

"A picture is worth a thousand words"

"A stitch in time (saves nine)"

"Actions speak louder than words"

"All good things must come to an end"

"All generalizations are false" [attributed to Mark Twain, Alexandre Dumas (Père)]: Cutting though Complexity

"All the world's a stage, And all [...] merely players": A look forward

"All roads lead to Rome"

"All is well that ends well"

"An ounce of prevention is worth a pound of cure"

"Another day, another dollar"

"As you sow so shall you reap"

"Beauty is in the eye of the beholder"

"Better late than never": SQL Server and Excel Data

"Better safe than sorry": Deleting obsolete companies

"Big fish eat little fish"

"Better the Devil you know (than the Devil you do not)": 

"Calm seas never made a good sailor"

"Count your blessings"

"Dead men tell no tales"

"Do not bite the hand that feeds you"

"Do not change horses in midstream"

"Do not count your chickens before they are hatched"

"Do not cross the bridge till you come to it"

"Do not judge a book by its cover"

"Do not meet troubles half-way"

"Do not put all your eggs in one basket"

"Do not put the cart before the horse"

"Do not try to rush things; ignore matters of minor advantage" (Confucius): A tale of two cities II

"Do not try to walk before you can crawl"

"Doubt is the beginning, not the end, of wisdom"

"Easier said than done"

"Every cloud has a silver lining"

"Every little bit helps"

"Every picture tells a story"

"Failing to plan is planning to fail"Planning correctly misunderstood...

"Faith will move mountains"

"Fake it till you make it"

"Fight fire with fire"

"First impressions are the most lasting"

"First things first": Ways of looking at data

"Fish always rots from the head downwards"

"Fools rush in (where angels fear to tread)" (Alexander Pope, "An Essay on Criticism", cca. 1711): A tale of two cities II

"Half a loaf is better than no bread"

"Haste makes waste"

"History repeats itself"

"Hope for the best, and prepare for the worst"

"If anything can go wrong, it will" (Murphy's law)

"If it ain't broke, don't fix it.": Approaching a query

"If you play with fire, you will get burned"

"If you want a thing done well, do it yourself"

"Ignorance is bliss"

"Imitation is the sincerest form of flattery"

"It ain't over till/until it's over"

"It is a small world"

"It is better to light a candle than curse the darkness"

"It is never too late": A look backAll-knowing developers are back...

"It's a bad plan that admits of no modification." (Publilius Syrus)Planning Correctly Misunderstood I

"It’s not an adventure until something goes wrong." (Yvon Chouinard)Documentation - Lessons learned

"It is not enough to learn how to ride, you must also learn how to fall"

"It takes a whole village to raise a child"

"It will come back and haunt you"

"Judge not, that ye be not judged"

"Kill two birds with one stone"

"Knowledge is power, guard it well"

"Learn a language, and you will avoid a war" (Arab proverb)

"Less is more"

"Life is what you make it"

"Many hands make light work"

"Moderation in all things"

"Money talks"

"More haste, less speed"

"Necessity is the mother of invention"

"Never judge a book by its cover"

"Never say never"

"Never too old to learn"

"No man can serve two masters"

"No pain, no gain"

"No plan ever survived contact with the enemy.' (Carl von Clausewitz)Planning Correctly Misunderstood I

"Oil and water do not mix"

"One-man show": series

"One man's trash is another man's treasure"

"One swallow does not make a summer"

"Only time will tell": The Software Quality Perspective and AI, Microsoft FabricIt’s all about Partnership IIAccess vs. LightSwitch

"Patience is a virtue"

"Poke the bear": Mea Culpa - A Look Forward

"Practice makes perfect"

"Practice what you preach"

"Prevention is better than cure"

"Rules were made to be broken"

"Seek and ye shall find"

"Some are more equal than others" (George Orwell, "Animal Farm")

"Spoken words fly away, written words remain." ["Verba volant, scripta manent"]: Documentation - Lessons learned

"Strike while the iron is hot"

"Technology is dead": Dashboards Are Dead & Other Crapprogramming is dead

"The best defense is a good offense"

"The bets are off":  A look forward

"The bigger they are, the harder they fall"

"The devil is in the detail": Copilot Stories Part IV, Cutting through ComplexityMore on SQL DatabasesThe Analytics MarathonThe Choice of Tools in PM, Who Messed with My Data?

"The die is cast"

"The exception which proves the rule"

"The longest journey starts with a single step"

"The pursuit of perfection is a fool's errand"

"There are two sides to every question"

"There is no smoke without fire"

"There's more than one way to skin a cat" (cca. 1600s)

"There is no I in team"

"There is safety in numbers"

"Those who do not learn from history are doomed to repeat it" (George Santayana)

"Time is money"

"To learn a language is to have one more window from which to look at the world" (Chinese proverb)[5

"Too little, too late"

"Too much of a good thing"

"Truth is stranger than fiction"

"Two birds with one stone": Deleting sequential data...

"Two heads are better than one": Pair programming

"Two wrongs (do not) make a right"

"United we stand, divided we fall"

"Use it or lose it"

"Unity is strength"

"Variety is the spice of life." (William Cowper)

"Virtue is its own reward"

"Well begun is half done"

"What does not kill me makes me stronger"

"Well done is better than well said"

"What cannot be cured must be endured"

"What goes around, comes around"

"When life gives you lemons, make lemonade"

"When the cat is away, the mice will play"

"When the going gets tough, the tough get going"

"Where there is a will there is a way"

"With great power comes great responsibility"

"Work expands so as to fill the time available"

"You are never too old to learn": All-Knowing Developers are Back in Demand?

"You can lead a horse to water, but you cannot make it drink"

"You cannot make an omelet without breaking eggs"

"(You cannot) teach an old dog new tricks"

"You must believe and not doubt at all": Believe and not doubt

"Zeal without knowledge is fire without light"

Previous Post <<||>> Next Post

References:
[1] Wikipedia (2024) List of proverbial phrases [link]

18 December 2024

🧭🏭Business Intelligence: Microsoft Fabric (Part VII: Data Stores Comparison)

Business Intelligence Series
Business Intelligence Series

Microsoft made available a reference guide for the data stores supported for Microsoft Fabric workloads [1], including the new Fabric SQL database (see previous post). Here's the consolidated table followed by a few aspects to consider: 

Area Lakehouse Warehouse Eventhouse Fabric SQL database Power BI Datamart
Data volume Unlimited Unlimited Unlimited 4 TB Up to 100 GB
Type of data Unstructured, semi-structured, structured Structured, semi-structured (JSON) Unstructured, semi-structured, structured Structured, semi-structured, unstructured Structured
Primary developer persona Data engineer, data scientist Data warehouse developer, data architect, data engineer, database developer App developer, data scientist, data engineer AI developer, App developer, database developer, DB admin Data scientist, data analyst
Primary dev skill Spark (Scala, PySpark, Spark SQL, R) SQL No code, KQL, SQL SQL No code, SQL
Data organized by Folders and files, databases, and tables Databases, schemas, and tables Databases, schemas, and tables Databases, schemas, tables Database, tables, queries
Read operations Spark, T-SQL T-SQL, Spark* KQL, T-SQL, Spark T-SQL Spark, T-SQL
Write operations Spark (Scala, PySpark, Spark SQL, R) T-SQL KQL, Spark, connector ecosystem T-SQL Dataflows, T-SQL
Multi-table transactions No Yes Yes, for multi-table ingestion Yes, full ACID compliance No
Primary development interface Spark notebooks, Spark job definitions SQL scripts KQL Queryset, KQL Database SQL scripts Power BI
Security RLS, CLS**, table level (T-SQL), none for Spark Object level, RLS, CLS, DDL/DML, dynamic data masking RLS Object level, RLS, CLS, DDL/DML, dynamic data masking Built-in RLS editor
Access data via shortcuts Yes Yes Yes Yes No
Can be a source for shortcuts Yes (files and tables) Yes (tables) Yes Yes (tables) No
Query across items Yes Yes Yes Yes No
Advanced analytics Interface for large-scale data processing, built-in data parallelism, and fault tolerance Interface for large-scale data processing, built-in data parallelism, and fault tolerance Time Series native elements, full geo-spatial and query capabilities T-SQL analytical capabilities, data replicated to delta parquet in OneLake for analytics Interface for data processing with automated performance tuning
Advanced formatting support Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format Full indexing for free text and semi-structured data like JSON Table support for OLTP, JSON, vector, graph, XML, spatial, key-value Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format
Ingestion latency Available instantly for querying Available instantly for querying Queued ingestion, streaming ingestion has a couple of seconds latency Available instantly for querying Available instantly for querying

It can be used as a map for what is needed to know for using each feature, respectively to identify how one can use the previous experience, and here I'm referring to the many SQL developers. One must consider also the capabilities and limitations of each storage repository.

However, what I'm missing is some references regarding the performance for data access, especially compared with on-premise workloads. Moreover, the devil hides in details, therefore one must test thoroughly before committing to any of the above choices. For the newest overview please check the referenced documentation!

For lakehouses, the hardest limitation is the lack of multi-table transactions, though that's understandable given its scope. However, probably the most important aspect is whether it can scale with the volume of reads/writes as currently the SQL endpoint seems to lag. 

The warehouse seems to be more versatile, though careful attention needs to be given to its design. 

The Eventhouse opens the door to a wide range of time-based scenarios, though it will be interesting how developers cope with its lack of functionality in some areas. 

Fabric SQL databases are a new addition, and hopefully they'll allow considering a wide range of OLTP scenarios. Starting with 28th of March 2025, SQL databases will be ON by default and tenant admins must manually turn them OFF before the respective date [3].

Power BI datamarts have been in preview for a couple of years.


References:
[1] Microsoft Fabric (2024) Microsoft Fabric decision guide: choose a data store [link]
[2] Reitse's blog (2024) Testing Microsoft Fabric Capacity: Data Warehouse vs Lakehouse Performance [link]
[3] Microsoft Fabric Update Blog (2025) Extending flexibility: default checkbox changes on tenant settings for SQL database in Fabric [link]
[4] Microsoft Fabric Update Blog (2025) Enhancing SQL database in Fabric: share your feedback and shape the future [link]
[5] Microsoft Fabric Update Blog (2025) Why SQL database in Fabric is the best choice for low-code/no-code Developers [link

14 October 2023

🧭Business Intelligence: Perspectives (Part 8: Insights - The Complexity Perspective)

Business Intelligence Series
Business Intelligence Series

Scientists attempt to discover laws and principles, and for this they conduct experiments, build theories and models rooted in the data they collect. In the business setup, data professionals analyze the data for identifying patterns, trends, outliers or anything else that can lead to new information or knowledge. On one side scientists choose the boundaries of the systems they study, while for data professionals even if the systems are usually given, they can make similar choices. 

In theory, scientists are more flexible in what data they collect, though they might have constraints imposed by the boundaries of their experiments and the tools they use. For data professionals most of the data they need is already there, in the systems the business uses, though the constraints reside in the intrinsic and extrinsic quality of the data, whether the data are fit for the purpose. Both parties need to work around limitations, or attempt to improve the experiments, respectively the systems. 

Even if the data might have different characteristics, this doesn't mean that the methods applied by data professionals can't be used by scientists and vice-versa. The closer data professionals move from Data Analytics to Data Science, the higher the overlap between the business and scientific setup. 

Conversely, the problems data professionals meet have different characteristics. Scientists outlook is directed mainly at the phenomena and processes occurring in nature and society, where randomness, emergence and chaos seem to feel at home. Business processes deal more with predefined controlled structures, cyclicity, higher dependency between processes, feedback and delays. Even if the problems may seem to be different, they can be modeled with systems dynamics. 

Returning to data visualization and the problem of insight, there are multiple questions. Can we use simple designs or characterizations to find the answer to complex problems? Which must be the characteristics of a piece of information or knowledge to generate insight? How can a simple visualization generate an insight moment? 

Appealing to complexity theory, there are several general approaches in handling complexity. One approach resides in answering complexity with complexity. This means building complex data visualizations that attempt to model problem's complexity. For example, this could be done by building a complex model that reflects the problem studied, and build a set of complex visualizations that reflect the different important facets. Many data professionals advise against this approach as it goes against the simplicity principle. On the other hand, starting with something complex and removing the nonessential can prove to be an approachable strategy, even if it involves more effort. 

Another approach resides in reducing the complexity of the problem either by relaxing the constraints, or by breaking the problem into simple problems and addressing each one of them with visualizations. Relaxing the constraints allow studying upon case a more general problem or a linearization of the initial problem. Breaking down the problem into problems that can be easier solved, can help to better understand the general problem though we might lose the sight of emergence and other behavior that characterize complex systems.

Providing simple visualizations to complex problems implies a good understanding of the problem, its solution(s) and the overall context, which frankly is harder to achieve the more complex a problem is. For its understanding a problem requires a minimum of knowledge that needs to be reflected in the visualization(s). Even if some important aspects are assumed as known, they still need to be confirmed by the visualizations, otherwise any deviation from assumptions can lead to a new problem. Therefore, its questionable that simple visualizations can address the complexity of the problems in a general manner. 

Previous Post <<||>> Next Post 


06 November 2020

🧭Business Intelligence: Perspectives (Part 6: Data Soup - Reports vs. Data Visualizations)

Business Intelligence Series
Business Intelligence Series

Considering visualizations, John Tukey remarked that ‘the greatest value of a picture is when it forces us to notice what we never expected to see’, which is not always the case for many of the graphics and visualizations available in organizations, typically in the form of simple charts and dashboards, quite often with no esthetics or meaning behind.

In general, reports are needed as source for operational activities, in which the details in form of raw or aggregate data are important. As one moves further to the tactical or strategic aspects of a business, visualizations gain in importance especially when they allow encoding data and information, respectively variations, trends or relations in smaller places with minimal loss of information.

There are also different aspects of visualizations that need to be considered. Modern tools allow rapid visualization and interactive navigation of data across different variables which is great as long one knows what is searching for, which is not always the case.

There are junk charts in which the data drowns in graphical elements that bring no value to the reader, in extremis even distorting the message/meaning.

There are graphics/visualizations that attempt bringing together and encoding multiple variables in respect to a theme, and for which a ‘project’ is typically needed as data is not ad-hoc available, don’t have the desired quality or need further transformations to be ready for consumption. Good quality graphics/visualizations require time and a good understanding of the business, which are not necessarily available into the BI/Analytics teams, and unfortunately few organizations do something in that direction, ignoring typically such needs. In this type of environments is stressed the rapid availability of data for decision-making or action-relevant insight, which depends typically on the consumer.

The story-telling capabilities of graphics/visualizations are often exaggerated. Yes, they can tell a story though stories need to be framed into a context/problem, some background and further references need to be provided, while without detailed data the graphics/visualizations are just nice representations in which each consumer understands what he can.

In an ideal world the consumer and the ‘designer’ would work together to identify the important data for the theme considered, to find the appropriate level of detail, respectively the best form of encoding. Such attempts can stop at table-based representations (aka reports), respectively basic or richer forms of graphical representations. One can consider reports as an early stage of the visualization process, with the potential to derive move value when the data allow meaningful graphical representations. Unfortunately, the time, data and knowledge available seldom make this achievable.

In addition, a well-designed report can be used as basis for multiple purposes, while a graphic/visualization can enforce more limitations. Ideal would be when multiple forms of representation (including reports) are combined to harness the value of data. Navigations from visualizations to detailed data can be useful to understand what happens; learning and understanding the various aspects being an iterative process.

It’s also difficult to demonstrate the value of insight derived from visualizations, especially when graphical literacy goes behind the numeracy and statistical literacy - many consumers lacking the skills needed to evaluate numbers and statistics adequately. If for a good artistic movie you need an assistance to enjoy the show and understand the message(s) behind it, the same can be said also about good graphics/visualizations. Moreover, this requires creativity, abstraction-based thinking, and other capabilities to harness the value of representations.

Given the considerable volume of requirements related to the need of basis data, reports will continue to be on high demand in organizations. In exchange visualizations can complement them by providing insights otherwise not available.

Initially published on Medium as answer to a post on Reporting and Visualizations.

Previous Post <<||>> Next Post

02 March 2016

🧭Business Intelligence: Perspectives (Part 3: Self-Service BI)

Business Intelligence

Introduction


According to Gartner, the world's leading information technology research and advisory company, Self-Service BI (aka self-service analytics, ad-hoc analysis, personal analytics), for short SSBI, is a “form of business intelligence (BI) in which line-of-business professionals are enabled and encouraged to perform queries and generate reports on their own, with nominal IT support” [1].

Reading between the lines, SSBI presumes the existence of an infrastructure made of tools to support it (aka self-service BI tools), direct or indirect access to row data and/or data models for the users, and the skillset needed in order to work with data and answer to business problems/questions.

A Little History

The concept of self-service is not new, it just got “rebranded” and transformed into a business opportunity. The need for business users to perform ad-hoc analyses was always there in organizations, especially in the ones not having the right infrastructure for harnessing their data. Even since the 90s with the appearance of products like MS Excel or MS Access in many organizations users were forced by the state of art to learn how to use such products in order to get the answers they needed from the data. Users started building personal solutions, many of them temporary, intended to fill the reporting gaps organizations had. With a little effort and relatively small investment users had the possibility of playing with the data, understanding the data, identifying and solving problems in the business. They acquired thus a certain level of business expertise and data awareness becoming valuable resources in the organization.

With time such solutions grew in scope and data volume, gained broader visibility and reached deeper in organizations, some of them becoming team, departmental or cross-departmental solutions. What grows uncontrolled with time starts to have negative impact on the environment. First tools’ management became a problem because the solutions needed to be backed-up and maintained regularly, then other problems started to surface: security of data, inefficient data processing as increasing volumes of data were processed on local computers and transferred over the network, data and effort were duplicated, different versions of reality existed as different numbers were reported, numbers that were reflecting different definitions, knowledge about the business or data-analysis skillsets. The management needed a more consolidated and standardized effort in order to address these problems. Organizations were forced or embraced the idea of investing money in modern BI solutions, in more powerful servers capable of handling a larger amount of requests, in flexible data models that facilitate data consumption, in data quality initiatives. Thus through various projects a considerable number of such solutions were converted into more standardized and performant BI solutions, the IT department being in control of the changes and new requests.

Back to Present

With IT in control of the reporting requirements the business is forced to rely on the rapidity with which IT is able to address new requirements. Some organizations acquired internal resources in order to build reports and afferent infrastructure in-house, others created partnerships with vendors, or approached a combination of the two. As the volume of requirements isn’t uniform over time, the business has to wait several days between the time a requirement was addressed to IT and a solution was provided. In business terms a few of days of waiting for data can equate with the loss of an opportunity, a decision taken too late, decision that could have broader impact.

A few years ago things started to change when the ad-hoc analysis concept was rebranded as self-service and surfaced as trend. This time vendors like Qlik, Tableau, MicroStrategy or Microsoft, some of the main SSBI vendors, are offering easy to use and rich functionality tools for data integration, visualization and discovery, tools that reflect the advances made in graphics, data storage and processing technologies (e.g. in-memory databases, parallel processing). With just a few drag-and-drops users are able to display details, aggregate data, identify trends and correlations between data. In addition the tools can make use of the existing data models available in data warehouses, data marts and other types of data repositories, including the rich set of open data available on the web.

Looking at the Future

Like its predecessors, SSBI seems to address primarily data analysts and data-aware business users (aka data citizens), however in time is expected to be adopted by more organizations and become more mature where already adopted. Of course, some of the problems from the early days more likely will resurface though through governance, better architectures and tools, integration with other BI capabilities, trainings and awareness most of the problems will be overcome. More likely there will be also organizations in which SSBI will fail. In the end each organization will need to find by itself the value of SSBI.

Previous Post <<||>> Next Post

Resources:
[1] Gartner (2016) Self-Service Analytics [Online] Available from: http://www.gartner.com/it-glossary/self-service-analytics
[2
] Gartner (2016) Magic Quadrant for Business Intelligence and Analytics Platforms, by Josh Parenteau, Rita L. Sallam, Cindi Howson, Joao Tapadinhas, Kurt Schlegel, Thomas W. Oestreich [Online] Available from: https://www.gartner.com/doc/reprints?id=1-2XXET8P&ct=160204&st=sb

27 February 2016

🧭Business Intelligence: Perspectives (Part 2: The Complexity Myth)

Business Intelligence

Introduction

While looking over “Business Intelligence Concepts and Platform Capabilities” Coursera MOOC resources for Module 2 I run into two similar articles from Solutions Review, respectively Information Age. What caught my attention was the easiness with which the complexity of BI “myth” is approached in both columns.

According to the two sources the capabilities of nowadays BI tools “enabled business users to easily identify and present trends in an impactful way” [1], and “do not require an expert at the helm” [2]. It became thus simpler for users to independently query data and create interactive reports and presentations [2]. In both columns one can read between the lines that the simplicity of using BI tools is equivalent with negating the complexity of BI, which from my point of view is false. In fact here are regarded especially the self-service BI tools, in trend nowadays, that allow users to easily perform ad-hoc analysis with a minimal involvement from IT. Self-service BI is only a subset of what BI for organizations means, and just a capability from the many BI capabilities an organization needs in theory, even if some organizations might use it extensively.

Beyond the Surface

A BI tool is not a BI solution per se, even if many generic BI solutions for different systems are available out of the box. This is one of the biggest confusion managers, users and unfortunately also BI professionals make. A BI tool offers the technological basis for creating a BI infrastructure, though it comes with no guarantees. It takes a well-defined IT and business strategy, one or more successful projects, skillful developers and users in order to harness the BI investment.

On the other side it’s also true that organizations can obtain results also from less, though BI doesn’t equates with any ad-hoc analysis performed by users, even if they use BI tools for this purpose. BI is not only about tools, reporting and revealing trends in the data. BI often implies a holistic knowledge about the business and certain data awareness, without which users will start aggregating and comparing apples with pears and wonder why they taste and look different.

If everything were so simple then why so many BI projects fail to deliver what’s expected? Why so many managers complain that they don’t have the data they need, when they need them? Sure maybe the problem lies in over-complexifying the whole BI landscape and treating everything from a high-level, though that’s more likely not it.

It’s a Teamwork Knowledge Game

BI is or needs to be monitoring and problem solving oriented. This requires a deep understanding about processes and business. There are business users and also BI professionals who don’t have the knowledge one needs in order to approach a business problem. One can see that from the premises they have, the questions they raise, the data they consider, the models they build, and the results.

From a BI professional’s perspective, even if one has a broad knowledge about various businesses, one often lacks the insight in a given business. BI professionals can seldom provide adequate BI solutions without input and feedback from the business. Some BI professionals rely too much on their knowledge, same as the business sometimes expects a maximum output from BI professionals by providing a minimum of input.

Considering the business users, quite often their focus and knowledge cover only the data boundaries of their department, while many problems extend over those boundaries. They know facts that are not necessarily reflected in the data. Even if they are closer to the data than other parties, they still lack some data-awareness (including statistical awareness) in order to approach problems.

Somebody was saying ironically when talking about users’ data and problem solving skills - “not everybody is a Bill Gates or Steve Jobs”. Continuing the idea, one can’t expect users to act as such. For sure there are many business users who are better problem solvers than BI consultants, though on the other side one can’t expect that the average business user will have the same skillset as an experienced BI consultant. This is in fact one of the problems of self-service BI. Probably with time and effort organization will develop such resources, though some help from BI professionals will be still needed. Without a good cooperation between the business and BI professionals an organization might not have the hoped results when investing in BI

More on Complexity

The complexity arises when one tries to make more with the data, especially the data found in raw form. Usually the complexity of raw data can be addressed by building a logical or physical model that allows easier consumption of data. Here is the point where the users find themselves overwhelmed, because for this is required a good knowledge of the physical data model and its semantics, the technical knowledge to build models and the skills to reengineer the logic available in the source systems. These are the themes BI professionals are supposed to excel in. Talking about models, they are the most difficult to build because they reflect various segments of the business, they reflect a breakdown of the complexity. It’s also the point where many BI projects fail as the built models don’t reflect the reality or aren’t capable to answer to business questions.

Coming back to the two columns, I have to point out that the complexity of a subject or domain can’t be judged based on how easy is to approach basic tasks. The complexity lies typically when one goes beyond the basics, when one dives into details. In case of BI its complexity starts when one attempts mixing various technologies and knowledge domains to model and solve daily business problems in an integrated, holistic, aligned, consistent and cost-effective manner. The more the technologies, the knowledge domains and constraints one has to consider, the more complex the BI landscape and solutions become.

On the other side this doesn’t mean that the BI infrastructure can’t be simplified, that BI can’t rely heavily or exclusively on self-service BI solutions. However for each strategy there are advantages and disadvantages and one more likely has to consider both sides of the coin in the process. And self-service BI has its own trade-offs, weaknesses that can be transformed in strengths with time.

Conclusion

When one considers nowadays BI tools capabilities, ad-hoc analyses are relatively easy to perform and can lead to results, though such analyses don’t equate with BI and the simplicity with which they are performed don’t necessarily imply that BI is simple as a whole. When one considers the complexity of nowadays businesses, the more one dives in various problems a business has, the more complex the BI landscape seems. In the end it’s in each organization powers to simplify and harmonize its BI infrastructure to a degree that its business goals aren’t affected negatively.

Previous Post <<||>> Next Post

Resources
[1] Information Age (2015) 5 Myths about Intelligence, by Ben Rossi, [Online] Available from: http://www.information-age.com/technology/information-management/123460271/5-myths-about-business-intelligence 
[2] SolutionsReview (2015) Top 5 Business Intelligence Myths Revealed, by Timothy King, [Online] Available from: http://solutionsreview.com/business-intelligence/top-5-business-intelligence-myths-revealed
[3] Gartner (2016) Magic Quadrant for Business Intelligence and Analytics Platforms, by Josh Parenteau, Rita L. Sallam, Cindi Howson, Joao Tapadinhas, Kurt Schlegel, Thomas W. Oestreich [Online] Available from: https://www.gartner.com/doc/reprints?id=1-2XXET8P&ct=160204&st=sb 
[4] Coursera (2016) Business Intelligence Concepts, Tools, and Applications MOOC, led by Jahangir Karimi, University of Colorado, [Online] Available from: https://www.coursera.org/learn/business-intelligence-tools

29 December 2015

🪙Business Intelligence: Storytelling (Just the Quotes)

"Storytelling reveals meaning without committing the error of defining it." (Hannah Arendt, "Men in Dark Times", 1968)

"Scientific practice may be considered a kind of storytelling practice [...]" (Donna Haraway, "Primate Visions", 1989)

"Storytelling is the art of unfolding knowledge in a way that makes each piece contribute to a larger truth." (Philip Gerard, "Writing a Book That Makes a Difference", 2000)

"The human mind is a wanton storyteller and even more, a profligate seeker after pattern. We see faces in clouds and tortillas, fortunes in tea leaves and planetary movements. It is quite difficult to prove a real pattern as distinct from a superficial illusion." (Richard Dawkins, "A Devil's Chaplain", 2003)

"The world of a story is not merely the sum of all the words we put on a page, or on many pages. When we talk about entering the world of a story as a reader we refer to things we picture, or imagine, and responses we form - to characters, events - all of which are prompted by, but not entirely encompassed by, the words on the page." (Peter Turchi, "Maps of the Imagination: The writer as cartographer", 2004)

"We have, as human beings, a storytelling problem. We're a bit too quick to come up with explanations for things we don't really have an explanation for." (Malcolm Gladwell, "Blink: The Power of Thinking Without Thinking", 2005)

"There is an extraordinary power in storytelling that stirs the imagination and makes an indelible impression on the mind." (Brennan Manning, "The Ragamuffin Gospel: Good News for the Bedraggled, Beat-Up, and Burnt Out", 2008)

"Mostly we rely on stories to put our ideas into context and give them meaning. It should be no surprise, then, that the human capacity for storytelling plays an important role in the intrinsically human-centered approach to problem solving, design thinking." (Tim Brown, "Change by Design: How Design Thinking Transforms Organizations and Inspires Innovation", 2009)

"The purpose of a storyteller is not to tell you how to think, but to give you questions to think upon." (Brandon Sanderson, "The Way of Kings", 2010)

"Visualizations act as a campfire around which we gather to tell stories." (Al Shalloway, 2011)

"The storytelling mind is allergic to uncertainty, randomness, and coincidence. It is addicted to meaning. If the storytelling mind cannot find meaningful patterns in the world, it will try to impose them. In short, the storytelling mind is a factory that churns out true stories when it can, but will manufacture lies when it can't." (Jonathan Gottschall, "The Storytelling Animal: How Stories Make Us Human", 2012)

"We are, as a species, addicted to story. Even when the body goes to sleep, the mind stays up all night, telling itself stories." (Jonathan Gottschall, "The Storytelling Animal", 2012)

"The fact of storytelling hints at a fundamental human unease, hints at human imperfection. Where there is perfection there is no story to tell." (Ben Okri, "A Way of Being Free", 2014)

"There is no such thing as a fact. There is only how you saw the fact, in a given moment. How you reported the fact. How your brain processed that fact. There is no extrication of the storyteller from the story." (Jodi Picoult, "Small Great Things", 2016)

"Stories can begin with a question or line of inquiry." (Kristen Sosulski, "Data Visualization Made Simple: Insights into Becoming Visual", 2018)

"Data storytelling provides a bridge between the worlds of logic and emotion. A data story offers a safe passage for your insights to travel around emotional pitfalls and through analytical resistance that typically impede facts." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

27 December 2015

🪙Business Intelligence: Insight (Just the Quotes)

"Knowledge workers and BI experts must continually evaluate the reports, dashboards, alerts, and other mechanisms for disseminating factual information to ensure the design facilitates insight." (Cindi Howson, "Successful Business Intelligence: Secrets to making BI a killer App", 2008)

"In fact, the analogy to storytelling is limited when applied to communicating with data. Data visualization has fundamental characteristics missing from traditional storytelling. For example, interactive data visualizations let audiences explore information to find insights that resonate with them. Visualizations take shape based to a large extent on the underlying data. And as this data changes, the emphasis and message of the visualization is likely to change." (Zach Gemignani et al, "Data Fluency", 2014)

"[…] the better insights are communicated, the more likely it is that data leads to positive action (in this case, better business decisions)." (Bernard Marr, ​​​​​​​"Data Strategy", 2017)

"Data Lake induces accessibility and catalyzes availability. It warrants data discovery platforms to soak the data trends at a horizontal scale and produce visual insights. It largely cuts down the time that goes into data preparation and exhaustive data analysis." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"Bad data are expensive: my best estimate is that it costs a typical company 20% of revenue. Worse, they dilute trust - who would trust an exciting new insight if it is based on poor data! And worse still, sometimes bad data are simply dangerous; look at the damage brought on by the financial crisis, which had its roots in bad data." (Rupa Mahanti, "Data Quality: Dimensions, Measurement, Strategy, Management, and Governance", 2019)

"Data storytelling provides a bridge between the worlds of logic and emotion. A data story offers a safe passage for your insights to travel around emotional pitfalls and through analytical resistance that typically impede facts." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"First, from an ethos perspective, the success of your data story will be shaped by your own credibility and the trustworthiness of your data. Second, because your data story is based on facts and figures, the logos appeal will be integral to your message. Third, as you weave the data into a convincing narrative, the pathos or emotional appeal makes your message more engaging. Fourth, having a visualized insight at the core of your message adds the telos appeal, as it sharpens the focus and purpose of your communication. Fifth, when you share a relevant data story with the right audience at the right time (kairos), your message can be a powerful catalyst for change." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"Ensure you build into your data literacy strategy learning on data quality. If the individuals who are using and working with data do not understand the purpose and need for data quality, we are not sitting in a strong position for great and powerful insight. What good will the insight be, if the data has no quality within the model?" (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"In the same vein, data strategy is often a misnomer for a much wider scope of coverage, but the lack of coherence in how we use the language has led to data strategy being perceived to cover data management activities all the way through to exploitation of data in the broadest sense. The occasional use of information strategy, intelligence strategy or even data exploitation strategy may differentiate, but the lack of a common definition on what we mean tends to lead to data strategy being used as a catch-all for the more widespread coverage such a document would typically include. Much of this is due to the generic use of the term ‘data’ to cover everything from its capture, management, governance through to reporting, analytics and insight." (Ian Wallis, "Data Strategy: From definition to execution", 2021)

"Current decision-making in business suffers from insight gaps. Organizations invest in data and analytics, hoping that will provide them with insights that they can use to make decisions, but in reality, there are many challenges and obstacles that get in the way of that process. One of the biggest challenges is that these organizations tend to focus on technology and hard skills only. They are definitely important, but you will not automatically get insights and better decisions with hard skills alone. Using data to make better data-informed decisions requires not only hard skills but also soft skills as well as mindsets." (Angelika Klidas & Kevin Hanegan, "Data Literacy in Practice", 2022)

"Decision-makers are constantly provided data in the form of numbers or insights, or similar. The challenge is that we tend to believe every number or piece of data we hear, especially when it comes from a trusted source. However, even if the source is trusted and the data is correct, insights from the data are created when we put it in context and apply meaning to it. This means that we may have put incorrect meaning to the data and then made decisions based on that, which is not ideal. This is why anyone involved in the process needs to have the skills to think critically about the data, to try to understand the context, and to understand the complexity of the situation where the answer is not limited to just one specific thing. Critical thinking allows individuals to assess limitations of what was presented, as well as mitigate any cognitive bias that they may have." (Angelika Klidas & Kevin Hanegan, "Data Literacy in Practice", 2022)

"Gaining more insight into data, simplifying data access, enabling shopping-for-data, augmenting traditional data governance, generating active metadata, and accelerating development of products and services are enabled by infusing AI into the Data Fabric architecture. An AI-infused Data Fabric is not only leveraging AI but also likewise an architecture to manage and deal with AI artefacts, including AI models, pipelines, etc." (Eberhard Hechler et al, "Data Fabric and Data Mesh Approaches with AI", 2023)

"Establishing a comprehensive observability architecture necessitates a systematic approach that spans the entirety of the data pipeline, from initial telemetry collection to actionable insights accessible by diverse stakeholders. The core objective is to unify distributed data sources - metrics, logs, traces, and quality signals - into a coherent framework that enables rapid diagnosis, continuous monitoring, and strategic decision-making." (William Smith, "Soda Core for Modern Data Quality and Observability: The Complete Guide for Developers and Engineers", 2025)

"The lakehouse combines the best elements of data lakes and data warehouses for OLAP workloads. It merges the scalability and flexibility of data lakes with the management features and performance optimization of data warehouses. [...] A lakehouse eliminates the need for disjointed systems and provides a single, coherent platform for all forms of data analysis. Lakehouses enhance the performance of data queries and simplify data management, making it easier for organizations to derive insights from their data." (Denny Lee et al, "Delta Lake: The Definitive Guide", 2025)

"Viewing the dendrograms in high dimensions provides insight into how the algorithm has joined points to clusters. For example, single linkage often has edges leading to a single focal point, which might not yield a useful clustering but might help to 
identify outliers. If the edges point to multiple focal points, with long edges bridging gaps in the data, the result is more likely yielding a useful clustering." (Dianne Cook & Ursula Laa, "Interactively Exploring High-Dimensional Data and Models in R", 2026)
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.