Showing posts with label BI. Show all posts
Showing posts with label BI. Show all posts

27 March 2025

🧭Business Intelligence: Perspectives (Part XXIX: Navigating into the Unknown)

Business Intelligence Series
Business Intelligence Series

One of the important challenges in Business Intelligence and the other related knowledge domains is that people try to oversell ideas, overstretching, shifting, mixing and bending the definition of concepts and their use to suit the sales pitch or other related purposes. Even if there are several methodologies built around data that attempt to provide a solid foundation on which organizations can build upon, terms like actionable, value, insight, quality or importance continue to be a matter of perception, interpretation, and quite often be misused. 

It's often challenging to define precisely such businesses concepts especially there are degrees of fuzziness that may apply to the different contexts that are associated with them. What makes a piece of signal, data, information or knowledge valuable, respectively actionable? What is the value, respectively values we associate with a piece or aggregation of information, insight or degree of quality? When do values, changes, variations and other aspects become important, respectively can be ignored? How much can one generalize or particularize certain aspects? And, many more such questions can be added to this line of inquiry. 

Just because an important value changed, no matter in what direction, it might mean nothing as long as the value moves in certain ranges, respectively other direct or indirect conditions are met or not. Sometimes, there are simple rules and models that can be used to identify the various areas that should trigger different responses, respectively actions, though even small variations can increase the overall complexity multifold. There seems to be certain comfort in numbers, even if the same numbers can mean different things to different people, at different points in time, respectively contexts.

In the pursuit to bridge the multitude of gaps and challenges, organization attempt to arrive at common definitions and understanding in what concerns the business terms, goals, objectives, metrics, rules, procedures, processes and other points of focus associated with the respective terms. Unfortunately, many such foundations barely support the edifices built as long as there’s no common mental models established!

Even if the use of shared models is not new, few organizations try to make the knowledge associated with them explicit, respectively agree on and evolve a set of mental models that reflect how the business works, what is important, respectively can be ignored, which are the dependent and independent aspects, etc. This effort can prove to be a challenge for many organizations, especially when several leaps of faith must be made in the process.

Independently on whether organizations use shared mental models, some kind of common ground must be achieved. It starts with open dialog, identifying the gaps, respectively the minimum volume of knowledge required for making progress in the right direction(s). The broader the gaps and the misalignment, the more iterations are needed to make progress! And, of course, one must know which are the destinations, what paths to follow, what to ignore, etc. 

It's important how we look at the business, and people tend to use different filters (aka glasses or hats) for this purpose. Simple relationships between the various facts are ideal, though uncommon. There’s a chain of causality that may trigger a certain change, though more likely one deals with a networked structure of cause-effect relationships. The world is more complex than we (can} imagine. We try to focus on the aspects we are aware of, respectively consider as important. However, in a complex world also small variations in certain areas can shift the overall weight to aspects outside of our focus, influence or area of responsibility. Quite often, what we don’t know is more important than what we know!

Previous Post <<||>> Next Post

10 March 2025

🧭Business Intelligence: Perspectives (Part XXVIII: Cutting through Complexity)

Business Intelligence Series
Business Intelligence Series

Independently of the complexity of the problems, one should start by framing the problem(s) correctly and this might take several steps and iterations until a foundation is achieved, upon which further steps can be based. Ideally, the framed problem should reflect reality and should provide a basis on which one can build something stable, durable and sustainable. Conversely, people want quick low-cost fixes and probably the easiest way to achieve this is by focusing on appearances, which are often confused with more.

In many data-related contexts, there’s the tendency to start with the "solution" in mind, typically one or more reports or visualizations which should serve as basis for further exploration. Often, the information exchange between the parties involved (requestor(s), respectively developer(s)) is kept to a minimum, though the formalized requirements barely qualify for the minimum required. The whole process starts with a gap that can create further changes as the development process progresses, with all the consequences deriving from this: the report doesn’t satisfy the needs, more iterations are needed - requirements’ reevaluation, redesign, redevelopment, retesting, etc.

The poor results are understandable, all parties start with a perspective based on facts which provide suboptimal views when compared with the minimum basis for making the right steps in the right direction. That’s not only valid for reports’ development but also for more complex endeavors – data models, data marts and warehouses, and other software products. Data professionals attempt to bridge the gaps by formalizing and validating the requirements, building mock-ups and prototypes, testing, though that’s more than many organizations can handle!

There are simple reports or data visualizations for which not having prior knowledge of the needed data sources, processes and the business rules has a minimal impact on the further steps of the processes involved in building the final product(s). However, "all generalizations are false" to some degree, and there’s a critical point after which minimal practices tend to create more waste than companies can afford. Consequently, applying the full extent of the processes can lead to waste when the steps aren’t imperative for the final product.

Even if one is aware of all the implications, one’s experience and the application of best practices doesn’t guarantee the quality of the results as long as some kind of novelty, unknown, fuzziness or complexity is involved. Novelty can appear in different ways – process, business rules, data or problem formulations, particularities that aren’t easily perceived or correctly understood. Each important minor piece of information can have an exponential impact under the wrong circumstances.

The unknown can encompass novelty, though can be also associated with the multitude of facts not explicitly and/or directly considered. "The devil is in details" and it’s so easy for important or minor facts to remain hidden under the veil of suppositions, expectations, respectively under the complex and fuzzy texture of logical aspects. Many processes operate under strict rules, though there are various consequences, facts or unnecessary information that tend to increase the overall complexity and fuzziness.

Predefined processes, procedures and practices can help cut and illuminate through this complex structure associated with the various requirements and aspects of problems. Plunging headfirst can be occasionally associated with the need to evaluate what is known and unknown from facts and data’s perspective, to identify the gaps and various factors that can weigh in the final solution. Unfortunately, too often it’s nothing of this!  

Besides the multitude of good/best practices and problem-solving approaches, all one has is his experience and intuition to cut through the overall complexity. False steps are inevitable for finding the approachable path(s) from the requirements to the solution.

08 March 2025

#️⃣Software Engineering: Programming (Part XVI: The Software Quality Perspective and AI)

Software Engineering Series
Software Engineering Series

Organizations tend to complain about poor software quality developed in-house, by consultancy companies or third parties, without doing much in this direction. Unfortunately, this agrees with the bigger picture reflected by the quality standards adopted by organizations - people talk and complain about them, though they aren’t that eager to include them in the various strategies, or even if they are considered, they are seldom enforced adequately!

Moreover, even if quality standards are adopted, and a lot of effort may be spent in this direction (as everybody has strong opinions and there are many exceptions), as projects progress, all the good intentions come to an end, the rules fading on the way either because are too strict, too general, aren’t adequately prioritized or communicated, or there’s no time to implement (all of) them. This applies in general to programming and to the domains that revolve around data – Business Intelligence, Data Analytics or Data Science.

The volume of good quality code and deliverables is not only a reflection of an organization’s maturity in dealing with best practices but also of its maturity in handling technical debt, Project Management, software and data quality challenges. All these aspects are strongly related to each other and therefore require a systemic approach rather than focusing on the issues locally. The systemic approach allows organizations to bridge the gaps between business areas, teams, projects and any other areas of focus.

There are many questionable studies on the effect of methodologies on software quality and data issues, proclaiming that one methodology is better than the other in addressing the multifold aspects of software quality. Besides methodologies, some studies attempt to correlate quality with organizations’ size, management or programmers’ experience, the size of software, or whatever characteristic might seem to affect quality.

Bad code is written independently of companies’ size or programmer's experience, management or organization’s maturity. Bad code doesn’t necessarily happen all at once, but it can depend on circumstances, repetitive team, requirements and code changes. There are decisions and actions that sooner or later can affect the overall outcome negatively.

Rewriting the code from scratch might look like an approachable measure though it’s seldom the cost-effective solution. Allocating resources for refactoring is usually a better approach, though this tends to increase considerably the cost of projects, and organizations might be tempted to face the risks, whatever they might be. Independently of the approaches used, sooner or later the complexity of projects, requirements or code tends to kick back.

There are many voices arguing that AI will help in addressing the problems of software development, quality assurance and probably other areas. It’s questionable how much AI will help to address the gaps, non-concordances and other mistakes in requirements, and how it will develop quality code when it has basic "understanding" issues. Even if step by step all current issues revolving around AI will be fixed, it will take time and multiple iterations until meaningful progress will be made.

At least for now, AI tools like Copilot or ChatGPT can be used for learning a programming language or framework through predefined or ad-hoc prompts. Probably, it can be used also to identify deviations from best practices or other norms in scope. This doesn’t mean that AI will replace for now code reviews, testing and other practices used in assuring the quality of software, but it can be used as an additional method to check for what was eventually missed in the other methods.

AI may also have hidden gems that when discovered, polished and sized, may have a qualitative impact on software development and software. Only time will tell what’s possible and achievable.

21 February 2025

🧩IT: Idioms, Sayings, Proverbs and Other Words of Wisdom

In IT setups one can hear many idioms, sayings and other type of words of wisdom that make the audience smile, even if some words seem to rub salt in the wounds. These are some of the idioms met in IT meetings or literature. Frankly, it's worth to write more about each of them, and this it the purpose of the "project". 

"A bad excuse is better than none"

"A bird in the hand is worth two in the bush": a working solution is worth more than hypothetically better solutions. 

"A drowning man will clutch at a straw": a drowning organization will clutch to the latest hope

"A friend in need (is a friend indeed)": 

"A journey of a thousand miles begins with a single step"

"A little learning is a dangerous thing"

"A nail keeps a shoe, a shoe a horse, a horse a man, a man a castle" (cca 1610): A nail keeps the shoe

"A picture is worth a thousand words"

"A stitch in time (saves nine)"

"Actions speak louder than words"

"All good things must come to an end"

"All generalizations are false" [attributed to Mark Twain, Alexandre Dumas (Père)]: Cutting though Complexity

"All roads lead to Rome"

"All is well that ends well"

"An ounce of prevention is worth a pound of cure"

"Another day, another dollar"

"As you sow so shall you reap"

"Beauty is in the eye of the beholder"

"Better late than never": SQL Server and Excel Data

"Better safe than sorry": Deleting obsolete companies

"Big fish eat little fish"

"Better the Devil you know (than the Devil you do not)": 

"Calm seas never made a good sailor"

"Count your blessings"

"Dead men tell no tales"

"Do not bite the hand that feeds you"

"Do not change horses in midstream"

"Do not count your chickens before they are hatched"

"Do not cross the bridge till you come to it"

"Do not judge a book by its cover"

"Do not meet troubles half-way"

"Do not put all your eggs in one basket"

"Do not put the cart before the horse"

"Do not try to rush things; ignore matters of minor advantage" (Confucius): A tale of two cities II

"Do not try to walk before you can crawl"

"Doubt is the beginning, not the end, of wisdom"

"Easier said than done"

"Every cloud has a silver lining"

"Every little bit helps"

"Every picture tells a story"

"Failing to plan is planning to fail"Planning correctly misunderstood...

"Faith will move mountains"

"Fake it till you make it"

"Fight fire with fire"

"First impressions are the most lasting"

"First things first": Ways of looking at data

"Fish always rots from the head downwards"

"Fools rush in (where angels fear to tread)" (Alexander Pope, "An Essay on Criticism", cca. 1711): A tale of two cities II

"Half a loaf is better than no bread"

"Haste makes waste"

"History repeats itself"

"Hope for the best, and prepare for the worst"

"If anything can go wrong, it will" (Murphy's law)

"If it ain't broke, don't fix it.": Approaching a query

"If you play with fire, you will get burned"

"If you want a thing done well, do it yourself"

"Ignorance is bliss"

"Imitation is the sincerest form of flattery"

"It ain't over till/until it's over"

"It is a small world"

"It is better to light a candle than curse the darkness"

"It is never too late": A look backAll-knowing developers are back...

"It's a bad plan that admits of no modification." (Publilius Syrus)Planning Correctly Misunderstood I

"It’s not an adventure until something goes wrong." (Yvon Chouinard)Documentation - Lessons learned

"It is not enough to learn how to ride, you must also learn how to fall"

"It takes a whole village to raise a child"

"It will come back and haunt you"

"Judge not, that ye be not judged"

"Kill two birds with one stone"

"Knowledge is power, guard it well"

"Learn a language, and you will avoid a war" (Arab proverb)

"Less is more"

"Life is what you make it"

"Many hands make light work"

"Moderation in all things"

"Money talks"

"More haste, less speed"

"Necessity is the mother of invention"

"Never judge a book by its cover"

"Never say never"

"Never too old to learn"

"No man can serve two masters"

"No pain, no gain"

"No plan ever survived contact with the enemy.' (Carl von Clausewitz)Planning Correctly Misunderstood I

"Oil and water do not mix"

"One man's trash is another man's treasure"

"One swallow does not make a summer"

"Only time will tell": The Software Quality Perspective and AI, Microsoft FabricIt’s all about Partnership IIAccess vs. LightSwitch

"Patience is a virtue"

"Practice makes perfect"

"Practice what you preach"

"Prevention is better than cure"

"Rules were made to be broken"

"Seek and ye shall find"

"Some are more equal than others" (George Orwell, "Animal Farm")

"Spoken words fly away, written words remain." ["Verba volant, scripta manent"]: Documentation - Lessons learned

"Strike while the iron is hot"

"Technology is dead": Dashboards Are Dead & Other Crapprogramming is dead

"The best defense is a good offense"

"The bigger they are, the harder they fall"

"The devil is in the detail": Copilot Stories Part IV, Cutting through ComplexityMore on SQL DatabasesThe Analytics MarathonThe Choice of Tools in PM, Who Messed with My Data?

"The die is cast"

"The exception which proves the rule"

"The longest journey starts with a single step"

"The pursuit of perfection is a fool's errand"

"There are two sides to every question"

"There is no smoke without fire"

"There's more than one way to skin a cat" (cca. 1600s)

"There is no I in team"

"There is safety in numbers"

"Those who do not learn from history are doomed to repeat it" (George Santayana)

"Time is money"

"To learn a language is to have one more window from which to look at the world" (Chinese proverb)[5

"Too little, too late"

"Too much of a good thing"

"Truth is stranger than fiction"

"Two birds with one stone": Deleting sequential data...

"Two heads are better than one": Pair programming

"Two wrongs (do not) make a right"

"United we stand, divided we fall"

"Use it or lose it"

"Unity is strength"

"Variety is the spice of life." (William Cowper)

"Virtue is its own reward"

"Well begun is half done"

"What does not kill me makes me stronger"

"Well done is better than well said"

"What cannot be cured must be endured"

"What goes around, comes around"

"When life gives you lemons, make lemonade"

"When the cat is away, the mice will play"

"When the going gets tough, the tough get going"

"Where there is a will there is a way"

"With great power comes great responsibility"

"Work expands so as to fill the time available"

"You are never too old to learn": All-Knowing Developers are Back in Demand?

"You can lead a horse to water, but you cannot make it drink"

"You cannot make an omelet without breaking eggs"

"(You cannot) teach an old dog new tricks"

"You must believe and not doubt at all": Believe and not doubt

"Zeal without knowledge is fire without light"

Previous Post <<||>> Next Post

References:
[1] Wikipedia (2024) List of proverbial phrases [link]

15 February 2025

🧭Business Intelligence: Perspectives (Part XXVII: A Tale of Two Cities II)

Business Intelligence Series
Business Intelligence Series
There’s a saying that applies to many contexts ranging from software engineering to data analysis and visualization related solutions: "fools rush in where angels fear to tread" [1]. Much earlier, an adage attributed to Confucius provides a similar perspective: "do not try to rush things; ignore matters of minor advantage". Ignoring these advices, there's the drive in rapid prototyping to jump in with both feet forward without checking first how solid the ground is, often even without having adequate experience in the field. That’s understandable to some degree – people want to see progress and value fast, without building a foundation or getting an understanding of what’s happening, respectively possible, often ignoring the full extent of the problems.

A prototype helps to bring the requirements closer to what’s intended to achieve, though, as the practice often shows, the gap between the initial steps and the final solutions require many iterations, sometimes even too many for making a solution cost-effective. There’s almost always a tradeoff between costs and quality, respectively time and scope. Sooner or later, one must compromise somewhere in between even if the solution is not optimal. The fuzzier the requirements and what’s achievable with a set of data, the harder it gets to find the sweet spot.

Even if people understand the steps, constraints and further aspects of a process relatively easily, making sense of the data generated by it, respectively using the respective data to optimize the process can take a considerable effort. There’s a chain of tradeoffs and constraints that apply to a certain situation in each context, that makes it challenging to always find optimal solutions. Moreover, optimal local solutions don’t necessarily provide the optimum effect when one looks at the broader context of the problems. Further on, even if one brought a process under control, it doesn’t necessarily mean that the process works efficiently.

This is the broader context in which data analysis and visualization topics need to be placed to build useful solutions, to make a sensible difference in one’s job. Especially when the data and processes look numb, one needs to find the perspectives that lead to useful information, respectively knowledge. It’s not realistic to expect to find new insight in any set of data. As experience often proves, insight is rarer than finding gold nuggets. Probably, the most important aspect in gold mining is to know where to look, though it also requires luck, research, the proper use of tools, effort, and probably much more.

One of the problems in working with data is that usually data is analyzed and visualized in aggregates at different levels, often without identifying and depicting the factors that determine why data take certain shapes. Even if a well-suited set of dimensions is defined for data analysis, data are usually still considered in aggregate. Having the possibility to change between aggregates and details is quintessential for data’s understanding, or at least for getting an understanding of what's happening in the various processes. 

There is one aspect of data modeling, respectively analysis and visualization that’s typically ignored in BI initiatives – process-wise there is usually data which is not available and approximating the respective values to some degree is often far from the optimal solution. Of course, there’s often a tradeoff between effort and value, though the actual value can be quantified only when gathering enough data for a thorough first analysis. It may also happen that the only benefit is getting a deeper understanding of certain aspects of the processes, respectively business. Occasionally, this price may look high, though searching for cost-effective solutions is part of the job!

Previous Post  <<||>>  Next Post

References:
[1] Alexander Pope (cca. 1711) An Essay on Criticism

04 February 2025

🧭Business Intelligence: Perspectives (Part XXVI: Monitoring - A Cockpit View)

Business Intelligence Series
Business Intelligence Series

The monitoring of business imperatives is sometimes compared metaphorically with piloting an airplane, where pilots look at the cockpit instruments to verify whether everything is under control and the flight ensues according to the expectations. The use of a cockpit is supported by the fact that an airplane is an almost "closed" system in which the components were developed under strict requirements and tested thoroughly under specific technical conditions. Many instruments were engineered and evolved over decades to operate as such. The processes are standardized, inputs and outputs are under strict control, otherwise the whole edifice would crumble under its own complexity. 

In organizational setups, a similar approach is attempted for monitoring the most important aspects of a business. A few dashboards and reports are thus built to monitor and control what’s happening in the areas which were identified as critical for the organization. The various gauges and other visuals were designed to provide similar perspectives as the ones provided by an airplane’s cockpit. At first sight the cockpit metaphor makes sense, though at careful analysis, there are major differences. 

Probably, the main difference is that businesses don’t necessarily have standardized processes that were brought under control (and thus have variation). Secondly, the data used doesn’t necessarily have the needed quality and occasionally isn’t fit for use in the business processes, including supporting processes like reporting or decision making. Thirdly, are high the chances that the monitoring within the BI infrastructures doesn’t address the critical aspects of the business, at least not at the needed level of focus, detail or frequency. The interplay between these three main aspects can lead to complex issues and a muddy ground for a business to build a stable edifice upon. 

The comparison with airplanes’ cockpit was chosen because the number of instruments available for monitoring is somewhat comparable with the number of visuals existing in an organization. In contrast, autos have a smaller number of controls simple enough to help the one(s) sitting in the cockpit. A car’s monitoring capabilities can probably reflect the needs of single departments or teams, though each unit needs its own gauges with specific business focus. The parallel is however limited because the areas of focus in organizations can change and shift in other directions, some topics may have a periodic character while others can regain momentum after a long time. 

There are further important aspects. At high level, the expectation is for software products and processes, including the ones related to BI topics, to have the same stability and quality as the mass production of automobiles, airplanes or other artifacts that have similar complexity and manufacturing characteristics. Even if the design process of software and manufacturing may share many characteristics, the similar aspects diverge as soon as the production processes start, respectively progress, and these are the areas where the most differences lie. Starting from the requirements and ending with the overall goals, everything resembles the characteristics of quick shifting sands on which is challenging to build any stabile edifice.

At micro level in manufacturing each piece was carefully designed and produced according to a set of characteristics that were proved to work. Everything must fit perfectly in the grand design and there are many tests and steps to make sure that happens. To some degree the same is attempted when building software products, though the processes break along the way with the many changes attempted, with the many cost, time and quality constraints. At some point the overall complexity kicks back; it might be still manageable though the overall effort is higher than what organizations bargained for. 

26 January 2025

🧭Business Intelligence: Perspectives (Part XXV: Grounding the Roots)

Business Intelligence Series
Business Intelligence Series

When building something that is supposed to last, one needs a solid foundation on which the artifact can be built upon. That’s valid for castles, houses, IT architectures, and probably most important, for BI infrastructures. There are so many tools out there that allow building a dashboard, report or other types of BI artifacts with a few drag-and-drops, moving things around, adding formatting and shiny things. In many cases all these steps are followed to create a prototype for a set of ideas or more formalized requirements keeping the overall process to a minimum. 

Rapid prototyping, the process of building a proof-of-concept by focusing at high level on the most important design and functional aspects, is helpful and sometimes a mandatory step in eliciting and addressing the requirements properly. It provides a fast road from an idea to the actual concept, however the prototype, still in its early stages, can rapidly become the actual solution that unfortunately continues to haunt the dreams of its creator(s). 

Especially in the BI area, there are many solutions that started as a prototype and gained mass until they start to disturb many things around them with implications for security, performance, data quality, and many other aspects. Moreover, the mass becomes in time critical, to the degree that it pulled more attention and effort than intended, with positive and negative impact altogether. It’s like building an artificial sun that suddenly becomes a danger for the nearby planet(s) and other celestial bodies. 

When building such artifacts, it’s important to define what goals the end-result must or would be nice to have, differentiating clearly between them, respectively when is the time to stop and properly address the aspects mandatory in transitioning from the prototype to an actual solution that addresses the best practices in scope. It’s also the point when one should decide upon solution’s feasibility, needed quality acceptance criteria, and broader aspects like supporting processes, human resources, data, and the various aspects that have impact. Unfortunately, many solutions gain inertia without the proper foundation and in extremis succumb under the various forces.

Developing software artifacts of any type is a balancing act between all these aspects, often under suboptimal circumstances. Therefore, one must be able to set priorities right, react and change direction (and gear) according to the changing context. Many wish all this to be a straight sequential road, when in reality it looks more like mountain climbing, with many peaks, valleys and change of scenery. The more exploration is needed, the slower the progress.

All these aspects require additional time, effort, resources and planning, which can easily increase the overall complexity of projects to the degree that it leads to (exponential) effort and more important - waste. Moreover, the complexity pushes back, leading to more effort, and with it to higher costs. On top of this one has the iteration character of BI topics, multiple iterations being needed from the initial concept to the final solution(s), sometimes many steps being discarded in the process, corners are cut, with all the further implications following from this. 

Somewhere in the middle, between minimum and the broad overextending complexity, is the sweet spot that drives the most impact with a minimum of effort. For some organizations, respectively professionals, reaching and remaining in the zone will be quite a challenge, though that’s not impossible. It’s important to be aware of all the aspects that drive and sustain the quality of artefacts, data and processes. There’s a lot to learn from successful as well from failed endeavors, and the various aspects should be reflected in the lessons learned. 

24 January 2025

🧭Business Intelligence: Perspectives (Part XXIV: Building Castles in the Air)

Business Intelligence Series
Business Intelligence Series

Business users have mainly three means of visualizing data – reports, dashboards and more recently notebooks, the latter being a mix between reports and dashboards. Given that all three types of display can be a mix of tabular representations and visuals/visualizations, the difference between them is often neglectable to the degree that the terms are used interchangeably. 

For example, in Power BI a report is a "multi-perspective view into a single semantic model, with visualizations that represent different findings and insights from that semantic model" [1], while a dashboard is "a single page, often called a canvas, that uses visualizations to tell a story" [1], a dashboards’ visuals coming from one or more reports [2]. Despite this clear delimitation, the two concepts continue to be mixed and misused in conversations even by data-related professionals. This happens also because in other tools the vendors designate as dashboard what is called report in Power BI. 

Given the limited terminology, it’s easy to generalize that dashboards are useless, poorly designed, bad for business users, and so on. As Stephen Few recognized almost two decades ago, "most dashboards fail to communicate efficiently and effectively, not because of inadequate technology (at least not primarily), but because of poorly designed implementations" [3]. Therefore, when people say that "dashboards are bad" refer to the result of poorly implementations, of what some of them were part of, which frankly is a different topic! Unfortunately, BI implementations reflect probably more than any other areas how easy is to fail!

Frankly, here it is not necessarily the poor implementation of a project management methodology at fault, which quite often happens, but the way requirements are defined, understood, documented and implemented. Even if these last aspects are part of the methodologies, they are merely a reflection of how people understand the business. The outcomes of BI implementations are rooted in other areas, and it starts with how the strategic goals and objectives are defined, how the elements that need oversight are considered in the broader perspectives. The dashboards become thus the end-result of a chain of failures, failing to build the business-related fundament on which the reporting infrastructure should be based upon. It’s so easy to shift the blame on what’s perceptible than on what’s missing!

Many dashboards are built because people need a sense of what’s happening in the business. It starts with some ideas based on the problems identified in organizations, one or more dashboards are built, and sometimes a lot of time is invested in the process. Then, some important progress is made, and all comes to a stale if the numbers don’t reveal something new, important, or whatever users’ perception is. Some might regard this as failure, though as long as the initial objectives were met, something was learned in the process and a difference was made, one can’t equate this with failure!

It’s more important to recognize the temporary character of dashboards, respectively of the requirements that lead to them and build around them. Of course, this requires occasionally a different approach to the whole topic. It starts with how KPIs and other business are defined and organized, respectively on how data repositories are built, and it ends with how data are visualized and reported.

As the practice often revealed, it’s possible to build castles in the air, without a solid foundation, though the expectation for such edifices to sustain the weight of businesses is unrealistic. Such edifices break with the first strong storm and unfortunately it's easier to blame a set of tools, some people or a whole department instead at looking critically at the whole organization!


References:
[1] Microsoft Learn (2024) Power BI: Glossary [link]
[2] Microsoft Learn (2024) Power BI: Dashboards for business users of the Power BI service [link
[3] Stephen Few, "Information Dashboard Design", 2006

15 January 2025

🧭Business Intelligence: Perspectives (Part XXIII: In between the Many Destinations)

Business Intelligence Series
Business Intelligence Series

In too many cases the development of queries, respectively reports or data visualizations (aka artifacts) becomes a succession of drag & drops, formatting, (re)ordering things around, a bit of makeup, configuring a set of parameters, and the desired product is good to go! There seems nothing wrong with this approach as long as the outcomes meet users’ requirements, though it also gives the impression that’s all what the process is about. 

Given a set of data entities, usually there are at least as many perspectives into the data as entities’ number. Further perspectives can be found in exceptions and gaps in data, process variations, and the further aspects that can influence an artifact’s logic. All these aspects increase the overall complexity of the artifact, respectively of the development process. One guideline in handling all this is to keep the process in focus, and this starts with requirements’ elicitation and ends with the quality assurance and actual use.

Sometimes, the two words, the processes and their projection into the data and (data) models don’t reflect the reality adequately and one needs to compromise, at least until the gaps are better addressed. Process redesign, data harmonization and further steps need to be upon case considered in multiple iterations that should converge to optimal solutions, at least in theory. 

Therefore, in the development process there should be a continuous exploration of the various aspects until an optimum solution is reached. Often, there can be a couple of competing forces that can pull the solution in two or more directions  and then compromising is necessary. Especially as part of continuous improvement initiatives there’s the tendency of optimizing locally processes in the detriment of the overall process, with all the consequences resulting from this. 

Unfortunately, many of the problems existing in organizations are ill-posed and misunderstood to the degree that in extremis more effort is wasted than the actual benefits. Optimization is a process of putting in balance all the important aspects, respectively of answering with agility to the changing nature of the business and environment. Ignoring the changing nature of the problems and their contexts is a recipe for failure on the long term. 

This implies that people in particular and organizations in general need to become and  remain aware of the micro and macro changes occurring in organizations. Continuous learning is the key to cope with change. Organizations must learn to compromise and focus on what’s important, achievable and/or probable. Identifying, defining and following the value should be in an organization’s ADN. It also requires pragmatism (as opposed to idealism). Upon case, it may even require to say “no”, at least until the changes in the landscape offer a reevaluation of the various aspects.

One requires a lot from organizations when addressing optimization topics, especially when misalignment or important constraints or challenges may exist. Unfortunately, process related problems don’t always admit linear solutions. The nonlinear aspects are reflected especially when changing the scale, perspective or translating the issues or solutions from one are area to another.

There are probably answers available in the afferent literature or in the approaches followed by other organizations. Reinventing the wheel is part of the game, though invention may require explorations outside of the optimal paths. Conversely, an organization that knows itself has more chances to cope with the challenges and opportunities altogether. 

A lot from what organizations do in a consistent manner looks occasionally like inertia, self-occupation, suboptimal or random behavior, in opposition to being self-driven, self-aware, or in self-control. It’s also true that these are ideal qualities or aspects of what organizations should become in time. 

14 January 2025

🧭Business Intelligence: Perspectives (Part XXII: Breaking Queries' Complexity)

Business Intelligence Series
Business Intelligence Series

Independently whether standalone or encapsulated in database objects, the queries written can become complex in time, respectively difficult to comprehend and maintain. One can reduce the cognitive load by identifying the aspects that enable one’s intuition - order, similarity and origin being probably the most important aspects that help coping with the inherent complexity. 

One should start with the table having the lowest level of detail, usually a transaction table that forms the backbone of a certain feature. For example, for Purchase Orders this could be upon case the distribution or line level. If Invoices are added to the logic, and there could be multiple invoice line for a record from the former logic, then this becomes the new level of detail. Further on, if General Ledger lines are added, more likely this becomes the lowest level of detail, and so on.

This approach allows to keep a consistent way of building the queries while enabling to validate the record count, making sure that no duplicates are added to the logic. Conversely, one can start also from the table with the lowest level of details, and add tables successively based on their level of detail, though the database engine may generate upon case a suboptimal plan. In addition, checking the cardinality may involve writing a second query. 

One should try to keep the tables belonging to the same entity together, when possible (e.g. Purchase Order vs. Vendor information). This approach allows to reduce the volume of work required to manage, review, test and understand the query later. If the blocks are too big, then occasionally it makes sense to bring pieces of logic into CTEs (Common Table Expressions), or much better into views that allow to better encapsulate and partition the logic.

CTEs allow to split the logic into logical steps, allowing occasionally to troubleshoot the logic on pieces though one should keep a balance between maintainability and detail. In extremis, people may create unnecessarily an additional CTE for each join. The longer and the more fragmented a query, the more difficult it becomes to troubleshoot and even understand. Readability can be better achieved though indentation, by keeping things together that belong together, respectively partitioning the logic in logical blocks that derive from the model. 

Keeping things together should be applied also to the other elements. For example, the join constraints should be limited only to the fields participating in the join (and, if possible, all the other constraints should be brought in the WHERE clause). Bringing the join constraints in the WHERE clause, an approach used in the past, decreases query readability no matter how well the WHERE clause is structured, and occasionally can lead to constraints’ duplication or worse, to missing pieces of logic. 

The order of the attributes should follow a logic that makes sense. One approach is to start from the tables with lowest cardinality that identify entities uniquely and move to the attributes that have a higher level of detail. However, not all attributes are equally important, and thus one might have to compromise and create multiple groups of data coming from different levels. 

One should keep in mind that the more random the order of the attributes is, the more difficult it becomes to validate the data as one needs to jump multiple times either in the query or in the mask. Ideally one should find a balance between the two perspectives. Having an intuitive logic of how the attributes are listed increases queries’ readability, maintainability and troubleshooting. The more random attributes’ listing, the higher the effort for the same. 

Previous Post  <<||>>   Next Post

18 December 2024

🧭🏭Business Intelligence: Microsoft Fabric (Part VII: Data Stores Comparison)

Business Intelligence Series
Business Intelligence Series

Microsoft made available a reference guide for the data stores supported for Microsoft Fabric workloads [1], including the new Fabric SQL database (see previous post). Here's the consolidated table followed by a few aspects to consider: 

Area Lakehouse Warehouse Eventhouse Fabric SQL database Power BI Datamart
Data volume Unlimited Unlimited Unlimited 4 TB Up to 100 GB
Type of data Unstructured, semi-structured, structured Structured, semi-structured (JSON) Unstructured, semi-structured, structured Structured, semi-structured, unstructured Structured
Primary developer persona Data engineer, data scientist Data warehouse developer, data architect, data engineer, database developer App developer, data scientist, data engineer AI developer, App developer, database developer, DB admin Data scientist, data analyst
Primary dev skill Spark (Scala, PySpark, Spark SQL, R) SQL No code, KQL, SQL SQL No code, SQL
Data organized by Folders and files, databases, and tables Databases, schemas, and tables Databases, schemas, and tables Databases, schemas, tables Database, tables, queries
Read operations Spark, T-SQL T-SQL, Spark* KQL, T-SQL, Spark T-SQL Spark, T-SQL
Write operations Spark (Scala, PySpark, Spark SQL, R) T-SQL KQL, Spark, connector ecosystem T-SQL Dataflows, T-SQL
Multi-table transactions No Yes Yes, for multi-table ingestion Yes, full ACID compliance No
Primary development interface Spark notebooks, Spark job definitions SQL scripts KQL Queryset, KQL Database SQL scripts Power BI
Security RLS, CLS**, table level (T-SQL), none for Spark Object level, RLS, CLS, DDL/DML, dynamic data masking RLS Object level, RLS, CLS, DDL/DML, dynamic data masking Built-in RLS editor
Access data via shortcuts Yes Yes Yes Yes No
Can be a source for shortcuts Yes (files and tables) Yes (tables) Yes Yes (tables) No
Query across items Yes Yes Yes Yes No
Advanced analytics Interface for large-scale data processing, built-in data parallelism, and fault tolerance Interface for large-scale data processing, built-in data parallelism, and fault tolerance Time Series native elements, full geo-spatial and query capabilities T-SQL analytical capabilities, data replicated to delta parquet in OneLake for analytics Interface for data processing with automated performance tuning
Advanced formatting support Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format Full indexing for free text and semi-structured data like JSON Table support for OLTP, JSON, vector, graph, XML, spatial, key-value Tables defined using PARQUET, CSV, AVRO, JSON, and any Apache Hive compatible file format
Ingestion latency Available instantly for querying Available instantly for querying Queued ingestion, streaming ingestion has a couple of seconds latency Available instantly for querying Available instantly for querying

It can be used as a map for what is needed to know for using each feature, respectively to identify how one can use the previous experience, and here I'm referring to the many SQL developers. One must consider also the capabilities and limitations of each storage repository.

However, what I'm missing is some references regarding the performance for data access, especially compared with on-premise workloads. Moreover, the devil hides in details, therefore one must test thoroughly before committing to any of the above choices. For the newest overview please check the referenced documentation!

For lakehouses, the hardest limitation is the lack of multi-table transactions, though that's understandable given its scope. However, probably the most important aspect is whether it can scale with the volume of reads/writes as currently the SQL endpoint seems to lag. 

The warehouse seems to be more versatile, though careful attention needs to be given to its design. 

The Eventhouse opens the door to a wide range of time-based scenarios, though it will be interesting how developers cope with its lack of functionality in some areas. 

Fabric SQL databases are a new addition, and hopefully they'll allow considering a wide range of OLTP scenarios. Starting with 28th of March 2025, SQL databases will be ON by default and tenant admins must manually turn them OFF before the respective date [3].

Power BI datamarts have been in preview for a couple of years.


References:
[1] Microsoft Fabric (2024) Microsoft Fabric decision guide: choose a data store [link]
[2] Reitse's blog (2024) Testing Microsoft Fabric Capacity: Data Warehouse vs Lakehouse Performance [link]
[3] Microsoft Fabric Update Blog (2025) Extending flexibility: default checkbox changes on tenant settings for SQL database in Fabric [link]
[4] Microsoft Fabric Update Blog (2025) Enhancing SQL database in Fabric: share your feedback and shape the future [link]
[5] Microsoft Fabric Update Blog (2025) Why SQL database in Fabric is the best choice for low-code/no-code Developers [link

14 December 2024

🧭💹Business Intelligence: Perspectives (Part XXI: Data Visualization Revised)

Data Visualization Series
Data Visualization Series

Creating data visualizations nowadays became so easy that anybody can do it with a minimum of effort and knowledge, which on one side is great for the creators but can be easily become a nightmare for the readers, respectively users. Just dumping data in visuals can be barely called data visualization, even if the result is considered as such. The problems of visualization are multiple – the lack of data culture, the lack of understanding processes, data and their characteristics, the lack of being able to define and model problems, the lack of educating the users, the lack of managing the expectations, etc.

There are many books on data visualization though they seem an expensive commodity for the ones who want rapid enlightenment, and often the illusion of knowing proves maybe to be a barrier. It's also true that many sets of data are so dull, that the lack of information and meaning is compensated by adding elements that give a kitsch look-and-feel (aka chartjunk), shifting the attention from the valuable elements to decorations. So, how do we overcome the various challenges? 

Probably, the most important step when visualizing data is to define the primary purpose of the end product. Is it to inform, to summarize or to navigate the data, to provide different perspectives at macro and micro level, to help discovery, to explore, to sharpen the questions, to make people think, respectively understand, to carry a message, to be artistic or represent truthfully the reality, or maybe is just a filler or point of attraction in a textual content?

Clarifying the initial purpose is important because it makes upfront the motives and expectations explicit, allowing to determine the further requirements, characteristics, and set maybe some limits in what concern the time spent and the qualitative and/or qualitative criteria upon which the end result should be eventually evaluated. Narrowing down such aspects helps in planning and the further steps performed. 

Many of the steps are repetitive and past experience can help reduce the overall effort. Therefore, professionals in the field, driven by intuition and experience probably don't always need to go through the full extent of the process. Conversely, what is learned and done poorly, has high chances of delivering poor quality. 

A visualization can be considered as effective when it serves the intended purpose(s), when it reveals with minimal effort the patterns, issues or facts hidden in the data, when it allows people to explore the data, ask questions and find answers altogether. One can talk also about efficiency, especially when readers can see at a glance the many aspects encoded in the visualization. However, the more the discovery process is dependent on data navigation via filters or other techniques, the more difficult it becomes to talk about efficiency.

Better criteria to judge visualizations is whether they are meaningful and useful for the readers, whether the readers understood the authors' intent, the further intrinsic implication, though multiple characteristics can be associated with these criteria: clarity, specificity, correctedness, truthfulness, appropriateness, simplicity, etc. All these are important in lower or higher degree depending on the broader context of the visualization.

All these must be weighted in the bigger picture when creating visualizations, though there are probably also exceptions, especially on the artistic side, where artists can cut corners for creating an artistic effect, though also in here the authors need to be truthful to the data and make sure that their work don't distort excessively the facts. Failing to do so might not have an important impact on the short term considerably, though in time the effects can ripple with unexpected effects.


13 December 2024

🧭💹Business Intelligence: Perspectives (Part XX: From BI to AI)

Business Intelligence Series

No matter how good data visualizations, reports or other forms of BI artifacts are, they only serve a set of purposes for a limited amount of time, limited audience or any other factors that influence their lifespan. Sooner or later the artifacts become thus obsolete, being eventually disabled, archived and/or removed from the infrastructure. 

Many artifacts require a considerable number of resources for their creation and maintenance over time. Sometimes the costs can be considerably higher than the benefits brought, especially when the data or the infrastructure are used for a narrow scope, though there can be other components that need to be considered in the bigger picture. Having a report or visualization one can use when needed can have an important impact on the business in correcting issues, sizing opportunities or filling the knowledge gaps. 

Even if it’s challenging to quantify the costs associated with the loss of opportunities rooted in the lack of data, respectively information, the amounts can be considerable high, greater even than building a whole BI infrastructure. Organization’s agility in addressing the important gaps can make a considerable difference, at least in theory. Having the resources that can be pulled on demand can give organizations the needed competitive boost. Internal or external resources can be used altogether, though, pragmatically speaking, there will be always a gap between demand and supply of knowledgeable resources.

The gap in BI artefacts can be addressed nowadays by AI-driven tools, which have the theoretical potential of shortening the gap between needs and the availability of solutions, respectively a set of answers that can be used in the process. Of course, the processes of sense-making and discovery are not that simple as we’d like, though it’s a considerable step forward. 

Having the possibility of asking questions in natural language and guiding the exploration process to create visualizations and other artifacts using prompt engineering and other AI-enabled methods offers new possibilities and opportunities that at least some organizations started exploring already. This however presumes the existence of an infrastructure on which the needed foundation can be built upon, the knowledge required to bridge the gap, respectively the resources required in the process. 

It must be stressed out that the exploration processes may bring no sensible benefits, at least no immediately, and the whole process depends on organizations’ capabilities of identifying and sizing the respective opportunities. Therefore, even if there are recipes for success, each organization must identify what matters and how to use technologies and the available infrastructure to bridge the gap.

Ideally to make progress organizations need besides the financial resources the required skillset, a set of projects that support learning and value creation, respectively the design and execution of a business strategy that addresses the steps ahead. Each of these aspects implies risks and opportunities altogether. It will be a test of maturity for many organizations. It will be interesting to see how many organizations can handle the challenge, respectively how much past successes or failures will weigh in the balance. 

AI offers a set of capabilities and opportunities, however the chance of exploring and failing fast is of great importance. AI is an enabler and not a magic wand, no matter what is preached in technical workshops! Even if progress follows an exponential trajectory, it took us more than half of century from the first steps until now and probably many challenges must be still overcome. 

The future looks interesting enough to be pursued, though are organizations capable to size the opportunities, respectively to overcome the challenges ahead? Are organizations capable of supporting the effort without neglecting the other priorities? 

12 December 2024

🧭💹Business Intelligence: Perspectives (Part XIX: Data Visualization between Art, Pragmatism and Kitsch)

Business Intelligence Series

The data visualizations (aka dataviz) presented in the media, especially the ones coming from graphical artists, have the power to help us develop what is called graphical intelligence, graphical culture, graphical sense, etc., though without a tutor-like experience the process is suboptimal because it depends on our ability of identifying what is important and which are the steps needed for decoding and interpreting such work, respectively for integrating their messages in our overall understanding about the world.

When such skillset is lacking, without explicit annotations or other form of support, the reader might misinterpret or fail to observe important visual cues even for simple visualizations, with all the implications deriving from this – a false understanding, and further aspects deriving from it, this being probably the most important aspect to consider. Unfortunately, even the most elaborate work can fail if the reader doesn’t have a basic understanding of all that’s implied in the process.

The books of Willard Brinton, Ana Rogers, Jacques Bertin, William Cleveland, Leland Wilkinson, Stephen Few, Albert Cairo, Soctt Berinato and many others can help the readers build a general understanding of the dataviz process and how data visualizations or simple graphics can be used/misused effectively, though each reader must follow his/her own journey. It’s also true that the basics can be easily learned, though the deeper one dives, the more interesting and nontrivial the journey becomes. Fortunately, the average reader can stick to the basics and many visualizations are simple enough to be understood.

To grasp the full extent of the implications, one can make comparisons with the domain of poetry where the author uses basic constructs like metaphor, comparisons, rhythm and epithets to create, communicate and imprint in reader’s mind old and new meanings, images and feelings altogether. Artistic data visualizations tend to offer similar charge as poetry does, even if the impact might not appeal so much to our artistic sensibility. Though dataviz from this perspective is or at least resembles an art form.

Many people can write verses, though only a fraction can write good meaningful poetry, from which a smaller fraction get poems, respectively even fewer get books published. Conversely, not everything can be expressed in verses unless one finds good metaphors and other aspects that can be leveraged in the process. Same can be said about good dataviz.

One can argue that in dataviz the author can explore and learn especially by failing fast (seeing what works and what doesn’t). One can also innovate, though the creator has probably a limited set of tools and rules for communication. Enabling readers to see the obvious or the hidden in complex visualizations or contexts requires skill and some kind of mastery of the visual form.

Therefore, dataviz must be more pragmatic and show the facts. In art one has the freedom to distort or move things around to create new meanings, while in dataviz it’s important for the meaning to be rooted in 'truth', at least by definition. The more the creator of a dataviz innovates, the higher the chances of being misunderstood. Moreover, readers need to be educated in interpreting the new meanings and get used to their continuous use.

Kitsch is a term applied to art and design that is perceived as naïve imitation to the degree that it becomes a waste of resources even if somebody pays the tag price. There’s a trend in dataviz to add elements to visualizations that don’t bring any intrinsic value – images, colors and other elements can be misused to the degree that the result resembles kitsch, and the overall value of the visualization is diminished considerably.

16 October 2024

🧭💹Business Intelligence: Perspectives (Part XVIII: There’s More to Noise)

Business Intelligence Series
Business Intelligence Series

Visualizations should be built with an audience's characteristics in mind! Upon case, it might be sufficient to show only values or labels of importance (minima, maxima, inflexion points, exceptions, trends), while other times it might be needed to show all or most of the values to provide an accurate extended perspective. It even might be useful to allow users switching between the different perspectives to reduce the clutter when navigating the data or look at the patterns revealed by the clutter. 

In data-based storytelling are typically shown the points, labels and further elements that support the story, the aspects the readers should focus on, though this approach limits the navigability and users’ overall experience. The audience should be able to compare magnitudes and make inferences based on what is shown, and the accurate decoding shouldn’t be taken as given, especially when the audience can associate different meanings to what’s available and what’s missing. 

In decision-making, selecting only some well-chosen values or perspectives to show might increase the chances for a decision to be made, though is this equitable? Cherry-picking may be justified by the purpose, though is in general not a recommended practice! What is not shown can be as important as what is shown, and people should be aware of the implications!

One person’s noise can be another person’s signal. Patterns in the noise can provide more insight compared with the trends revealed in the "unnoisy" data shown! Probably such scenarios are rare, though it’s worth investigating what hides behind the noise. The choice of scale, the use of special types of visualizations or the building of models can reveal more. If it’s not possible to identify automatically such scenarios using the standard software, the users should have the possibility of changing the scale and perspective as seems fit. 

Identifying patterns in what seems random can prove to be a challenge no matter the context and the experience in the field. Occasionally, one might need to go beyond the general methods available and statistical packages can help when used intelligently. However, a presenter’s challenge is to find a plausible narrative around the findings and communicate it further adequately. Additional capabilities must be available to confirm the hypotheses framed and other aspects related to this approach.

It's ideal to build data models and a set of visualizations around them. Most probable some noise may be removed in the process, while other noise will be further investigated. However, this should be done through adjustable visual filters because what is removed can be important as well. Rare events do occur, probably more often than we are aware and they may remain hidden until we find the right perspective that takes them into consideration. 

Probably, some of the noise can be explained by special events that don’t need to be that rare. The challenge is to identify those parameters, associations, models and perspectives that reveal such insights. One’s gut feeling and experience can help in this direction, though novel scenarios can surprise us as well.

Not in every set of data one can find patterns, respectively a story trying to come out. Whether we can identify something worth revealing depends also on the data available at our disposal, respectively on whether the chosen data allow identifying significant patterns. Occasionally, the focus might be too narrow, too wide or too shallow. It’s important to look behind the obvious, to look at data from different perspectives, even if the data seems dull. It’s ideal to have the tools and knowledge needed to explore such cases and here the exposure to other real-life similar scenarios is probably critical!

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.