Showing posts sorted by date for query Data management. Sort by relevance Show all posts
Showing posts sorted by date for query Data management. Sort by relevance Show all posts

20 November 2024

♟️Strategic Management: Analysis (Just the Quotes)

"Business executives cannot afford to ignore the merits of graphical representation which have for so long been accepted by the engineer and man of science. They must look behind the graphical method and study the conditions leading to the picture along with the picture itself. No business is too small to profit by an examination which shall analyze and scrutinize nor too large to ignore its possibilities. Each business must adjust the graphical methods to its own peculiarities and each diagram must be adjusted to the individual for whom it is prepared or the individual must be educated up to the use and importance of these methods of analysis." (William C Marshall, "Graphical methods for schools, colleges, statisticians, engineers and executives", 1921)

"The pattern of personal characteristics of the leader must bear some relevant relationship to the characteristics, activities, and goals of the followers. [...] It becomes clear that an adequate analysis of leadership involves not only a study of leadership but also of situations." (R M Stodgill, "Journal of Psychology", 1948)

"Planning is essentially the analysis and measurement of materials and processes in advance of the event and the perfection of records so that we may know exactly where we are at any given moment. In short it is attempting to steer each operation and department by chart and compass and chronometer - not by guess and by God." (Lyndall Urwick, "The Pattern of Management", 1956)

"Another approach to management theory, undertaken by a growing and scholarly group, might be referred to as the decision theory school. This group concentrates on rational approach to decision-the selection from among possible alternatives of a course of action or of an idea. The approach of this school may be to deal with the decision itself, or to the persons or organizational group making the decision, or to an analysis of the decision process. Some limit themselves fairly much to the economic rationale of the decision, while others regard anything which happens in an enterprise the subject of their analysis, and still others expand decision theory to cover the psychological and sociological aspect and environment of decisions and decision-makers." (Harold Koontz, "The Management Theory Jungle," 1961)

"We have overwhelming evidence that available information plus analysis does not lead to knowledge. The management science team can properly analyse a situation and present recommendations to the manager, but no change occurs. The situation is so familiar to those of us who try to practice management science that I hardly need to describe the cases." (C West Churchman, "Managerial acceptance of scientific recommendations", California Management Review Vol 7, 1964)

"Analysis is not a scientific procedure for reaching decisions which avoid intuitive elements, but rather a mechanism for sharpening the intuition of the decision maker." (James R Schlesinger, "Memorandum to Senate Committee on Government Operations", 1968)

"The concept of organizational goals, like the concepts of power, authority, or leadership, has been unusually resistant to precise, unambiguous definition. Yet a definition of goals is necessary and unavoidable in organizational analysis. Organizations are established to do something; they perform work directed toward some end." (Charles Perrow, "Organizational Analysis: A Sociological View", 1970)

"Strategic planning is not the 'application of scientific methods to business decision' […] . It is the application of thought, analysis, imagination, and judgment. It is responsibility, rather than technique. […] Strategy planning is not forecasting. […] Strategic planning is necessary precisely because we cannot forecast. […] Strategic planning does nor deal with future decisions. It deals with the futurity of present decisions. […] Strategic planning is not an attempt to eliminate risk. It is not even an attempt to minimize risk." (Peter F Drucker, "Management: Tasks, Responsibilities, Practices", 1973)

"Perhaps the fault [for the poor implementation record for models] lies in the origins of managerial model-making - the translation of methods and principles of the physical sciences into wartime operations research. [...] If hypothesis, data, and analysis lead to proof and new knowledge in science, shouldn’t similar processes lead to change in organizations? The answer is obvious-NO! Organizational changes (or decisions or policies) do not instantly pow from evidence, deductive logic, and mathematical optimization." (Edward B Roberts, "Interface", 1977)

"Managers are not confronted with problems that are independent of each other, but with dynamic situations that consist of complex systems of changing problems that interact with each other. I call such situations messes. Problems are extracted from messes by analysis. Managers do not solve problems, they manage messes." (Russell L Ackoff, "The future of operational research is past", 1979)

"Analysis is the critical starting point of strategic thinking. Faced with problems, trends, events, or situations that appear to constitute a harmonious whole or come packaged as a whole by common sense of the day, the strategic thinker dissects them into their constituent parts. Then, having discovered the significance of these constituents, he reassembles them in a way calculated to maximize his advantage." (Kenichi Ohmae, "The Mind Of The Strategist", 1982) 

"In business as on the battlefield, the object of strategy is to bring about the conditions most favorable to one's own side, judging precisely the right moment to attack or withdraw and always assessing the limits of compromise correctly. Besides the habit of analysis, what marks the mind of the strategist is an intellectual elasticity or flexibility that enables him to come up with realistic responses to changing situations, not simply to discriminate with great precision among different shades of gray." (Kenichi Ohmae, "The Mind Of The Strategist", 1982)

"No matter how difficult or unprecedented the problem, a breakthrough to the best possible solution can come only from a combination of rational analysis, based on the real nature of things, and imaginative reintegration of all the different items into a new pattern, using nonlinear brainpower. This is always the most effective approach to devising strategies for dealing successfully with challenges and opportunities, in the market arena as on the battlefield." (Kenichi Ohmae, "The Mind Of The Strategist", 1982)

"View thinking as a strategy. Thinking is the best way to resolve difficulties. Maintain faith in your ability to think your way out of problems. Recognize the difference between worrying and thinking. The former is repeated, needless problem analysis while the latter is solution generation." (Timothy W Firnstahl, Harvard Business Review, 1986)

"[…] the most successful strategies are visions, not plans. Strategic planning isn’t strategic thinking. One is analysis, and the other is synthesis." (Henry Mintzberg, "The Fall and Rise of Strategic Planning", Harvard Business Review, 1994)

"Enterprise Engineering is defined as that body of knowledge, principles, and practices having to do with the analysis, design, implementation and operation of an enterprise. In a continually changing and unpredictable competitive environment, the Enterprise Engineer addresses a fundamental question: 'how to design and improve all elements associated with the total enterprise through the use of engineering and analysis methods and tools to more effectively achieve its goals and objectives' [...]" (Donald H Liles, "The Enterprise Engineering Discipline", 1996)

"Without meaningful data there can be no meaningful analysis. The interpretation of any data set must be based upon the context of those data. Unfortunately, much of the data reported to executives today are aggregated and summed over so many different operating units and processes that they cannot be said to have any context except a historical one - they were all collected during the same time period. While this may be rational with monetary figures, it can be devastating to other types of data." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"[...] strategy is about determining the problems and opportunities in front of you, defining them properly, and shaping a course of action that will give your business the greatest advantage. Balancing problem solving with creating and exploiting new opportunities through imagination and analysis is the cornerstone of a great strategy." (Eben Hewitt, "Technology Strategy Patterns: Architecture as strategy" 2nd Ed., 2019)

"If you do not conduct sufficient analysis and if you do not have firm technical knowledge, you cannot carry out improvement or standardization, nor can you perform good control or prepare control charts useful for effective control." (Kaoru Ishikawa)

10 November 2024

🏭🗒️Microsoft Fabric: Data Mesh (Notes)

Disclaimer: This is work in progress intended to consolidate information from various sources and may deviate from them. Please consult the sources for the exact content!
Last updated: 23-May-2024

Data Mesh [Notes]
  • {definition} a type of decentralized data architecture that organizes data based on different business domains [2]
    •   a centrally managed network of decentralized data products
  • {concept} landing zone
    • typically a subscription that needs to be governed by a common policy [7]
      • {downside} creating one landing zone for every project can lead to too many landing zones to manage
        • {alternative} landing zones based on a business domain [7] 
    •  resources must be managed efficiently in a way that each team is given access to only their resources [7]
      •   shared resources might be need with separate management and common access to all [7]
    • need to be linked together into a mesh
      • via peer-to-peer networks
  • {concept} connectivity hub
  • {feature} resource group
    • {definition} a container that holds related resources for an Azure solution 
    • can be associated with a data product
      • when the data product becomes obsolete, the resource group can be deleted [7]
  • {feature} subscription
    • {definition} a logical unit of Azure services that are linked to an Azure account
    • can be associated as a landing zone governed by a policy [7]
  • {feature} tenant (aka Microsoft Fabric tenantMF tenant)
    • a single instance of Fabric for an organization that is aligned with a Microsoft Entra ID
    • can contain any number of workspaces
  • {feature} workspaces
    • {definition} a collection of items that brings together different functionality in a single environment designed for collaboration
    • associated with a domain [3]
  • {feature} domains
    • {definition} a way of logically grouping together data in an organization that is relevant to a particular area or field [1]
    • some tenant-level settings for managing and governing data can be delegated to the domain level [2]
  • {feature} subdomains
    • a way for fine tuning the logical grouping data under a domain [1]
    • subdivisions of a domain
  • {concept} deployment template

References
[1] Microsoft Learn: Fabric (2023) Fabric domains (link)
[2] Establishing Data Mesh architectural pattern with Domains and OneLake on Microsoft Fabric, by Maheswaran Arunachalam (link
[3] Data mesh: A perspective on using Azure Synapse Analytics to build data products, by Amanjeet Singh (link)
[4] Zhamak Dehghani (2021) Data Mesh: Delivering Data-Driven Value at Scale
[5] Marthe Mengen (2024) How do you set up your Data Mesh in Microsoft Fabric? (link)
[6] Administering Microsoft Fabric - Considering Data Products vs Domains vs Workspaces, by Paul Andrew (link)
[7] Aniruddha Deswandikar (2024) Engineering Data Mesh in Azure Cloud

16 October 2024

𖣯Strategic Management: Strategic Perspectives (Part II: The Elephant in the Room)

Strategic Management Perspectives
Strategic Management Perspectives

There’s an ancient parable about several blind people who touch a shape they had never met before, an elephant, and try to identify what it is. The elephant is big, more than each person can sense through direct experience, and people’s experiences don’t correlate to the degree that they don’t trust each other, the situation escalating upon case. The moral of the parable is that we tend to claim (absolute) truths based on limited, subjective experience [1], and this can easily happen in business scenarios in which each of us has a limited view of the challenges we are facing individually and as a collective. 

The situation from the parable can be met in business scenarios, when we try to make sense of the challenges we are faced with, and we get only a limited perspective from the whole picture. Only open dialog and working together can get us closer to the solution! Even then, the accurate depiction might not be in sight, and we need to extrapolate the unknown further.  

A third-party consultant with experience might be the right answer, at least in theory, though experience and solutions are relative. The consultant might lead us in a direction, though from this to finding the answer can be a long way that requires experimentation, a mix of tactics and strategies that change over time, more sense-making and more challenges lying ahead. 

We would like a clear answer and a set of steps that lead us to the solution, though the answer is as usual, it depends! It depends on the various forces/drivers that have the biggest impact on the organization, on the context, on the organization’s goals, on the resources available directly or indirectly, on people’s capabilities, the occurrences of external factors, etc. 

In many situations the smartest thing to do is to gather information, respectively perspectives from all the parties. Tools like brainstorming, SWOT/PESTLE analysis or scenario planning can help in sense-making to identify the overall picture and where the gravity point lies. For some organizations the solution will be probably a new ERP system, or the redesign of some processes, introduction of additional systems to track quality, flow of material, etc. 

A new ERP system will not necessarily solve all the issues (even if that’s the expectation), and some organizations just try to design the old processes into a new context. Process redesign in some areas can be upon case a better approach, at least as primary measure. Otherwise, general initiatives focused on quality, data/information management, customer/vendor management, integrations, and the list remains open, can provide the binder/vehicle an organization needs to overcome the current challenges.

Conversely, if the ERP or other strategical systems are 10-20 years old, then there’s indeed an elephant in the room! Moreover, the elephant might be bigger than we can chew, and other challenges might lurk in its shadow(s). Everything is a matter of perspective with no apparent unique answer. Thus, finding an acceptable solution might lurk in the shadow of the broader perspective, in the cumulated knowledge of the people experiencing the issues, respectively in some external guidance. Unfortunately, the guides can be as blind as we are, making limited or no important impact. 

Sometimes, all it’s needed is a leap of faith corroborated with a set of tactics or strategies kept continuously in check, redirected as they seem fit based on the knowledge accumulated and the challenges ahead. It helps to be aware of how others approached the same issues. Unfortunately, there’s no answer that works for all! In this lies the challenge, in identifying what works and makes sense for us!

Previous Post <<||>> Next Post

Resources:
[1] Wikipedia (2024) Blind men and an elephant [link]


15 October 2024

🗄️Data Management: Data Governance (Part III: Taming the Complexity)

Data Management Series
Data Management Series

The Chief Data Officer (CDO) or the “Head of the Data Team” is one of the most challenging jobs because is more of a "political" than a technical role. It requires the ideal candidate to be able to throw and catch curved balls almost all the time, and one must be able to play ball with all the parties having an interest in data (aka stakeholders). It’s a full-time job that requires the combination of management and technical skillsets, and both are important! The focus will change occasionally in one direction more than in the other, with important fluctuations. 

Moreover, even if one masters the technical and managerial aspects, the combination of the two gives birth to situations that require further expertise – applied systems thinking being probably the most important. This, also because there are so many points of failure that it's challenging to address all the important causes. Therefore, it’s critical to be a system thinker, to have an experienced team and make use adequately of its experience! 

In a complex word, in which even the smallest constraint or opportunity can have an important impact especially when it’s involved in the early stages of the processes taking place in organizations. It relies on the manager’s and team’s skillset, their inspiration, the way the business reacts to the tasks involved and probably many other aspects that make things work. It takes considerable effort until the whole mechanism works, and even more time to make things work efficiently. The best metaphor is probably the one of a small combat team in which everybody has their place and skillset in the mechanism, independently if one talks about strategy, tactics or operations. 

Unfortunately, building such teams takes time, and the more people are involved, the more complex this endeavor becomes. The manager and the team must meet somewhere in the middle in what concerns the philosophy, the execution of the various endeavors, the way of working together to achieve the same goals. There are multiple forces pulling in all directions and it takes time until one can align the goals, respectively the effort. 

The most challenging forces are the ones between the business and the data team, respectively the business and data requirements, forces that don’t necessarily converge. Working in small organizations, the two parties have in theory more challenges to overcome the challenges and a team’s experience can weight a lot in the process, though as soon the scale changes, the number of challenges to be overcome changes exponentially (there are however different exponential functions in which the basis and exponent make the growth rapid). 

In big organizations can appear other parties that have the same force to pull the weight in one direction or another. Thus, the political aspects become more complex to the degree that the technologies must follow the political decisions, with all the positive and negative implications deriving from this. As comparison, think about the challenges from moving from two to three or more moving bodies orbiting each other, resulting in a chaotic dynamical system for most initial conditions. 

Of course, a business’ context doesn’t have to create such complexity, though when things are unchecked, when delays in decision-making as well as other typical events occur, when there’s no structure, strategy, coordinated effort, or any other important components, the chances for chaotic behavior are quite high with the pass of time. This is just a model to explain real life situations that seem similar on the surface but prove to be quite complex when diving deeper. That’s probably why a CDO’s role as tamer of complexity is important and challenging!

Previous Post <<||>> Next Post

17 September 2024

#️⃣Software Engineering: Mea Culpa (Part V: All-Knowing Developers are Back in Demand?)

Software Engineering Series

I’ve been reading many job descriptions lately related to my experience and curiously or not I observed that many organizations look for developers with Microsoft Dynamics experience in the CRM, respectively Finance and Operations (F&O) and Business Central (BC) areas. It’s a good sign that the adoption of Microsoft solutions for CRM and ERP increases, especially when one considers the progress made in the BI and AI areas with the introduction of Microsoft Fabric, which gives Microsoft a considerable boost. Conversely, it seems that the "developers are good for everything" syntagma is back, at least from what one reads in job descriptions. 

Of course, it’s useful to have an inhouse developer who can address all the aspects of an implementation, though that’s a lot to ask considering the different non-programming areas that need to be addressed. It’s true that a developer with experience can handle Requirements, Data and Process Management, respectively Data Migrations and Business Intelligence topics, though if one considers that each of the topics can easily become a full-time job before, during and post-project implementations. I’ve been there and I (hopefully) know that the jobs imply. Even if an experienced programmer can easily handle the different aspects, there will be also times when all the topics combined will be too much for a person!

It's not a novelty that job descriptions are treated like Christmas lists, but it’s difficult to differentiate between essential and nonessential skillset. I read many jobs descriptions lately in which among a huge list of demands, one of the requirements is to program in the F&O framework, sign that D365 programmers are in high demand. I worked for many years as programmer and Software Engineer, respectively in the BI area, where SQL and non-SQL code is needed. Even if I can understand the code in F&O, does it make sense to learn now to program in X++ and the whole framework? 

It's never too late to learn new tricks, respectively another programming language and/or framework. It even helps to provide better solutions in other areas, though frankly I would invest my time in other areas, and AI-related topics like AI prompting or Data Science seem to be more interesting in the long term, especially when they are already in demand!

There seems to be a tendency for Data Science professionals to do everything, building their own solutions, ignoring the experience accumulated respectively the data models built in BI and Data Analytics areas, as if the topics and data models are unrelated! It’s also true that AI-modeling comes with its own requirements in what concerns data modeling (e.g. translating non-numeric to numeric values), though I believe that common ground can be found!

Similarly, the notebook-based programming seems to replicate logic in each solution, which occasionally makes sense, though personally I wouldn’t recommend it as practice! The other day, I was looking at code developed in Python to mimic the joining of tables, when a view with the same could be easier (re)used, maintained, read and probably more efficient, even if different engines will be used. It will be interesting to see how the mix of spaghetti solutions will evolve over time. There are developers already complaining of the number of objects used in the process by building logic for each layer from the medallion architecture! Even if it makes sense from architectural considerations, it will become a nightmare in time.

One can wonder also about nomenclature used – Data Engineer or Prompt Engineering for the simple manipulation of data between structures in data transformations, respectively for structuring the prompts for AI. I believe that engineering involves more than this, no matter the context! 

Previous Post <<||>> Next Post

16 September 2024

🧭Business Intelligence: Mea Culpa (Part IV: Generalist or Specialist in an AI Era?)

Business Intelligence Series
Business Intelligence Series

Except the early professional years when I did mainly programming for web or desktop applications in the context of n-tier architectures, over the past 20 years my professional life was a mix between BI, Data Analytics, Data Warehousing, Data Migrations and other topics (ERP implementations and support, Project Management, IT Service Management, IT, Data and Applications Management), though the BI topics covered probably on average at least 60% of my time, either as internal or external consultant. 

I can consider myself thus a generalist who had the chance to cover most of the important aspects of a business from an IT perspective, and it was thus a great experience, at least until now! It’s a great opportunity to have the chance to look at problems, solutions, processes and the various challenges and opportunities from different perspectives. Technical people should have this opportunity directly in their jobs through the communication occurring in projects or IT services, though that’s more of a wish! Unfortunately, the dialogue between IT and business occurs almost only over the tickets and documents, which might be transparent but isn’t necessarily effective or efficient! 

Does working only part time in an area make one person less experienced or knowledgeable than other people? In theory, a full-time employee should get more exposure in depth and/or breadth, but that’s relative! It depends on the challenges one faces, the variation of the tasks, the implemented solutions, their depth and other technical and nontechnical factors like training, one’s experience in working with the various tools, the variety of the tasks and problem faced, professionalism, etc. A richer exposure can but not necessarily involve more technical and nontechnical knowledge, and this shouldn’t be taken as given! There’s no right or wrong answer even if people tend to take sides and argue over details.

Independently of job's effective time, one is forced to use his/her time to keep current with technologies or extend one’s horizon. In IT, a professional seldom can rely on what is learned on the job. Fortunately, nowadays one has more and more ways of learning, while the challenge shifts toward what to ignore, respectively better management of one’s time while learning. The topics increase in complexity and with this blogging becomes even more difficult, especially when one competes with AI content!

Talking about IT, it will be interesting to see how much AI can help or replace some of the professions or professionals. Anyway, some jobs will become obsolete or shift the focus to prompt engineering and technical reviews. AI still needs explicit descriptions of how to address tasks, at least until it learns to create and use better recipes for problem definition and solving. The bottom line, AI and its use can’t be ignored, and it can and should be used also in learning new things. It’s amazing what one can do nowadays with prompt engineering! 

Another aspect on which AI can help is to tailor the content to one’s needs. A high percentage in the learning process is spent on fishing in a sea of information for content that is worth knowing, respectively for a solution to one’s needs. AI must be able to address also some of the context without prompters being forced to give information explicitly!

AI opens many doors but can close many others. How much of one’s experience will remain relevant over the next years? Will AI have more success in addressing some of the challenges existing in people’s understanding or people will just trust AI blindly? Anyway, somebody must be smarter than AI, and here people’s collective intelligence probably can prove to be a real match. 

14 September 2024

🗄️Data Management: Data Governance (Part II: Heroes Die Young)

Data Management Series
Data Management Series

In the call for action there are tendencies in some organizations to idealize and overcharge main actors' purpose and image when talking about data governance by calling them heroes. Heroes are those people who fight for a goal they believe in with all their being and occasionally they pay the supreme tribute. Of course, the image of heroes is idealized and many other aspects are ignored, though such images sell ideas and ideals. Organizations might need heroes and heroic deeds to change the status quo, but the heroism doesn't necessarily payoff for the "heroes"! 

Sometimes, organizations need a considerable effort to change the status quo. It can be people's resistance to new, to the demands, to the ideas propagated, especially when they are not clearly explained and executed. It can be the incommensurable distance between the "AS IS" and the "TO BE" perspectives, especially when clear paths aren't in sight. It can be the lack of resources (e.g., time, money, people, tools), knowledge, understanding or skillset that makes the effort difficult. 

Unfortunately, such initiatives favor action over adequate strategies, planning and understanding of the overall context. The call do to something creates waves of actions and reactions which in the organizational context can lead to storms and even extreme behavior that ranges from resistance to the new to heroic deeds. Finding a few messages that support the call for action can help, though they can't replace the various critical for success factors.

Leading organizations on a new path requires a well-defined realistic strategy, respectively adequate tactical and operational planning that reflects organizations' specific needs, knowledge and capabilities. Just demanding from people to do their best is not enough, and heroism has chances to appear especially in this context. Unfortunately, the whole weight falls on the shoulders of the people chosen as actors in the fight. Ideally, it should be possible to spread the whole weight on a broader basis which should be considered the foundation for the new. 

The "heroes" metaphor is idealized and the negative outcome probably exaggerated, though extreme situations do occur in organizations when decisions, planning, execution and expectations are far from ideal. Ideal situations are met only in books and less in practice!

The management demands and the people execute, much like in the army, though by contrast people need to understand the reasoning behind what they are doing. Proper execution requires skillset, understanding, training, support, tools and the right resources for the right job. Just relying on people's professionalism and effort is not enough and is suboptimal, but this is what many organizations seem to do!

Organizations tend to respond to the various barriers or challenges with more resources or pressure instead of analyzing and depicting the situation adequately, and eventually change the strategy, tactics or operations accordingly. It's also difficult to do this as long an organization doesn't have the capabilities and practices of self-check, self-introspection, self-reflection, etc. Even if it sounds a bit exaggerated, an organization must know itself to overcome the various challenges. Regular meetings, KPIs and other metrics give the illusion of control when self-control is needed. 

Things don't have to be that complex even if managing data governance is a complex endeavor. Small or midsized organizations are in theory more capable to handle complexity because they can be more agile, have a robust structure and the flow of information and knowledge has less barriers, respectively a shorter distance to overcome, at least in theory. One can probably appeal to the laws and characteristics of networks to understand more about the deeper implications, of how solutions can be implemented in more complex setups.

🗄️Data Management: Data Culture (Part V: Quid nunc? [What now?])

Data Management Series
Data Management Series

Despite the detailed planning, the concentrated and well-directed effort with which the various aspects of data culture are addressed, things don't necessarily turn into what we want them to be. There's seldom only one cause but a mix of various factors that create a network of cause and effect relationships that tend to diminish or increase the effect of certain events or decisions, and it can be just a butterfly's flutter that stirs a set of chained reactions. The butterfly effect is usually an exaggeration until the proper conditions for the chaotic behavior appear!

The butterfly effect is made possible by the exponential divergence of two paths. Conversely, success needs probably multiple trajectories to converge toward a final point or intermediary points or areas from which things move on the "right" path. Success doesn't necessarily mean reaching a point but reaching a favorable zone for future behavior to follow a positive trend. For example, a sink or a cone-like structure allow water to accumulate and flow toward an area. A similar structure is needed for success to converge, and the structure results from what is built in the process. 

Data culture needs a similar structure for the various points of interest to converge. Things don't happen by themselves unless the force of the overall structure is so strong that allows things to move toward the intended path(s). Even then the paths can be far from optimal, but they can be favorable. Probably, that's what the general effort must do - bring the various aspects in the zone for allowing things to unfold. It might still be a long road, though the basis is there!

A consequence of this metaphor is that one must identify the important aspects, respectively factors that influence an organization's culture and drive them in the right direction(s) – the paths that converge toward the defined goal(s). (Depending on the area of focus one can consider that there are successions of more refined goals.)

The structure that allows things to converge is based on the alignment of the various paths and implicitly forces. Misalignment can make a force move in other direction with all the consequences deriving from this behavior. If its force is weak, probably will not have an impact over the overall structure, though that's relative and can change in time. 

One may ask for what's needed all this construct, even if it doesn’t reflect the reality. Sometimes, even a not entirely correct model can allow us to navigate the unknown. Model's intent is to depict what's needed for a initiative to be successful. Moreover, success doesn’t mean to shoot bulls eye but to be first in the zone until one's skillset enables performance.

Conversely, it's important to understand that things don't happen by themselves. At least this seems to be the feeling some initiatives let. One needs to build and pull the whole structure in the right direction and the alignment of the various forces can reduce the overall effort and increase the chances for success. Attempting to build something just because it’s written in documentation without understanding the whole picture (or something close to it) can easily lead to failure.

This doesn’t mean that all attempts that don’t follow a set of patterns are doomed to failure, but that the road will be more challenging and will probably take longer. Conversely, maybe these deviations from the optimal paths are what an organization needs to grow, to solidify the foundation on which something else can be built. The whole path is an exploration that doesn’t necessarily match what is written in books, respectively the expectations!

Previous Post <<||>> Next Post

11 September 2024

🗄️Data Management: Data Culture (Part IV: Quo vadis? [Where are you going?])

Data Management Series

The people working for many years in the fields of BI/Data Analytics, Data and Process Management probably met many reactions that at the first sight seem funny, though they reflect bigger issues existing in organizations: people don’t always understand the data they work with, how data are brought together as part of the processes they support, respectively how data can be used to manage and optimize the respective processes. Moreover, occasionally people torture the data until it confesses something that doesn’t necessarily reflect the reality. It’s even more deplorable when the conclusions are used for decision-making, managing or optimizing the process. In extremis, the result is an iterative process that creates more and bigger issues than whose it was supposed to solve!

Behind each blunder there are probably bigger understanding issues that need to be addressed. Many of the issues revolve around understanding how data are created, how are brought together, how the processes work and what data they need, use and generate. Moreover, few business and IT people look at the full lifecycle of data and try to optimize it, or they optimize it in the wrong direction. Data Management is supposed to help, and it does this occasionally, though a methodology, its processes and practices are as good as people’s understanding about data and its use! No matter how good a data methodology is, it’s as weak as the weakest link in its use, and typically the issues revolving around data and data understanding are the weakest link. 

Besides technical people, few businesspeople understand the full extent of managing data and its lifecycle. Unfortunately, even if some of the topics are treated in the books, they are too dry, need hands on experience and some thought in corroborating practices with theories. Without this, people will do things mechanically, processes being as good as the people using them, their value becoming suboptimal and hinder the business. That’s why training on Data Management is not enough without some hands-on experience!

The most important impact is however in BI/Data Analytics areas - how the various artifacts are created and used as support in decision-making, process optimization and other activities rooted in data. Ideally, some KPIs and other metrics should be enough for managing and directing a business, however just basing the decisions on a set of KPIs without understanding the bigger picture, without having a feeling of the data and their quality, the whole architecture, no matter how splendid, can breakdown as sandcastle on a shore meeting the first powerful wave!

Sometimes it feels like organizations do things from inertia, driven by the forces of the moment, initiatives and business issues for which temporary and later permanent solutions are needed. The best chance for solving many of the issues would have been a long time ago, when the issues were still small to create any powerful waves within the organizations. Therefore, a lot of effort is sometimes spent in solving the consequences of decisions not made at the right time, and that can be painful and costly!

For building a good business one needs also a solid foundation. In the past it was enough to have a good set of products that are profitable. However, during the past decade(s) the rules of the game changed driven by the acerb competition across geographies, inefficiencies, especially in the data and process areas, costing organizations on the short and long term. Data Management in general and Data Quality in particular, even if they’re challenging to quantify, have the power to address by design many of the issues existing in organizations, if given the right chance!

Previous Post <<||>> Next Post

02 September 2024

🗄️Data Management: Data Culture (Part III: A Tale of Two Cities)


One of the curious things is that as part of their change of culture organizations try to adopt a new language, to give new names to things, try to make distinction between the "AS IS" and "TO BE" states, insisting how the new image will replace the previous one. Occasionally, they even stress how bad things were in the past and how great will be in the future, trying to depict the future in vivid images. 

Even if this might work occasionally, it tends to confuse people and this not necessarily because of the language and the metaphors used, or the fact that same people were in the same positions, but the lack of belief or conviction, respectively half-hearted enthusiasm personified by the parties. To "convert" people to new philosophies one needs to believe in them or mimic that in similar terms. The lack of conviction can easily have a false effect that spreads within the organization. 

Dissociation from the past, from what an organization was, tends to increase the resistance against the new because two different images are involved. On one side there’s the attachment to the past, and even if there were mistakes made, or things didn’t go optimally, the experiences and decisions made are part of the organization, of the people who made them. People as individuals and as an organization should embrace their mistakes and good deeds altogether, learn from them, improve what is to improve and move forward. Conversely, there’s the resistance to the new, to the change, words they don’t believe in yet, the bigger picture is still fuzzy in their minds, and there can be many other reasons that don’t agree with one’s understanding. 

There are images, memories, views, decisions, objectives of the past and people need to recognize the road from what it was to what should be. One can hypothesize that embracing one’s mistake and understanding, the chain of reasoning from then and from now will help an organization transition towards the new. Awareness of one’s situation most probably will help in the transition process. Unfortunately, leaders and technology gurus tend to depict the past as negative, creating thus more negative emotions, respectively reactions in the process. The past is still part of the people, of the organization and will continue to be.

Conversely, the disassociation from the past can create more resistance to the new, and probably more unnecessary barriers. Probably, it’s easier for the gurus to build the new if the past weren’t there! Forgetting the past would be an error because there are many lessons that can be still useful. All the experience needs to be redirected in new directions. It’s more important to help people see the vision of the future, understand their missions, the paths to be followed and the challenges ahead, . 

It sounds more of a rambling from a psychology course, though organizations do have an image they want to change, to bring forth to cope with the various challenges, an image they want to reflect when needed. There are also organizations that want to change but keep their image intact, which leads to deeper conflicts. Unfortunately, changes of image involve conflicts that can become complex from what they bring forth.  

A data culture should increase people’s awareness of the present, respectively of the future, of what it takes to bridge the gap, the challenges ahead, how to embrace change, how to keep a realistic perspective, how to do a reality check, etc. Methodologies can increase people’s awareness and provide the theoretical basis, though walking the path will be a different story for everyone. 

01 September 2024

🗄️Data Management: Data Governance (Part I: No Guild of Heroes)

Data Management Series
Data Management Series

Data governance appeared around 1980s as topic though it gained popularity in early 2000s [1]. Twenty years later, organizations still miss the mark, respectively fail to understand and implement it in a consistent manner. As usual, the reasons for failure are multiple and they vary from misunderstanding what governance is all about to poor implementation of methodologies and inadequate management or leadership. 

Moreover, methodologies tend to idealize the various aspects and is not what organizations need, but pragmatism. For example, data governance is not about heroes and heroism [2], which can give the impression that heroic actions are involved and is not the case! Actions for the sake of action don’t necessarily lead to change by themselves. Organizations are in general good at creating meaningless action without results, especially when people preoccupy themselves, miss or ignore the mark. Big organizations are very good at generating actions without effects. 

People do talk to each other, though they try to solve their own problems and optimize their own areas without necessarily thinking about the bigger picture. The problem is not necessarily communication or the lack of depth into business issues, people do communicate, know the issues without a business impact assessment. The challenge is usually in convincing the upper management that the effort needs to be consolidated, supported, respectively the needed resources made available. 

Probably, one of the issues with data governance is the attempt of creating another structure in the organization focused on quality, which has the chances to fail, and unfortunately does fail. Many issues appear when the structure gains weight and it becomes a separate entity instead of being the backbone of organizations. 

As soon organizations separate the data governance from the key users, management and the other important decisional people in the organization, it takes a life of its own that has the chances to diverge from the initial construct. Then, organizations need "alignment" and probably other big words to coordinate the effort. Also such constructs can work but they are suboptimal because the forces will always pull in different directions.

Making each manager and the upper management responsible for governance is probably the way to go, though they’ll need the time for it. In theory, this can be achieved when many of the issues are solved at the lower level, when automation and further aspects allow them to supervise things, rather than hiding behind every issue. 

When too much mircomanagement is involved, people tend to busy themselves with topics rather than solve the issues they are confronted with. The actual actors need to be empowered to take decisions and optimize their work when needed. Kaizen, the philosophy of continuous improvement, proved itself that it works when applied correctly. They’ll need the knowledge, skills, time and support to do it though. One of the dangers is however that this becomes a full-time responsibility, which tends to create a separate entity again.

The challenge for organizations lies probably in the friction between where they are and what they must do to move forward toward the various objectives. Moving in small rapid steps is probably the way to go, though each person must be aware when something doesn’t work as expected and react. That’s probably the most important aspect. 

So, the more functions are created that diverge from the actual organization, the higher the chances for failure. Unfortunately, failure is visible in the later phases, and thus self-awareness, self-control and other similar “qualities” are needed, like small actors that keep the system in check and react whenever is needed. Ideally, the employees are the best resources to react whenever something doesn’t work as per design. 

Previous Post <<||>> Next Post 

Resources:
[1] Wikipedia (2023) Data Management [link]
[2] Tiankai Feng (2023) How to Turn Your Data Team Into Governance Heroes [link]


18 August 2024

🧭Business Intelligence: Mea Culpa (Part III: Problem Solving)

Business Intelligence Series
Business Intelligence Series

I've been working for more than 20 years in BI and Data Analytics area, in combination with Software Engineering, ERP implementations, Project Management, IT services and several other areas, which allowed me to look at many recurring problems from different perspectives. One of the things I learnt is that problems are more complex and more dynamic than they seem, respectively that they may require tailored dynamic solutions. Unfortunately, people usually focus on one or two immediate perspectives, ignoring the dynamics and the multilayered character of the problems!

Sometimes, a quick fix and limited perspective is what we need to get started and fix the symptoms, and problem-solvers usually stop there. When left unsupervised, the problems tend to kick back, build up momentum and appear under more complex forms in various places. Moreover, the symptoms can remain hidden until is too late. To this also adds the political agendas and the further limitations existing in organizations (people, money, know-how, etc.).

It seems much easier to involve external people (individual experts, consultancy companies) to solve the problem(s), though unless they get a deep understanding of the business and the issues existing in it, the chances are high that they solve the wrong problems and/or implement the wrong solutions. Therefore, it's more advisable to have internal experts, when feasible, and that's the point where business people with technical expertise and/or IT people with business expertise can help. Ideally, one should have a good mix and the so called competency centers can do a great job in handling the challenges of organizations. 

Between business and IT people there's a gap that can be higher or lower depending on resources know-how or the effort made by organizations to reduce it. To this adds the nature of the issues existing in organizations, which can vary considerable across departments, organizations or any other form of establishment. Conversely, the specific skillset can be transmuted where needed, which might happen naturally, though upon case also considerable effort needs to be involved in the process.

Being involved in similar tasks, one may get the impression that one can do whatever the others can do. This can happen in IT as well on the business side. There can be activities that can be done by parties from the other group, though there are also many exceptions in both directions, especially when one considers that one can’t generalize the applicability and/or transmutation of skillset. 

A more concrete example is the know-how needed by a businessperson to use the BI infrastructure for answering business questions, and ideally for doing all or at least most of the activities a BI professional can do. Ideally, as part of the learning path, it would be helpful to have a pursuable path in between the two points. The mastery of tools helps in the process though there are different mindsets involved.

Unfortunately, the data-related fields are full of overconfident people who get the problem-solving process wrong. Data-based problem-solving resumes in gathering the right facts and data, building the right conceptual model, identifying the right questions to ask, collecting more data, refining methods and solutions, etc. There’s aways an easy wrong way to solve a problem!

The mastery of tools doesn’t imply the mastery of business domains! What people from the business side can bring is deeper insight in the business problems, though getting from there to implementing solutions can prove a long way, especially when problems require different approaches, different levels of approximations, etc. No tool alone can bridge such gaps yet! Frankly, this is the most difficult to learn and unfortunately many data professionals seem to get this wrong!

Previous Post <<||>> Next Post

13 June 2024

🧭Business Intelligence: Microsoft Fabric (Part III: One Person Can’t Learn or Do Everything)

Business Intelligence Series
Business Intelligence Series

Today’s Explicit Measures webcast [1] considered an article written by Kurt Buhler (The Data Goblins): [Microsoft] "Fabric is a Team Sport: One Person Can’t Learn or Do Everything" [2]. It’s a well-written article that deserves some thought as there are several important points made. I can’t say I agree with the full extent of some statements, even if some disagreements are probably just a matter of semantics.

My main disagreement starts with the title “One Person Can’t Learn or Do Everything”. As clarified in webcast's chat, the author defines “everything" as an umbrella for “all the capabilities and experiences that comprise Fabric including both technical (like Power BI) or non-technical (like adoption data literacy) and everything in between” [1].

For me “everything” is relative and considers a domain's core set of knowledge, while "expertise" (≠ "mastery") refers to the degree to which a person can use the respective knowledge to build back-to-back solutions for a given area. I’d say that it becomes more and more challenging for beginners or average data professionals to cover the core features. Moreover, I’d separate the non-technical skills because then one will also need to consider topics like Data, Project, Information or Knowledge Management.

There are different levels of expertise, and they can vary in depth (specialization) or breadth (covering multiple areas), respectively depend on previous experience (whether one worked with similar technologies). Usually, there’s a minimum of requirements that need to be covered for being considered as expert (e.g. certification, building a solution from beginning to the end, troubleshooting, performance optimization, etc.). It’s also challenging to roughly define when one’s expertise starts (or ends), as there are different perspectives on the topics. 

Conversely, the term expert is in general misused extensively, sometimes even with a mischievous intent. As “expert” is usually considered an external consultant or a person who got certified in an area, even if the person may not be able to build solutions that address a customer’s needs. 

Even data professionals with many years of experience can be overwhelmed by the volume of knowledge, especially when one considers the different experiences available in MF, respectively the volume of new features released monthly. Conversely, expertise can be considered in respect to only one or more MF experiences or for one area within a certain layer. Lot of the knowledge can be transported from other areas – writing SQL and complex database objects, modelling (enterprise) semantic layers, programming in Python, R or Power Query, building data pipelines, managing SQL databases, etc. 

Besides the standard documentation, training sessions, and some reference architectures, Microsoft made available also some labs and other material, which helps discovering the features available, though it doesn’t teach people how to build complete solutions. I find more important than declaring explicitly the role-based audience, the creation of learning paths for the various roles.

During the past 6-7 months I've spent on average 2 days per week learning MF topics. My problem is not the documentation but the lack of maturity of some features, the gaps in functionality, identifying the respective gaps, knowing what and when new features will be made available. The fact that features are made available or changed while learning makes the process more challenging. 

My goal is to be able to provide back-to-back solutions and I believe that’s possible, even if I might not consider all the experiences available. During the past 22 years, at least until MF, I could build complete BI solutions starting from requirements elicitation, data extraction, modeling and processing for data consumption, respectively data consumption for the various purposes. At least this was the journey of a Software Engineer into the world of data. 

References:
[1] Explicit Measures (2024) Power BI tips Ep.328: Microsoft Fabric is a Team Sport (link)
[2] Data Goblins (2024) Fabric is a Team Sport: One Person Can’t Learn or Do Everything (link)

23 May 2024

🏭🗒️Microsoft Fabric: Domains (Notes)

Disclaimer: This is work in progress intended to consolidate information from various sources and may deviate from them. Please consult the sources for the exact content!
Last updated: 29-May-2024

Domains & Entities

Domains

  • {definition} a way of logically grouping together data in an organization that is relevant to a particular area or field [1]
    • associated with workspaces 
      • {benefit} allows to group data into business domains [1]
      • all the items in the workspace are then associated with the domain, and they receive a domain attribute as part of their metadata [1]
      • {benefit} enables a better consumption experience [1]
        • simplify discovery and consumption 
  • provide a management boundary between tenant and workspace enabling domain admins to have more granular control over multiple workspaces [6]
    • some tenant-level settings for managing and governing data can be delegated to the domain level [2]
  • [security] domain roles
    • Fabric admins (or higher)
      • can create and edit domains
      • can specify domain admins and contributors
      • can associate workspaces with domains [4]
      • can see, edit, and delete all domains in the admin portal [4]
      • domain admins
    • business owners or experts of a domain
      • can update the domain description
      • can define contributors
      • can associate workspaces with the domain [4]
      • can define and update the domain image
      • can override tenant settings for any specific settings the tenant admin has delegated to the domain level [4]
      • can't delete the domain, change the domain name, or add/delete other domain admins
      • can only see and edit the domains they're admins of.
    • domain contributors
      • ⇐ must be a workspace admin
      • can associate their workspaces with a domain or change the current domain association
      • don’t have access to the Domains page in the admin portal
    • domain users
      • can share lakehouse with other domain users without giving access to workspace and other artifacts [4]
  • {concept} default domain
    • a domain that has been specified as the default domain for specific users and/or security groups [3]
      • ⇒ when these users/security groups create/update a new/unassigned workspace, that workspace will automatically be assigned to that domain [3]
      • ⇒ generally automatically become domain contributors of the workspaces that are assigned in this manner [3]
  • {feature} subdomains
    • a way for fine tuning the logical grouping data under a domain [1]
      • subdivisions of a domain
      • only one level is supported in the hierarchy
    • visible as part of the domains filter and as part of the item location path
    • no setup available
      • [planned] some domain settings will be added to subdomains as well

Acronyms:
MF - Microsoft Fabric

Resources:
[1] Microsoft Learn (2023) Administer Microsoft Fabric (link)
[2] Microsoft Learn - Fabric (2024) Governance overview and guidance (link)
[3] Microsoft Learn: Fabric (2023) Fabric domains (link)
[4] Establishing Data Mesh architectural pattern with Domains and OneLake on Microsoft Fabric, by Maheswaran Arunachalam (link
[5] Microsoft Fabric Updates Blog (2024) Easily implement data mesh architecture with domains in Fabric, by Naama Tsafrir 
(link)
[6] Microsoft (2024) Microsoft Fabric Domains – Data Mesh [with Naama Tsafrir & Assaf Shemesh]


07 May 2024

🏭🗒️Microsoft Fabric: The Metrics Layer (Notes) [new feature]

Disclaimer: This is work in progress intended to consolidate information from various sources.
Last updated: 07-May-2024

The Metrics Layer in Microsoft Fabric (adapted diagram)
The Metrics Layer in Microsoft Fabric (adapted diagram)

[new feature] Metrics Layer (Metrics Store)

  • {definition}an abstraction layer available between the data store(s) and end users which allows organizations to create standardized business metrics, that are rooted in measures and are discoverable and intended for reuse
    • ⇐ {important} feature still in private preview 
  • {goal} extend existing infrastructure 
    • {benefit} leverages and extends existing features
  • {goal} provide consistent definitions and descriptions [1]
    • consistent definitions that include besides business logic additional dimensions and filters [1]
    • ⇒ {benefit} allows to standardize the metrics across the organization
    • ⇒ {benefit} enforce to enforce a SSoT
  • {goal} easy management 
    • via management views 
    • [feature] lineage 
    • [feature] source control
    • [feature] duplicate identification
    • [feature] push updates to downstream uses of the metrics 
  • {goal}searchable and discoverable metrics 
    • {feature} integration
      • based on Sempy fabric package
        • ⇐ a dataframe for storage and propagation of Power BI metadata which is part of the python-based semantic Link in Fabric
  • {goal}trust
    • [feature] trust indicators
    • {benefit} facilitates report's adoption
  • {feature} metric set 
    • {definition} a Fabric item that groups together a set of metrics into a mini-model
    • {benefit} allows to reduce the overall complexity of semantic models, while being easy to evolve and consume
    • associated with a single domain
      • ⇒ supports the data mesh architecture
    • shareable 
      • can be shared with other users
    • {action} create metric set
      • creates the actual artifact, to which metrics can be added 
  • {feature} metric
    •  {definition} a way to elevate the measures from the various semantic models existing in the organization
    • tied to the original semantic model
      • ⇒ {benefit} allows to see how a metric is used across the solutions 
    • reusable
      • can be reused in other fabric artifacts
        • new reports on the Power BI service
        • notebooks 
          • by copying the code
      • can be reused in Power BI
        • via OneLake data hub menu element
      • can be chained 
        • changes are propagated downstream 
    • materializable 
      • its output can be persisted to OneLake by saving it a delta table into a lakehouse
      • {misuse} data is persisted unnecessarily
    • {action} elevate metric
      • copies measure's definition and description
      • ⇒  implies restructuring, refactoring, moving, and testing a lot of code in the process
      • {misuse} data professionals build everything as metrics
    • {action} update metric
    • {action} add filters to metric 
    • {action} add dimensions to metric
    • {action} materialize metric 

Acronyms:
SSoT - single source of truth ()

References:
[1] Power BI Tips (2024) Explicit Measures Ep. 236: Metrics Hub, Hot New Feature with Carly Newsome (link)
[2] Power BI Tips (2024) Introducing Fabric Metrics Layer / Power Metrics Hub [with Carly Newsome] (link)

06 May 2024

🏭🧭Microsoft Fabric: The Metrics Layer [new feature]

Introduction

One of the announcements of this year's Microsoft Fabric Community first conference was the introduction of a metrics layer in Fabric which "allows organizations to create standardized business metrics, that are rooted in measures and are discoverable and intended for reuse" [1]. As it seems, the information content provided at the conference was kept to a minimum given that the feature is still in private preview, though several webcasts start to catch up on the topic (see [2], [4]). Moreover, as part of their show, the Explicit Measures (@PowerBITips) hosts had Carly Newsome as invitee, the manager of the project, who unveiled more details about the project and the feature, details which became the main source for the information below. 

The idea of a metric layer or metric store is not new, data professionals occasionally refer to their structure(s) of metrics as such. The terms gained weight in their modern conception relatively recently in 2021-2022 (see [5], [6], [7], [8], [10]). Within the modern data stack, a metrics layer or metric store is an abstraction layer available between the data store(s) and end users. It allows to centrally define, store, and manage business metrics. Thus, it allows us to standardize and enforce a single source of truth (SSoT), respectively solve several issues existing in the data stacks. As Benn Stancil earlier remarked, the metrics layer is one of the missing pieces from the modern data stack (see [10]).

Microsoft's Solution

Microsoft's business case for metrics layer's implementation is based on three main ideas (1) duplicate measures contribute to poor data quality, (2) complex data models hinder self-service, (3) reduce data silos in Power BI. In Microsoft's conception the metric layer provides several benefits: consistent definitions and descriptions, easy management via management views, searchable and discoverable metrics, respectively assure trust through indicators. 

For this feature's implementation Microsoft introduces a new Fabric Item called a metric set that allows to group several (business) metrics together as part of a mini-model that can be tailored to the needs of a subset of end-users and accessed by them via the standard tools already available. The metric set becomes thus a mini-model. Such mini-models allow to break down and reduce the overall complexity of semantic models, while being easy to evolve and consume. The challenge will become then on how to break down existing and future semantic models into nonoverlapping mini-models, creating in extremis a partition (see the Lego metaphor for data products). The idea of mini-models is not new, [12] advocating the idea of using a Master Model, a technique for creating derivative tabular models based on a single tabular solution.

A (business) metric is a way to elevate the measures from the various semantic models existing in the organization within the mini-model defined by the metric set. A metric can be reused in other fabric artifacts - currently in new reports on the Power BI service, respectively in notebooks by copying the code. Reusing metrics in other measures can mean that one can chain metrics and the changes made will be further propagated downstream. 

The Metrics Layer in Microsoft Fabric (adapted diagram)
The Metrics Layer in Microsoft Fabric (adapted diagram)

Every metric is tied to the original semantic model which allows thus to track how a metric is used across the solutions and, looking forward to Purview, to identify data's lineage. A measure is related to a "table", the source from which the measure came from.

Users' Perspective

The Metrics Layer feature is available in Microsoft Fabric service for Power BI within the Metrics menu element next to Scorecards. One starts by creating a metric set in an existing workspace, an operation which creates the actual artifact, to which the individual metrics are added. To create a metric, a user with build permissions can navigate through the semantic models across different workspaces he/she has access to, pick a measure from one of them and elevate it to a metric, copying in the process its measure's definition and description. In this way the metric will always point back to the measure from the semantic model, while the metrics thus created are considered as a related collection and can be shared around accordingly. 

Once a metric is added to the metric set, one can add in edit mode dimensions to it (e.g. Date, Category, Product Id, etc.). One can then further explore a metric's output and add filters (e.g. concentrate on only one product or category) point from which one can slice-and-dice the data as needed.

There is a panel where one can see where the metric has been used (e.g. in reports, scorecards, and other integrations), when was last time refreshed, respectively how many times was used. Thus, one has the most important information in one place, which is great for developers as well as for the users. Probably, other metadata will be added, such as whether an increase in the metric would be favorable or unfavorable (like in Tableau Pulse, see [13]) or maybe levels of criticality, an unit of measure, or maybe its type - simple metric, performance indicator (PI), result indicator (RI), KPI, KRI etc.

Metrics can be persisted to the OneLake by saving their output to a delta table into the lakehouse. As demonstrated in the presentation(s), with just a copy-paste and a small piece of code one can materialize the data into a lakehouse delta table, from where the data can be reused as needed. Hopefully, the process will be further automated. 

One can consume metrics and metrics sets also in Power BI Desktop, where a new menu element called Metric sets was added under the OneLake data hub, which can be used to connect to a metric set from a Semantic model and select the metrics needed for the project. 

Tapping into the available Power BI solutions is done via an integration feature based on Sempy fabric package, a dataframe for storage and propagation of Power BI metadata which is part of the python-based semantic Link in Fabric [11].

Further Thoughts

When dealing with a new feature, a natural idea comes to mind: what challenges does the feature involve, respectively how can it be misused? Given that the metrics layer can be built within a workspace and that it can tap into the existing measures, this means that one can built on the existing infrastructure. However, this can imply restructuring, refactoring, moving, and testing a lot of code in the process, hopefully with minimal implications for the solutions already available. Whether the process is as simple as imagined is another story. As misusage, in extremis, data professionals might start building everything as metrics, though the danger might come when the data is persisted unnecessarily. 

From a data mesh's perspective, a metric set is associated with a domain, though there will be metrics and data common to multiple domains. Moreover, a mini-model has the potential of becoming a data product. Distributing the logic across multiple workspaces and domains can add further challenges, especially in what concerns the synchronization and implemented of requirements in a way that doesn't lead to bottlenecks. But this is a general challenge for the development team(s). 

The feature will probably suffer further changes until is released in public review (probably by September or the end of the year). I subscribe to other data professionals' opinion that the feature was for long needed and that can have an important impact on the solutions built. 

Previous Post <<||>> Next Post

Resources:
[1] Microsoft Fabric Blog (2024) Announcements from the Microsoft Fabric Community Conference (link)
[2] Power BI Tips (2024) Explicit Measures Ep. 236: Metrics Hub, Hot New Feature with Carly Newsome (link)
[3] Power BI Tips (2024) Introducing Fabric Metrics Layer / Power Metrics Hub [with Carly Newsome] (link)
[4] KratosBI (2024) Fabric Fridays: Metrics Layer Conspiracy Theories #40 (link)
[5] Chris Webb's BI Blog (2022) Is Power BI A Semantic Layer? (link)
[6] The Data Stack Show (2022) TDSS 95: How the Metrics Layer Bridges the Gap Between Data & Business with Nick Handel of Transform (link)
[7] Sundeep Teki (2022) The Metric Layer & how it fits into the Modern Data Stack (link)
[8] Nick Handel (2021) A brief history of the metrics store (link)
[9] Aurimas (2022) The Jungle of Metrics Layers and its Invisible Elephant (link)
[10] Benn Stancil (2021) The missing piece of the modern data stack (link)
[11] Microsoft Learn (2024) Sempy fabric Package (link)
[12] Michael Kovalsky (2019) Master Model: Creating Derivative Tabular Models (link)
[13] Christina Obry (2023) The Power of a Metrics Layer - and How Your Organization Can Benefit From It (link

18 April 2024

🏭🗒️Microsoft Fabric: Mirroring (Notes)

Disclaimer: This is work in progress intended to consolidate information from various sources.
Last updated: 18-Apr-2024

Mirroring in Microsoft Fabric
Mirroring in Microsoft Fabric (adapted after [2])

Mirroring

  • low-cost, low-latency fully managed service that allows to replicate data from various systems together into OneLake [1]
    • supported in Azure SQL Database, Azure Cosmos DB, Snowflake [1]
    • ⇒ replaces ETL pipelines
    • ⇐ manages the replication of data into OneLake and conversion to Parquet [1]
    • changes have a near real-time latency [3]
    • uses the source database’s CDC feature [4]
  • {benefit} allows users to use a highly integrated, end-to-end, and easy-to-use service [1]
    • the mirrored data can be used in Power BI
      • ⇐ since tables are all v-ordered delta tables [3]
  • {benefit} the delta tables can then be used in every Fabric experience, allowing users to accelerate their journey into Fabric [1]
    • {restriction} replication of views is currently not supported [3]
  • {benefit} allows to write cross-database queries against the mirrored databases, warehouses, and the SQL analytics endpoints of Lakehouses in a single T-SQL query [1]
  • landing zone in OneLake 
    • stores both the snapshot and change data [3]
      • ⇐ improves performance when converting files into delta verti-parquet [3]
  • {operation} enable the mirroring 
    • by creating a secure connection to the operational data source
    • an entire database or individual tables can be replicated 
    • creates three items in the targeted workspace:
      • the mirrored database
      • a SQL analytics endpoint
      • a Default semantic model
    • the same source database can be mirrored multiple times
      • ⇐ though usually not needed
        • scenario: replicating data to different types of environments
  • {operation} stopping the mirroring
    • stops the replication in the source database, but a copy of the tables is kept in OneLake [3]
  • {operation} restarting the mirroring 
    • results in all data being replicated from the start [3]
  • {operation} remove a table from mirroring
    • the table is no longer replicated and its data is deleted from OneLake [3]
  • {operation} sharing
    • users grant other users or groups of users access to a mirrored database without giving access to the workspace and the rest of its items [1]
      • access is also granted to the SQL analytics endpoint and associated default semantic model [1]
      • shared mirrored databases can be found through either
        • Data Hub
        • Shared with Me section in Microsoft Fabric
    • triggers an initial replication
  • {requirement} [licensing] requires Power BI Premium, Fabric Capacity, or Trial Capacity
  • {feature} monitoring
    • {benefit} allows to gain insights into mirroring operations and when the replica in Fabric OneLake was last refreshed [4]
  • [Azure SQL Database] 
    • {restriction} doesn't support Azure SQL Database logical servers behind an AVN or private networking [2]
      • {requirement} update the Azure SQL logical server firewall rules to Allow public network access [2]
      • {requirement} enable the Allow Azure services option to connect to your Azure SQL Database logical server [2]
    • {restriction}
      •  access through the Power BI Gateway or behind a firewall is unsupported [3]
    • authentication
      • SQL authentication with user name and password
      • Microsoft Entra ID
      • Service Principal
    • {troubleshooting} check if the changes properly flow
      • via sys.dm_change_feed_log_scan_sessions DMV
    • {troubleshooting} check if there are any problems reported
      • via sys.dm_change_feed_errors DMV
    • {troubleshooting} check if the mirroring was properly enabled
      • via sp_help_change_feed stored procedure 
    • {troubleshooting} disable mirroring
      • via sp_change_feed_disable_db stored procedure
Acronyms:
AVN - Azure Virtual Network
CDC - Change Data Capture
ETL - Extract, Transfer, Load
DMV - Dynamic Management View

References:
[1] Microsoft Learn - Microsoft Fabric (2024) What is Mirroring in Fabric? (link)
[2] Microsoft Learn - Microsoft Fabric (2024) Mirroring Azure SQL Database [Preview] (link)
[3] Microsoft Learn - Microsoft Fabric (2024) Frequently asked questions for Mirroring Azure SQL Database in Microsoft Fabric [Preview] (link)
[4] Microsoft Fabric Updates Blog (2023) (link)

Resources:
[1] Tutorial: Configure Microsoft Fabric mirrored databases from Azure SQL Database [Preview] (link)
[2] Microsoft Fabric Updates Blog (2024) Announcing the Public Preview of Mirroring in Microsoft Fabric, by Charles Webb (link)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.