Showing posts with label integration. Show all posts
Showing posts with label integration. Show all posts

16 April 2025

🧮ERP: Implementations (Part XIV: A Never-Ending Story)

ERP Implementations Series
ERP Implementations Series

An ERP implementation is occasionally considered as a one-time endeavor after which an organization will live happily ever after. In an ideal world that would be true, though the work never stops – things that were carved out from the implementation, optimizations, new features, new regulations, new requirements, integration with other systems, etc. An implementation is thus just the beginning from what it comes and it's essential to get the foundation right – and that’s the purpose of the ERP implementation – provide a foundation on which something bigger and solid can be erected. 

No matter how well an ERP implementation is managed and executed, respectively how well people work towards the same goals, there’s always something forgotten or carved out from the initial project. Usually, the casual suspects are the integrations with other systems, though there can be also minor or even bigger features that are planned to be addressed later, if the implementation hasn’t consumed already all the financial resources available, as it's usually the case. Some of the topics can be addressed as Change Requests or consolidated on projects of their own. 

Even simple integrations can become complex when the processes are poorly designed, and that typically happens more often than people think. It’s not necessarily about the lack of skillset or about the technologies used, but about the degree to which the processes can work in a loosely coupled interconnected manner. Even unidirectional integrations can raise challenges, though everything increases in complexity when the flow of data is bidirectional. Moreover, the complexity increases with each system added to the overall architecture. 

Like a sculpture’s manual creation, processes in an ERP implementation form a skeleton that needs chiseling and smoothing until the form reaches the desired optimized shape. However, optimization is not a one-time attempt but a continuous work of exploring what is achievable, what works, what is optimal. Sometimes optimization is an exact science, while other times it’s about (scientifical) experimentation in which theory, ideas and investments are put to good use. However, experimentation tends to be expensive at least in terms of time and effort, and probably these are the main reasons why some organizations don’t even attempt that – or maybe it’s just laziness, pure indifference or self-preservation. In fact, why change something that already works?

Typically, software manufacturers make available new releases on a periodic basis as part of their planning for growth and of attracting more businesses. Each release that touches used functionality typically needs proper evaluation, testing and whatever organizations consider as important as part of the release management process. Ideally, everything should go smoothly though life never ceases to surprise and even a minor release can have an important impact when earlier critical functionality stopped working. Test automation and other practices can make an important difference for organizations, though these require additional effort and investments that usually pay off when done right. 

Regulations and other similar requirements must be addressed as they can involve penalties or other risks that are usually worth avoiding. Ideally such requirements should be supported by design, though even then a certain volume of work is involved. Moreover, the business context can change unexpectedly, and further requirements need to be considered eventually. 

The work on an ERP system and the infrastructure built around it is a never-ending story. Therefore, organizations must have not only the resources for the initial project, but also what comes after that. Of course, some work can be performed manually, some requirements can be delayed, some risks can be assumed, though the value of an ERP system increases with its extended usage, at least in theory. 

15 April 2025

🧮ERP: Implementations (Part XII: The Process Perspective)

ERP Implementation Series
ERP Implementations Series

Technology can have a tremendous potential impact on organizations, helping them achieve their strategic goals and objectives, however it takes more than an implementation of one or more technologies to leverage that potential! This applies to ERP and other technology implementations altogether, but the role of technology is more important in the latter through its transformative role. ERP implementations can be the foundation on which the whole future of the organization is built upon, and it’s ideal to have a broader strategy that looks at all the facets of an organization pre-, during and postimplementation. 

One of the most important assets an organization has is its processes, organization’s success depending on the degree the processes are used to leverage the various strategies. Many customers want their business processes to be implemented on the new platform and that's the point where many projects go in the wrong direction! There are probably areas where this approach makes sense, though organizations need to look also at the alternatives available in the new ecosystem, identify and prioritize the not existing features accordingly. There will be also extreme cases in which one or a mix of systems will be considered as not feasible, and this is an alternative that should be considered during such evaluations! 

An ERP system allows organizations to implement their key value-creation processes by providing a technological skeleton with a set of configurations and features that can be used to address a wide set of requirements. Such a framework is an enabler - makes things possible - though the potential is not reached automatically, and this is one of the many false assumptions associated with such projects. Customers choose such a system and expect magic to happen! Many of the false perceptions are strengthened by implementers or the other parties involved in the projects. As in other IT areas, there are many misconceptions that pervade. 

An ERP provides thus a basis on which an organization can implement its processes. Doing an ERP implementation without process redesign is seldom possible, even if many organizations want to avoid it at all costs. Even if organization’s processes are highly standardized, expecting a system to model them by design is utopian, given that ERP system tends to target the most important aspects identified across industries. And thus, customizations come into play, some of them done without looking for alternatives already existing in the intrinsic or extended range of solutions available in an ERP’s ecosystem. 

One of the most important dangers is when an organization’s processes are so complex that their replication in the new environment creates more issues that the implementation can solve. At least in the first phases of the implementation, organizations must learn to compromise and focus on the critical aspects without which the organization can’t do its business. Moreover, the costs of implementations tend to increase exponentially, when multiple complex requirements are added to address the gaps.  Organizations should always look at alternatives – integrations with third party systems tend to be more cost-effective than rebuilding the respective functionality from scratch! 

It's also true that some processes are too complex to be implemented, though the solution resides usually in the middle. Each customization adds another level of complexity, and a whole range of risk many customers take. Conversely, there’s no blueprint that works for everybody. Organizations must thus compromise and that’s probably one of the most important aspects they should be aware of! However, also compromises must be made in the right places, while evaluating alternatives and the possible outcomes. It’s important to be aware of the full extent of the implications for their decisions. 

13 April 2025

🏭🗒️Microsoft Fabric: Continuous Integration & Continuous Deployment [CI/CD] [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 13-Apr-2025

[Microsoft Fabric] Continuous Integration & Continuous Deployment [CI/CD] 
  • {def} development processes, tools, and best practices used to automates the integration, testing, and deployment of code changes to ensure efficient and reliable development
    • can be used in combination with a client tool
      • e.g. VS Code, Power BI Desktop
      • don’t necessarily need a workspace
        • developers can create branches and commit changes to that branch locally, push those to the remote repo and create a pull request to the main branch, all without a workspace
        • workspace is needed only as a testing environment [1]
          • to check that everything works in a real-life scenario [1]
    • addresses a few pain points [2]
      • manual integration issues
        • manual changes can lead to conflicts and errors
          • slow down development [2]
      • development delays
        • manual deployments are time-consuming and prone to errors
          • lead to delays in delivering new features and updates [2]
      • inconsistent environments
        • inconsistencies between environment cause issues that are hard to debug [2]
      • lack of visibility
        • can be challenging to
          • track changes though their lifetime [2]
          • understand the state of the codebase[2]
    • {process} continuous integration (CI)
    • {process} continuous deployment (CD)
    • architecture
      • {layer} development database 
        • {recommendation} should be relatively small [1]
      • {layer} test database 
        • {recommendation{ should be as similar as possible to the production database [1]
      • {layer} production database

      • data items
        • items that store data
        • items' definition in Git defines how the data is stored [1]
    • {stage} development 
      • {best practice} back up work to a Git repository
        • back up the work by committing it into Git [1]
        • {prerequisite} the work environment must be isolated [1]
          • so others don’t override the work before it gets committed [1]
          • commit to a branch no other developer is using [1]
          • commit together changes that must be deployed together [1]
            • helps later when 
              • deploying to other stages
              • creating pull requests
              • reverting changes
      • {warning} big commits might hit the max commit size limit [1]
        • {bad practice} store large-size items in source control systems, even if it works [1]
        • {recommendation} consider ways to reduce items’ size if they have lots of static [1] resources, like images [1]
      • {action} revert to a previous version
        • {operation} undo
          • revert the immediate changes made, as long as they aren't committed yet [1]
          • each item can be reverted separately [1]
        • {operation} revert
          • reverting to older commits
            • {recommendation} promote an older commit to be the HEAD 
              • via git revert or git reset [1]
              • shows that there’s an update in the source control pane [1]
              • the workspace can be updated with that new commit [1]
          • {warning} reverting a data item to an older version might break the existing data and could possibly require dropping the data or the operation might fail [1]
          • {recommendation} check dependencies in advance before reverting changes back [1]
      • {concept} private workspace
        • a workspace that provides an isolated environment [1]
        • allows to work in isolation, use a separate [1]
        • {prerequisite} the workspace is assigned to a Fabric capacity [1]
        • {prerequisite} access to data to work in the workspace [1]
        • {step} create a new branch from the main branch [1]
          • allows to have most up-to-date version of the content [1]
          • can be used for any future branch created by the user [1]
            • when a sprint is over, the changes are merged and one can start a fresh new task [1]
              • switch the connection to a new branch on the same workspace
            • approach can be used when is needed to fix a bug in the middle of a sprint [1]
          • {validation} connect to the correct folder in the branch to pull the right content into the workspace [1]
      • {best practice} make small incremental changes that are easy to merge and less likely to get into conflicts [1]
        • update the branch to resolve the conflicts first [1]
      • {best practice} change workspace’s configurations to enable productivity [1]
        • connection between items, or to different data sources or changes to parameters on a given item [1]
      • {recommendation} make sure you're working with the supported structure of the item you're authoring [1]
        • if you’re not sure, first clone a repo with content already synced to a workspace, then start authoring from there, where the structure is already in place [1]
      • {constraint} a workspace can only be connected to a single branch at a time [1]
        • {recommendation} treat this as a 1:1 mapping [1]
    • {stage} test
      • {best practice} allows to simulate a real production environment for testing purposes [1]
        • {alternative} simulate this by connecting Git to another workspace [1]
      • factors to consider for the test environment
        • data volume
        • usage volume
        • production environment’s capacity
          • stage and production should have the same (minimal) capacity [1]
            • using the same capacity can make production unstable during load testing [1]
              • {recommendation} test using a different capacity similar in resources to the production capacity [1]
              • {recommendation} use a capacity that allows to pay only for the testing time [1]
                • allows to avoid unnecessary costs [1]
      • {best practice} use deployment rules with a real-life data source
        • {recommendation} use data source rules to switch data sources in the test stage or parameterize the connection if not working through deployment pipelines [1]
        • {recommendation} separate the development and test data sources [1]
        • {recommendation} check related items
          • the changes made can also affect the dependent items [1]
        • {recommendation} verify that the changes don’t affect or break the performance of dependent items [1]
          • via impact analysis.
      • {operation} update data items in the workspace
        • imports items’ definition into the workspace and applies it on the existing data [1]
        • the operation is same for Git and deployment pipelines [1]
        • {recommendation} know in advance what the changes are and what impact they have on the existing data [1]
        • {recommendation} use commit messages to describe the changes made [1]
        • {recommendation} upload the changes first to a dev or test environment [1]
          • {benefit} allows to see how that item handles the change with test data [1]
        • {recommendation} check the changes on a staging environment, with real-life data (or as close to it as possible) [1]
          • {benefit} allows to minimize the unexpected behavior in production [1]
        • {recommendation} consider the best timing when updating the Prod environment [1]
          • {benefit} minimize the impact errors might cause on the business [1]
        • {recommendation} perform post-deployment tests in Prod to verify that everything works as expected [1]
        • {recommendation} have a deployment, respectively a recovery plan [1]
          • {benefit) allows to minimize the effort, respectively the downtime [1]
    • {stage} production
      • {best practice} let only specific people manage sensitive operations [1]
      • {best practice} use workspace permissions to manage access [1]
        • applies to all BI creators for a specific workspace who need access to the pipeline
      • {best practice} limit access to the repo or pipeline by only enabling permissions to users [1] who are part of the content creation process [1]
      • {best practice} set deployment rules to ensure production stage availability [1]
        • {goal} ensure the data in production is always connected and available to users [1]
        • {benefit} allows deployments run while while minimizing the downtimes
        • applies to data sources and parameters defined in the semantic model [1]
      • deployment into production using Git branches
        • {recommendation} use release branches [1]
          • requires changing the connection of workspace to the new release branches before every deployment [1]
          • if the build or release pipeline requires to change the source code, or run scripts in a build environment before deployment, then connecting the workspace to Git won't help [1]
      • {recommendation} after deploying to each stage, make sure to change all the configuration specific to that stage [1]

    References:
    [1] Microsoft Learn (2025) Fabric: Best practices for lifecycle management in Fabric [link]
    [2] Microsoft Learn (2025) Fabric: CI/CD for pipelines in Data Factory in Microsoft Fabric [link]
    [3] Microsoft Learn (2025) Fabric: Choose the best Fabric CI/CD workflow option for you [link]

    Acronyms:
    API - Application Programming Interface
    BI - Business Intelligence
    CI/CD - Continuous Integration and Continuous Deployment
    VS - Visual Studio

    20 January 2025

    🏭🗒️Microsoft Fabric: [Azure] Service Principals (SPN) [Notes]

    Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

    Last updated: 20-Jan-2025

    [Azure] Service Principal (SPN)  

    • {def} a non-human, application-based security identity used by applications or automation tools to access specific Azure resources [1]
      • can be assigned precise permissions, making them perfect for automated processes or background services
        • allows to minimize the risks of human error and identity-based vulnerabilities
        • supported in datasets, Gen1/Gen2 dataflows, datamarts [2]
        • authentication type 
          • supported only by [2]
            • Azure Data Lake Storage
            • Azure Data Lake Storage Gen2
            • Azure Blob Storage
            • Azure Synapse Analytics
            • Azure SQL Database
            • Dataverse
            • SharePoint online
          • doesn’t support
            • SQL data source with Direct Query in datasets [2]
    • when registering a new application in Microsoft Entra ID, a SPN is automatically created for the app registration [4]
      • the access to resources is restricted by the roles assigned to the SPN
        • ⇒ gives control over which resources can be accessed and at which level [4]
      • {recommendation} use SPN with automated tools [4]
        • rather than allowing them to sign in with a user identity  [4]
      • {prerequisite} an active Microsoft Entra user account with sufficient permissions to 
        • register an application with the tenant [4]
        • assign to the application a role in the Azure subscription [4]
        •  requires Application.ReadWrite.All permission [4]
    • extended to support Fabric Data Warehouses [1]
      • {benefit} automation-friendly API Access
        • allows to create, update, read, and delete Warehouse items via Fabric REST APIs using service principals [1]
        • enables to automate repetitive tasks without relying on user credentials [1]
          • e.g. provisioning or managing warehouses
          • increases security by limiting human error
        • the warehouses thus created, will be displayed in the Workspace list view in Fabric UI, with the Owner name of the SPN [1]
        • applicable to users with administrator, member, or contributor workspace role [3]
        • minimizes risk
          • the warehouses created with delegated account or fixed identity (owner’s identity) will stop working when the owner leaves the organization [1]
            • Fabric requires the user to login every 30 days to ensure a valid token is provided for security reasons [1]
      • {benefit} seamless integration with Client Tools: 
        • tools like SSMS can connect to the Fabric DWH using SPN [1]
        • SPN provides secure access for developers to 
          • run COPY INTO
            • with and without firewall enabled storage [1]
          • run any T-SQL query programmatically on a schedule with ADF pipelines [1]
      • {benefit} granular access control
        • Warehouses can be shared with an SPN through the Fabric portal [1]
          • once shared, administrators can use T-SQL commands to assign specific permissions to SPN [1]
            • allows to control precisely which data and operations an SPN has access to  [1]
              • GRANT SELECT ON <table name> TO <Service principal name>  
        • warehouses' ownership can be changed from an SPN to user, and vice-versa [3]
      • {benefit} improved DevOps and CI/CD Integration
        • SPN can be used to automate the deployment and management of DWH resources [1]
          •  ensures faster, more reliable deployment processes while maintaining strong security postures [1]
      • {limitation} default semantic models are not supported for SPN created warehouses [3]
        • ⇒ features such as listing tables in dataset view, creating report from the default dataset don’t work [3]
      • {limitation} SPN for SQL analytics endpoints is not currently supported
      • {limitation} SPNs are currently not supported for COPY INTO error files [3]
        • ⇐ Entra ID credentials are not supported as well [3]
      • {limitation} SPNs are not supported for GIT APIs. SPN support exists only for Deployment pipeline APIs [3]
      • monitoring tools
        • [DMV] sys.dm_exec_sessions.login_name column [3] 
        • [Query Insights] queryinsights.exec_requests_history.login_name [3]
        • Query activity
          • submitter column in Fabric query activity [3]
        • Capacity metrics app: 
          • compute usage for warehouse operations performed by SPN appears as the Client ID under the User column in Background operations drill through table [3]

    References:
    [1] Microsoft Fabric Updates Blog (2024) Service principal support for Fabric Data Warehouse [link]
    [2] Microsoft Fabric Learn (2024) Service principal support in Data Factory [link]
    [3] Microsoft Fabric Learn (2024) Service principal in Fabric Data Warehouse [link
    [4] Microsoft Fabric Learn (2024) Register a Microsoft Entra app and create a service principal [link]
    [5] Microsoft Fabric Updates Blog (2024) Announcing Service Principal support for Fabric APIs [link]

    Resources:
    [R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link
     
    Acronyms:
    ADF - Azure Data Factory
    API - Application Programming Interface
    CI/CD - Continuous Integration/Continuous Deployment
    DMV - Dynamic Management View
    DWH - Data Warehouse
    SPN - service principal
    SSMS - SQL Server Management Studio

    07 December 2024

    🏭 💠Data Warehousing: Microsoft Fabric (Part VI: SQL Databases for OLTP scenarios) [new feature]

    Data Warehousing Series
    Data Warehousing Series

    One interesting announcements at Ignite is the availability in public preview of SQL databases in Microsoft Fabric, "a versatile and developer-friendly transactional database built on the foundation of Azure SQL database". With this Fabric can address besides OLAP also OLTP scenarios, evolving thus from analytics to a data platform [1]. According to the announcement, besides the AI-optimized architectural aspects, the feature makes the SQL Azure simple, autonomous and secure by design [1], and these latest aspects are considered in this post. 

    Simplicity revolves around the deployment and configuration of databases, the creation of a new database requiring giving a name and the database is created in seconds [1]. It’s a considerable improvement compared with the relatively complex setup needed for on-premise configurations, though sometimes more flexibility in configuration is needed upfront or over database’s lifetime. To get a database ready for testing one can import a sample database or get specific data via data flows and/or pipelines [1]. As development tools one can use Visual Studio Code or SSMS [1], and probably more tools will be available in time.

    The integration with both GitHub and Azure DevOps allows to configure each database under source control, which is needed for many scenarios especially when multiple resources make changes to the database objects [1]. Frankly, that’s mainly important during the development phase, respectively in scenarios in which multiple people make in parallel changes to the logic. It will be interesting to see how much overhead or challenges the feature adds to development and how smoothly everything works together!

    The most important aspect for many solutions is the replication of data in near-real time to the (open-source) delta parquet format in OneLake and thus making the data available for analytics almost immediately [1]. Probably, from this aspect many cloud-based applications can benefit, even if the performance might not be as good as in other well-established architectures. However, there are many other scenarios in which one needs to maintain and use data for OLTP/OLAP purposes. This invites adequate testing and a good weighting of the advantages and disadvantages involved. 

    A SQL database is a native item in Fabric, and therefore it utilizes Fabric capacity units like other Fabric workloads [1]. One can use the Fabric SKU estimator (still in private preview) to estimate the costs [2], though it will be interesting to see how cost-effective the solutions are. Probably, especially when the infrastructure is already available outside of Fabric, it will be easier and cost-effective to use the mirroring functionality. One should test and have a better estimator before moving blindly from the existing infrastructure to Fabric. 

    SQL databases in Fabric are autonomous by design, while allowing to get the best performance and availability by default [1]. High availability is reached through zone redundancy, while performance is achieved by scaling automatically the storage and compute to accommodate the workloads [1]. The auto-optimization capability is achieved with the help of the latest Intelligent Query Processing (IQP) enhancements, respectively the creation of missing indexes to improve query performance [1]. It will be interesting to see how the whole process works, given that the maintenance of indexes usually involves some challenges (e.g. identifying covering indexes, indexes needed only for temporary workloads, duplicated indexes).

    SQL databases in Fabric are automatically configured for high availability with zone redundancy, while storage and compute scale automatically to accommodate the user workload [1]. The database is auto-optimized through the latest IQP enhancements while the system creates any missing indexes to improve query performance. All data is replicated to OneLake by default [1]. Finally, the database always receives the latest security updates with auto-patching, while automatic backups help in disaster recovery scenarios  [1], which can be of real help for database administrators. 

    References:
    [1] Microsoft Fabric Updates Blog (2024) Announcing SQL database in Microsoft Fabric Public Preview [link]
    [2] Microsoft Fabric Updates Blog (2024) Announcing New Recruitment for the Private Preview of Microsoft Fabric SKU Estimator [link]

    06 May 2024

    🧭🏭Business Intelligence: Microsoft Fabric (Part III: The Metrics Layer) 🆕

    Introduction

    One of the announcements of this year's Microsoft Fabric Community first conference was the introduction of a metrics layer in Fabric which "allows organizations to create standardized business metrics, that are rooted in measures and are discoverable and intended for reuse" [1]. As it seems, the information content provided at the conference was kept to a minimum given that the feature is still in private preview, though several webcasts start to catch up on the topic (see [2], [4]). Moreover, as part of their show, the Explicit Measures (@PowerBITips) hosts had Carly Newsome as invitee, the manager of the project, who unveiled more details about the project and the feature, details which became the main source for the information below. 

    The idea of a metric layer or metric store is not new, data professionals occasionally refer to their structure(s) of metrics as such. The terms gained weight in their modern conception relatively recently in 2021-2022 (see [5], [6], [7], [8], [10]). Within the modern data stack, a metrics layer or metric store is an abstraction layer available between the data store(s) and end users. It allows to centrally define, store, and manage business metrics. Thus, it allows us to standardize and enforce a single source of truth (SSoT), respectively solve several issues existing in the data stacks. As Benn Stancil earlier remarked, the metrics layer is one of the missing pieces from the modern data stack (see [10]).

    Microsoft's Solution

    Microsoft's business case for metrics layer's implementation is based on three main ideas (1) duplicate measures contribute to poor data quality, (2) complex data models hinder self-service, (3) reduce data silos in Power BI. In Microsoft's conception the metric layer provides several benefits: consistent definitions and descriptions, easy management via management views, searchable and discoverable metrics, respectively assure trust through indicators. 

    For this feature's implementation Microsoft introduces a new Fabric Item called a metric set that allows to group several (business) metrics together as part of a mini-model that can be tailored to the needs of a subset of end-users and accessed by them via the standard tools already available. The metric set becomes thus a mini-model. Such mini-models allow to break down and reduce the overall complexity of semantic models, while being easy to evolve and consume. The challenge will become then on how to break down existing and future semantic models into nonoverlapping mini-models, creating in extremis a partition (see the Lego metaphor for data products). The idea of mini-models is not new, [12] advocating the idea of using a Master Model, a technique for creating derivative tabular models based on a single tabular solution.

    A (business) metric is a way to elevate the measures from the various semantic models existing in the organization within the mini-model defined by the metric set. A metric can be reused in other fabric artifacts - currently in new reports on the Power BI service, respectively in notebooks by copying the code. Reusing metrics in other measures can mean that one can chain metrics and the changes made will be further propagated downstream. 

    The Metrics Layer in Microsoft Fabric (adapted diagram)
    The Metrics Layer in Microsoft Fabric (adapted diagram)

    Every metric is tied to the original semantic model which allows thus to track how a metric is used across the solutions and, looking forward to Purview, to identify data's lineage. A measure is related to a "table", the source from which the measure came from.

    Users' Perspective

    The Metrics Layer feature is available in Microsoft Fabric service for Power BI within the Metrics menu element next to Scorecards. One starts by creating a metric set in an existing workspace, an operation which creates the actual artifact, to which the individual metrics are added. To create a metric, a user with build permissions can navigate through the semantic models across different workspaces he/she has access to, pick a measure from one of them and elevate it to a metric, copying in the process its measure's definition and description. In this way the metric will always point back to the measure from the semantic model, while the metrics thus created are considered as a related collection and can be shared around accordingly. 

    Once a metric is added to the metric set, one can add in edit mode dimensions to it (e.g. Date, Category, Product Id, etc.). One can then further explore a metric's output and add filters (e.g. concentrate on only one product or category) point from which one can slice-and-dice the data as needed.

    There is a panel where one can see where the metric has been used (e.g. in reports, scorecards, and other integrations), when was last time refreshed, respectively how many times was used. Thus, one has the most important information in one place, which is great for developers as well as for the users. Probably, other metadata will be added, such as whether an increase in the metric would be favorable or unfavorable (like in Tableau Pulse, see [13]) or maybe levels of criticality, an unit of measure, or maybe its type - simple metric, performance indicator (PI), result indicator (RI), KPI, KRI etc.

    Metrics can be persisted to the OneLake by saving their output to a delta table into the lakehouse. As demonstrated in the presentation(s), with just a copy-paste and a small piece of code one can materialize the data into a lakehouse delta table, from where the data can be reused as needed. Hopefully, the process will be further automated. 

    One can consume metrics and metrics sets also in Power BI Desktop, where a new menu element called Metric sets was added under the OneLake data hub, which can be used to connect to a metric set from a Semantic model and select the metrics needed for the project. 

    Tapping into the available Power BI solutions is done via an integration feature based on Sempy fabric package, a dataframe for storage and propagation of Power BI metadata which is part of the python-based semantic Link in Fabric [11].

    Further Thoughts

    When dealing with a new feature, a natural idea comes to mind: what challenges does the feature involve, respectively how can it be misused? Given that the metrics layer can be built within a workspace and that it can tap into the existing measures, this means that one can built on the existing infrastructure. However, this can imply restructuring, refactoring, moving, and testing a lot of code in the process, hopefully with minimal implications for the solutions already available. Whether the process is as simple as imagined is another story. As misusage, in extremis, data professionals might start building everything as metrics, though the danger might come when the data is persisted unnecessarily. 

    From a data mesh's perspective, a metric set is associated with a domain, though there will be metrics and data common to multiple domains. Moreover, a mini-model has the potential of becoming a data product. Distributing the logic across multiple workspaces and domains can add further challenges, especially in what concerns the synchronization and implemented of requirements in a way that doesn't lead to bottlenecks. But this is a general challenge for the development team(s). 

    The feature will probably suffer further changes until is released in public review (probably by September or the end of the year). I subscribe to other data professionals' opinion that the feature was for long needed and that can have an important impact on the solutions built. 

    Previous Post <<||>> Next Post

    Resources:
    [1] Microsoft Fabric Blog (2024) Announcements from the Microsoft Fabric Community Conference (link)
    [2] Power BI Tips (2024) Explicit Measures Ep. 236: Metrics Hub, Hot New Feature with Carly Newsome (link)
    [3] Power BI Tips (2024) Introducing Fabric Metrics Layer / Power Metrics Hub [with Carly Newsome] (link)
    [4] KratosBI (2024) Fabric Fridays: Metrics Layer Conspiracy Theories #40 (link)
    [5] Chris Webb's BI Blog (2022) Is Power BI A Semantic Layer? (link)
    [6] The Data Stack Show (2022) TDSS 95: How the Metrics Layer Bridges the Gap Between Data & Business with Nick Handel of Transform (link)
    [7] Sundeep Teki (2022) The Metric Layer & how it fits into the Modern Data Stack (link)
    [8] Nick Handel (2021) A brief history of the metrics store (link)
    [9] Aurimas (2022) The Jungle of Metrics Layers and its Invisible Elephant (link)
    [10] Benn Stancil (2021) The missing piece of the modern data stack (link)
    [11] Microsoft Learn (2024) Sempy fabric Package (link)
    [12] Michael Kovalsky (2019) Master Model: Creating Derivative Tabular Models (link)
    [13] Christina Obry (2023) The Power of a Metrics Layer - and How Your Organization Can Benefit From It (link
    [14] KratosBI (2024) Introducing the Metrics Layer in #MicrosoftFabric with Carly Newsome [link]

    Resources:
    [R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

    02 November 2016

    ♟️Strategic Management: Integration (Just the Quotes)

    "By integration we mean the process of achieving unity of effort among the various subsystems in the accomplishment of the organization's tasks." (Paul R Lawrence, "Organization and environment: Managing differentiation and integration", 1967)

    "No matter how difficult or unprecedented the problem, a breakthrough to the best possible solution can come only from a combination of rational analysis, based on the real nature of things, and imaginative reintegration of all the different items into a new pattern, using nonlinear brainpower. This is always the most effective approach to devising strategies for dealing successfully with challenges and opportunities, in the market arena as on the battlefield." (Kenichi Ohmae, "The Mind Of The Strategist", 1982)

    "Culture [is] a pattern of basic assumptions invented, discovered, or developed by a given group as it learns to cope with its problems of external adaptation and internal integration that has worked well enough to be considered valid and, therefore, to be taught to new members as the correct way to perceive, think, and feel in relation to those problems." (Edgar H Schein, "Organizational Culture and Leadership", 1985)

    "To keep the business from disintegrating, the concept of information systems architecture is becoming less of an option and more of a necessity." (John Zachman, "A Framework for Information Systems Architecture", 1987)

    "Conventional process structures are fragmented and piecemeal, and they lack the integration necessary to maintain quality and service. They are breeding grounds for tunnel vision, as people tend to substitute the narrow goals of their particular department for the larger goals of the process as a whole. When work is handed off from person to person and unit to unit, delays and errors are inevitable. Accountability blurs, and critical issues fall between the cracks." (Michael M Hammer, "Reengineering Work: Don't Automate, Obliterate", Magazine, 1990) [source]

    "But the net effect of increasing scale, centralization of capital, vertical integration and diversification within the corporate form of enterprise has been to replace the 'invisible hand' of the market by the 'visible hand' of the managers." (David Harvey, "The Limits To Capital", 2006)

    02 November 2007

    🏗️Software Engineering: Integration (Just the Quotes)

    "With increasing size and complexity of the implementations of information systems, it is necessary to use some logical construct (or architecture) for defining and controlling the interfaces and the integration of all of the components of the system." (John Zachman, "A Framework for Information Systems Architecture", 1987)

    "The longer we wait between integrations and acceptance tests, the worse things get. Wait twice as long and we'll have four or more times the hassle. The reason is that one bug written just yesterday is pretty easy to find, while ten or a hundred written weeks ago can become almost impossible." (Ron Jeffries, "Extreme Programming Installed", 2001)

    "The main activity of programming is not the origination of new independent programs, but in the integration, modification, and explanation of existing ones." (Terry Winograd, "Beyond Programming Languages", 1991)

    "As the size of software systems increases, the algorithms and data structures of the computation no longer constitute the major design problems. When systems are constructed from many components, the organization of the overall system - the software architecture - presents a new set of design problems. This level of design has been addressed in a number of ways including informal diagrams and descriptive terms, module interconnection languages, templates and frameworks for systems that serve the needs of specific domains, and formal models of component integration mechanisms." (David Garlan & Mary Shaw, "An introduction to software architecture", Advances in software engineering and knowledge engineering Vol 1, 1993)

    "Enterprise architecture is the organizing logic for business processes and IT infrastructure reflecting the integration and standardization requirements of a company's operation model. […] The key to effective enterprise architecture is to identify the processes, data, technology, and customer interfaces that take the operating model from vision to reality." (Jeanne W Ross et al, "Enterprise architecture as strategy: creating a foundation for business", 2006)

    "Enterprise-architecture is the integration of everything the enterprise is and does. Even the term ‘architecture’ is perhaps a little misleading. It’s on a much larger scale, the scale of the whole rather than of single subsystems: more akin to city-planning than to the architecture of a single building. In something this large, there are no simple states of ‘as-is’ versus ‘to-be’, because its world is dynamic, not static. And it has to find some way to manage the messy confusion of what is, rather than the ideal that we might like it to be." (Tom Graves, "Real Enterprise-Architecture : Beyond IT to the whole enterprise", 2007)

    "Acceptance testing relies on the ability to execute automated tests in a productionlike environment. However, a vital property of such a test environment is that it is able to successfully support automated testing. Automated acceptance testing is not the same as user acceptance testing. One of the differences is that automated acceptance tests should not run in an environment that includes integration to all external systems. Instead, your acceptance testing should be focused on providing a controllable environment in which the system under test can be run. 'Controllable' in this context means that you are able to create the correct initial state for our tests. Integrating with real external systems removes our ability to do this." (David Farley & Jez Humble, "Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation", 2010)

    "Interim solutions, however, acquire inertia (or momentum, depending on your point of view). Because they are there, ultimately useful and widely accepted, there is no immediate need to do anything else. Whenever a stakeholder has to decide what action adds the most value, there will be many that are ranked higher than proper integration of an interim solution. Why? Because it is there, it works, and it is accepted. The only perceived downside is that it does not follow the chosen standards and guidelines - except for a few niche markets, this is not considered to be a significant force." (Klaus Marquardt, [in Kevlin Henney’s "97 Things Every Programmer Should Know", 2010])

    "Many processes in software development are repetitive and easily automated. The DRY principle applies in these contexts, as well as in the source code of the application. Manual testing is slow, error-prone, and difficult to repeat, so automated test suites should be used where possible. Integrating software can be time consuming and error-prone if done manually, so a build process should be run as frequently as possible, ideally with every check-in. Wherever painful manual processes exist that can be automated, they should be automated and standardized. The goal is to ensure that there is only one way of accomplishing the task, and it is as painless as possible." (Steve Smith, [in Kevlin Henney’s "97 Things Every Programmer Should Know", 2010])

    "In many applications, integration or functional tests are used by default as the standard way to test almost all aspects of the system. However integration and functional tests are not the best way to detect and identify bugs. Because of the large number of components involved in a typical end-to-end test, it can be very hard to know where something has gone wrong. In addition, with so many moving parts, it is extremely difficult, if not completely unfeasible, to cover all of the possible paths through the application." (John F Smart, "Jenkins: The Definitive Guide", 2011)

    Related Posts Plugin for WordPress, Blogger...

    About Me

    My photo
    Koeln, NRW, Germany
    IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.