13 April 2025

🏭🗒️Microsoft Fabric: Continuous Integration & Continuous Deployment [CI/CD] [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 13-Apr-2025

[Microsoft Fabric] Continuous Integration & Continuous Deployment [CI/CD] 
  • {def} development processes, tools, and best practices used to automates the integration, testing, and deployment of code changes to ensure efficient and reliable development
    • can be used in combination with a client tool
      • e.g. VS Code, Power BI Desktop
      • don’t necessarily need a workspace
        • developers can create branches and commit changes to that branch locally, push those to the remote repo and create a pull request to the main branch, all without a workspace
        • workspace is needed only as a testing environment [1]
          • to check that everything works in a real-life scenario [1]
    • addresses a few pain points [2]
      • manual integration issues
        • manual changes can lead to conflicts and errors
          • slow down development [2]
      • development delays
        • manual deployments are time-consuming and prone to errors
          • lead to delays in delivering new features and updates [2]
      • inconsistent environments
        • inconsistencies between environment cause issues that are hard to debug [2]
      • lack of visibility
        • can be challenging to
          • track changes though their lifetime [2]
          • understand the state of the codebase[2]
    • {process} continuous integration (CI)
    • {process} continuous deployment (CD)
    • architecture
      • {layer} development database 
        • {recommendation} should be relatively small [1]
      • {layer} test database 
        • {recommendation{ should be as similar as possible to the production database [1]
      • {layer} production database

      • data items
        • items that store data
        • items' definition in Git defines how the data is stored [1]
    • {stage} development 
      • {best practice} back up work to a Git repository
        • back up the work by committing it into Git [1]
        • {prerequisite} the work environment must be isolated [1]
          • so others don’t override the work before it gets committed [1]
          • commit to a branch no other developer is using [1]
          • commit together changes that must be deployed together [1]
            • helps later when 
              • deploying to other stages
              • creating pull requests
              • reverting changes
      • {warning} big commits might hit the max commit size limit [1]
        • {bad practice} store large-size items in source control systems, even if it works [1]
        • {recommendation} consider ways to reduce items’ size if they have lots of static [1] resources, like images [1]
      • {action} revert to a previous version
        • {operation} undo
          • revert the immediate changes made, as long as they aren't committed yet [1]
          • each item can be reverted separately [1]
        • {operation} revert
          • reverting to older commits
            • {recommendation} promote an older commit to be the HEAD 
              • via git revert or git reset [1]
              • shows that there’s an update in the source control pane [1]
              • the workspace can be updated with that new commit [1]
          • {warning} reverting a data item to an older version might break the existing data and could possibly require dropping the data or the operation might fail [1]
          • {recommendation} check dependencies in advance before reverting changes back [1]
      • {concept} private workspace
        • a workspace that provides an isolated environment [1]
        • allows to work in isolation, use a separate [1]
        • {prerequisite} the workspace is assigned to a Fabric capacity [1]
        • {prerequisite} access to data to work in the workspace [1]
        • {step} create a new branch from the main branch [1]
          • allows to have most up-to-date version of the content [1]
          • can be used for any future branch created by the user [1]
            • when a sprint is over, the changes are merged and one can start a fresh new task [1]
              • switch the connection to a new branch on the same workspace
            • approach can be used when is needed to fix a bug in the middle of a sprint [1]
          • {validation} connect to the correct folder in the branch to pull the right content into the workspace [1]
      • {best practice} make small incremental changes that are easy to merge and less likely to get into conflicts [1]
        • update the branch to resolve the conflicts first [1]
      • {best practice} change workspace’s configurations to enable productivity [1]
        • connection between items, or to different data sources or changes to parameters on a given item [1]
      • {recommendation} make sure you're working with the supported structure of the item you're authoring [1]
        • if you’re not sure, first clone a repo with content already synced to a workspace, then start authoring from there, where the structure is already in place [1]
      • {constraint} a workspace can only be connected to a single branch at a time [1]
        • {recommendation} treat this as a 1:1 mapping [1]
    • {stage} test
      • {best practice} allows to simulate a real production environment for testing purposes [1]
        • {alternative} simulate this by connecting Git to another workspace [1]
      • factors to consider for the test environment
        • data volume
        • usage volume
        • production environment’s capacity
          • stage and production should have the same (minimal) capacity [1]
            • using the same capacity can make production unstable during load testing [1]
              • {recommendation} test using a different capacity similar in resources to the production capacity [1]
              • {recommendation} use a capacity that allows to pay only for the testing time [1]
                • allows to avoid unnecessary costs [1]
      • {best practice} use deployment rules with a real-life data source
        • {recommendation} use data source rules to switch data sources in the test stage or parameterize the connection if not working through deployment pipelines [1]
        • {recommendation} separate the development and test data sources [1]
        • {recommendation} check related items
          • the changes made can also affect the dependent items [1]
        • {recommendation} verify that the changes don’t affect or break the performance of dependent items [1]
          • via impact analysis.
      • {operation} update data items in the workspace
        • imports items’ definition into the workspace and applies it on the existing data [1]
        • the operation is same for Git and deployment pipelines [1]
        • {recommendation} know in advance what the changes are and what impact they have on the existing data [1]
        • {recommendation} use commit messages to describe the changes made [1]
        • {recommendation} upload the changes first to a dev or test environment [1]
          • {benefit} allows to see how that item handles the change with test data [1]
        • {recommendation} check the changes on a staging environment, with real-life data (or as close to it as possible) [1]
          • {benefit} allows to minimize the unexpected behavior in production [1]
        • {recommendation} consider the best timing when updating the Prod environment [1]
          • {benefit} minimize the impact errors might cause on the business [1]
        • {recommendation} perform post-deployment tests in Prod to verify that everything works as expected [1]
        • {recommendation} have a deployment, respectively a recovery plan [1]
          • {benefit) allows to minimize the effort, respectively the downtime [1]
    • {stage} production
      • {best practice} let only specific people manage sensitive operations [1]
      • {best practice} use workspace permissions to manage access [1]
        • applies to all BI creators for a specific workspace who need access to the pipeline
      • {best practice} limit access to the repo or pipeline by only enabling permissions to users [1] who are part of the content creation process [1]
      • {best practice} set deployment rules to ensure production stage availability [1]
        • {goal} ensure the data in production is always connected and available to users [1]
        • {benefit} allows deployments run while while minimizing the downtimes
        • applies to data sources and parameters defined in the semantic model [1]
      • deployment into production using Git branches
        • {recommendation} use release branches [1]
          • requires changing the connection of workspace to the new release branches before every deployment [1]
          • if the build or release pipeline requires to change the source code, or run scripts in a build environment before deployment, then connecting the workspace to Git won't help [1]
      • {recommendation} after deploying to each stage, make sure to change all the configuration specific to that stage [1]

    References:
    [1] Microsoft Learn (2025) Fabric: Best practices for lifecycle management in Fabric [link]
    [2] Microsoft Learn (2025) Fabric: CI/CD for pipelines in Data Factory in Microsoft Fabric [link]
    [3] Microsoft Learn (2025) Fabric: Choose the best Fabric CI/CD workflow option for you [link]

    Acronyms:
    API - Application Programming Interface
    BI - Business Intelligence
    CI/CD - Continuous Integration and Continuous Deployment
    VS - Visual Studio

    No comments:

    Related Posts Plugin for WordPress, Blogger...

    About Me

    My photo
    Koeln, NRW, Germany
    IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.