Showing posts with label workspaces. Show all posts
Showing posts with label workspaces. Show all posts

05 July 2025

🏭🗒️Microsoft Fabric: Git Repository [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 4-Jul-2025

[Microsoft Fabric] Git Repository

  • {def} set of features that enable developers to integrate their development processes, tools, and best practices straight into the Fabric platform [2]

  • {goal} the repo serves as single-source-of-truth
  • {feature} backup and version control [2]
  • {feature} revert to previous stages [2]
  • {feature} collaborate with others or work alone using Git branches [2]
  • {feature} source control 
    • provides tools to manage Fabric items [2]
    • supported for Azure DevOps and GitHub [3]
  • {configuration} tenant switches 
    • ⇐ must be enabled from the Admin portal 
      • by the tenant admin, capacity admin, or workspace admin
        • dependent on organization's settings [3]
    • users can create Fabric items
    • users can synchronize workspace items with their Git repositories
    • create workspaces
      • only if is needed to branch out to a new workspace [3]
    • users can synchronize workspace items with GitHub repositories
      • for GitHub users only [3]
  • {concept} release process 
    • begins once new updates complete a Pull Request process and merge into the team’s shared branch [3]
  • {concept} branch
    • {operation} switch branches
      • the workspace syncs with the new branch and all items in the workspace are overridden [3]
        • if there are different versions of the same item in each branch, the item is replaced [3]
        • if an item is in the old branch, but not the new one, it gets deleted [3]
      • one can't switch branches if there are any uncommitted changes in the workspace [3]
    • {action} branch out to another workspace 
      • creates a new workspace, or switches to an existing workspace based on the last commit to the current workspace, and then connects to the target workspace and branch [4]
      • {permission} contributor and above
    • {action} checkout new branch )
      • creates a new branch based on the last synced commit in the workspace [4]
      • changes the Git connection in the current workspace [4]
      • doesn't change the workspace content [4]
      • {permission} workspace admin
    • {action} switch branch
      • syncs the workspace with another new or existing branch and overrides all items in the workspace with the content of the selected branch [4]
      • {permission} workspace admin
    • {limitation} maximum length of branch name: 244 characters.
    • {limitation} maximum length of full path for file names: 250 characters
    • {limitation} maximum file size: 25 MB
  • {operation} connect a workspace to a Git Repos 
    • can be done only by a workspace admin [4]
      • once connected, anyone with permissions can work in the workspace [4]
    • synchronizes the content between the two (aka initial sync)
      • {scenario} either of the two is empty while the other has content
        • the content is copied from the nonempty location to the empty on [4]
      • {scenario}both have content
        • one must decide which direction the sync should go [4]
          • overwrite the content from the destination [4]
      • includes folder structures [4]
        • workspace items in folders are exported to folders with the same name in the Git repo [4]
        • items in Git folders are imported to folders with the same name in the workspace [4]
        • if the workspace has folders and the connected Git folder doesn't yet have subfolders, they're considered to be different [4]
          • leads to uncommitted changes status in the source control panel [4]
            • one must to commit the changes to Git before updating the workspace [4]
              • update first, the Git folder structure overwrites the workspace folder structure [4]
        • {limitation} empty folders aren't copied to Git
          • when creating or moving items to a folder, the folder is created in Git [4]
        • {limitation} empty folders in Git are deleted automatically [4]
        • {limitation} empty folders in the workspace aren't deleted automatically even if all items are moved to different folders [4]
        • {limitation} folder structure is retained up to 10 levels deep [4]
        • {limitation} the folder structure is maintained up to 10 levels deep
    •  Git status
      • synced 
        • the item is the same in the workspace and Git branch [4]
      •  conflict 
        • the item was changed in both the workspace and Git branch [4]
      •  unsupported item
      •  uncommitted changes in the workspace
      •  update required from Git [4]
      •  item is identical in both places but needs to be updated to the last commit [4]
  • source control panel
    • shows the number of items that are different in the workspace and Git branch
      • when changes are made, the number is updated
      • when the workspace is synced with the Git branch, the Source control icon displays a 0
  • commit and update panel 
    • {section} changes 
      • shows the number of items that were changed in the workspace and need to be committed to Git [4]
      • changed workspace items are listed in the Changes section
        • when there's more than one changed item, one can select which items to commit to the Git branch [4]
      • if there were updates made to the Git branch, commits are disabled until you update your workspace [4]
    • {section} updates 
      • shows the number of items that were modified in the Git branch and need to be updated to the workspace [4]
      • the Update command always updates the entire branch and syncs to the most recent commit [4]
        • {limitation} one can’t select specific items to update [4]
        • if changes were made in the workspace and in the Git branch on the same item, updates are disabled until the conflict is resolved [4]
    • in each section, the changed items are listed with an icon indicating the status
      •  new
      •  modified
      •  deleted
      •  conflict
      •  same-changes
  • {concept} related workspace
    • workspace with the same connection properties as the current branch [4]
      • e.g.  the same organization, project, repository, and git folder [4] 
Previous Post <<||>> Next Post 

References:
[2] Microsoft Learn (2025) Fabric: What is Microsoft Fabric Git integration? [link
What is lifecycle management in Microsoft Fabric? [link]
[3] Microsoft Fabric Updates Blog (2025) Fabric: Introducing New Branching Capabilities in Fabric Git Integration [link
[4] Microsoft Learn (2025) Fabric: Basic concepts in Git integration [link]
[5]  [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: 

Acronyms:
CI/CD - Continuous Integration and Continuous Deployment

13 March 2025

🏭🗒️Microsoft Fabric: Workspaces [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 25-Mar-2025

[Microsoft Fabric] Workspace

  • {def} a collection of items that brings together different functionality in a single environment designed for collaboration
  • {default} created in organization's shared capacity
    • workspaces can be assigned to other capacities
      • includes My Workspaces
      • via Workspace settings >> Premium
    • shared environment that holds live items
      • any changes made directly within the workspace immediately take effect and impact all users
  • components
    • header
      • contains
        • name 
        • brief description of the workspace
        • links to other functionality
    • toolbar 
      • contains 
        • controls for managing items to the workspace 
        • controls for managing files
    • view area
      • enables selecting a view
      • {type} list view
        • {subitem} task flow
          • area in which users can create or view a graphical representation of the data project [3]
            • ⇐ shows the logical flow of the project [3]
              • ⇐ it doesn't show the flow of data [3]
          • can be hided via Show/Hide arrows
        • {subitem} items list
          • area in which the users can see the items and folders in the workspace [3]
          • one can filter the items list by selecting the tasks, if any defined [3]
        • {subitem} resize bar
          • elements that allow to resize the task flow and items list by dragging the resize bar up or down [3]
      • {type} lineage view
        • shows the flow of data between the items in the workspace [3]
  • {feature} workspace settings 
    • allows to manage and update the workspace [3]
  • {feature} contact list 
    • allows to specify which users receive notification about issues occurring in the workspace [3] 
    • {default} contains workspace's creator [3]
  • {feature} SharePoint integration 
    • allows to configure a M365 Group whose SharePoint document library is available to workspace users [3]
      • ⇐ the group is created outside of MF first [3]
      • restrictions may apply to the environment
    • {best practice} give access to the workspace to the same M365 Group whose file storage is configured [3]
      • MF doesn't synchronize permissions between users or groups with workspace access, and users or groups with M365 Group membership [3]
  • {feature} workspace identity
    • an automatically managed service principal that can be associated with a Fabric workspace [6]
      • workspaces with a workspace identity can securely read or write to firewall-enabled ADSL Gen2 accounts through trusted workspace access for OneLake shortcuts [6]
      • Fabric creates a service principal in Microsoft Entra ID to represent the identity [6]
        • ⇐ an accompanying app registration is also created [6]
        • Fabric automatically manages the credentials associated with workspace identities [6]
          • ⇒ prevents credential leaks and downtime due to improper credential handling [6]
    • used to obtain Microsoft Entra tokens without the customer having to manage any credentials [6]
      • Fabric items can use the identity when connecting to resources that support Microsoft Entra authentication [6]
    • can be created in the workspace settings of any workspace except My workspaces
    • automatically assigned to the workspace contributor role and has access to workspace items [6]
  • {feature} workspace roles
    • allows to manage who can do what in a workspace [4]
    • sit on top of OneLake and divide the data lake into separate containers that can be secured independently [4]
    • extend the Power BI workspace roles by associating new MF capabilities 
      • e.g. data integration, data exploration
    • can be assigned to 
      • individual users
      • security groups
      • Microsoft 365 groups
      • distribution lists
    • {role} Admin
    • {role} Member
    • {role} Contributor
    • {role} Viewer
    • user groups
      • members get the role(s) assigned
      • users existing in several group get the highest level of permission that's provided by the roles that they're assigned [4]
      • {concept} [nested group]
  • {concept} current workspace
    • the active open workspace
  • {action} create new workspace
  • {action} pin workspace
  • {action} delete workspace
    • everything contained within the workspace is deleted for all group members [3]
      • the associated app is also removed from AppSource [3]
    • {warning} if the workspace has a workspace identity, that workspace identity will be irretrievably lost [3]
      • this may cause Fabric items relying on the workspace identity for trusted workspace access or authentication to break [3]
    • only admins can perform the operation
  • {action} manage workspace
  • {action} take ownership of Fabric items
    • Fabric items may stop working correctly [5]
      • {scenario} the owner leaves the organization [5]
      • {scenario}the owner don't sign in for more than 90 days [5]
      • in such cases, anyone with read and write permissions on an item can take ownership of the item [5]
        • become the owner of any child items the item might have
        • {limitation} one can't take over ownership of child items directly [5]
          • ⇐ one can take ownership only through the parent item [5]
  • {limitation} workspace retention
    • personal workspace
      • {default} 30 days [8]
    • collaborative workspace
      • {default} 7 days [8]
      • configurable between 7 to 90 days [8]
        • via Define workspace retention period setting [8]
    •  during the retention period, Fabric administrators can restore, respectively delete permanently the workspace [8]
      • after that, the workspace is deleted permanently and it and its contents are irretrievably lost [8]
  • {limitation} can contain a maximum of 1000 items [8]
    • includes both parent and child items [8]
      • refers to Fabric and Power BI items [8]
    • workspaces with more items have 180-day extension period applied automatically as of 10-Apr-2025 [8]
  • {limitation} certain special characters aren't supported in workspace names when using an XMLA endpoint [3]
  • {limitation} a user or a service principal can be a member of up to 1000 workspaces [3]
  • {feature} auditing
    • several activities are audited for workspaces [3]
      • CreateFolder
      • DeleteFolder
      • UpdateFolder
      • UpdateFolderAccess
  • {feature} workspace monitoring 
    • Eventhouse secure read-only database that collects and organizes logs and metrics from a range of Fabric items in the workspace [1]
      • accessible only to workspace users with at least a contributor role [1]
      • users can access and analyze logs and metrics [1]
      • the data is aggregated or detailed [1]
      • can be queried via KQL or SQL [1]
      • supports both historical log analysis and real-time data streaming [1]
      • accessible from the workspace [1]
        • one can build and save query sets and dashboards to simplify data exploration [1]
      • use the workspace settings to delete the database [1]
        •  wait about 15 minutes before recreating a deleted database [1]
    • {action} share the database
      • users need workspace member or admin role [1]
    • {limitation} one can enable either 
      • workspace monitoring 
      • log analytics
        • if enabled, the log analytics configuration must be deleted first before enabling workspace monitoring [1]
          • one should wait for a few hours before enabling workspace monitoring [1]
    • {limitation} retention period for monitoring data: 30 days [1]
    • {limitation}the ingestion can't be configured to filter for specific log type or category [1]
      • e.g. error or workload type.
    • {limitation} user data operation logs aren't available even though the table is available in the monitoring database [1]
    • {prerequisite} Power BI Premium or Fabric capacity [1]
    • {prerequisite} workspace admins can turn on monitoring for their workspaces tenant setting is enabled [1]
      • enabling the setting, requires Fabric administrator rights [1]
    • {prerequisite} admin role in the workspace [1]
  • workspace permissions
    •  the first security boundary for data within OneLake [7]
      • each workspace represents a single domain or project area where teams can collaborate on data [7]
      • security is managed through Fabric workspace roles [7]
    • items can have permissions configured separately from the workspace roles [7]
      • permissions can be configured either by [7]
        • sharing an item 
        • managing the permissions of an item

References:
[1] Microsoft Learn (2024) Fabric: What is workspace monitoring (preview)? [link]
[2] Microsoft Fabric Update Blog (2024) Announcing preview of Workspace Monitoring? [link]
[3] Microsoft Learn (2024) Fabric: Workspaces in Microsoft Fabric and Power BI [link]
[4] Microsoft Learn (2024) Fabric: Roles in workspaces in Microsoft Fabric [link]
[5] Microsoft Learn (2024) Fabric: Take ownership of Fabric items [link]
[6] Microsoft Learn (2024) Fabric: Workspace identity [link]
[7] Microsoft Learn (2024) Fabric: Role-based access control (RBAC) [link]
[8] Microsoft Learn (2024) Fabric: Manage workspaces [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
ADSL Gen2 - Azure Data Lake Storage Gen2
KQL - Kusto Query Language
M365 - Microsoft 365
MF - Microsoft Fabric
SQL - Structured Query Language

22 January 2025

🏭🗒️Microsoft Fabric: Folders [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 22-Jan-2025

[Microsoft Fabric] Folders

  • {def} organizational units inside a workspace that enable users to efficiently organize and manage artifacts in the workspace [1]
  • identifiable by its name
    • {constraint} must be unique in a folder or at the root level of the workspace
    • {constraint} can’t include certain special characters [1]
      • C0 and C1 control codes [1]
      • leading or trailing spaces [1]
      • characters: ~"#.&*:<>?/{|} [1]
    • {constraint} can’t have system-reserved names
      • e.g. $recycle.bin, recycled, recycler.
    • {constraint} its length can't exceed 255 characters
  • {operation} create folder
    • can be created in
      • an existing folder (aka nested subfolder) [1]
        • {restriction} a maximum of 10 levels of nested subfolders can be created [1]
        • up to 10 folders can be created in the root folder [1]
        • {benefit} provide a hierarchical structure for organizing and managing items [1]
      • the root
  • {operation} move folder
  • {operation} rename folder
    • same rules applies as for folders’ creation [1]
  • {operation} delete folder
    • {restriction} currently can be deleted only empty folders [1]
      • {recommendation} make sure the folder is empty [1]
  •  {operation} create item in folder
    • {restriction} certain items can’t be created in a folder
      • dataflows gen2
      • streaming semantic models
      • streaming dataflows
    • ⇐ items created from the home page or the Create hub, are created at the root level of the workspace [1]
  • {operation} move file(s) between folders [1]
  • {operation} publish to folder [1]
    •   Power BI reports can be published to specific folders
      • {restriction} folders' name must be unique throughout an entire workspace, regardless of their location [1]
        • when publishing a report to a workspace that has another report with the same name in a different folder, the report will publish to the location of the already existing report [1]
  • {limitation}may not be supported by certain features
    •   e.g. Git
  • {recommendation} use folders to organize workspaces [1]
  • {permissions}
    • inherit the permissions of the workspace where they're located [1] [2]
    • workspace admins, members, and contributors can create, modify, and delete folders in the workspace [1]
    • viewers can only view folder hierarchy and navigate in the workspace [1]
  • [deployment pipelines] deploying items in folders to a different stage, the folder hierarchy is automatically applied [2]

Previous Post  <<||>>  Next Post

References:
[1] Microsoft Fabric (2024) Create folders in workspaces [link]
[2] Microsoft Fabric (2024) The deployment pipelines process [link]
[3] Microsoft Fabric Updates Blog (2025) Define security on folders within a shortcut using OneLake data access roles [link]
[4] Microsoft Fabric Updates Blog (2025) Announcing the General Availability of Folder in Workspace [link]
[5] Microsoft Fabric Updates Blog (2025) Announcing Folder in Workspace in Public Preview [link]
[6] Microsoft Fabric Updates Blog (2025) Getting the size of OneLake data items or folders [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

10 March 2024

🏭📑Microsoft Fabric: Medallion Architecture [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 10-Mar-2024

Medallion Architecture in Microsoft Fabric [1]


Medallion architecture
  • a recommended data design pattern used to organize data in a lakehouse logically [2]
    • compatible with the concept of data mesh
  • {goal} incrementally and progressively improve the structure and quality of data as it progresses through each stage [1]
    • brings structure and efficiency to a lakehouse environment [2]
    • ensures that data is reliable and consistent as it goes through various checks and changes [2]
    •  complements other data organization methods, rather than replacing them [2]
  • consists of three distinct layers (or zones)
    • {layer} bronze (aka raw zone
      • stores source data in its original format [1]
      • the data in this layer is typically append-only and immutable [1]
      • {recommendation} store the data in its original format, or use Parquet or Delta Lake [1]
      • {recommendation} create a shortcut in the bronze zone instead of copying the data across [1]
        • works with OneLake, ADLS Gen2, Amazon S3, Google
      • {operation} ingest data
        • {characteristic} maintains the raw state of the data source [3]
        • {characteristic} is appended incrementally and grows over time [3]
        • {characteristic} can be any combination of streaming and batch transactions [3]
        • ⇒ retaining the full, unprocessed history
          • ⇒ provides the ability to recreate any state of a given data system [3]
        • additional metadata may be added to data on ingest
            • e.g. source file names, recording the time data was processed
          • {goal} enhanced discoverability [3]
          • {goal} description of the state of the source dataset [3]
          • {goal} optimized performance in downstream applications [3]
    • {layer} silver (aka enriched zone
      • stores data sourced from the bronze layer
      • the raw data has been 
        • cleansed
        • standardized
        • structured as tables (rows and columns)
        • integrated with other data to provide an enterprise view of all business entities
      • {recommendation} use Delta tables 
        • provide extra capabilities and performance enhancements [1]
          • {default} every engine in Fabric writes data in the delta format and use V-Order write-time optimization to the Parquet file format [1]
      • {operation} validate and deduplicate data
      • for any data pipeline, the silver layer may contain more than one table [3]
    • {layer} gold (aka curated zone)
      • stores data sourced from the silver layer [1]
      • the data is refined to meet specific downstream business and analytics requirements [1]
      • tables typically conform to star schema design
        • supports the development of data models that are optimized for performance and usability [1]
      • use lakehouses (one for each zone), a data warehouse, or combination of both
        • the decision should be based on team's preference and expertise of your team. 
        • different analytic engines can be used [1]
    • ⇐ schemas and tables within each layer can take on a variety of forms and degrees of normalization [3]
      • depends on the frequency and nature of data updates and the downstream use cases for the data [3]
  • {pattern} create each zone as a lakehouse
    • business users access data by using the SQL analytics endpoint [1]
  • {pattern} create the bronze and silver zones as lakehouses, and the gold zone as data warehouse
    • business users access data by using the data warehouse endpoint [1]
  • {pattern} create all lakehouses in a single Fabric workspace
    • {recommendation} create each lakehouse in its own workspace [1]
    • provides more control and better governance at the zone level [1]
  • {concept} data transformation 
    • involves altering the structure or content of data to meet specific requirements [2] 
      • via Dataflows (Gen2), notebooks
  • {concept} data orchestration 
    • refers to the coordination and management of multiple data-related processes, ensuring they work together to achieve a desired outcome [2]
      • via data pipelines

Previous Post  <<||>>  Next Post

References:
[1] Microsoft Learn: Fabric (2023) Implement medallion lakehouse architecture in Microsoft Fabric (link)
[2] Microsoft Learn: Fabric (2023) Organize a Fabric lakehouse using medallion architecture design (link)
[3] Microsoft Learn: Azure (2023) What is the medallion lakehouse architecture? (link)

Resources:
[R1] Serverless.SQL (2023) Data Loading Options With Fabric Workspaces, by Andy Cutler (link)
[R2] Microsoft Learn: Fabric (2023) Lakehouse end-to-end scenario: overview and architecture (link)
[R3] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
ADLS - Azure Data Lake Store Gen2

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.