Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!
Last updated: 18-Apr-2024
|
Mirroring in Microsoft Fabric (adapted after [2]) |
[Microsoft Fabric] Mirroring
- low-cost, low-latency fully managed service that allows to replicate data from various systems together into OneLake [1]
- supported in Azure SQL Database, Azure Cosmos DB, Snowflake [1]
- ⇒ replaces ETL pipelines
- ⇐ manages the replication of data into OneLake and conversion to Parquet [1]
- changes have a near real-time latency [3]
- uses the source database’s CDC feature [4]
- zero-ETL, fully managed, and high-performance data movement mechanism [8]
- {benefit} allows users to use a highly integrated, end-to-end, and easy-to-use service [1]
- the mirrored data can be used in Power BI
- ⇐ since tables are all v-ordered delta tables [3]
- {benefit} allows users to accelerate their journey into Fabric [1]
- the delta tables can be used in every Fabric experience
- {restriction} replication of views is currently not supported [3]
- {benefit} allows to write cross-database queries against the mirrored databases, warehouses, and the SQL analytics endpoints of Lakehouses in a single T-SQL query [1]
- {benefit} free Mirroring compute and storage for replicas in OneLake [5]
- storage for replicas is free up to a limit
- based on the capacity size [5]
- mirroring offers a free TB of mirroring storage for every CU purchased [5]
- compute is free and not consume capacity
- ⇐ used to replicate data into OneLake [5]
- landing zone in OneLake
- stores both the snapshot and change data [3]
- ⇐ improves performance when converting files into delta verti-parquet [3]
- {operation} enable the mirroring
- by creating a secure connection to the operational data source
- an entire database or individual tables can be replicated
- creates three items in the targeted workspace
- the mirrored database
- a SQL analytics endpoint
- a Default semantic model
- the same source database can be mirrored multiple times
- ⇐ though usually not needed
- scenario: replicating data to different types of environments
- {operation} stopping the mirroring
- stops the replication in the source database, but a copy of the tables is kept in OneLake [3]
- {operation} restarting the mirroring
- results in all data being replicated from the start [3]
- {operation} remove a table from mirroring
- the table is no longer replicated and its data is deleted from OneLake [3]
- {operation} sharing
- users grant other users or groups of users access to a mirrored database without giving access to the workspace and the rest of its items [1]
- access is also granted to the SQL analytics endpoint and associated default semantic model [1]
- shared mirrored databases can be found through either
- Data Hub
- Shared with Me section in Microsoft Fabric
- triggers an initial replication
- {requirement} [licensing] requires Power BI Premium, Fabric Capacity, or Trial Capacity
- {feature} monitoring
- {benefit} allows to gain insights into mirroring operations and when the replica in Fabric OneLake was last refreshed [4]
- [Azure SQL Database]
- {restriction} doesn't support Azure SQL Database logical servers behind an AVN or private networking [2]
- {requirement} update the Azure SQL logical server firewall rules to Allow public network access [2]
- {requirement} enable the Allow Azure services option to connect to your Azure SQL Database logical server [2]
- {restriction}
- access through the Power BI Gateway or behind a firewall is unsupported [3]
- authentication
- SQL authentication with user name and password
- Microsoft Entra ID
- Service Principal
- {troubleshooting} check if the changes properly flow
- via sys.dm_change_feed_log_scan_sessions DMV
- {troubleshooting} check if there are any problems reported
- via sys.dm_change_feed_errors DMV
- {troubleshooting} check if the mirroring was properly enabled
- via sp_help_change_feed stored procedure
- {troubleshooting} disable mirroring
- via sp_change_feed_disable_db stored procedure
- [SQL Server] [preview]
- creates an initial snapshot in Fabric OneLake after which data is kept in sync in near-real time [7]
- [SQL Server 2016-2022]
- relies on CDC to capture an initial snapshot of all the tables selected for mirroring and there after replicate the changes [7]
- the mirroring services connects to read the initial snapshot as well as the changes and pulls the data into OneLake and converts into an analytics-ready format in Fabric [7]
- via on-premises data gateway (OPDG)
- must be installed in the SQL Server environment [7]
- applies to VMs and on-premises altogether [5]
- [SQL Server 2025] supports the latest change feed technology for replicating changes to ensure that there is minimal impact to operational workload [5]
- keeps track and replicates the initial snapshot and changes to the landing zone in OneLake which is then converted to an analytics-ready format by the mirroring engine in Fabric [7]
- {requirement} on-premises data gateway
- primarily used as a control plane to connect and authenticate the on-premises environment to Fabric [7]
- {requirement} Arc Agent
- outbounds authentication from SQL Server to Fabric [7]
- [Azure SQL MI]
- {feature|preview} firewall support
- via data gateway
- ensures secure connections to the source databases via private endpoint [5]
- removes the necessity of opening public access [5]
- {feature|preview} mirror tables without PKs
- {benefit} increased flexibility
- {feature|preview} expanded DDL
- allows to truncate tables while mirroring is actively replicating on data [5]
- {feature} open mirroring
- extension of Mirroring capabilities in Fabric [5]
- creates a copy of the data in OneLake and keeps it up to date [6]
- provides APIs to replicate data from anywhere
- data is converted to parquet or CSV format and the API or UI is used to load the data into the OneLake landing zone along with any additional changes [6]
- the Fabric replication technology converts everything to a Delta format making it optimized and ready for AI and BI workloads [6]
- {benefit} empowers vendors to build their own Mirroring solutions
- leverages the same Mirroring engine to help manage inserts/updates/deletes efficiently [5]
- cheaper for read-heavy data sources [8]
- hreat for unique data requirements, regulated environments [8]
- {scenario} analyze operational data in near real-time [8]
- {scenario} avoid managing ETL pipelines [8]
- {scenario} single source of truth across both operational and analytical systems [8]
- {scenario} building a Lakehouse architecture and want real-time updates [8]
AVN - Azure Virtual Network
CDC - Change Data Capture
MI - Managed Instance
PK - Primary Key
References:
[1] Microsoft Learn - Microsoft Fabric
(2024) What is Mirroring in Fabric? [link]
[2] Microsoft Learn - Microsoft Fabric (2024) Mirroring Azure SQL
Database [Preview] [link]
[3] Microsoft Learn - Microsoft Fabric (2024) Frequently
asked questions for Mirroring Azure SQL Database in Microsoft Fabric [Preview]
[link]
[4] Microsoft Fabric Updates Blog (2023) Introducing Mirroring in
Microsoft Fabric [link]
[5] Microsoft Fabric Updates Blog (2025) What’s new with
Mirroring in Fabric at Microsoft Build 2025 [link]
[6] Microsoft Fabric Updates Blog (2025) Announcing the General
Availability of Open Mirroring [link]
[7] Microsoft Fabric Updates Blog (2025) Mirroring for SQL Server in
Microsoft Fabric (Preview) [link]
[8] Microsoft Fabric Updates Blog (2025) Mirroring in Microsoft Fabric explained: benefits, use cases, and pricing demystified [link]
Resources:
[R1] Tutorial: Configure Microsoft Fabric
mirrored databases from Azure SQL Database [Preview] (link)
[R2] Microsoft Fabric Updates Blog (2024) Announcing the Public
Preview of Mirroring in Microsoft Fabric, by Charles Webb (link)
[R3] Microsoft Learn (2025) Fabric: What's new in Microsoft
Fabric? [link]
[R4] Microsoft (2025) Fabric Roadmap [link]
No comments:
Post a Comment