SQL Troubles: 🏭🗒️Microsoft Fabric: Warehouse [Notes]

10 November 2024

🏭🗒️Microsoft Fabric: Warehouse [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 11-Mar-2024

Warehouse vs SQL analytics endpoint in Microsoft Fabric [3]

[Microsoft Fabric] Warehouse

{def} highly available relational data warehouse that can be used to store and query data in the Lakehouse

supports the full transactional T-SQL capabilities
modernized version of the traditional data warehouse
unifies capabilities from Synapse Dedicated and Serverless SQL Pools

resources are managed elastically to provide the best possible performance

⇒ no need to think about indexing or distribution
a new parser gives enhanced CSV file ingestion time
metadata is cached in addition to data
improved assignment of compute resources to milliseconds
multi-TB result sets are streamed to the client

leverages a distributed query processing engine

provides with workloads that have a natural isolation boundary [3]

true isolation is achieved by separating workloads with different characteristics, ensuring that ETL jobs never interfere with their ad hoc analytics and reporting workloads [3]

{operation} data ingestion

involves moving data from source systems into the data warehouse [2]

the data becomes available for analysis [1]

via Pipelines, Dataflows, cross-database querying, COPY INTO command
no need to copy data from the lakehouse to the data warehouse [1]

one can query data in the lakehouse directly from the data warehouse using cross-database querying [1]

{operation} data storage

involves storing the data in a format that is optimized for analytics [2]

{operation} data processing

involves transforming the data into a format that is ready for consumption by analytical tools [1]

{operation} data analysis and delivery

involves analyzing the data to gain insights and delivering those insights to the business [1]

{operation} designing a warehouse (aka warehouse design)

standard warehouse design can be used

{operation} sharing a warehouse (aka warehouse sharing)

a way to provide users read access to the warehouse for downstream consumption

via SQL, Spark, or Power BI

the level of permissions can be customized to provide the appropriate level of access

{feature} mirroring

provides a modern way of accessing and ingesting data continuously and seamlessly from any database or data warehouse into the Data Warehousing experience in Fabric

any database can be accessed and managed centrally from within Fabric without having to switch database clients
data is replicated in a reliable way in real-time and lands as Delta tables for consumption in any Fabric workload

{feature} v-order

write time optimization to the parquet file format that enables lightning-fast reads under the MF compute engine [5]

{feature} caching

stores frequently accessed data and metadata in a faster storage layer [6]

{feature} snapshots

{def} read-only representation of a warehouse at a specific point in time [7]

{feature} automatic purging

routinely and systematically eliminating expired data periodically [8]
{benefit} proactively helps maintain an efficient and cost-effective data infrastructure [8]

via garbage collection

background process, that periodically identifies and cleans [8]

all the data and log files of dropped tables
aborted transactions
temporary tables
expired files

executes every 24 hours, when the warehouse is active [8]
ensures the data warehouse remains optimized and efficient [8]

{goal} storage cost optimization
{goal} minimize maintenance overhead
{goal} adhering to data retention regulations

{concept} SQL analytics endpoint

a warehouse that is automatically generated from a Lakehouse in Microsoft Fabric [3]

{concept} virtual warehouse

can containing data from virtually any source by using shortcuts [3]

{concept} cross database querying

enables to quickly and seamlessly leverage multiple data sources for fast insights and with zero data duplication [3]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2023) Fabric: Get started with data warehouses in Microsoft Fabric (link)

[2] Microsoft Learn (2023) Fabric: Microsoft Fabric decision guide: choose a data store (link)
[3] Microsoft Learn (2024) Fabric: What is data warehousing in Microsoft Fabric? (link)

[4] Microsoft Learn (2023) Fabric: Better together: the lakehouse and warehouse (link)

[5] Microsoft Learn (2024) Fabric: Understand V-Order for Microsoft Fabric Warehouse [link]

[6] Microsoft Learn (2024) Fabric: Caching in Fabric data warehousing [link]

[7] Microsoft Learn (2024) Fabric: Warehouse Snapshots in Microsoft Fabric (Preview) [link]

[8] Microsoft Fabric Updates Blog (2025) Intelligent Data Cleanup: Smart Purging for Smarter Data Warehouses [link]

Resources:
[R1] Microsoft Learn (2023) Fabric: Data warehousing documentation in Microsoft Fabric (link)

[R2] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]
[R3] Microsoft Learn (2025) Fabric: Share your data and manage permissions [link]
[R4] Microsoft Learn (2025) Microsoft Fabric decision guide: Choose between Warehouse and Lakehouse [link]

Acronyms:

ETL - Extract, Transfer, Load

MF - Microsoft Fabric

SQL Troubles

Pages

10 November 2024

🏭🗒️Microsoft Fabric: Warehouse [Notes]

No comments:

About Me