Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!
Last updated: 11-Mar-2024
Warehouse vs SQL analytics endpoint in Microsoft Fabric [3] |
[Microsoft Fabric] Data Warehouse
- highly available relational data warehouse that can be used to store and query data in the Lakehouse
- supports the full transactional T-SQL capabilities
- modernized version of the traditional data warehouse
- unifies capabilities from Synapse Dedicated and Serverless SQL Pools
- modernized with key improvements
- resources are managed elastically to provide the best possible performance
- ⇒ no need to think about indexing or distribution
- a new parser gives enhanced CSV file ingestion time
- metadata is now cached in addition to data
- improved assignment of compute resources to milliseconds
- multi-TB result sets are streamed to the client
- leverages a distributed query processing engine
- provides with workloads that have a natural isolation boundary [3]
- true isolation is achieved by separating workloads with different characteristics, ensuring that ETL jobs never interfere with their ad hoc analytics and reporting workloads [3]
- {operation} data ingestion
- involves moving data from source systems into the data warehouse [2]
- the data becomes available for analysis [1]
- via Pipelines, Dataflows, cross-database querying, COPY INTO command
- no need to copy data from the lakehouse to the data warehouse [1]
- one can query data in the lakehouse directly from the data warehouse using cross-database querying [1]
- {operation} data storage
- involves storing the data in a format that is optimized for analytics [2]
- {operation} data processing
- involves transforming the data into a format that is ready for consumption by analytical tools [1]
- {operation} data analysis and delivery
- involves analyzing the data to gain insights and delivering those insights to the business [1]
- {operation} designing a warehouse (aka warehouse design)
- standard warehouse design can be used
- {operation} sharing a warehouse (aka warehouse sharing)
- a way to provide users read access to the warehouse for downstream consumption
- via SQL, Spark, or Power BI
- the level of permissions can be customized to provide the appropriate level of access
- {feature} mirroring
- provides a modern way of accessing and ingesting data continuously and seamlessly from any database or data warehouse into the Data Warehousing experience in Fabric
- any database can be accessed and managed centrally from within Fabric without having to switch database clients
- data is replicated in a reliable way in real-time and lands as Delta tables for consumption in any Fabric workload
- {concept}SQL analytics endpoint
- a warehouse that is automatically generated from a Lakehouse in Microsoft Fabric [3]
- {concept}virtual warehouse
- can containing data from virtually any source by using shortcuts [3]
- {concept} cross database querying
- enables to quickly and seamlessly leverage multiple data sources for fast insights and with zero data duplication [3]
[2] Microsoft Learn: Fabric (2023) Microsoft Fabric decision guide: choose a data store (link)
[3] Microsoft Learn: Fabric (2024) What is data warehousing in Microsoft Fabric? (link)
[4] Microsoft Learn: Fabric (2023) Better together: the lakehouse and warehouse (link)
Resources:
[1] Microsoft Learn: Fabric (2023) Data warehousing documentation in Microsoft Fabric (link)