Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!
Last updated: 12-Mar-2025
         
       | 
    
| Query Acceleration [2] | 
[Microsoft Fabric] Query Acceleration
- {def}
 - indexes and caches on the fly data landing in OneLake [2]
 - {benefit} allows to
 - analyze real-time streams coming directly into Eventhouse and combine it with data landing in OneLake
 - ⇐ either coming from mirrored databases, Warehouses, Lakehouses or Spark [2]
 - ⇒ accelerate data landing in OneLake
 - ⇐ including existing data and any new updates, and expect similar performance [1]
 - eliminates the need to
 - manage ingestion pipelines [1]
 - maintain duplicate copies of data [1]
 - ensures that data remains in sync without additional effort [4]
 - the initial process is dependent on the size of the external table [4]
 - ⇐ provides significant performance comparable to ingesting data in Eventhouse [1]
 - in some cases up to 50x and beyond [2]
 - ⇐ supported in Eventhouse over delta tables from OneLake shortcuts, etc. [4]
 - when creating a shortcut from an Eventhouse to a OneLake delta table, users can choose if they want to accelerate the shortcut [2]
 - accelerating the shortcut means equivalent ingestion into the Eventhouse
 - ⇐ optimizations that deliver the same level of performance for accelerated shortcuts as native Eventhouse tables [2]
 - e.g. indexing, caching, etc.
 - all data management is done by the data writer and in the Eventhouse the accelerated table shortcut [2]
 - behave like external tables, with the same limitations and capabilities [4]
 - {limitation} materialized view aren't supported [1]
 - {limitation} update policies aren't supported [1]
 - allows specifying a policy on top of external delta tables that defines the number of days to cache data for high-performance queries [1]
 - ⇐ queries run over OneLake shortcuts can be less performant than on data that is ingested directly to Eventhouses [1]
 - ⇐ due to network calls to fetch data from storage, the absence of indexes, etc. [1]
 - {costs} charged under OneLake Premium cache meter [2]
 - ⇐ similar to native Eventhouse tables [2]
 - one can control the amount of data to accelerate by configuring number of days to cache [2]
 - indexing activity may also count towards CU consumption [2]
 - {limitation} the number of columns in the external table can't exceed 900 [1]
 - {limitation} query performance over accelerated external delta tables which have partitions may not be optimal during preview [1]
 - {limitation} the feature assumes delta tables with static advanced features
 - e.g. column mapping doesn't change, partitions don't change, etc
 - {recommendation} to change advanced features, first disable the policy, and once the change is made, re-enable the policy [1]
 - {limitation} schema changes on the delta table must also be followed with the respective .alter external delta table schema [1]
 - might result in acceleration starting from scratch if there was breaking schema change [1]
 - {limitation} index-based pruning isn't supported for partitions [1]
 - {limitation} parquet files with a compressed size higher than 6 GB won't be cached [1]
 
    References:
[1] Microsoft Learn (2024) Fabric: Query acceleration for OneLake shortcuts - overview (preview) [link]
[1] Microsoft Learn (2024) Fabric: Query acceleration for OneLake shortcuts - overview (preview) [link]
  [2] Microsoft Fabric Updates Blog (2024) Announcing Eventhouse Query
  Acceleration for OneLake Shortcuts (Preview) [link]
[3] Microsoft Learn (2024) Fabric: Query acceleration over OneLake shortcuts (preview) [link]
[3] Microsoft Learn (2024) Fabric: Query acceleration over OneLake shortcuts (preview) [link]
[4] Microsoft Fabric Updates Blog (2025) Eventhouse Accelerated OneLake Table Shortcuts – Generally Available [link] 
Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]


No comments:
Post a Comment