Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!
Last updated: 25-Apr-2025
[Microsoft Fabric] Incremental Refresh in Dataflows Gen2
- {feature} enables to incrementally extract data from data sources, apply Power Query transformations, and load into various output destinations [5]
 - designed to reduce the amount of data that needs to be processed and retrieved from the source system [8]
 - configurable directly in the dataflow editor [8]
 - doesn't need to specify the historical data range [8]
 - ⇐ the dataflow doesn't remove any data from the destination that's outside the bucket range [8]
 - doesn't need to specify the parameters for the incremental refresh [8]
 - the filters and parameters are automatically added as the last step in the query [8]
 - {prerequisite} the data source
 - supports folding [8]
 - needs to contain a Date/DateTime column that can be used to filter the data [8]
 - {prerequisite} the data destination supports incremental refresh [8]
 - available destinations
 - Fabric Warehouse
 - Azure SQL Database
 - Azure Synapse Analytics
 - Fabric Lakehouse [preview]
 - other destinations can be used in combination with incremental refresh by using a second query that references the staged data to update the data destination [8]
 - allows to use incremental refresh to reduce the amount of data that needs to be processed and retrieved from the source system [8]
 - a full refresh from the staged data to the data destination is still needed [8]
 - works by dividing the data into buckets based on a DateTime column [8]
 - each bucket contains the data that changed since the last refresh [8]
 - the dataflow knows what changed by checking the maximum value in the specified column
 - if the maximum value changed for that bucket, the dataflow retrieves the whole bucket and replaces the data in the destination [8]
 - if the maximum value didn't change, the dataflow doesn't retrieve any data [8]
 - {limitation}
 - the data destination must be set to a fixed schema [8]
 - ⇒table's schema in the data destination must be fixed and can't change [8]
 - ⇒ dynamic schema must be changed to fixed schema before configuring incremental refresh [8]
 - {limitation} the only supported update method in the data destination: replace
 - ⇒the dataflow replaces the data for each bucket in the data destination with the new data [8]
 - data that is outside the bucket range isn't affected [8]
 - {limitation} maximum number of buckets
 - single query: 50
 - {workaround} increase the bucket size or reduce the bucket range to lower the number of buckets [8]
 - whole dataflow: 150
 - {workaround} reduce the number of incremental refresh queries or increase the bucket size [8]
 - {downside} the dataflow may take longer to refresh after enabling incremental refresh [8]
 - because the additional overhead of checking if data changed and processing the buckets is higher than the time saved by processing less data [8]
 - {recommendation} review the settings for incremental refresh and adjust them to better fit the scenario
 - {option} increase the bucket size to reduce the number of buckets and the overhead of processing them [8]
 - {option} reduce the number of buckets by increasing the bucket size [8]
 - {option} disable incremental refresh [8]
 - {recommendation} don't use the column for detecting changes also for filtering [8]
 - because this can lead to unexpected results [8]
 - {setting} limit number of concurrent evaluation
 - setting the value to a lower number, reduces the number of requests sent to the source system [8]
 - via global settings >> Scale tab >> maximum number of parallel query evaluations
 - {recommendation} don't enable this limit unless there're issues with the source system [8]
 
  References:
  [5] Microsoft Learn (2023) Fabric: Save a draft of your dataflow [link]
[8] Microsoft Learn (2025) Fabric: Incremental refresh in Dataflow Gen2 [link]
[8] Microsoft Learn (2025) Fabric: Incremental refresh in Dataflow Gen2 [link]
  Resources:
