SQL Troubles: cache

Showing posts with label cache. Show all posts

21 June 2025

🏭🗒️Microsoft Fabric: Result Set Caching in SQL Analytics Endpoints [Notes] 🆕

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 21-Jun-2025

[Microsoft Fabric] Result Set Caching in SQL Analytics Endpoints

{def} built-in performance optimization for Warehouse and Lakehouse that improves read latency [1]

fully transparent to the user [3]
persists the final result sets for applicable SELECT T-SQL queries

caches all the data accessed by a query [3]
subsequent runs that "hit" cache will process just the final result set

can bypass complex compilation and data processing of the original query[1]

⇐ returns subsequent queries faster [1]

the cache creation and reuse is applied opportunistically for queries

works on

warehouse tables
shortcuts to OneLake sources
shortcuts to non-Azure sources

the management of cache is handled automatically [1]

regularly evicts cache as needed

as data changes, result consistency is ensured by invalidating cache created earlier [1]

{operation} enable setting

via ALTER DATABASE <database_name> SET RESULT_SET_CACHING ON

{operation} validate setting

via SELECT name, is_result_set_caching_on FROM sys.databases

{operation} configure setting

configurable at item level

once enabled, it can then be disabled

at the item level
for individual queries

e.g. debugging or A/B testing a query

via OPTION ( USE HINT ('DISABLE_RESULT_SET_CACHE')

{default} during the preview, result set caching is off for all items [1]

[monitoring]

via Message Output

applicable to Fabric Query editor, SSMS
the statement "Result set cache was used" is displayed after query execution if the query was able to use an existing result set cache

via queryinsights.exec_requests_history system view

result_cache_hit displays indicates result set cache usage for each query execution [1]

{value} 2: the query used result set cache (cache hit)
{value} 1: the query created result set cache
{value} 0: the query wasn't applicable for result set cache creation or usage [1]

{reason} the cache no longer exists
{reason} the cache was invalidated by a data change, disqualifying it for reuse [1]
{reason} query isn't deterministic

isn't eligible for cache creation [1]

{reason} query isn't a SELECT statement

[warehousing]

{scenario} analytical queries that process large amounts of data to produce a relatively small result [1]
{scenario} workloads that trigger the same analytical queries repeatedly [1]

the same heavy computation can be triggered multiple times, even though the final result remains the same [1]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2025) Result set caching (preview) [link]

[2] Microsoft Fabric Update Blog (2025) Result Set Caching for Microsoft Fabric Data Warehouse (Preview) [link|aka]

[3] Microsoft Learn (2025) In-memory and disk caching [link]

[4] Microsoft Learn (2025) Performance guidelines in Fabric Data Warehouse [link]

Resources:
[R1] Microsoft Fabric (2025) Fabric Update - June 2025 [link]

Acronyms:

MF - Microsoft Fabric

SSMS - SQL Server Management Studio

17 March 2025

🏭🗒️Microsoft Fabric: Caching in Warehouse [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 17-Mar-2024

[Microsoft Fabric] Caching

{def} technique that improves the performance of data processing by storing frequently accessed data and metadata in a faster storage layer [1]

e.g. local memory, local SSD disk
⇐ subsequent requests can be served faster, directly from the cache [1]

if a set of data has been previously accessed by a query, any subsequent queries will retrieve that data directly from the in-memory cache [1]
local memory operations are notably faster compared to fetching data from remote storage [1]
⇐ significantly diminishes IO latency [1]

fully transparent to the user
consistently active and operates seamlessly in the background [1]
orchestrated and upheld by MF

it doesn't offer users the capability to manually clear the cache [1]

provides transactional consistency

ensures that any modifications to the data in storage after it has been initially loaded into the in-memory cache, will result in consistent data [1]

when the cache reaches its capacity threshold and fresh data is being read for the first time, objects that have remained unused for the longest duration will be removed from the cache [1]

process is enacted to create space for the influx of new data and maintain an optimal cache utilization strategy [1]

{type} in-memory cache

data in cache is organized in a compressed columnar format (aka columnar storage) [1]

⇐ optimized for analytical queries
each column of data is stored separately [1]

{benefit} allows for better compression [1]

since similar data values are stored together [1]

{benefit} reduces the memory footprint

when queries need to perform operations on a specific column, the engine can work more efficiently speeding up queries' execution [1]

it doesn't have to process unnecessary data from other columns
can perform operations on multiple columns simultaneously [1]

taking advantage of modern multi-core processors [1]

when retrieves data from storage, the data is transformed from its original file-based format into highly optimized structures in in-memory cache [1]
{scenario} analytical workloads where queries involve scanning large amounts of data to perform aggregations, filtering, and other data manipulations [1]

{type} disk cache

complementary extension to the in-memory cache
any data loaded into the in-memory cache is also serialized to the SSD cache [1]

data removed from the in-memory cache remains within the SSD cache for an extended period
when subsequent query requests the data, it is retrieved from the SSD cache into the in-memory cache quicker [1]

{issue} cold run (aka cold cache) performance

the first 1-3 executions of a query perform noticeably slower than subsequent executions [2]

if the first run's performance is crucial, try manually creating statistics (aka pre-warming the cache) [2]
otherwise, one can rely on the automatic statistics generated in the first query run and leveraged in subsequent runs [2]
as long as underlying data does not change significantly [2]

differentiated from

[Kusto] caching policy [link]
[Apache Spark] intelligent cache [link]
[Power BI] query caching [link]
[Azure] caching [link]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2024) Fabric: Caching in Fabric data warehousing [link]

[2] Microsoft Learn (2024) Fabric Data Warehouse performance guidelines [link]

[3] Sandeep Pawar (2023) Pre-Warming The Direct Lake Dataset For Warm Cache Import-Like Performance [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
IO - Input/Output
MF - Microsoft Fabric
SSD - Solid State Drive

13 June 2011

🔹SQL Server: Procedure Cache (Definitions)

"An area of SQL Server memory used to hold trees and query plans that have recently been used or are currently in use by SQL Server." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A temporary storage location for the current, executing version of a specific stored procedure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A shared, reserved memory area that stores the most recently executed SQL statements or PL/SQL procedures (including triggers and functions). Also called SQL cache." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Microsoft, "SQL Server 2012 Glossary", 2012)

SQL Troubles

Pages