Disclaimer: This is work in progress intended to consolidate information from various sources (some unfortunately not available anymore).
Query plan (aka query execution plan, execution plan)
- read-only reentrant structure that is shared by multiple users [1]
- optimized version of the query tree [7]
- defines
- the sequence in which the source tables are accessed [7]
- the methods used to extract data from each table [7]
- whether it uses an index or a table scan [7]
- the methods used to compute calculations [7]
- the methods used to filter, aggregate, and sort data from each table [7]
- built by the Query Optimizer
- it chooses a query plan using a set of heuristics to balance compilation time and plan optimality in order to find a good query plan [7]
- stored in the plan cache
- remain in the plan cache as long as there is enough memory to store them
- under memory pressure exists, a cost-based approach is used to determine which execution plans to remove from the plan cache [7]
- execution plans are examined multiple times until enough have been removed to satisfy memory requirements [7]
- ⇒ an execution plan may have its cost increased and decreased more than once [7]
- the database engine uses the resource monitor and user worker threads to free memory from the plan cache in response to memory pressure [7]
- examine plans run concurrently to decrease the current cost for each unused execution plan [7]
- the resource monitor removes execution plans from the plan cache when global memory pressure exists [7]
- it frees memory to enforce policies for system memory, process memory, resource pool memory, and maximum size for all caches [7]
- the user worker threads remove execution plans from the plan cache when single cache memory pressure exists [7]
- it enforce policies for maximum single cache size and maximum single cache entries [7]
- when memory pressure no longer exists, the decreasing of current cost of unused execution plans is stopped [7]
- all execution plans remain in the plan cache, even if their cost is zero [7]
- one or all plans can be removed from the plan cache [7]
- via DBCC FREEPROCCACHE
- via DBCC FREESYSTEMCACHE can also be used to clear any cache, including plan cache
- [SQL Server 2016] via ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE
- clears the procedure (plan) cache for the database in scope [7]
- changes in some configuration settings via sp_configure and reconfigure cause plans to be removed from plan cache [7]
- agnostic to both parameter values and user-specific information
- no user context is stored in the query plan [5]
- the text of each newly submitted query is hashed and cached in memory, along with the original query text and a link to the execution context and the execution plan that the original query resulted in [2]
- allows avoid redundantly parsing and optimizing queries [2]
- must first go through a conversion to be passed to the LPE process to be carried out
- ⇐ regardless of the stage that actually produces the execution plan
- this is handled by the conversion stage of the optimization process
- each optimized query plan is processed by this stage before being returned for execution [3]
- [SQL Server 2005] is put in the plan cache when a cacheable batch is submitted for execution and compiled [1]
- a portion from the buffer pool is used for caching query plans (aka plan cache) [3]
- [SQL Server 2005] there are at most two instances of a query plan at any time in plan cache [1]
- one for all of the serial executions
- one for all of the parallel executions
- common for all parallel executions, regardless of their degrees of parallelism [5]
- a necessary condition to compile a parallel query plan is that the minimum of the number of processors must be more than one after having survived:
- the processor affinity mask
- the 'max degree of parallelism' server-wide option (possibly set using the "sp_configure" stored procedure)
- [SQL Server 2008] the MAXDOP limits imposed on the resource pool that the query is executing within, by the Resource Governor
- any MAXDOP hint on the query
- if two identical queries posed by the same user using two different sessions with the same session options arrive simultaneously, two query plans exists while they execute [3]
- at the end of their executions, the plan for only one of them is retained in the plan cache [3]
- can be cached (aka query plan caching)
- query plans are cashed for many types of batches [3]
- an exact text match for two ad-hoc queries before reuse can occur [3]
- the text match is both case- and space-sensitive, even on case-insensitive servers [3]
- if a query plan is not cached, its reuse opportunity is zero [3]
- such a plan will be compiled every time it is executed, potentially resulting in poor performance [3]
- in some cases, non-caching is a desirable outcome [3]
- can be reused (aka query plan reuse)
- when the plan cache is searched for plan reuse opportunities, the comparisons are against query plans, not against execution contexts [3]
- once a reusable query plan is found, an available execution context is found (causing execution context reuse) or freshly generated [3]
- does not necessarily imply execution context reuse [3]
- query plans should be reused whenever possible and optimal to avoid unnecessary query compilation costs [3]
- factors that affect plan-reuse:
- [stored procedures and triggers] the execution plan is executed separately from the execution plan for the batch calling the stored procedure or firing the trigger [7]
- allows for greater reuse of their execution plans [7]
- a query plan for a stored procedure is not reused when executed across different databases
- doesn’t apply to ad-hoc queries, prepared queries, or dynamic SQL
- trigger execution
- the number of rows affected by that execution (1 versus n) is a distinguishing factor in determining a plan cache hit
- ⇐ specific to triggers, and does not apply to stored procedures.
- [SQL Server 2005] INSTEAD OF triggers
- the "1-plan" is shared by executions that affect both 0 and 1 row
- [SQL Server 2005] non-INSTEAD OF ("after") triggers
- "1-plan" is only used by executions that affect 1 row and "n-plan" is used by executions that affect both 0 and n rows (n > 1)
- triggers associated with bulk inserts are cached.
- bulk insert statements are never cached
- a batch that contains any one literal longer than 8 KB is not cached
- a literal's length is measured after constant folding is applied
- batches flagged with the "replication flag" associated with a replication user are not matched with batches without that flag.
- batch called from CLR is not matched with the same batch submitted from outside of CLR
- ⇐ applies to CLR triggers and non-CLR triggers
- two CLR-submitted batches can reuse the same plan.
- notification queries and non-notification queries
- query plans for queries submitted via sp_resyncquery are not cached
- if the query is resubmitted (via sp_resyncquery or otherwise), it needs to be compiled again
- batches within a cursor with non-cursor batches
- certain SET options (see [3])
- {best practice} to avoid SET option-related recompilations, establish SET options at connection time, and ensure that they do not change for the duration of the connection [3]
- batches with unqualified object names result in non-reuse of query plans.
- stored procedure created with "CREATE PROCEDURE ...WITH RECOMPILE
- stored procedure executed using "EXEC ... WITH RECOMPILE"
- trace flag 8675
- determines when the optimizer times out while generating a query plan [2]
- tells when the optimizer reaches the memory limit to which SQL Server restricts it [2]
- ⇐ about 80% of the BPool
- {concept} trivial plan
- results when the query optimizer determines that given the tables referenced in the query and the indexes existing on them, only one plan is possible [3]
- usually a single table is referenced
- doesn't require any cost estimation or comparisons between plans [2]
- in XML plan will appear statementOptmLevel="TRIVIAL"
- a query that has generated a trivial plan may not always generate a trivial plan
- e.g. new indexes might be created on the underlying tables, and so multiple access paths become available to the query optimizer
- may be generated for certain types of covered queries, joins between tables with constraint relationships, DML statements [2]
- a query that contains a subquery is not eligible for a trivial plan [2]
- using a variable in the query doesn't impair the optimizer's ability to select a trivial plan [12]
- regardless of variable's value at runtime, a simple equality operator is being used to compare the variable with a uniquely indexed column, and the optimizer intrinsically knows that a trivial plan is the best option [2]
- 8759 trace flag
- allows detecting when a trivial plan has been chosen [2]
- {enabled} the optimizer writes the first portion of a query for which a trivial plan is created to the error log [2]
- {concept} execution context
- data structure that holds parameter values and user-specific information specific to the executed query
- derived from a query plan 'on the fly' [3]
- before a batch execution begins, a skeleton execution context is generated
- as execution proceeds, the necessary execution context pieces are generated and put into the skeleton
- ⇒ two execution contexts need not be identical even after user-specific information and query parameters are deleted from them
- because structures of execution contexts derived from the same query plan can differ from one another, the execution context used for a particular execution has slight impact on performance [3]
- the impact of the differences diminishes over time as the plan cache gets 'hot' and as a steady state is reached [3]
- holds the values that are needed for a specific execution of a query plan [3]
- multiple execution contexts may exist for a query plan
- ⇒ it cannot exist in the plan cache without an associated query plan [3]
- ⇒ whenever a query plan is removed from plan cache, all of the associated execution contexts are also removed along with it [3]
- cached (aka execution context caching)
- can be reused (aka execution context reuse)
- each user concurrently executing a batch will have an execution context
- is not reentrant
- ⇐ at any point of time, an execution context can execute only one batch submitted by a session
- ⇐ while the execution is happening, the context is not given to any other session or user [3]
- if a batch execution generates an error of severity 11 or higher, the execution context is destroyed [3]
- ⇒ even in the absence of memory pressure the number of execution contexts (for a given query plan) cached in plan cache can go up and down [3]
- if a parallel query plan is compiled, a serial execution context may be derived from the parallel plan at execution time by disabling parallelism-specific plan operators [3]
- execution contexts that are associated with parallel branches of the parallel plan are not cached, however serial branches of the parallel plan are cached [3]
- {concept} plan cost
- associated with query plans and execution contexts
- partially controls how long the plan or context will live in the plan cache
- the cost of an ad-hoc query is zero
- [SQL Server 2000] for a query plan, the cost is a measure of the server resources (CPU time and I/O) it takes the query optimizer to optimize the batch
- for ad-hoc queries, the cost is zero
- [SQL Server 2000] for an execution context, the cost is a measure of the server resources (CPU time and I/O) it takes for the server to initialize the execution context so that the individual statements become ready for execution
- doesn't include costs (CPU and I/O) incurred during batch executions
- typically, execution context costs are lower than query plan costs
- [SQL Server 2005] the cost of a query plan is a measure of the amount of resources required to produce it
- the cost is calculated in 'number of ticks' with a maximum value of 31
- the individual parts of the cost are calculated as follows [3]
- two I/Os cost 1 tick, with a maximum of 19 ticks
- two context switches cost 1 tick, with a maximum of 8 ticks
- sixteen memory pages (128 KB) cost 1 tick, with a maximum of 4 ticks
- as soon as the size of the plan cache reaches 50% of the buffer pool size, the next plan cache access decrements the ticks of all of the plans by 1 each [3]
- because this decrement is piggybacked on a thread that accesses the plan cache for plan lookup purpose, the decrement can be considered to occur in a lazy fashion [3]
- if the sum of the sizes of all of the caches reaches or exceeds 75% of the buffer pool size, a dedicated resource monitor thread gets activated, and it decrements tick counts of all of the objects in all of the caches
- this thread's behavior approximates that of the lazy-writer thread in SQL Server 2000
- cost = I/O cost + context switch cost (a measure of CPU cost) + memory cost
- the plan cache is distinct from the data cache
- there are other functionality-specific caches
- the lazy-writer process does not decrement costs
- query plan reuse causes the query plan cost to be reset to its initial value [3]
- {concept} query optimization
- the multi-phased process of selecting a 'good-enough' query execution plan from potentially many possible plans [7]
- involves the overhead of selecting the query and selecting a plan
- the overhead is typically saved several-fold when the query optimizer picks an efficient execution plan [7]
- {concept} [SQL Server 2022] optimized plan forcing
- part of intelligent query processing family of features
- reduces compilation overhead for repeating forced queries [7]
- requires the Query Store to be enabled and in 'read write' mode
- once the query execution plan is generated, specific compilation steps are stored for reuse as an optimization replay script [7]
- stored as part of the compressed showplan XML in Query Store, in a hidden OptimizationReplay attribute [7]
- {concept} estimated execution plan
- the compiled plan, as produced by the query optimizer [7]
- {concept} actual execution plan
- the compiled plan plus its execution context [7]
- includes runtime information available after the execution completes [7]
- e.g. execution warnings, or in newer versions of the Database Engine, the elapsed and CPU time used during execution [7]
- {concept} Live Query Statistics
- is the same as the compiled plan plus its execution context
- includes runtime information during execution progress [7]
- e.g. the actual number of rows flowing through the operators
- updated every second. Runtime information includes for example [7]
- {concept} query plan hash
- a binary hash value calculated on the execution plan for a given query [7]
- used to uniquely identify similar execution plans [7]
- {concept} query hash
- a binary hash value calculated on the Transact-SQL text of a query [7]
- used to uniquely identify queries [7]
- {concept} plan cache
- part of the memory pool that is used to store execution plans [7]
- has two stores for all compiled plans
- Object Plans cache store (OBJCP)
- used for plans related to persisted objects (stored procedures, functions, and triggers) [7]
- SQL Plans cache store (SQLCP)
- used for plans related to auto-parameterized, dynamic, or prepared queries [7]
- has two additional stores that don't store compiled plans
- Bound Trees cache store (PHDR)
- used for data structures (aka bound trees, algebraizer trees) used during plan compilation for views, constraints, and defaults [7]
- Extended Stored Procedures cache store (XPROC)
- used for predefined system procedures that are defined using a DLL, not using Transact-SQL statements [7]
- e.g. sp_executeSql or xp_cmdshell
- the cached structure contains only the function name and the DLL name in which the procedure is implemented [7]
- {concept} plan handle
- transient identifier that allows to retrieve a compiled plan from the plan cache [7]
- remains constant only while the plan remains in the cache [7]
- hash value derived from the compiled plan of the entire batch [7]
- remains the same even if one or more statements in the batch get recompiled [7]
- if a plan was compiled for a batch instead of a single statement, the plan for individual statements in the batch can be retrieved using the plan handle and statement offsets [7]
- {concept} recompile execution plans
- certain changes in a database can cause an execution plan to be either inefficient or invalid, based on the new state of the database [7]
- the changes that invalidate an execution plan are detected and the plan are marked as not valid [7]
- ⇒ a new plan must then be recompiled for the next connection that executes the query [7]
- conditions that invalidate a plan include the following:
- changes made to a table or view referenced by the query [7]
- via ALTER TABLE or ALTER VIEW
- changes made to a single procedure, which would drop all plans for that procedure from the cache [7]
- via ALTER PROCEDURE
- changes to any indexes used by the execution plan [7]
- updates on statistics used by the execution plan, generated either explicitly from a statement or generated automatically.
- dropping an index used by the execution plan.
- an explicit call to sp_recompile.
- large numbers of changes to keys
- generated by INSERT or DELETE statements from other users that modify a table referenced by the query [7]
- for tables with triggers, if the number of rows in the inserted or deleted tables grows significantly [7]
- executing a stored procedure using the WITH RECOMPILE option [7]
- most recompilations are required either for statement correctness or to obtain potentially faster query execution plans [7]
Acronyms
MAXDOP - Maximum Degree of Parallelism
Resources:
[1] Scalability Experts, "Microsoft® SQL Server 2005: Changing the Paradigm", 2005
[2] Ken Henderson (2004) The Guru's Guide to SQL Server Architecture and Internals 55/12
[3] Greg Low, Plan Caching and Recompilation in SQL Server 2012, 2013 [3]
[4] MCITP Self-Paced Training Kit Exam 70-442
[5] Microsoft Learn (2012) Execution Plan Caching and Reuse (link)
[6] Microsoft Learn (2023) Execution plan overview (link)
[7] Microsoft Learn (2023) Optimized plan forcing with Query Store (link)
[8] Microsoft Learn (2023) Query processing architecture guide (link)
No comments:
Post a Comment