Disclaimer: This is work in progress intended to consolidate information from various sources (some unfortunately not available anymore).
Query plan (aka query execution plan, execution plan)
- read-only reentrant structure that is shared by multiple users [1]
 - optimized version of the query tree [7]
 - defines
 - the sequence in which the source tables are accessed [7]
 - the methods used to extract data from each table [7]
 - whether it uses an index or a table scan [7]
 - the methods used to compute calculations [7]
 - the methods used to filter, aggregate, and sort data from each table [7]
 - built by the Query Optimizer
 - it chooses a query plan using a set of heuristics to balance compilation time and plan optimality in order to find a good query plan [7]
 - stored in the plan cache
 - remain in the plan cache as long as there is enough memory to store them
 - under memory pressure exists, a cost-based approach is used to determine which execution plans to remove from the plan cache [7]
 - execution plans are examined multiple times until enough have been removed to satisfy memory requirements [7]
 - ⇒ an execution plan may have its cost increased and decreased more than once [7]
 - the database engine uses the resource monitor and user worker threads to free memory from the plan cache in response to memory pressure [7]
 - examine plans run concurrently to decrease the current cost for each unused execution plan [7]
 - the resource monitor removes execution plans from the plan cache when global memory pressure exists [7]
 - it frees memory to enforce policies for system memory, process memory, resource pool memory, and maximum size for all caches [7]
 - the user worker threads remove execution plans from the plan cache when single cache memory pressure exists [7]
 - it enforce policies for maximum single cache size and maximum single cache entries [7]
 - when memory pressure no longer exists, the decreasing of current cost of unused execution plans is stopped [7]
 - all execution plans remain in the plan cache, even if their cost is zero [7]
 - one or all plans can be removed from the plan cache [7]
 - via DBCC FREEPROCCACHE
 - via DBCC FREESYSTEMCACHE can also be used to clear any cache, including plan cache
 - [SQL Server 2016] via ALTER DATABASE SCOPED CONFIGURATION CLEAR PROCEDURE_CACHE
 - clears the procedure (plan) cache for the database in scope [7]
 - changes in some configuration settings via sp_configure and reconfigure cause plans to be removed from plan cache [7]
 - agnostic to both parameter values and user-specific information
 - no user context is stored in the query plan [5]
 - the text of each newly submitted query is hashed and cached in memory, along with the original query text and a link to the execution context and the execution plan that the original query resulted in [2]
 - allows avoid redundantly parsing and optimizing queries [2]
 - must first go through a conversion to be passed to the LPE process to be carried out
 - ⇐ regardless of the stage that actually produces the execution plan
 - this is handled by the conversion stage of the optimization process
 - each optimized query plan is processed by this stage before being returned for execution [3]
 - [SQL Server 2005] is put in the plan cache when a cacheable batch is submitted for execution and compiled [1]
 - a portion from the buffer pool is used for caching query plans (aka plan cache) [3]
 - [SQL Server 2005] there are at most two instances of a query plan at any time in plan cache [1]
 - one for all of the serial executions
 - one for all of the parallel executions
 - common for all parallel executions, regardless of their degrees of parallelism [5]
 - a necessary condition to compile a parallel query plan is that the minimum of the number of processors must be more than one after having survived:
 - the processor affinity mask
 - the 'max degree of parallelism' server-wide option (possibly set using the "sp_configure" stored procedure)
 - [SQL Server 2008] the MAXDOP limits imposed on the resource pool that the query is executing within, by the Resource Governor
 - any MAXDOP hint on the query
 - if two identical queries posed by the same user using two different sessions with the same session options arrive simultaneously, two query plans exists while they execute [3]
 - at the end of their executions, the plan for only one of them is retained in the plan cache [3]
 - can be cached (aka query plan caching)
 - query plans are cashed for many types of batches [3]
 - an exact text match for two ad-hoc queries before reuse can occur [3]
 - the text match is both case- and space-sensitive, even on case-insensitive servers [3]
 - if a query plan is not cached, its reuse opportunity is zero [3]
 - such a plan will be compiled every time it is executed, potentially resulting in poor performance [3]
 - in some cases, non-caching is a desirable outcome [3]
 - can be reused (aka query plan reuse)
 - when the plan cache is searched for plan reuse opportunities, the comparisons are against query plans, not against execution contexts [3]
 - once a reusable query plan is found, an available execution context is found (causing execution context reuse) or freshly generated [3]
 - does not necessarily imply execution context reuse [3]
 - query plans should be reused whenever possible and optimal to avoid unnecessary query compilation costs [3]
 - factors that affect plan-reuse:
 - [stored procedures and triggers] the execution plan is executed separately from the execution plan for the batch calling the stored procedure or firing the trigger [7]
 - allows for greater reuse of their execution plans [7]
 - a query plan for a stored procedure is not reused when executed across different databases
 - doesn’t apply to ad-hoc queries, prepared queries, or dynamic SQL
 - trigger execution
 - the number of rows affected by that execution (1 versus n) is a distinguishing factor in determining a plan cache hit
 - ⇐ specific to triggers, and does not apply to stored procedures.
 - [SQL Server 2005] INSTEAD OF triggers
 - the "1-plan" is shared by executions that affect both 0 and 1 row
 - [SQL Server 2005] non-INSTEAD OF ("after") triggers
 - "1-plan" is only used by executions that affect 1 row and "n-plan" is used by executions that affect both 0 and n rows (n > 1)
 - triggers associated with bulk inserts are cached.
 - bulk insert statements are never cached
 - a batch that contains any one literal longer than 8 KB is not cached
 - a literal's length is measured after constant folding is applied
 - batches flagged with the "replication flag" associated with a replication user are not matched with batches without that flag.
 - batch called from CLR is not matched with the same batch submitted from outside of CLR
 - ⇐ applies to CLR triggers and non-CLR triggers
 - two CLR-submitted batches can reuse the same plan.
 - notification queries and non-notification queries
 - query plans for queries submitted via sp_resyncquery are not cached
 - if the query is resubmitted (via sp_resyncquery or otherwise), it needs to be compiled again
 - batches within a cursor with non-cursor batches
 - certain SET options (see [3])
 - {best practice} to avoid SET option-related recompilations, establish SET options at connection time, and ensure that they do not change for the duration of the connection [3]
 - batches with unqualified object names result in non-reuse of query plans.
 - stored procedure created with "CREATE PROCEDURE ...WITH RECOMPILE
 - stored procedure executed using "EXEC ... WITH RECOMPILE"
 - trace flag 8675
 - determines when the optimizer times out while generating a query plan [2]
 - tells when the optimizer reaches the memory limit to which SQL Server restricts it [2]
 - ⇐ about 80% of the BPool
 - {concept} trivial plan
 - results when the query optimizer determines that given the tables referenced in the query and the indexes existing on them, only one plan is possible [3]
 - usually a single table is referenced
 - doesn't require any cost estimation or comparisons between plans [2]
 - in XML plan will appear statementOptmLevel="TRIVIAL"
 - a query that has generated a trivial plan may not always generate a trivial plan
 - e.g. new indexes might be created on the underlying tables, and so multiple access paths become available to the query optimizer
 - may be generated for certain types of covered queries, joins between tables with constraint relationships, DML statements [2]
 - a query that contains a subquery is not eligible for a trivial plan [2]
 - using a variable in the query doesn't impair the optimizer's ability to select a trivial plan [12]
 - regardless of variable's value at runtime, a simple equality operator is being used to compare the variable with a uniquely indexed column, and the optimizer intrinsically knows that a trivial plan is the best option [2]
 - 8759 trace flag
 - allows detecting when a trivial plan has been chosen [2]
 - {enabled} the optimizer writes the first portion of a query for which a trivial plan is created to the error log [2]
 - {concept} execution context
 - data structure that holds parameter values and user-specific information specific to the executed query
 - derived from a query plan 'on the fly' [3]
 - before a batch execution begins, a skeleton execution context is generated
 - as execution proceeds, the necessary execution context pieces are generated and put into the skeleton
 - ⇒ two execution contexts need not be identical even after user-specific information and query parameters are deleted from them
 - because structures of execution contexts derived from the same query plan can differ from one another, the execution context used for a particular execution has slight impact on performance [3]
 - the impact of the differences diminishes over time as the plan cache gets 'hot' and as a steady state is reached [3]
 - holds the values that are needed for a specific execution of a query plan [3]
 - multiple execution contexts may exist for a query plan
 - ⇒ it cannot exist in the plan cache without an associated query plan [3]
 - ⇒ whenever a query plan is removed from plan cache, all of the associated execution contexts are also removed along with it [3]
 - cached (aka execution context caching)
 - can be reused (aka execution context reuse)
 - each user concurrently executing a batch will have an execution context
 - is not reentrant
 - ⇐ at any point of time, an execution context can execute only one batch submitted by a session
 - ⇐ while the execution is happening, the context is not given to any other session or user [3]
 - if a batch execution generates an error of severity 11 or higher, the execution context is destroyed [3]
 - ⇒ even in the absence of memory pressure the number of execution contexts (for a given query plan) cached in plan cache can go up and down [3]
 - if a parallel query plan is compiled, a serial execution context may be derived from the parallel plan at execution time by disabling parallelism-specific plan operators [3]
 - execution contexts that are associated with parallel branches of the parallel plan are not cached, however serial branches of the parallel plan are cached [3]
 - {concept} plan cost
 - associated with query plans and execution contexts
 - partially controls how long the plan or context will live in the plan cache
 - the cost of an ad-hoc query is zero
 - [SQL Server 2000] for a query plan, the cost is a measure of the server resources (CPU time and I/O) it takes the query optimizer to optimize the batch
 - for ad-hoc queries, the cost is zero
 - [SQL Server 2000] for an execution context, the cost is a measure of the server resources (CPU time and I/O) it takes for the server to initialize the execution context so that the individual statements become ready for execution
 - doesn't include costs (CPU and I/O) incurred during batch executions
 - typically, execution context costs are lower than query plan costs
 - [SQL Server 2005] the cost of a query plan is a measure of the amount of resources required to produce it
 - the cost is calculated in 'number of ticks' with a maximum value of 31
 - the individual parts of the cost are calculated as follows [3]
 - two I/Os cost 1 tick, with a maximum of 19 ticks
 - two context switches cost 1 tick, with a maximum of 8 ticks
 - sixteen memory pages (128 KB) cost 1 tick, with a maximum of 4 ticks
 - as soon as the size of the plan cache reaches 50% of the buffer pool size, the next plan cache access decrements the ticks of all of the plans by 1 each [3]
 - because this decrement is piggybacked on a thread that accesses the plan cache for plan lookup purpose, the decrement can be considered to occur in a lazy fashion [3]
 - if the sum of the sizes of all of the caches reaches or exceeds 75% of the buffer pool size, a dedicated resource monitor thread gets activated, and it decrements tick counts of all of the objects in all of the caches
 - this thread's behavior approximates that of the lazy-writer thread in SQL Server 2000
 - cost = I/O cost + context switch cost (a measure of CPU cost) + memory cost
 - the plan cache is distinct from the data cache
 - there are other functionality-specific caches
 - the lazy-writer process does not decrement costs
 - query plan reuse causes the query plan cost to be reset to its initial value [3]
 - {concept} query optimization
 - the multi-phased process of selecting a 'good-enough' query execution plan from potentially many possible plans [7]
 - involves the overhead of selecting the query and selecting a plan
 - the overhead is typically saved several-fold when the query optimizer picks an efficient execution plan [7]
 - {concept} [SQL Server 2022] optimized plan forcing
 - part of intelligent query processing family of features
 - reduces compilation overhead for repeating forced queries [7]
 - requires the Query Store to be enabled and in 'read write' mode
 - once the query execution plan is generated, specific compilation steps are stored for reuse as an optimization replay script [7]
 - stored as part of the compressed showplan XML in Query Store, in a hidden OptimizationReplay attribute [7]
 - {concept} estimated execution plan
 - the compiled plan, as produced by the query optimizer [7]
 - {concept} actual execution plan
 - the compiled plan plus its execution context [7]
 - includes runtime information available after the execution completes [7]
 - e.g. execution warnings, or in newer versions of the Database Engine, the elapsed and CPU time used during execution [7]
 - {concept} Live Query Statistics
 - is the same as the compiled plan plus its execution context
 - includes runtime information during execution progress [7]
 - e.g. the actual number of rows flowing through the operators
 - updated every second. Runtime information includes for example [7]
 - {concept} query plan hash
 - a binary hash value calculated on the execution plan for a given query [7]
 - used to uniquely identify similar execution plans [7]
 - {concept} query hash
 - a binary hash value calculated on the Transact-SQL text of a query [7]
 - used to uniquely identify queries [7]
 - {concept} plan cache
 - part of the memory pool that is used to store execution plans [7]
 - has two stores for all compiled plans
 - Object Plans cache store (OBJCP)
 - used for plans related to persisted objects (stored procedures, functions, and triggers) [7]
 - SQL Plans cache store (SQLCP)
 - used for plans related to auto-parameterized, dynamic, or prepared queries [7]
 - has two additional stores that don't store compiled plans
 - Bound Trees cache store (PHDR)
 - used for data structures (aka bound trees, algebraizer trees) used during plan compilation for views, constraints, and defaults [7]
 - Extended Stored Procedures cache store (XPROC)
 - used for predefined system procedures that are defined using a DLL, not using Transact-SQL statements [7]
 - e.g. sp_executeSql or xp_cmdshell
 - the cached structure contains only the function name and the DLL name in which the procedure is implemented [7]
 - {concept} plan handle
 - transient identifier that allows to retrieve a compiled plan from the plan cache [7]
 - remains constant only while the plan remains in the cache [7]
 - hash value derived from the compiled plan of the entire batch [7]
 - remains the same even if one or more statements in the batch get recompiled [7]
 - if a plan was compiled for a batch instead of a single statement, the plan for individual statements in the batch can be retrieved using the plan handle and statement offsets [7]
 - 
    {concept} recompile execution plans
    
- certain changes in a database can cause an execution plan to be either inefficient or invalid, based on the new state of the database [7]
 - the changes that invalidate an execution plan are detected and the plan are marked as not valid [7]
 - ⇒ a new plan must then be recompiled for the next connection that executes the query [7]
 - conditions that invalidate a plan include the following:
 - changes made to a table or view referenced by the query [7]
 - via ALTER TABLE or ALTER VIEW
 - changes made to a single procedure, which would drop all plans for that procedure from the cache [7]
 - via ALTER PROCEDURE
 - changes to any indexes used by the execution plan [7]
 - updates on statistics used by the execution plan, generated either explicitly from a statement or generated automatically.
 - dropping an index used by the execution plan.
 - an explicit call to sp_recompile.
 - large numbers of changes to keys
 - generated by INSERT or DELETE statements from other users that modify a table referenced by the query [7]
 - for tables with triggers, if the number of rows in the inserted or deleted tables grows significantly [7]
 - executing a stored procedure using the WITH RECOMPILE option [7]
 - most recompilations are required either for statement correctness or to obtain potentially faster query execution plans [7]
 
 
  Resources:
[1] Scalability Experts, "Microsoft® SQL Server 2005: Changing the
  Paradigm", 2005
[2] Ken Henderson (2004) The Guru's Guide to SQL Server
  Architecture and Internals 55/12
[3] Greg Low, Plan Caching and
  Recompilation in SQL Server 2012, 2013 [3]
[4] MCITP Self-Paced Training
  Kit Exam 70-442
[5] Microsoft Learn (2012) Execution Plan Caching and
  Reuse (link)
[6] Microsoft Learn (2023) Execution plan overview (link)
[7] Microsoft Learn (2023) Optimized plan forcing with Query Store (link)
[8] Microsoft Learn (2023) Query processing architecture guide (link)
Acronyms
MAXDOP - Maximum Degree of Parallelism

