Showing posts with label execution plan. Show all posts
Showing posts with label execution plan. Show all posts

15 January 2023

💎🏭SQL Reloaded: Monitoring the Synapse serverless SQL pool with Dynamics Management Views I

I feel sometimes flying blind when I build or troubleshoot SQL queries and I don't have the query plan and/or further statistics to understand how the database engine works, why some queries take longer than expected, etc. Unfortunately, Synapse serverless SQL pool doesn't seem to support showing exection plans in SQL Server Management Studio as per now (SHOWPLAN_XML is not supported for SET). I looked at my old queries based on the the sys.dm_exec_requests and sys.dm_exec_query_stats  DMVs, however the results didn't proved to be what I was searching for. (

This weekend, I found Sidney Cirqueira's post on monitoring Synapse serverless SQL pools where he describes how to do that via the Monitoring hub, DMVs, QPI library, respectively Log Analytics. (You should check regularly the Azure Synapse Analytics Blog as it's full of goodies!)

Thus, I found out that there's a new DMV called sys.dm_exec_requests_history which provides at least the duration and the volume of data processes by each statement run on the service:

-- Azure Serverless SQL pool: requests' history 
SELECT top 100 ERH.status
, ERH.transaction_Id
, ERH.distributed_statement_Id 
, ERH.query_hash 
, ERH.login_name 
, ERH.start_time
, ERH.end_time 
, ERH.command 
, ERH.query_text 
--, ERH.total_elapsed_time_ms
, ERH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec
--, ERH.data_processed_mb
, ERH.data_processed_mb/1028.0 data_processed_gb
, ERH.error
, ERH.error_code 
FROM sys.dm_exec_requests_history ERH
ORDER BY ERH.data_processed_mb DESC

It isn't much information, compared with the columns returned by sys.dm_exec_requests, but it's something to start with. At least it allows focusing on the queries with the longest duration (use the above query sorting the records based on the total_elapsed_time_ms descending) or highest volume of data processed:

-- Azure Serverless SQL pool: queries with most data processed
SELECT TOP 50 ERH.query_text  
, COUNT(*) no_runs
, SUM(ERH.total_elapsed_time_ms) total_elapsed_time_ms
, SUM(ERH.data_processed_mb) data_processed_mb
, SUM(ERH.data_processed_mb/1028.0) data_processed_gb
, MIN(ERH.start_time) first_run_date
, MAX(ERH.start_time) last_run_date
FROM sys.dm_exec_requests_history ERH
GROUP BY ERH.query_text
HAVING COUNT(*)>1
ORDER BY data_processed_mb DESC

The same query can be slightly changed to retrieve the volume of data processed by month:
 
-- Azure Serverless SQL pool: data processed by month
SELECT Convert(nvarchar(7), ERH.start_time, 23) [period]
, COUNT(*) no_runs
, SUM(ERH.total_elapsed_time_ms) total_elapsed_time_ms
, SUM(ERH.data_processed_mb) data_processed_mb
, SUM(ERH.data_processed_mb/1028.0) data_processed_gb
, MIN(ERH.start_time) first_run_date
, MAX(ERH.start_time) last_run_date
FROM sys.dm_exec_requests_history ERH
GROUP BY Convert(nvarchar(7), ERH.start_time, 23)
HAVING COUNT(*)>1
ORDER BY data_processed_mb DESC

One can add in the grouping also the login name to break down the analysis by the login that issued the query. Organization's domain can be used to differentiate between system or organization-baed queries.

The volume of data processed is stored also in the sys.dm_external_data_processed DMV aggregated for the current day, week, respectively month as part of the cost control related feature:
 
-- Azure Serverless SQL pool: volume of data processed
SELECT type 
, data_processed_mb 
, data_processed_mb/1028.0 data_processed_gb
FROM sys.dm_external_data_processed

And here's how the output looks like:
 
typedata_processed_mbdata_processed_gb
daily2300.223735
weekly3770.366731
monthly223522217.433852

Notes:
1) I still need to play with the DMVs to understand their scope and limitations.
2) The view appears also in the list of DMVs I idenfitied to be supported the by Synapse serverless SQL pool. As I discoered later, 3 more DMVs are available with useful statistics.
3) The queries based on sys.dm_exec_requests and sys.dm_exec_query_stats DMVs seem to return only the running query based on them.  (Actually, the DMVs seem to work.)
4) The view is available also in SQL Server 2022, though it doesn't seem to be used.
5) According to the above-mentioned source, the view is provided for ticket purposes to help customers better troubleshooting the SQL requests. Use the distributed_statement_id in the tickets raised with Microsoft to troubleshoot any issues with Synapse.
6) Unfortunately, also the useful Query Store feature is not yet supported, even if the DMVs related to it seem to be available. Attempting to enable it results in the error:
Msg 15869, Level 16, State 9, Line 1
QUERY_STORE is not supported for ALTER DATABASE


Happy coding!

17 August 2009

🛢DBMS: Query Optimizer (Definitions)

"SQL Server code that analyzes queries and database objects and selects the appropriate query plan. The SQL Server optimizer is a cost-based optimizer. It estimates the cost of each permutation of table accesses in terms of CPU cost and I/O cost." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A SQL server tool that formulates an optimum execution plan for a query." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"The SQL Server component responsible for generating the optimum execution plan for a query." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The SQL Server database engine component responsible for generating efficient execution plans for SQL statements." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A term applied to a process, within a database engine, that attempts to find the fastest method of executing a SQL command against a database." (Gavin Powell, "Beginning Database Design", 2006)

"This is the component in SQL Server that analyzes your queries, compares them with available indexes, and decides which index will return a result set the fastest." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"An optimization process running within SQL Server. Any queries submitted to SQL Server are first processed by the query optimizer. It determines the best way to run the query, including what indexes to use and what types of joins to use. The output is a query execution plan, sometimes called a query plan or just a plan." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A process that generates query plans. For each query, the optimizer generates a plan that matches the query to the index that will return results as efficiently as possible. The optimizer reuses the query plan each time the query runs. If a collection changes significantly, the optimizer creates a new query plan." (MongoDb, "Glossary", 2008)

"The Optimizer is an internal technology that is responsible for selecting the most efficient means to accessing or altering information. It uses detailed statistics about the database to make the right decision." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A part of a DBMS that examines a nonprocedural data manipulation request and makes a determination of the most efficient way to process that request." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"The component of a relational database system responsible for analyzing SQL queries and producing optimal access paths for retrieving data from the database." (Craig S Mullins, "Database Administration", 2012)

"A component of the SQL and XQuery compiler that chooses an access plan for a data manipulation language statement by modeling the execution cost of many alternative access plans and choosing the one with the minimal estimated cost." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"Built-in database software that determines the most efficient way to execute a SQL statement by considering factors related to the objects referenced and the conditions specified in the statement." (Oracle)

"The MySQL component that determines the best indexes and join order to use for a query, based on characteristics and data distribution of the relevant tables." (MySQL)

31 January 2009

🛢DBMS: Static SQL (Definitions)

[static query:] "A previously saved query that is optimized for a particular execution plan. The opposite of an ad hoc query." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"SQL that has been compiled prior to execution." (Ajay Gupta et al, "Informix Dynamic Server 11", 2007)

"SQL statements in an application that do not change at runtime and, therefore, can be hard-coded into the application." (John Goodson & Robert A Steward, "The Data Access Handbook", 2009)

[Static embedded SQL:] "Embedded SQL in which the entire SQL statement can be specified when the program is written, allowing the statement to be precompiled before the program is executed." (Jan L Harrington, "SQL Clearly Explained” 3rd Ed. , 2010)

[static query:] "A stored, precompiled SQL query, optimized for access against a particular database design." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A style of embedded SQL in which the SQL statements do not change while the application is running." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"SQL statements, embedded within a program, that are prepared during the program preparation process (before the program is executed). After being prepared, the SQL statement does not change (although values of host variables specified by the statement might change." (Kirsten A Larsen et al, "Extremely pureXML in DB2 10 for z/OS", 2011)

"SQL queries that are preprocessed and whose access paths are determined during the bind procedure, prior to execution." (Craig S Mullins, "Database Administration", 2012)

"SQL statements that are embedded within a program and are bound before the program is executed. After being bound, a static SQL statement does not change, although values of host variables specified by the statement can change. " (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A program where access to the database has been predetermined by the programmer and any input from the user will not change this access pattern. Put another way, all SQL statements are already part of the program when it is executed." (Microfocus)

"A type of embedded SQL in which SQL statements are hard-coded and compiled when the rest of the program is compiled." (Microsoft) 

"It is called static SQL because the SQL statements in the program are static; that is, they do not change each time the program is run."  (Microsoft) [source]


Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.