I feel sometimes flying blind when I build or troubleshoot SQL queries and I don't have the query plan and/or further statistics to understand how the database engine works, why some queries take longer than expected, etc. Unfortunately, Synapse serverless SQL pool doesn't seem to support showing exection plans in SQL Server Management Studio as per now (SHOWPLAN_XML is not supported for SET). I looked at my old queries based on the the sys.dm_exec_requests and sys.dm_exec_query_stats DMVs, however the results didn't proved to be what I was searching for. (
This weekend, I found Sidney Cirqueira's post on monitoring Synapse serverless SQL pools where he describes how to do that via the Monitoring hub, DMVs, QPI library, respectively Log Analytics. (You should check regularly the Azure Synapse Analytics Blog as it's full of goodies!)
Thus, I found out that there's a new DMV called sys.dm_exec_requests_history which provides at least the duration and the volume of data processes by each statement run on the service:
-- Azure Serverless SQL pool: requests' history SELECT top 100 ERH.status , ERH.transaction_Id , ERH.distributed_statement_Id , ERH.query_hash , ERH.login_name , ERH.start_time , ERH.end_time , ERH.command , ERH.query_text --, ERH.total_elapsed_time_ms , ERH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec
--, ERH.data_processed_mb , ERH.data_processed_mb/1028.0 data_processed_gb , ERH.error , ERH.error_code FROM sys.dm_exec_requests_history ERH ORDER BY ERH.data_processed_mb DESC
It isn't much information, compared with the columns returned by sys.dm_exec_requests, but it's something to start with. At least it allows focusing on the queries with the longest duration (use the above query sorting the records based on the total_elapsed_time_ms descending) or highest volume of data processed:
-- Azure Serverless SQL pool: queries with most data processed SELECT TOP 50 ERH.query_text , COUNT(*) no_runs , SUM(ERH.total_elapsed_time_ms) total_elapsed_time_ms , SUM(ERH.data_processed_mb) data_processed_mb , SUM(ERH.data_processed_mb/1028.0) data_processed_gb , MIN(ERH.start_time) first_run_date , MAX(ERH.start_time) last_run_date FROM sys.dm_exec_requests_history ERH GROUP BY ERH.query_text HAVING COUNT(*)>1 ORDER BY data_processed_mb DESC
-- Azure Serverless SQL pool: data processed by month SELECT Convert(nvarchar(7), ERH.start_time, 23) [period] , COUNT(*) no_runs , SUM(ERH.total_elapsed_time_ms) total_elapsed_time_ms , SUM(ERH.data_processed_mb) data_processed_mb , SUM(ERH.data_processed_mb/1028.0) data_processed_gb , MIN(ERH.start_time) first_run_date , MAX(ERH.start_time) last_run_date FROM sys.dm_exec_requests_history ERH GROUP BY Convert(nvarchar(7), ERH.start_time, 23) HAVING COUNT(*)>1 ORDER BY data_processed_mb DESC
The volume of data processed is stored also in the sys.dm_external_data_processed DMV aggregated for the current day, week, respectively month as part of the cost control related feature:
-- Azure Serverless SQL pool: volume of data processed SELECT type , data_processed_mb , data_processed_mb/1028.0 data_processed_gb FROM sys.dm_external_data_processed
type | data_processed_mb | data_processed_gb |
daily | 230 | 0.223735 |
weekly | 377 | 0.366731 |
monthly | 223522 | 217.433852 |