20 January 2023

SQL Reloaded: Monitoring the Synapse serverless SQL pool with Dynamics Management Views II

Identifying the SQL Server DMVs which are accessible for the Serverless SQL pool (see previous post), allowed me to identify besides sys.dm_exec_requests_history three more DMVs with statistics on the statements executed on the server: sys.dm_request_phases, sys.dm_request_phases_task_group_stats and sys.dm_request_phases_exec_task_stats. Untofurtunately, there seems to be no documentation available on these DMVs, and, at the time the post was written, there were also no further hits on google.com or bing.com found on the same.

sys.dm_request_phases

sys.dm_request_phases  provides insights in the phases an execution statement goes through, and seems to summarize the other two views:

-- Azure Serverless SQL pool: request phases
SELECT TOP (100) dist_statement_id
, RPH.dist_request_id
, TRY_CAST(RPH.id as bigint) id
, TRY_CAST(RPH.parent_ids as bigint) parent_ids
, RPH.start_time
, RPH.end_time
--, RPH.total_elapsed_time_ms
--, RPH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec
--, RPH.min_time_ms
--, RPH.min_time_ms/1000.0 min_time_sec
--, RPH.max_time_ms
--, RPH.max_time_ms/1000.0 max_time_sec
--, RPH.avg_time_ms
, RPH.avg_time_ms/1000.0 avg_time_sec
--, RPH.stdev_time_ms
--, RPH.stdev_time_ms/1000.0 stdev_time_sec -- it has no values
--, RPH.min_rows
--, RPH.max_rows
--, RPH.avg_rows
--, RPH.stdev_rows -- it has no values
, RPH.total_rows
--, RPH.total_bytes_processed
, RPH.total_bytes_processed/1028.0 total_kb_processed
, RPH.state_desc
, RPH.operation_type
, RPH.input_dop
, RPH.output_dop
, RPH.task_retries
, RPH.error_id
FROM sys.dm_request_phases RPH
ORDER BY Id
dist_statement_iddist_request_ididparent_idsstart_timeend_timeavg_time_sectotal_rowstotal_kb_processedstate_descoperation_typeinput_dopoutput_doptask_retrieserror_id
8C4386DC...820E9FC6...12...09:58:34.213...09:58:36.3371.03120302343.310311succeededShuffle1100
8C4386DC...820E9FC6...20...09:58:36.447...09:58:39.7131.89197145.193579succeededReturn1100
C9524971...680DCB55...34...10:05:46.747...10:05:47.0570.20320302343.310311succeededShuffle1100
C9524971...680DCB55...40...10:05:47.057...10:05:48.4801.40606630.101167succeededReturn1100
FD2D17AD...C9453EF2...56...11:58:54.060...11:58:55.2970.547101534.098249succeededComputeToControlNode1100
FD2D17AD...C9453EF2...60...11:58:55.297...11:58:55.4200.125104.074902succeededReturn1100
9FB0A268...CAA533DE...78...11:59:16.483...11:59:16.7000.20320302343.310311succeededShuffle1100
9FB0A268...CAA533DE...80...11:59:16.700...11:59:18.6401.92267143.673151succeededReturn1100
1732AB0D...AC1A4F10...910...11:59:25.950...11:59:26.1400.17220302343.310311succeededShuffle1100
1732AB0D...AC1A4F10...100...11:59:26.140...11:59:27.4501.29796635.185797succeededReturn1100

Notes:
1) The foreign keys and dates (in the above and below queries) were truncated to accomodate all the important attributes in the snapshot of the values returned.
2) Based on the exisitng queries, there are two records for each executed statement, a Shuffle or ComputeToControlNode followed by a Return (see operation_type). In more complex scenario there are several Shuffles and Broadcasts and a Return. According to the Microsoft team, even if for serverless SQL pools there's no Data Movement Service (DMS), there's a similar algorithm responsible for moving the data between the nodes.
3) Because in serverless SQL pool each query has its own distribution statement id, the min, max, avg and total values will have the sames values across the columns. Therefore, the columns with redundant values were commented.
4) The Id of the request phase seems to have numeric values despite being defined as alphanumeric. I tried to cast the values to bigint for sorting purposes.

sys.dm_request_phases_task_group_stats

sys.dm_request_phases_task_group_stats stores metadata about the requests breakdown at task group:

-- Azure Serverless SQL pool: request phases breakdown at task group
SELECT TOP (100) RPT.dist_request_id
, TRY_CAST(RPT.id as bigint) id
, TRY_CAST(RPT.parent_ids as bigint) parent_ids
, RPT.dist_statement_id
, RPT.state_desc
, RPT.start_time
, RPT.end_time
, RPT.input_dop
, RPT.output_dop
, RPT.operation_type
, RPT.task_retries
FROM sys.dm_request_phases_task_group_stats RPT
ORDER BY id
dist_request_ididparent_idsdist_statement_idstate_descstart_timeend_timeinput_dopoutput_dopoperation_typetask_retries
820E9FC6...128C4386DC...succeeded63809805514213255163809805516338269311Shuffle0
820E9FC6...208C4386DC...succeeded63809805516447616363809805519713300111Return0
680DCB55...34C9524971...succeeded63809805946745002163809805947057495311Shuffle0
680DCB55...40C9524971...succeeded63809805947057495363809805948479368211Return0
C9453EF2...56FD2D17AD...succeeded63809812734060711263809812735295106711ComputeToControlNode0
C9453EF2...60FD2D17AD...succeeded63809812735295106763809812735420297011Return0
CAA533DE...789FB0A268...succeeded63809812756482608463809812756701350411Shuffle0
CAA533DE...809FB0A268...succeeded63809812756701350463809812758638854911Return0
AC1A4F10...9101732AB0D...succeeded63809812765951362063809812766138851411Shuffle0
AC1A4F10...1001732AB0D...succeeded63809812766138851463809812767451360111Return0

Notes:
1) The DVM seems to return the same number of records as sys.dm_request_phases.
2) Observe the format of the start_time and end_time, probably the timestamps come from the Spark cluster and were not translated into an SQL Server data type.

sys.dm_request_phases_exec_task_stats

sys.dm_request_phases_exec_task_stats stores metadata about the requests breakdown at task level:

-- Azure Serverless SQL pool: request phases breakdown at task
SELECT TOP (100) RPE.dist_request_id
, TRY_CAST(RPE.id as bigint) id
--, RPE.min_time_ms
--, RPE.max_time_ms
, RPE.avg_time_ms/1000.0 avg_time_sec
--, RPE.stdev_time_ms
, RPE.total_bytes_processed
--, RPE.min_rows
--, RPE.max_rows
--, RPE.avg_rows
--, RPE.stdev_rows
, RPE.total_rows
, RPE.error_id
FROM sys.dm_request_phases_exec_task_stats RPE
ORDER BY id
dist_request_ididavg_time_sectotal_kb_processedtotal_rowserror_id
820E9FC6...11.0312343.31031120300
820E9FC6...21.8917145.19357990
680DCB55...30.2032343.31031120300
680DCB55...41.4066630.10116700
C9453EF2...50.5471534.098249100
C9453EF2...60.1254.074902100
CAA533DE...70.2032343.31031120300
CAA533DE...81.9227143.67315160
AC1A4F10...90.1722343.31031120300
AC1A4F10...101.2976635.18579790

What does all this mean?

The lack of documentation makes it challenging to interpret the values of the views besides the data and metadata they offer. In a paper on POLARIS, the code given to the serveless SQL pool engine, a taks is defined as "a careful packaging of data and query processing into units [...] that can be readily moved across compute nodes and re-started at the task level" [1]. Therefore, one can assume that this is the level targetted by the sys.dm_request_phases_exec_task_stats DMV. Further on, the tasks are grouped at phase level according to the sys.dm_request_phases_task_group_stats, the metadata from the two DMVs being further combined into sys.dm_request_phases DMV. 

If the meaning is kept from dedicated SQL pools, a shuffle operation indicates that data is moved between the frontend and backend nodes to satisfy a request, while a Result represents the operation of returning the result selt to client. The "ComputeToControlNode" operation involves a simple select (e.g. SELECT top 10) from a CETA and therefore no "Shuffle" is needed.

Requests' history

Further on, one can use the "Distributed statement id" to join the execution request phases with the request history, however matches will be found only for a small subset of the records (probably the executions since the pool started):
 
-- Azure Serverless SQL pool: requests history with request phase info
SELECT top 100 ERH.status
, ERH.transaction_Id
, ERH.distributed_statement_Id 
, ERH.query_hash 
--, ERH.login_name 
, ERH.start_time
, ERH.end_time 
, ERH.command 
, ERH.query_text 
--, ERH.total_elapsed_time_ms
, ERH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec
--, ERH.data_processed_mb
, ERH.data_processed_mb
, RPH.avg_time_ms/1000.0 avg_time_sec
, RPH.total_rows
, RPH.total_bytes_processed/1028.0/1028.0 total_mb_processed
, RPH.state_desc
, RPH.operation_type
, RPH.input_dop
, RPH.output_dop
, RPH.task_retries
, RPH.error_id
, ERH.error
, ERH.error_code 
FROM sys.dm_exec_requests_history ERH
     JOIN sys.dm_request_phases RPH
	   ON ERH.distributed_statement_Id = RPH.dist_statement_id
	  --AND RPH.parent_ids = 0 -- only the parent
ORDER BY RPH.Id DESC

Here's a subset of the result set focusing only on the statistical values:
 
distr_statement_Idstart_timeend_timetotal_elapsed_time_secdata_processed_mbavg_time_sectotal_rowstotal_mb_processedoperation_typeidparent_ids
{8C4386D......8:24.4300000...8:39.826666615.396101.03120302.279484738326Shuffle12
{8C4386D......8:24.4300000...8:39.826666615.396101.89196.950577411478Return20
{C952497......5:45.2100000...5:48.49333333.283100.20320302.279484738326Shuffle34
{C952497......5:45.2100000...5:48.49333333.283101.40606.449514753891Return40
{FD2D17A......8:52.1400000...8:55.41666663.276100.547101.492313471789ComputeToControlNode56
{FD2D17A......8:52.1400000...8:55.41666663.276100.125100.003963912451Return60
{9FB0A26......9:15.1300000...9:18.63666663.506100.20320302.279484738326Shuffle78
{9FB0A26......9:15.1300000...9:18.63666663.506101.92266.949098395914Return80
{1732AB0......9:24.6900000...9:27.45000002.76100.17220302.279484738326Shuffle910
{1732AB0......9:24.6900000...9:27.45000002.76101.29796.454460892023Return100

Notes:
As can be seen, the volume of data processed and the elapsed time values don't match between the two tables, though they are close. The differences probably result from further steps occuring in the process. 

Happy coding!

References:
[1] Josep Aguilar-Saborit, Raghu Ramakrishnan et al, "POLARIS: The Distributed SQL Engine in Azure Synapse", VLDB Conferences. PVLDB, 13(12): 3204 – 3216, 2020, DOI: https://doi.org/10.14778/3415478.3415545

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.