Identifying the SQL Server DMVs which are accessible for the Serverless SQL pool (see previous post), allowed me to identify besides sys.dm_exec_requests_history three more DMVs with statistics on the statements executed on the server: sys.dm_request_phases, sys.dm_request_phases_task_group_stats and sys.dm_request_phases_exec_task_stats. Untofurtunately, there seems to be no documentation available on these DMVs, and, at the time the post was written, there were also no further hits on google.com or bing.com found on the same.
sys.dm_request_phases
sys.dm_request_phases provides insights in the phases an execution statement goes through, and seems to summarize the other two views:
-- Azure Serverless SQL pool: request phases SELECT TOP (100) dist_statement_id , RPH.dist_request_id , TRY_CAST(RPH.id as bigint) id , TRY_CAST(RPH.parent_ids as bigint) parent_ids , RPH.start_time , RPH.end_time --, RPH.total_elapsed_time_ms --, RPH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec --, RPH.min_time_ms --, RPH.min_time_ms/1000.0 min_time_sec --, RPH.max_time_ms --, RPH.max_time_ms/1000.0 max_time_sec --, RPH.avg_time_ms , RPH.avg_time_ms/1000.0 avg_time_sec --, RPH.stdev_time_ms --, RPH.stdev_time_ms/1000.0 stdev_time_sec -- it has no values --, RPH.min_rows --, RPH.max_rows --, RPH.avg_rows --, RPH.stdev_rows -- it has no values , RPH.total_rows --, RPH.total_bytes_processed , RPH.total_bytes_processed/1028.0 total_kb_processed , RPH.state_desc , RPH.operation_type , RPH.input_dop , RPH.output_dop , RPH.task_retries , RPH.error_id FROM sys.dm_request_phases RPH ORDER BY Id
dist_statement_id | dist_request_id | id | parent_ids | start_time | end_time | avg_time_sec | total_rows | total_kb_processed | state_desc | operation_type | input_dop | output_dop | task_retries | error_id |
8C4386DC... | 820E9FC6... | 1 | 2 | ...09:58:34.213 | ...09:58:36.337 | 1.031 | 2030 | 2343.310311 | succeeded | Shuffle | 1 | 1 | 0 | 0 |
8C4386DC... | 820E9FC6... | 2 | 0 | ...09:58:36.447 | ...09:58:39.713 | 1.891 | 9 | 7145.193579 | succeeded | Return | 1 | 1 | 0 | 0 |
C9524971... | 680DCB55... | 3 | 4 | ...10:05:46.747 | ...10:05:47.057 | 0.203 | 2030 | 2343.310311 | succeeded | Shuffle | 1 | 1 | 0 | 0 |
C9524971... | 680DCB55... | 4 | 0 | ...10:05:47.057 | ...10:05:48.480 | 1.406 | 0 | 6630.101167 | succeeded | Return | 1 | 1 | 0 | 0 |
FD2D17AD... | C9453EF2... | 5 | 6 | ...11:58:54.060 | ...11:58:55.297 | 0.547 | 10 | 1534.098249 | succeeded | ComputeToControlNode | 1 | 1 | 0 | 0 |
FD2D17AD... | C9453EF2... | 6 | 0 | ...11:58:55.297 | ...11:58:55.420 | 0.125 | 10 | 4.074902 | succeeded | Return | 1 | 1 | 0 | 0 |
9FB0A268... | CAA533DE... | 7 | 8 | ...11:59:16.483 | ...11:59:16.700 | 0.203 | 2030 | 2343.310311 | succeeded | Shuffle | 1 | 1 | 0 | 0 |
9FB0A268... | CAA533DE... | 8 | 0 | ...11:59:16.700 | ...11:59:18.640 | 1.922 | 6 | 7143.673151 | succeeded | Return | 1 | 1 | 0 | 0 |
1732AB0D... | AC1A4F10... | 9 | 10 | ...11:59:25.950 | ...11:59:26.140 | 0.172 | 2030 | 2343.310311 | succeeded | Shuffle | 1 | 1 | 0 | 0 |
1732AB0D... | AC1A4F10... | 10 | 0 | ...11:59:26.140 | ...11:59:27.450 | 1.297 | 9 | 6635.185797 | succeeded | Return | 1 | 1 | 0 | 0 |
Notes:
1) The foreign keys and dates (in the above and below queries) were truncated to accomodate all the important attributes in the snapshot of the values returned.
2) Based on the exisitng queries, there are two records for each executed statement, a Shuffle or ComputeToControlNode followed by a Return (see operation_type). In more complex scenario there are several Shuffles and Broadcasts and a Return. According to the Microsoft team, even if for serverless SQL pools there's no Data Movement Service (DMS), there's a similar algorithm responsible for moving the data between the nodes.
3) Because in serverless SQL pool each query has its own distribution statement id, the min, max, avg and total values will have the sames values across the columns. Therefore, the columns with redundant values were commented.
4) The Id of the request phase seems to have numeric values despite being defined as alphanumeric. I tried to cast the values to bigint for sorting purposes.
sys.dm_request_phases_task_group_stats
sys.dm_request_phases_task_group_stats stores metadata about the requests breakdown at task group:
-- Azure Serverless SQL pool: request phases breakdown at task group SELECT TOP (100) RPT.dist_request_id , TRY_CAST(RPT.id as bigint) id , TRY_CAST(RPT.parent_ids as bigint) parent_ids , RPT.dist_statement_id , RPT.state_desc , RPT.start_time , RPT.end_time , RPT.input_dop , RPT.output_dop , RPT.operation_type , RPT.task_retries FROM sys.dm_request_phases_task_group_stats RPT ORDER BY id
dist_request_id | id | parent_ids | dist_statement_id | state_desc | start_time | end_time | input_dop | output_dop | operation_type | task_retries |
820E9FC6... | 1 | 2 | 8C4386DC... | succeeded | 638098055142132551 | 638098055163382693 | 1 | 1 | Shuffle | 0 |
820E9FC6... | 2 | 0 | 8C4386DC... | succeeded | 638098055164476163 | 638098055197133001 | 1 | 1 | Return | 0 |
680DCB55... | 3 | 4 | C9524971... | succeeded | 638098059467450021 | 638098059470574953 | 1 | 1 | Shuffle | 0 |
680DCB55... | 4 | 0 | C9524971... | succeeded | 638098059470574953 | 638098059484793682 | 1 | 1 | Return | 0 |
C9453EF2... | 5 | 6 | FD2D17AD... | succeeded | 638098127340607112 | 638098127352951067 | 1 | 1 | ComputeToControlNode | 0 |
C9453EF2... | 6 | 0 | FD2D17AD... | succeeded | 638098127352951067 | 638098127354202970 | 1 | 1 | Return | 0 |
CAA533DE... | 7 | 8 | 9FB0A268... | succeeded | 638098127564826084 | 638098127567013504 | 1 | 1 | Shuffle | 0 |
CAA533DE... | 8 | 0 | 9FB0A268... | succeeded | 638098127567013504 | 638098127586388549 | 1 | 1 | Return | 0 |
AC1A4F10... | 9 | 10 | 1732AB0D... | succeeded | 638098127659513620 | 638098127661388514 | 1 | 1 | Shuffle | 0 |
AC1A4F10... | 10 | 0 | 1732AB0D... | succeeded | 638098127661388514 | 638098127674513601 | 1 | 1 | Return | 0 |
Notes:
1) The DVM seems to return the same number of records as sys.dm_request_phases.
2) Observe the format of the start_time and end_time, probably the timestamps come from the Spark cluster and were not translated into an SQL Server data type.
sys.dm_request_phases_exec_task_stats
sys.dm_request_phases_exec_task_stats stores metadata about the requests breakdown at task level:
-- Azure Serverless SQL pool: request phases breakdown at task SELECT TOP (100) RPE.dist_request_id , TRY_CAST(RPE.id as bigint) id --, RPE.min_time_ms --, RPE.max_time_ms , RPE.avg_time_ms/1000.0 avg_time_sec --, RPE.stdev_time_ms , RPE.total_bytes_processed --, RPE.min_rows --, RPE.max_rows --, RPE.avg_rows --, RPE.stdev_rows , RPE.total_rows , RPE.error_id FROM sys.dm_request_phases_exec_task_stats RPE ORDER BY id
dist_request_id | id | avg_time_sec | total_kb_processed | total_rows | error_id |
820E9FC6... | 1 | 1.031 | 2343.310311 | 2030 | 0 |
820E9FC6... | 2 | 1.891 | 7145.193579 | 9 | 0 |
680DCB55... | 3 | 0.203 | 2343.310311 | 2030 | 0 |
680DCB55... | 4 | 1.406 | 6630.101167 | 0 | 0 |
C9453EF2... | 5 | 0.547 | 1534.098249 | 10 | 0 |
C9453EF2... | 6 | 0.125 | 4.074902 | 10 | 0 |
CAA533DE... | 7 | 0.203 | 2343.310311 | 2030 | 0 |
CAA533DE... | 8 | 1.922 | 7143.673151 | 6 | 0 |
AC1A4F10... | 9 | 0.172 | 2343.310311 | 2030 | 0 |
AC1A4F10... | 10 | 1.297 | 6635.185797 | 9 | 0 |
What does all this mean?
The lack of documentation makes it challenging to interpret the values of the views besides the data and metadata they offer. In a paper on POLARIS, the code given to the serveless SQL pool engine, a taks is defined as "a careful packaging of data and query processing into units [...] that can be readily moved across compute nodes and re-started at the task level" [1]. Therefore, one can assume that this is the level targetted by the sys.dm_request_phases_exec_task_stats DMV. Further on, the tasks are grouped at phase level according to the sys.dm_request_phases_task_group_stats, the metadata from the two DMVs being further combined into sys.dm_request_phases DMV.
-- Azure Serverless SQL pool: requests history with request phase info SELECT top 100 ERH.status , ERH.transaction_Id , ERH.distributed_statement_Id , ERH.query_hash --, ERH.login_name , ERH.start_time , ERH.end_time , ERH.command , ERH.query_text --, ERH.total_elapsed_time_ms , ERH.total_elapsed_time_ms/1000.0 total_elapsed_time_sec --, ERH.data_processed_mb , ERH.data_processed_mb , RPH.avg_time_ms/1000.0 avg_time_sec , RPH.total_rows , RPH.total_bytes_processed/1028.0/1028.0 total_mb_processed , RPH.state_desc , RPH.operation_type , RPH.input_dop , RPH.output_dop , RPH.task_retries , RPH.error_id , ERH.error , ERH.error_code FROM sys.dm_exec_requests_history ERH JOIN sys.dm_request_phases RPH ON ERH.distributed_statement_Id = RPH.dist_statement_id --AND RPH.parent_ids = 0 -- only the parent ORDER BY RPH.Id DESC
distr_statement_Id | start_time | end_time | total_elapsed_time_sec | data_processed_mb | avg_time_sec | total_rows | total_mb_processed | operation_type | id | parent_ids |
{8C4386D... | ...8:24.4300000 | ...8:39.8266666 | 15.396 | 10 | 1.031 | 2030 | 2.279484738326 | Shuffle | 1 | 2 |
{8C4386D... | ...8:24.4300000 | ...8:39.8266666 | 15.396 | 10 | 1.891 | 9 | 6.950577411478 | Return | 2 | 0 |
{C952497... | ...5:45.2100000 | ...5:48.4933333 | 3.283 | 10 | 0.203 | 2030 | 2.279484738326 | Shuffle | 3 | 4 |
{C952497... | ...5:45.2100000 | ...5:48.4933333 | 3.283 | 10 | 1.406 | 0 | 6.449514753891 | Return | 4 | 0 |
{FD2D17A... | ...8:52.1400000 | ...8:55.4166666 | 3.276 | 10 | 0.547 | 10 | 1.492313471789 | ComputeToControlNode | 5 | 6 |
{FD2D17A... | ...8:52.1400000 | ...8:55.4166666 | 3.276 | 10 | 0.125 | 10 | 0.003963912451 | Return | 6 | 0 |
{9FB0A26... | ...9:15.1300000 | ...9:18.6366666 | 3.506 | 10 | 0.203 | 2030 | 2.279484738326 | Shuffle | 7 | 8 |
{9FB0A26... | ...9:15.1300000 | ...9:18.6366666 | 3.506 | 10 | 1.922 | 6 | 6.949098395914 | Return | 8 | 0 |
{1732AB0... | ...9:24.6900000 | ...9:27.4500000 | 2.76 | 10 | 0.172 | 2030 | 2.279484738326 | Shuffle | 9 | 10 |
{1732AB0... | ...9:24.6900000 | ...9:27.4500000 | 2.76 | 10 | 1.297 | 9 | 6.454460892023 | Return | 10 | 0 |
Notes:
As can be seen, the volume of data processed and the elapsed time values don't match between the two tables, though they are close. The differences probably result from further steps occuring in the process.
Happy coding!
Previous Post <<||>> Next Post
References:
[1] Josep Aguilar-Saborit, Raghu Ramakrishnan et al, "POLARIS: The Distributed SQL Engine in Azure Synapse", VLDB Conferences. PVLDB, 13(12): 3204 – 3216, 2020, DOI: https://doi.org/10.14778/3415478.3415545
No comments:
Post a Comment