06 July 2020

💠🛠️🪄SQL Server: Undocumented (Part III: SQL Server CPU Utilization via the Ring Buffer)

Introduction

If no proper monitoring solution of the SQL Server and the hosting server is in place to review the CPU utilization, one can use the Scheduler Monitor buffer provided by the undocumented sys.dm_os_ring_buffers data management view (DMV). Introduced with SQL Server 2005, the DMV provides significant amount of diagnostic memory information in XML form via several buffers: Resource Monitor, Out-of-Memory, Memory Broker, Buffer Pool, respectively Scheduler Monitor buffer [2]. A ring buffer is a recorded response to a notification [1].

The view changed between the various versions of SQL Server, while with the introduction of Always On availability groups in SQL Server 2017 further buffer rings were made available (see [5]).

Warning:
According to Microsoft (see [4] the sys.dm_os_ring_buffers is provided only for information purposes, the future compatibility post SQL Server 2019 being not guaranteed!

Querying the Scheduler Monitor Buffer

Within the Scheduler Monitor buffer, the DMV stores a history of 4 hours uptime with minute by minute data points (in total 256 entries) with the CPU utilization for the SQL Server, other processes, respectively the system idle time as percentages. It allows thus to identify the peaks in CPU utilization and thus to determine the intervals of focus for further troubleshooting. As the data are stored within an XML structure, the values can be queried via the XQuery syntax as follows: 

-- cpu utilization for SQL Server and other applications
DECLARE @ts_now bigint = (SELECT cpu_ticks/(cpu_ticks/ms_ticks)
        FROM sys.dm_os_sys_info); 

SELECT DAT.record_id
, DAT.EventTime
, DAT.SQLProcessUtilization 
, DAT.SystemIdle 
, 100 - (DAT.SystemIdle + DAT.SQLProcessUtilization) OtherUtilization
FROM ( 
	SELECT record.value('(./Record/@id)[1]', 'int') record_id
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') SystemIdle 
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') SQLProcessUtilization
	, EventTime 
	FROM ( 
		SELECT DATEADD(ms, -1 * (@ts_now - [timestamp]), GETDATE()) EventTime
		, [timestamp]
		, CONVERT(xml, record) AS [record] 
		FROM sys.dm_os_ring_buffers 
		WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' 
		  AND record LIKE N'%<SystemHealth>%') AS x 
	) AS DAT
ORDER BY DAT.record_id DESC;

If the SQL Server is not busy as all, the SQL Server utilization time may tend to 0%, while the system idle time to 90%. (It's the case of my SQL Server lab.)

CPU Utilization for my home lab
CPU Utilization for my home SQL Server lab

Notes:
If the server was restarted within the last 4 hours, then the points will have a gap between two readings corresponding to the downtime interval.
The query is supposed to run also on Linux machines, though the SystemIdle time will be 0. One can thus consider the SQL and non-SQL CPU utilization.

Storing the History

The above query can be run on a regular basis (e.g. every 3-4 hours) via a SSIS package and push the data into a table for historical purposes. Because is needed to have a continuous history of the readings, it's better if the gap between runs is smaller than the 4 hours. No matter of the approach used is better to check for overlappings when storing the data:

-- dropping the table
-- DROP TABLE IF EXISTS dbo.T_RingBufferReadings 

-- reinitilizing the history
-- TRUNCATE TABLE dbo.T_RingBufferReadings

-- creating the table
CREATE TABLE dbo.T_RingBufferReadings (
  Id bigint IDENTITY (1,1) NOT NULL
, RecordId bigint 
, EventTime datetime2(3) NOT NULL
, SQLProcessUtilization int NOT NULL
, SystemIdle int NOT NULL
, OtherUtilization int NOT NULL
)


-- reviewing the data
SELECT *
FROM dbo.T_RingBufferReadings 
ORDER BY EventTime DESC

If there are many records, to improve the performance, one can create also an index, which can include the reading points as well:

-- creating a unique index with an include 
CREATE UNIQUE NONCLUSTERED INDEX [UI_T_RingBufferReadings_EventTime] ON dbo.T_RingBufferReadings
(
	EventTime ASC,
    RecordId ASC
) INCLUDE (SQLProcessUtilization, SystemIdle, OtherUtilization)
GO

The above query based on the DMV becomes:

-- cpu utilization by SQL Server and other applications
DECLARE @ts_now bigint = (SELECT cpu_ticks/(cpu_ticks/ms_ticks)
        FROM sys.dm_os_sys_info); 

INSERT INTO dbo.T_RingBufferReadings
SELECT record_id
, DAT.EventTime
, DAT.SQLProcessUtilization 
, DAT.SystemIdle 
, 100 - (DAT.SystemIdle + DAT.SQLProcessUtilization) OtherUtilization
FROM ( 
	SELECT record.value('(./Record/@id)[1]', 'int') record_id
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') SystemIdle 
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') SQLProcessUtilization
	, EventTime 
	FROM ( 
		SELECT DATEADD(ms, -1 * (@ts_now - [timestamp]), GETDATE()) EventTime
		, [timestamp]
		, CONVERT(xml, record) AS [record] 
		FROM sys.dm_os_ring_buffers 
		WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' 
		  AND record LIKE N'%<SystemHealth>%') AS x 
	) AS DAT
	LEFT JOIN dbo.T_RingBufferReadings RBR
	  ON DAT.record_id = RBR.Recordid 
WHERE RBR.Recordid IS NULL
ORDER BY DAT.record_id DESC;

Note:
A ServerName column can be added to the table if is needed to store the values for different SQL Servers. Then the LEFT JOIN has to consider the new added column. 
Either of the two queries can be used to display the data points within a chart via SSRS, Power BI or any reporting tool available. 

Happy coding!

References:
[1] Grant Fritchey (2014) SQL Server Query Performance Tuning: Troubleshoot and Optimize Query Performance in SQL Server 2014, 4th Ed.
[2] Sunil Agarwal et al (2005), Troubleshooting Performance Problems in SQL Server 2005, Source: TShootPerfProbs.docx
[3] Sunil Agarwal et al (2008), Troubleshooting Performance Problems in SQL Server 2008, Source: TShootPerfProbs2008.docx
[4] Microsoft SQL Docs (2018) Related Dynamic Management Views, Source
[5] Microsoft SQL Docs (2017) Use ring buffers to obtain health information about Always On availability groups, Source

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.