SQL Troubles

06 July 2020

🪄SSRS (& Paginated Reports): First Steps (Part IV: Matrix Report Display)

Introduction

SQL Server Reporting Services (SSRS) allows grouping data into a matrix format based on one or more groups. By using the Report Wizard one can simplify considerably the volume of work.

The considered example is based on the AdventureWorks2014 database and considers Product's Inventory as base for building the report.

Preparing the Data

Usually it's useful to incorporate the logic for a report in one or more views, allowing thus to reuse the views in multiple reports. For the current report is needed to create two views, one for the Products, respectively Production.vProductInventory for the inventory.

-- dropping the vProducts view (cleaning after)
--DROP VIEW IF EXISTS [Production].[vProducts]

-- creating the vProducts view
CREATE VIEW [Production].[vProducts]
AS 
SELECT p.[ProductID] 
, p.ProductNumber
, p.[Name] 
, IsNull(p.Size, '') + IsNull(' ' + p.SizeUnitMeasureCode, '') Size
, p.Color
, P.Style
, p.ProductModelID
, pm.[Name] AS [ProductModel] 
, p.StandardCost 
, P.ListPrice
, P.SafetyStockLevel
, P.ReorderPoint
, p.SellStartDate 
, p.SellEndDate
, p.ProductSubcategoryID
, PPS.Name Subcategory
, PPS.ProductCategoryID
, PPC.Name Category
, P.MakeFlag
, P.FinishedGoodsFlag
FROM [Production].[Product] p 
     LEFT JOIN [Production].[ProductModel] pm 
       ON p.[ProductModelID] = pm.[ProductModelID] 
	 LEFT JOIN Production.ProductSubcategory PPS
	   ON P.ProductSubcategoryID = PPS.ProductSubcategoryID 
	      LEFT JOIN Production.ProductCategory PPC
		    ON PPS.ProductCategoryID = PPC.ProductCategoryID 
GO

-- reviewing the data 
SELECT *
FROM [Production].[vProducts]

-- checking the view for internal data consistency
SELECT count(*)
FROM [Production].[Product] p 
     LEFT JOIN [Production].[ProductModel] pm 
       ON p.[ProductModelID] = pm.[ProductModelID] 
	 LEFT JOIN Production.ProductSubcategory PPS
	   ON P.ProductSubcategoryID = PPS.ProductSubcategoryID 
	      LEFT JOIN Production.ProductCategory PPC
		    ON PPS.ProductCategoryID = PPC.ProductCategoryID

-- dropping the vProductInventory view (cleaning after)
--DROP VIEW IF EXISTS Production.vProductInventory

-- creating the view 
CREATE VIEW Production.vProductInventory
AS
SELECT PPI.ProductId 
, PPD.ProductNumber
, PPD.Name ProductName 
, PPD.ProductModel
, PPD.Size
, PPD.Category
, PPD.Subcategory
, PPD.Style
, PPD.StandardCost
, PPD.ListPrice 
, PPD.StandardCost * PPI.Quantity InventoryValue 
, PPD.ListPrice * PPI.Quantity SalesValue
, PPD.MakeFlag
, PPI.Locationid 
, PPL.Name Location 
, PPI.Shelf 
, PPI.Bin 
, PPI.Quantity 
FROM [Production].[ProductInventory] PPI
     JOIN [Production].[vProducts] PPD
	   ON PPI.ProductID = PPD.ProductID
	 JOIN [Production].[Location] PPL
	   ON PPI.LocationID = PPL.LocationID

-- reviewing the data
SELECT *
FROM Production.vProductInventory

-- checking the view for internal data consistency
SELECT count(*)
FROM [Production].[ProductInventory] PPI
     JOIN [Production].[vProducts] PPD
	   ON PPI.ProductID = PPD.ProductID
	 JOIN [Production].[Location] PPL
	   ON PPI.LocationID = PPL.LocationID

Note:
It's important to check the internal consistency of the views or queries used, on whether the logic removes or duplicates data. For this one can run the query for the uppermost table, and add repeatedly one more join for each run to see whether the number of records remains the same. One can shortcut the validation by checking only the number of records from the base table and for the whole query, and only if there are differences use the previously mentioned approach. (This is how I observed that the Production.vProductDetails view is not usable, because it considers only the Products having a valid Model.)

Creating the Report

We can now use the Production.vProductInventory view to create the Product Inventory by Location report based on the following query:

-- Product Inventory by Location
SELECT PPI.Category
, PPI.Subcategory
, PPI.ProductNumber
, PPI.ProductName 
, PPI.ProductModel
, PPI.Size
, PPI.Style
, PPI.StandardCost
, PPI.ListPrice
, PPI.Location 
, PPI.Quantity 
, PPI.InventoryValue 
, PPI.SalesValue 
FROM Production.vProductInventory PPI
ORDER BY PPI.Category
, PPI.Subcategory
, PPI.ProductNumber

Note:
The attributes can be provided in the order in which they should be displayed in the report, reducing thus the overhead in the further steps.

Using the Reporting Wizard via the Add New Report select in the first step the data source, while in the next step provide the above query:

Design the Query

In the next step select the "Matrix" Report Type:

Select the Report Type

Within the "Design the Matrix" section assign the fields as follows (all the fields except the ones considered as Columns and Details will be considered as Rows):

Design the Matrix

Into the last step give the report a meaningful name (e.g. Product Inventory by Location):

Completing the Wizard

In theory the report is ready to run, however before doing that change the formatting by aligning the headers to the center and eventually change their size from 11 to 10 pixels, respectively rename the dataset (e.g. Inventory). To obtain the same information about the grouping as below change into the "Advanced Mode".

Design View

And here's the output (I had to scroll through the pages to find meaningful values, therefore part of the Details header is not shown):

Preview

Restructuring the Grouping

As can be seen into the Design Mode, the wizard created a grouping for each attribute provider into the Details (see matrix1_Category, matrix1_Subcategory, etc.). Therefore, the values will not be repeated for each row, which is typically desirable for visualizations but not when exporting the data to Excel for further processing. I prefer the latter version, therefore to obtain this behavior one can go on and delete all the grouping via "Delete group only" except the matrix1_Category:

Deleting the groups

This action deleted unfortunately all the Detailed headers except Category. To bring them back into the grouping double click on the and add each field into the Group expressions as below:

Group Properties

As final change before running the report one can add header names for the Detail attributes. After these changes reports' design looks as follows:

Report Design with one grouping

And here's the final report with the values repeating for each row:

Preview Report without formatting

Note:
To avoid removing the groupings, I prefer to add only one Detail field into the query, typically the field which will make the row unique into the output (e.g. Product Number) and add the further fields (actually replace the below query with the one above) after the Wizard created the report. One still needs to add the columns manually into the report. In the end the effort is similar.

-- Product Inventory by Location
SELECT PPI.ProductNumber
, PPI.Location 
, PPI.Quantity 
, PPI.InventoryValue 
, PPI.SalesValue 
FROM Production.vProductInventory PPI
ORDER BY PPI.Category
, PPI.Subcategory
, PPI.ProductNumber

Changing the Design

Report's design can be slightly improved by adding various formatting of the cells or values. One can use similar formatting as the ones consider in the previous post. The only thing difficult to implement will be a ranking function (see Ranking Rows in Reports). After the design changes here's the report:

Final Report

Note:
Of course, together with parameters one can also add totals after each Category or Subcategory to the report if needed, though the latter is more appropriate for design purposes and not for further data consumption.

Happy coding!

Previous Post <<||>> Next Post

💠🛠️🪄SQL Server: Undocumented (Part III: SQL Server CPU Utilization via the Ring Buffer)

Introduction

If no proper monitoring solution of the SQL Server and the hosting server is in place to review the CPU utilization, one can use the Scheduler Monitor buffer provided by the undocumented sys.dm_os_ring_buffers data management view (DMV). Introduced with SQL Server 2005, the DMV provides significant amount of diagnostic memory information in XML form via several buffers: Resource Monitor, Out-of-Memory, Memory Broker, Buffer Pool, respectively Scheduler Monitor buffer [2]. A ring buffer is a recorded response to a notification [1].

The view changed between the various versions of SQL Server, while with the introduction of Always On availability groups in SQL Server 2017 further buffer rings were made available (see [5]).

Warning:

According to Microsoft (see [4] the sys.dm_os_ring_buffers is provided only for information purposes, the future compatibility post SQL Server 2019 being not guaranteed!

Querying the Scheduler Monitor Buffer

Within the Scheduler Monitor buffer, the DMV stores a history of 4 hours uptime with minute by minute data points (in total 256 entries) with the CPU utilization for the SQL Server, other processes, respectively the system idle time as percentages. It allows thus to identify the peaks in CPU utilization and thus to determine the intervals of focus for further troubleshooting. As the data are stored within an XML structure, the values can be queried via the XQuery syntax as follows:

-- cpu utilization for SQL Server and other applications
DECLARE @ts_now bigint = (SELECT cpu_ticks/(cpu_ticks/ms_ticks)
        FROM sys.dm_os_sys_info); 

SELECT DAT.record_id
, DAT.EventTime
, DAT.SQLProcessUtilization 
, DAT.SystemIdle 
, 100 - (DAT.SystemIdle + DAT.SQLProcessUtilization) OtherUtilization
FROM ( 
	SELECT record.value('(./Record/@id)[1]', 'int') record_id
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') SystemIdle 
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') SQLProcessUtilization
	, EventTime 
	FROM ( 
		SELECT DATEADD(ms, -1 * (@ts_now - [timestamp]), GETDATE()) EventTime
		, [timestamp]
		, CONVERT(xml, record) AS [record] 
		FROM sys.dm_os_ring_buffers 
		WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' 
		  AND record LIKE N'%<SystemHealth>%') AS x 
	) AS DAT
ORDER BY DAT.record_id DESC;

If the SQL Server is not busy as all, the SQL Server utilization time may tend to 0%, while the system idle time to 90%. (It's the case of my SQL Server lab.)

CPU Utilization for my home SQL Server lab

Notes:

If the server was restarted within the last 4 hours, then the points will have a gap between two readings corresponding to the downtime interval.

The query is supposed to run also on Linux machines, though the SystemIdle time will be 0. One can thus consider the SQL and non-SQL CPU utilization.

Storing the History

The above query can be run on a regular basis (e.g. every 3-4 hours) via a SSIS package and push the data into a table for historical purposes. Because is needed to have a continuous history of the readings, it's better if the gap between runs is smaller than the 4 hours. No matter of the approach used is better to check for overlappings when storing the data:

-- dropping the table
-- DROP TABLE IF EXISTS dbo.T_RingBufferReadings 

-- reinitilizing the history
-- TRUNCATE TABLE dbo.T_RingBufferReadings

-- creating the table
CREATE TABLE dbo.T_RingBufferReadings (
  Id bigint IDENTITY (1,1) NOT NULL
, RecordId bigint 
, EventTime datetime2(3) NOT NULL
, SQLProcessUtilization int NOT NULL
, SystemIdle int NOT NULL
, OtherUtilization int NOT NULL
)


-- reviewing the data
SELECT *
FROM dbo.T_RingBufferReadings 
ORDER BY EventTime DESC

If there are many records, to improve the performance, one can create also an index, which can include the reading points as well:

-- creating a unique index with an include 
CREATE UNIQUE NONCLUSTERED INDEX [UI_T_RingBufferReadings_EventTime] ON dbo.T_RingBufferReadings
(
	EventTime ASC,
    RecordId ASC
) INCLUDE (SQLProcessUtilization, SystemIdle, OtherUtilization)
GO

The above query based on the DMV becomes:

-- cpu utilization by SQL Server and other applications
DECLARE @ts_now bigint = (SELECT cpu_ticks/(cpu_ticks/ms_ticks)
        FROM sys.dm_os_sys_info); 

INSERT INTO dbo.T_RingBufferReadings
SELECT record_id
, DAT.EventTime
, DAT.SQLProcessUtilization 
, DAT.SystemIdle 
, 100 - (DAT.SystemIdle + DAT.SQLProcessUtilization) OtherUtilization
FROM ( 
	SELECT record.value('(./Record/@id)[1]', 'int') record_id
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/SystemIdle)[1]', 'int') SystemIdle 
	, record.value('(./Record/SchedulerMonitorEvent/SystemHealth/ProcessUtilization)[1]', 'int') SQLProcessUtilization
	, EventTime 
	FROM ( 
		SELECT DATEADD(ms, -1 * (@ts_now - [timestamp]), GETDATE()) EventTime
		, [timestamp]
		, CONVERT(xml, record) AS [record] 
		FROM sys.dm_os_ring_buffers 
		WHERE ring_buffer_type = N'RING_BUFFER_SCHEDULER_MONITOR' 
		  AND record LIKE N'%<SystemHealth>%') AS x 
	) AS DAT
	LEFT JOIN dbo.T_RingBufferReadings RBR
	  ON DAT.record_id = RBR.Recordid 
WHERE RBR.Recordid IS NULL
ORDER BY DAT.record_id DESC;

Note:
A ServerName column can be added to the table if is needed to store the values for different SQL Servers. Then the LEFT JOIN has to consider the new added column.
Either of the two queries can be used to display the data points within a chart via SSRS, Power BI or any reporting tool available.

Happy coding!

References:
[1] Grant Fritchey (2014) SQL Server Query Performance Tuning: Troubleshoot and Optimize Query Performance in SQL Server 2014, 4th Ed.
[2] Sunil Agarwal et al (2005), Troubleshooting Performance Problems in SQL Server 2005, Source: TShootPerfProbs.docx
[3] Sunil Agarwal et al (2008), Troubleshooting Performance Problems in SQL Server 2008, Source: TShootPerfProbs2008.docx
[4] Microsoft SQL Docs (2018) Related Dynamic Management Views, Source
[5] Microsoft SQL Docs (2017) Use ring buffers to obtain health information about Always On availability groups, Source

Previous Post <<||>> Previous Post

🪄SSRS (& Paginated Reports): Design (Part II: Parameter Dependencies in Dropdowns)

Introduction

Quite often there are dependencies between the values of two distinct dropdowns, typically the values representing the different levels of a direct on implied hierarchy. This means that the choice at the upper level determines the values which will be shown at a lower level, the corresponding query needing to handle the value as a parameter. There are several ways on how one can handle this, the choice depending also on the data provider considered for the report.

Because the AdventureWorks database is used within an SQL Server, the available choices are MSOLEDBSQL, SQLOLEDB, ODBC or SQL Server Native Client. It should be noted that the SQLOLEDB and the Native Clients providers have been deprecated and shouldn't be used for new developments! In the project was used the Microsoft OLE DB Provider for SQL Server (MSOLEDBSQL) provider.

For exemplification will be used the reports build previously. The hierarchy will be built based on the State and Territory, a Country belonging to a Territory, while a State to a Country. For the beginning we will build the dependency existing between the Country and State.

First Level of Dependency

The text parameter for StateCountry dataset will be defined similarly to the one consider for the Country, and have as source the following query:

-- State Provinces
SELECT '' Id
, '(All)' Name
UNION ALL
SELECT StateProvinceCode
, StateProvinceCode + ' ' + Name
FROM [Person].[StateProvince]
WHERE CountryRegionCode = ISNULL(NullIf(?, ''), CountryRegionCode)
ORDER BY Id

The Country parameter will be handled in the WHERE constraint - it's the role of the question mark. When an empty string was selected, the equivalent for ignoring the parameter, the formula used will result to an identity, which however makes sense only if there's always a value for the Country Code in the Person.StateProvince table. Otherwise the same line needs to be written as follows:

WHERE IsNull(CountryRegionCode, '') = COALESCE(NullIf(?, ''), CountryRegionCode, '')

This might look strange, though it does the trick of handling the nulls. According to the rules of defensive programming, one should treat the eventuality of nulls. However, this depends also on application's design. If one develops reports with this supposition, then the supposition should be made explicit in the documentation!

Once the query provided, SSRS recognizes that a parameter was provided and requires to map the question mark to the respective parameter. Within dataset's properties click on the "Parameter" tab and selected @Country as Parameter Value:

Parameters in Dataset Properties

A question mark needs to be considered in the query for each parameter, the parameters from the Dataset Properties being displayed in the order in which they appear in the query.

Once the dataset defined, one can go on and define the StateProvince parameter similarly to the Country parameter and consider the StateProvinces dataset as source. The StateProvince parameter needs to be added also in report's header:
= "State Province: " & Parameters!StateProvince.Label

Within the Addresses main dataset there needs to be added the declaration for the StateProvince parameter, while in where the corresponding clause:

-- SQL Parameter
& " DECLARE @StateProvince as nvarchar(3) = '" & Parameters!StateProvince.Value & "'" & vbcrlf 

-- constraint in WHERE clause
& IIf(Parameters!StateProvince.Value<> "", "     AND SIC.StateProvinceCode = @StateProvince", "") & vbcrlf W

With these changes the report should be ready to run.

Notes:
The above query for the StateProvince dataset can be used inside of a formula with a few small changes. This should work independently of the provider used. Just replace the query in the formula and delete the parameters previously created for the data source via the Dataset Properties:

= " -- State Provinces " & vbCrLf
 & " DECLARE @Country as nvarchar(3) = '" & Parameters!Country.Value & "'" & vbcrlf 
 & " SELECT '' Id " & vbCrLf
 & " , '(All)' Name " & vbCrLf
 & " UNION ALL " & vbCrLf
 & " SELECT StateProvinceCode " & vbCrLf
 & " , StateProvinceCode + ' ' + Name " & vbCrLf
 & " FROM [Person].[StateProvince] " & vbCrLf
 & IIf(Parameters!Country.Value<>"", "     WHERE CountryRegionCode = @Country", "") & vbcrlf 
 & " ORDER BY Id " & vbCrLf

Second Level of Dependency

The second dependency is based on the Territory Group formed of North America, Pacific and Europe. There is however no Territory Group table, the relations between Countries and Territory Groups being stored into the Sales.SalesTerritory table. Therefore, the query for the corresponding dropdown can be written via a DISTINCT:

-- Territories
SELECT '' Id
, '(All)' Name
UNION ALL
SELECT DISTINCT [Group] Id
, [Group] Name
FROM Sales.SalesTerritory
ORDER BY Id

When the user selects a Territory, in the Countries and State Province dropdowns are supposed to be shown only the corresponding values for the selected value. We need to modify thus both queries via formulas.
The query for the Country dataset includes the Territory parameter and the corresponding constraint via an EXIST:

= "  -- Countries  " & vbCrLf 
& " DECLARE @Territory as nvarchar(50) = '" & Parameters!Territory.Value & "'" & vbcrlf 
& " SELECT '' Id  " & vbCrLf 
& " , '(All)' Name  " & vbCrLf 
& " UNION ALL  " & vbCrLf 
& " SELECT SSP.CountryRegionCode Id  " & vbCrLf 
& " , SSP.Name   " & vbCrLf 
& " FROM [Person].[CountryRegion] SSP " & vbCrLf 
& IIf(Parameters!Territory.Value<> "", "     WHERE EXISTS (SELECT * FROM [Sales].[SalesTerritory] SST WHERE SSP.CountryRegionCode = SST.CountryRegionCode AND SST.[Group] = @Territory)", "") & vbcrlf 
& " ORDER BY Id " & vbCrLf

In exchange the StateProvince dataset is slightly more complex:

= " -- State Provinces " & vbCrLf
 & " DECLARE @Country as nvarchar(3) = '" & Parameters!Country.Value & "'" & vbcrlf 
 & " DECLARE @Territory as nvarchar(50) = '" & Parameters!Territory.Value & "'" & vbcrlf 
 & " SELECT '' Id " & vbCrLf
 & " , '(All)' Name " & vbCrLf
 & " UNION ALL " & vbCrLf
 & " SELECT PSP.StateProvinceCode " & vbCrLf
 & " , PSP.StateProvinceCode + ' ' + PSP.Name " & vbCrLf
 & " FROM [Person].[StateProvince] PSP" & vbCrLf
 & "      JOIN [Sales].[SalesTerritory] PST" & vbCrLf
 & "       ON PSP.TerritoryId = PST.TerritoryId " & vbCrLf
 & " WHERE 0=0 " & vbCrLf
 & IIf(Parameters!Country.Value<> "", "     AND PSP.CountryRegionCode = @Country", "") & vbcrlf 
 & IIf(Parameters!Territory.Value<> "", "     AND PST.[Group] = @Territory", "") & vbcrlf 
 & " ORDER BY Id " & vbCrLf

Once the datasets changed, is still needed to modify the order of the parameters to reflect the order in which they are called within dependency, respectively Territory, State Province, Country. In the end the report looks like this:

Notes:
(1) If the order of the parameters doesn't reflect the otder in which they are called, you'll get a generic error similar with the following one (the error appears when there are problems with the query):
An error occurred during local report processing. The definition of the report '/Customer Addresses' is invalid. The CommandText expression for the query 'Countries' contains an error: [BC30277] Type character '8e does not match declared data type 'String'.

(2) When designing parameters one should consider also the best practices for avoiding the poor design of parameters.

Previous Post <<||>> Next Post

28 June 2020

𖣯Strategic Management: Strategy Design (Part II: A System's View)

Each time one discusses in IT about software and hardware components interacting with each other, one talks about a composite referred to as a system. Even if the term Information System (IS) is related to it, a system is defined as a set of interrelated and interconnected components that can be considered together for specific purposes or simple convenience.

A component can be a piece of software or hardware, as well persons or groups if we extend the definition. The consideration of people becomes relevant especially in the context of ecologies, in which systems are placed in a broader context that considers people’s interaction with them, as this raises to important behavior that impacts system’s functioning.

Within a system each part has a role or function determined in respect to the whole as well as to the other parts. The role or function of the component is typically fixed, predefined, though there are also exceptions especially when the scope of a component is enlarged, respectively reduced to the degree that the component can be removed or ignored. What one considers or not considers as part of system defines a system’s boundaries; it’s what distinguishes it from other systems within the environment(s) considered.

The interaction between the components resumes in the exchange, transmission and processing of data found in different aggregations ranging from signals to complex data structures. If in non-IT-based systems the changes are determined by inflow, respectively outflow of energy, in IT the flow is considered in terms of data in its various aggregations (information, knowledge). The data flow (also information flow) represents the ‘fluid’ that nourishes a system’s ‘organism’.

One can grasp the complexity in the moment one attempts to describe a system in terms of components, respectively the dependencies existing between them in term of data and processes. If in nature the processes are extrapolated, in IT they are predefined (even if the knowledge about them is not available). In addition, the less knowledge one has about the infrastructure, the higher the apparent complexity. Even if the system is not necessarily complex, the lack of knowledge and certainty about it makes it complex. The more one needs to dig for information and knowledge to get an acceptable level of knowledge and logical depth, the more time is needed for designing a solution.

Saint Exupéry’s definition of simplicity applies from a system’s functional point of view, though it doesn’t address the relative knowledge about the system, which often is implicit (in people’s heads). People have only fragmented knowledge about the system which makes it difficult to create the whole picture. It’s typically the role of system or process operational manuals, respectively of data descriptions, to make that knowledge explicit, also establishing a fundament for common knowledge and further communication and understanding.

Between the apparent (perceived) and real complexity of a system there’s an important gap that needs to be addressed if one wants to manage the systems adequately, respectively to simplify the systems. Often simplification happens when components or whole systems are replaced, consolidated, or migrated, a mix between these approaches existing as well. Simplifications at data level (aka data harmonization) or process level (aka process optimization and redesign) can have an important impact, being inherent to the good (optimal) functioning of systems.

Whether these changes occur in big-bang or gradual iterations it’s a question of available resources, organizational capabilities, including the ability to handle such projects, respectively the impact, opportunities and risks associated with such endeavors. Beyond this, it’s important to regard the problems from a systemic and systematic point of view, in which ecology’s role is important.

Previous Post <<||>> Next Post

Written: Jun-2020, Last Reviewed: Mar-2024

27 June 2020

💠🛠️SQL Server: Undocumented (Part II: Execute Command for Each Table)

One neat undocumented feature useful in the administration of the SQL Server is the sp_MSforeachtable stored procedure which allows executing a set of several commands against each table available into the current database, or, with a few changes, against a set of predefined tables. It allows running up to 3 commands as part of the main processing, as well a pre-command, typically used to set variable environments or perform initializations, respectively a post-command, typically used for control and cleanup processes. It replaces thus the need to run the commands within a cursor, even if the latter can occasionally provide more flexibility (see example).

The stored procedures provides the following parameters:

@command1, @command2, @command3: the main commands to be executed (at least the first command must be executed)

@precommand: a command to be executed before the @command1,

@postcommand: a command to be executed after all the commands were executed successfully,

@replacechar: the character used to represent the table within the commands, by default a question mark (?) character

@whereand: used to limit the scope only to a set of tables

Before the introduction of Data Management Views with SQL Server 2005, the stored procedure was used to get the number of records for all tables:

-- getting the number of records for each table
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'

To restrict running the command(s) only for a table, the object_id function can be used within the @whereand parameter:

-- getting the number of records for a table
EXEC sp_MSforeachtable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?' --
      , @whereand = ' And Object_id = Object_id(''[Person].[Address]'')'

In AX 2009 and even the later versions including Dynamics 365 (as long the direct access to the database is possible) the number of records could be restricted to a given company (aka business unit):

-- getting the number of records from a mandant
sp_MSforEachTable @command1 = 'SELECT ''?'' [Table], COUNT(*) NoRecords FROM WHERE DataAreaId IN (''DAT'')'

The procedude can be used to perform CRUD operayions on each table. During a data migration it was possible to clean out the business units not needed with just a simple script:

--delete the data for multiple data areas
 sp_MSforEachTable @command1 = 'DELETE FROM ? WHERE DataAreaId IN (''m01'')'

Administrations tasks can be performed as well, e.g. updating statistics, performing consistency checks for each table, etc.:

-- for each table 
EXEC sp_MSforeachtable 'EXEC sp_spaceused ''?''' -- displaying the disk space researved/used
EXEC sp_MSforeachtable 'UPDATE statistics ? WITH ALL' -- updating statistics
EXEC sp_MSforeachtable 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy

A set of tasks can be performed for a single table by providing the constraint within the @whereand parameter:

-- executing multiple commands for a table
EXEC sp_MSforeachtable @command1='UPDATE statistics ? WITH ALL' -- updating statistics
   , @command2= 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy 
   , @whereand = ' And Object_Name(Object_id) = ''[Person].[Address]'''

Executing the commands against a subset of tables involves using a persisted or temporary table to store the name of the tables which shoul be considered, eventually with further metadata to allow filtering:

-- dropping the table
-- DROP TABLE IF EXISTS #Tables

-- create a temporary table
SELECT TableName
INTO #Tables 
FROM (VALUES ('[Person].[Address]')
, ('[Person].[AddressType]')
, ('[Person].[BusinessEntity]')) DAT(TableName)

-- getting the number of records for the list of tables
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'
, @whereand = ' And Object_id In (Select Object_id(TableName) FROM #Tables)'

Warnings:
1) Do not forget to dump the temporary table when finished!
2) The code is provided only for exemplification purposes. You can use the above code on your own risk!
3) Undocumented features can be deprecated in future versions of SQL Server, therefore they should be used with precaution in long-term solutions. It doesn't run for example in Microsoft Fabric warehouses.

Happy coding!

25 June 2020

💠🛠️SQL Server: Undocumented (Part I: Reading the SQL Server Log)

Starting SQL Server 2005 Microsoft made available the sys.xp_readerrorlog system stored procedure, which allows reading the SQL Server as well the SQL Server Agent logs one by one. Since I learned about it, I prefer using it to search for errors in the log instead of using the Management Studio because it allows more flexibility and often seems to be much faster than the Log File Viewer.

The stored procedure exposes the following parameters:
p1: numeric value identifying the index of the log file: 0 = current, 1 = archive #1, 2 = archive #2, ...
p2: log file type: 1 or NULL = SQL Server log, 2 = SQL Agent log
p3: the string to search for (e.g. Error)
p4: a second string t to search for (allows refining the results)
p5: start time of the events to be considered
p6: end time of the events to be considered
p7: sorting order for results: N'asc' = ascending, N'desc' = descending

The only required parameter is the first, thus the stored procedure can be called as follows:

-- reading the current SQL Server log (to be run individually)

EXEC xp_readerrorlog 0
EXEC xp_readerrorlog 0, 1
EXEC xp_readerrorlog 0, 1, N'Login failed' -- filters for failed logins 
EXEC xp_readerrorlog 0, 1, N'DBCC CHECKDB' -- filters for database consistency checks 
EXEC xp_readerrorlog 0, 1, N'Error' -- filters the Errors
EXEC xp_readerrorlog 0, 1, N'Error', N'Severity: 14' -- filters the Errors with severity 14
EXEC xp_readerrorlog 0, 1, N'', N'', '20200601', NULL, NULL -- start date
EXEC xp_readerrorlog 0, 1, N'', N'', '20200601', '20200607', NULL -- start date & end date 
EXEC xp_readerrorlog 0, 1, N'', N'', NULL, NULL, N'asc' -- ascending sorting

-- reading the last SQL Server log
EXEC xp_readerrorlog 1, 1


-- reading the current SQL Agent log
EXEC xp_readerrorlog 0, 2

Especially when filtering for certain information, the output can be displayed easily into a SSRS report. Besides errors typically I'm looking for failed logins or the results of consistency checks.

The number of log files available depends on how the SQL Server Logs were configured.

Even if the stored procedure provides the needed information fast, the error and the corresponding description are shown in separate records with the same timestamp. To process the records one can use a temporary table matching output's definition. To facilitate the selection of consecutive records a primary key can be added as well. One can dump into the table some or all of the log files and perform various searches or analyses.

-- dropping the table
--DROP TABLE IF EXISTS #tbl 

-- creating a temporary table
CREATE TABLE #tbl (
  id int IDENTITY(1,1) NOT NULL
, LogDate datetime2(3)
, ProcessInfo nvarchar(50)
, LogText varchar(max))    
  
-- reading the current log into it
INSERT INTO #tbl        
EXEC sys.xp_readerrorlog 0, 1

-- reading the last archive log into it
INSERT INTO #tbl        
EXEC sys.xp_readerrorlog 1, 1

-- retrieving the errors together with their descriptions 
SELECT A.LogDate
, A.ProcessInfo
, A.LogText
, B.LogText
FROM #tbl A
     JOIN #tbl B
    ON A.Id = B.Id - 1
WHERE A.LogText LIKE '%Error:%'

Here's the output of the last query (it will look differently on your server):

The temporary table can be used for further analyses (e.g. displaying the first, respectively last occurrence of the error together with a count):

-- first/last occurence of an error 
SELECT Min(A.LogDate) FirstOccurence
, Max(A.LogDate) LastOccurence
, count(*) NoRecords 
, A.LogText
, B.LogText
FROM #tbl A
     JOIN #tbl B
    ON A.Id = B.Id - 1
WHERE A.LogText LIKE '%Error:%'
GROUP BY A.LogText
, B.LogText

Warnings:
Do not forget to dump the temporary table when finished!
The code is provided only for exemplification purposes. You can use the above code on your own risk!
Be careful when the number of records in the logs is too big, because reading all the records can create temporary performance issues!

Undocumented features can be deprecated in future versions of SQL Server, therefore they should be used with precaution in long-term solutions.

Happy coding!

24 June 2020

𖣯Strategic Management: Strategy Design (Part I: Simple, but not that Simple)

Strategic Management Series

Simplicity of design has been for centuries the wholly grail of architects, while software designers seem to situate themselves in opposition with the trend, as they aim using a mix of technologies that usually increase architecture’s complexity (sometimes the many, the newer and fancier, the better). Unfortunately, despite the implied but not necessarily reachable potential, each component added to an information system or infrastructure has the potential of increasing the overall complexity by a factor proportional to the degree of interactions it creates, respectively by the number of issues it creates or allows to propagate through these interactions.

Conversely, one talks about simplicity in IT without stating what is intended by it, and it can mean many things. Quite often the aim is packed within the ‘keep it simple stupid’ (aka KISS) mantra, a modern and pejorative alternative of Occam’s razor. KISS became a principle in software architecture design, and it can mean that a simple solution works better than a complex one, or that pursuing something in the simplest manner possible is usually better. The nuances are wide enough to cover a wide spectrum of solutions, arriving at statements that the simplest choice to make is the most appropriate one to make, thing that’s not necessarily true in IT, where complexity finds itself home.

Starting with the important number of technologies coexisting in integrations and ending with the exceptions existing in processes or the quality of data, things are almost never as simple as one may wish. An IT infrastructure’s complexity is dependent on the number of existing components, on whether they come from different generations or come from different vendor, on whether are deployed on different operating systems or are supported by different service providers, on the number of customizations made, on the degree of overlapping of the data and integrations needed to keep the data in synch, respectively of the differences existing in data models, quality and use. In general, the more variance, randomness, and challenges one has, the higher the overall complexity.

Paraphrasing Saint Exupéry, in IT simplicity is reached when there is no longer anything to add or anything to take away, or in Hans Hofmann’s words, simplicity is reflected in ‘the ability to simplify means to eliminate the unnecessary so that the necessary may speak’. This refers to the features, what a piece of software can do, respectively the functionality, how a certain outcome is reached, which arrive to be packed in various logical aggregations (function point, functional requirement, story, epic, model, product, etc.) or physical aggregations (classes, components, packages, services, models, etc.). These are the levels at which one needs to address simplicity adequately.

To make something simple one must be able either to design a solution up to the detail that there’s nothing to add or remove, or to start with something and remove or things to reach simplicity. Both approaches involve a considerable effort, time, and multiple iterations, however the first approach can easily become utopian as some architectures are so complex that sooner or later the second approach comes into play. Therefore, one needs in general to focus on what seems an optimal solution and optimize it continuously in further iterations. Aiming for perfection from the beginning or also later in the improvement process is a foolhardy wish.

Even if simplicity is hard to achieve, one can still talk about the elegance of a solution, scenarios in which the various components fit together like the pieces of a puzzle, or about robustness, reliability, correctness, maintainability, (re)usability, or learnability. These latter characteristics are known in Software Engineering as (software) quality attributes.

SQL Troubles

Pages

06 July 2020

🪄SSRS (& Paginated Reports): First Steps (Part IV: Matrix Report Display)

💠🛠️🪄SQL Server: Undocumented (Part III: SQL Server CPU Utilization via the Ring Buffer)

🪄SSRS (& Paginated Reports): Design (Part II: Parameter Dependencies in Dropdowns)

28 June 2020

𖣯Strategic Management: Strategy Design (Part II: A System's View)

27 June 2020

💠🛠️SQL Server: Undocumented (Part II: Execute Command for Each Table)

25 June 2020

💠🛠️SQL Server: Undocumented (Part I: Reading the SQL Server Log)

24 June 2020

𖣯Strategic Management: Strategy Design (Part I: Simple, but not that Simple)

About Me