SQL Troubles: consistency check

Showing posts with label consistency check. Show all posts

06 July 2020

🪄SSRS (& Paginated Reports): First Steps (Part IV: Matrix Report Display)

Introduction

SQL Server Reporting Services (SSRS) allows grouping data into a matrix format based on one or more groups. By using the Report Wizard one can simplify considerably the volume of work.

The considered example is based on the AdventureWorks2014 database and considers Product's Inventory as base for building the report.

Preparing the Data

Usually it's useful to incorporate the logic for a report in one or more views, allowing thus to reuse the views in multiple reports. For the current report is needed to create two views, one for the Products, respectively Production.vProductInventory for the inventory.

-- dropping the vProducts view (cleaning after)
--DROP VIEW IF EXISTS [Production].[vProducts]

-- creating the vProducts view
CREATE VIEW [Production].[vProducts]
AS 
SELECT p.[ProductID] 
, p.ProductNumber
, p.[Name] 
, IsNull(p.Size, '') + IsNull(' ' + p.SizeUnitMeasureCode, '') Size
, p.Color
, P.Style
, p.ProductModelID
, pm.[Name] AS [ProductModel] 
, p.StandardCost 
, P.ListPrice
, P.SafetyStockLevel
, P.ReorderPoint
, p.SellStartDate 
, p.SellEndDate
, p.ProductSubcategoryID
, PPS.Name Subcategory
, PPS.ProductCategoryID
, PPC.Name Category
, P.MakeFlag
, P.FinishedGoodsFlag
FROM [Production].[Product] p 
     LEFT JOIN [Production].[ProductModel] pm 
       ON p.[ProductModelID] = pm.[ProductModelID] 
	 LEFT JOIN Production.ProductSubcategory PPS
	   ON P.ProductSubcategoryID = PPS.ProductSubcategoryID 
	      LEFT JOIN Production.ProductCategory PPC
		    ON PPS.ProductCategoryID = PPC.ProductCategoryID 
GO

-- reviewing the data 
SELECT *
FROM [Production].[vProducts]

-- checking the view for internal data consistency
SELECT count(*)
FROM [Production].[Product] p 
     LEFT JOIN [Production].[ProductModel] pm 
       ON p.[ProductModelID] = pm.[ProductModelID] 
	 LEFT JOIN Production.ProductSubcategory PPS
	   ON P.ProductSubcategoryID = PPS.ProductSubcategoryID 
	      LEFT JOIN Production.ProductCategory PPC
		    ON PPS.ProductCategoryID = PPC.ProductCategoryID

-- dropping the vProductInventory view (cleaning after)
--DROP VIEW IF EXISTS Production.vProductInventory

-- creating the view 
CREATE VIEW Production.vProductInventory
AS
SELECT PPI.ProductId 
, PPD.ProductNumber
, PPD.Name ProductName 
, PPD.ProductModel
, PPD.Size
, PPD.Category
, PPD.Subcategory
, PPD.Style
, PPD.StandardCost
, PPD.ListPrice 
, PPD.StandardCost * PPI.Quantity InventoryValue 
, PPD.ListPrice * PPI.Quantity SalesValue
, PPD.MakeFlag
, PPI.Locationid 
, PPL.Name Location 
, PPI.Shelf 
, PPI.Bin 
, PPI.Quantity 
FROM [Production].[ProductInventory] PPI
     JOIN [Production].[vProducts] PPD
	   ON PPI.ProductID = PPD.ProductID
	 JOIN [Production].[Location] PPL
	   ON PPI.LocationID = PPL.LocationID

-- reviewing the data
SELECT *
FROM Production.vProductInventory

-- checking the view for internal data consistency
SELECT count(*)
FROM [Production].[ProductInventory] PPI
     JOIN [Production].[vProducts] PPD
	   ON PPI.ProductID = PPD.ProductID
	 JOIN [Production].[Location] PPL
	   ON PPI.LocationID = PPL.LocationID

Note:
It's important to check the internal consistency of the views or queries used, on whether the logic removes or duplicates data. For this one can run the query for the uppermost table, and add repeatedly one more join for each run to see whether the number of records remains the same. One can shortcut the validation by checking only the number of records from the base table and for the whole query, and only if there are differences use the previously mentioned approach. (This is how I observed that the Production.vProductDetails view is not usable, because it considers only the Products having a valid Model.)

Creating the Report

We can now use the Production.vProductInventory view to create the Product Inventory by Location report based on the following query:

-- Product Inventory by Location
SELECT PPI.Category
, PPI.Subcategory
, PPI.ProductNumber
, PPI.ProductName 
, PPI.ProductModel
, PPI.Size
, PPI.Style
, PPI.StandardCost
, PPI.ListPrice
, PPI.Location 
, PPI.Quantity 
, PPI.InventoryValue 
, PPI.SalesValue 
FROM Production.vProductInventory PPI
ORDER BY PPI.Category
, PPI.Subcategory
, PPI.ProductNumber

Note:
The attributes can be provided in the order in which they should be displayed in the report, reducing thus the overhead in the further steps.

Using the Reporting Wizard via the Add New Report select in the first step the data source, while in the next step provide the above query:

Design the Query

In the next step select the "Matrix" Report Type:

Select the Report Type

Within the "Design the Matrix" section assign the fields as follows (all the fields except the ones considered as Columns and Details will be considered as Rows):

Design the Matrix

Into the last step give the report a meaningful name (e.g. Product Inventory by Location):

Completing the Wizard

In theory the report is ready to run, however before doing that change the formatting by aligning the headers to the center and eventually change their size from 11 to 10 pixels, respectively rename the dataset (e.g. Inventory). To obtain the same information about the grouping as below change into the "Advanced Mode".

Design View

And here's the output (I had to scroll through the pages to find meaningful values, therefore part of the Details header is not shown):

Preview

Restructuring the Grouping

As can be seen into the Design Mode, the wizard created a grouping for each attribute provider into the Details (see matrix1_Category, matrix1_Subcategory, etc.). Therefore, the values will not be repeated for each row, which is typically desirable for visualizations but not when exporting the data to Excel for further processing. I prefer the latter version, therefore to obtain this behavior one can go on and delete all the grouping via "Delete group only" except the matrix1_Category:

Deleting the groups

This action deleted unfortunately all the Detailed headers except Category. To bring them back into the grouping double click on the and add each field into the Group expressions as below:

Group Properties

As final change before running the report one can add header names for the Detail attributes. After these changes reports' design looks as follows:

Report Design with one grouping

And here's the final report with the values repeating for each row:

Preview Report without formatting

Note:
To avoid removing the groupings, I prefer to add only one Detail field into the query, typically the field which will make the row unique into the output (e.g. Product Number) and add the further fields (actually replace the below query with the one above) after the Wizard created the report. One still needs to add the columns manually into the report. In the end the effort is similar.

-- Product Inventory by Location
SELECT PPI.ProductNumber
, PPI.Location 
, PPI.Quantity 
, PPI.InventoryValue 
, PPI.SalesValue 
FROM Production.vProductInventory PPI
ORDER BY PPI.Category
, PPI.Subcategory
, PPI.ProductNumber

Changing the Design

Report's design can be slightly improved by adding various formatting of the cells or values. One can use similar formatting as the ones consider in the previous post. The only thing difficult to implement will be a ranking function (see Ranking Rows in Reports). After the design changes here's the report:

Final Report

Note:
Of course, together with parameters one can also add totals after each Category or Subcategory to the report if needed, though the latter is more appropriate for design purposes and not for further data consumption.

Happy coding!

Previous Post <<||>> Next Post

27 June 2020

💠🛠️SQL Server: Undocumented (Part II: Execute Command for Each Table)

One neat undocumented feature useful in the administration of the SQL Server is the sp_MSforeachtable stored procedure which allows executing a set of several commands against each table available into the current database, or, with a few changes, against a set of predefined tables. It allows running up to 3 commands as part of the main processing, as well a pre-command, typically used to set variable environments or perform initializations, respectively a post-command, typically used for control and cleanup processes. It replaces thus the need to run the commands within a cursor, even if the latter can occasionally provide more flexibility (see example).

The stored procedures provides the following parameters:

@command1, @command2, @command3: the main commands to be executed (at least the first command must be executed)

@precommand: a command to be executed before the @command1,

@postcommand: a command to be executed after all the commands were executed successfully,

@replacechar: the character used to represent the table within the commands, by default a question mark (?) character

@whereand: used to limit the scope only to a set of tables

Before the introduction of Data Management Views with SQL Server 2005, the stored procedure was used to get the number of records for all tables:

-- getting the number of records for each table
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'

To restrict running the command(s) only for a table, the object_id function can be used within the @whereand parameter:

-- getting the number of records for a table
EXEC sp_MSforeachtable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?' --
      , @whereand = ' And Object_id = Object_id(''[Person].[Address]'')'

In AX 2009 and even the later versions including Dynamics 365 (as long the direct access to the database is possible) the number of records could be restricted to a given company (aka business unit):

-- getting the number of records from a mandant
sp_MSforEachTable @command1 = 'SELECT ''?'' [Table], COUNT(*) NoRecords FROM WHERE DataAreaId IN (''DAT'')'

The procedude can be used to perform CRUD operayions on each table. During a data migration it was possible to clean out the business units not needed with just a simple script:

--delete the data for multiple data areas
 sp_MSforEachTable @command1 = 'DELETE FROM ? WHERE DataAreaId IN (''m01'')'

Administrations tasks can be performed as well, e.g. updating statistics, performing consistency checks for each table, etc.:

-- for each table 
EXEC sp_MSforeachtable 'EXEC sp_spaceused ''?''' -- displaying the disk space researved/used
EXEC sp_MSforeachtable 'UPDATE statistics ? WITH ALL' -- updating statistics
EXEC sp_MSforeachtable 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy

A set of tasks can be performed for a single table by providing the constraint within the @whereand parameter:

-- executing multiple commands for a table
EXEC sp_MSforeachtable @command1='UPDATE statistics ? WITH ALL' -- updating statistics
   , @command2= 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy 
   , @whereand = ' And Object_Name(Object_id) = ''[Person].[Address]'''

Executing the commands against a subset of tables involves using a persisted or temporary table to store the name of the tables which shoul be considered, eventually with further metadata to allow filtering:

-- dropping the table
-- DROP TABLE IF EXISTS #Tables

-- create a temporary table
SELECT TableName
INTO #Tables 
FROM (VALUES ('[Person].[Address]')
, ('[Person].[AddressType]')
, ('[Person].[BusinessEntity]')) DAT(TableName)

-- getting the number of records for the list of tables
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'
, @whereand = ' And Object_id In (Select Object_id(TableName) FROM #Tables)'

Warnings:
1) Do not forget to dump the temporary table when finished!
2) The code is provided only for exemplification purposes. You can use the above code on your own risk!
3) Undocumented features can be deprecated in future versions of SQL Server, therefore they should be used with precaution in long-term solutions. It doesn't run for example in Microsoft Fabric warehouses.

Happy coding!

29 October 2018

💠🛠️SQL Server: Administration (Searching the Error Log)

Searching for a needle in a haystack is an achievable task though may turn to be daunting. Same can be said about searching for a piece of information in the SQL error log. Fortunately, there is xp_readerrorlog, an undocumented (extended) stored procedure, which helps in the process. The stored procedure makes available the content of the error log and provides basic search capabilities via a small set of parameters. For example, it can be used to search for errors, warnings, failed backups, consistency checks, failed logins, databases instant file initializations, and so on. It helps identify whether an event occurred and the time at which the event occurred.

The following are the parameter available with the stored procedure:

Parameter	Name	Type	Description
1	FileToRead	int	0 = Current, 1 or 2, 3, … n Archive Number
2	Logtype	int	1 = SQL Error Log and 2 = SQL Agent log
3	String1	varchar(255)	the string to match the logs on
4	String2	varchar(255)	a second string to match in combination with String1 (AND)
5	StartDate	datetime	beginning date to look from
6	EndDate	datetime	ending date to look up to
7	ResultsOrder		ASC or DESC sorting

Note:
If the SQL Server Agent hasn’t been active, then there will be no Agent log and the call to the stored procedure will return an error.

Here are a few examples of using the stored procedure:

-- listing the content of the current SQL Server error log
EXEC xp_readerrorlog 0, 1

-- listing the content of the second SQL Server error log
EXEC xp_readerrorlog 1, 1

-- listing the content of the current SQL Server Agent log
EXEC xp_readerrorlog 0, 2

-- searching for errors 
EXEC xp_readerrorlog 0, 1, N'error'

-- searching for errors that have to do with consistency checks
EXEC xp_readerrorlog 0, 1, N'error', N'CHECKDB'

-- searching for errors that have to do with consistency checks
EXEC xp_readerrorlog 0, 1, N'failed', N'backups'

-- searching for warnings 
EXEC xp_readerrorlog 0, 1, N'warning'

-- searching who killed a session
EXEC xp_readerrorlog 0, 1, N'kill'

-- searching for I/O information
EXEC xp_readerrorlog 0, 1, N'I/O'

-- searching for consistency checks 
EXEC xp_readerrorlog 0, 1, N'CHECKDB'

-- searching for consistency checks performed via DBCC
EXEC xp_readerrorlog 0, 1, N'DBCC CHECKDB'

-- searching for failed logins  
EXEC xp_readerrorlog 0, 1, N'Login failed'

-- searching for 
EXEC xp_readerrorlog 0, 1, N'[INFO]'

-- searching for shutdowns 
EXEC xp_readerrorlog 0, 1, N'shutdown'

-- searching for a database instant file initialization event  
EXEC xp_readerrorlog 0, 1, N'database instant file initialization'

If the error log is too big it’s important to narrow the search for a given time interval:

-- searching for errors starting with a given date 
DECLARE @StartDate as Date = DateAdd(d, -1, GetDate())
EXEC xp_readerrorlog 0, 1, N'error', N'', @StartDate

-- searching for errors within a time interval 
DECLARE @StartDate as Date = DateAdd(d, -14, GetDate())
DECLARE @EndDate as Date = DateAdd(d, -7, GetDate())
EXEC xp_readerrorlog 0, 1, N'', N'', @StartDate, @EndDate, N'desc'

The output can be dumped into a table especially when is needed to perform a detailed analysis on the error log. It might be interesting to check how often an error message occurred, like in the below example. One can take thus advantage of more complex pattern searching.

-- creating the error log table 
CREATE TABLE dbo.ErrorLogMessages (
    LogDate datetime2(0) 
  , ProcessInfo nvarchar(255)
  , [Text] nvarchar(max))

-- loading the errors 
INSERT INTO dbo.ErrorLogMessages
EXEC xp_readerrorlog 0, 1

-- checking the results 
SELECT *
FROM dbo.ErrorLogMessages

-- checking messages frequency 
SELECT [Text]
, count(*) NoOccurrences
, Min(LogDate) FirstOccurrence
FROM dbo.ErrorLogMessages
GROUP BY [Text]
HAVING count(*)>1
ORDER BY NoOccurrences DESC

-- getting the errors and their information 
SELECT *
FROM (
 SELECT *
 , Lead([Text], 1) OVER (PARTITION BY LogDate, ProcessInfo ORDER BY LogDate) PrevMessage
 FROM dbo.ErrorLogMessages
 ) DAT
WHERE [Text] LIKE '%error:%[0-9]%'

-- cleaning up 
--DROP TABLE IF EXISTS dbo.ErrorLogMessages

For those who don’t have admin permissions it is necessary to explicitly give execute permissions on the xp_readerrorlog stored procedure:

-- giving explicit permissions to account
GRANT EXECUTE ON xp_readerrorlog TO [<account_name>]

Personally, I’ve been using the stored procedure mainly to check whether error messages were logged for a given time interval and whether the consistency checks run without problems. Occasionally, I used it to check for failed logins or sessions terminations (aka kills).

Notes:
Microsoft warns that undocumented objects might change in future releases. Fortunately, xp_readerrorlog made it since SQL Server 2005 to SQL Server 2017, so it might make it further…
The above code was tested also on SQL Server 2017.

Happy coding!

SQL Troubles

Pages