SQL Troubles: cursors

Showing posts with label cursors. Show all posts

23 January 2025

💎🏭SQL Reloaded: Number of Records V (via Cursors, a Solution for Warehouses in Microsoft Fabric)

After deploying the sample warehouse available in Microsoft Fabric, I tried to check the number of records available in the deployed tables under the dbo schema. Surprisingly, the sys.partitions.count column has 0 values for all the tables associated with the respective schema (see post).

There are only a few tables available, and taking a record count for each table should be enough, which is relatively simple with the undocumented sp_MSForEachTable. Unfortunately, this approach doesn't work neither, so one needs to revert to the use of old-fashioned cursors (as I used to do in SQL Server 2000):

-- number of records via cursor
DECLARE @table_name nvarchar(150)
DECLARE @sql nvarchar(250)
DECLARE @number_records bigint 
DECLARE @number_tables int, @iterator int

DROP TABLE IF EXISTS dbo.#tables;

CREATE TABLE dbo.#tables (
  ranking int NOT NULL
, table_name nvarchar(150) NOT NULL
, number_records bigint
)

INSERT INTO #tables
SELECT row_number() OVER(ORDER BY object_id) ranking
, concat(schema_name(schema_id),'.', name) table_name
, NULL number_records
FROM sys.tables obj
WHERE obj.schema_id = schema_id('dbo')
ORDER BY table_name

SET @iterator = 1
SET @number_tables = IsNull((SELECT count(*) FROM #tables), 0)

WHILE (@iterator <= @number_tables)
BEGIN 
    SET @table_name = (SELECT table_name FROM #tables WHERE ranking = @iterator)
    SET @sql = CONCAT(N'SELECT @NumberRecords = count(*) FROM ', @table_name)

	BEGIN TRY
		--get the number of records
		EXEC sp_executesql @Query = @sql
		, @params = N'@NumberRecords bigint OUTPUT'
		, @NumberRecords = @number_records OUTPUT

		IF IsNull(@number_records, 0)> 0  
		BEGIN
                SET @sql = 'UPDATE #tables' 
             + ' SET number_records = ' + Str(@number_records)
             + ' WHERE table_name = ''' + @table_name + '''';

		 EXEC(@sql)
		END 
	END TRY
	BEGIN CATCH  
	 -- no action needed in case of error
        END CATCH;

	SET @iterator = @iterator + 1
END

SELECT *
FROM dbo.#tables;

--DROP TABLE IF EXISTS dbo.#tables;

Results:

ranking	table_name	number_records
1	dbo.Date	5844
2	dbo.Geography	305179
3	dbo.HackneyLicense	42958
4	dbo.Time	86400
5	dbo.Weather	526330
6	dbo.Trip	2838927
7	dbo.Medallion	13668

Comments:
1) It's a lot of code for a simple task, though the code can be easily duplicated and adapted for similar requirements. Unfortunately, it can lead in time also to many instances of the same code. When possible, one should consider maybe encapsulating the logic in a stored procedure.
2) It's usually a good idea to check how many records are available in the tables used for testing, as this can impact queries' performance and tables' appropriateness for the tests performed. Moreover, it's a good idea to understand the volume of data when taking over or working with a database.
3) If one removes the row_number function, the code should run also in SQL Server 2000. Similar solutions were used then for retrieving the record count.
4) Microsoft recommends not to drop the temporary tables explicitly, but let SQL Server handle this cleanup automatically and take thus advantage of the Optimistic Latching Algorithm, which helps prevent contention on TempDB [1]..
5) There are others who stumbled over this issue (see [1]).
6) The solution has been tested successfully also in SQL databases.
7) The whole code must be run together because the temporary table seems to have only a transitory scope! An attempt to rerun the last SELECT from #tables raises the error: "Invalid object name '#tables'"

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Koen Verbeeck (2024) Get row counts of all tables in a Microsoft Fabric warehouse [link]
[2] Haripriya SB (2024) Do NOT drop #temp tables (link)

15 January 2023

💎🏭SQL Reloaded: Data Management Views for the Synapse serverless SQL pool (& Microsoft Fabric Warehouse)

Unfortunately, the Dynamic Management Views (DMVs) for serverless SQL Server pools don't seem to be documented (or at least I haven't found them in the standard SQL Server documentation). I was thinking some weeks back how I could retrieve them easily as cursors aren't supported in serverless. In the end the old-fashioned loop got the job done (even if might not be the best way to do it):

-- retrieving the data management views in use with the number of records they held
DECLARE @view_name nvarchar(150)
DECLARE @sql nvarchar(250)
DECLARE @number_records bigint 
DECLARE @number_views int, @iterator int

DROP TABLE IF EXISTS dbo.#views;

CREATE TABLE dbo.#views (
  ranking int NOT NULL
, view_name nvarchar(150) NOT NULL
)

INSERT INTO #views
SELECT row_number() OVER(ORDER BY object_id) ranking
, concat(schema_name(schema_id),'.', name) view_name
FROM sys.all_views obj
WHERE obj.Type = 'V'
  AND obj.is_ms_shipped = 1
  --AND obj.name LIKE 'dm_exec_requests%'
ORDER BY view_name
SET @iterator = 1
SET @number_views = IsNull((SELECT count(*) FROM #views), 0)

WHILE (@iterator <= @number_views)
BEGIN 
    SET @view_name = (SELECT view_name FROM #views WHERE ranking = @iterator)
    SET @sql = CONCAT(N'SELECT @NumberRecords = count(*) FROM ', @view_name)

	BEGIN TRY
		--get the number of records
		EXEC sp_executesql @Query = @sql
		, @params = N'@NumberRecords bigint OUTPUT'
		, @NumberRecords = @number_records OUTPUT

		IF IsNull(@number_records, 0)> 0  
		BEGIN
		  SELECT @view_name, @number_records
		END 
	END TRY
	BEGIN CATCH  
	 -- no action needed in case of error
        END CATCH;

	SET @iterator = @iterator + 1
END

DROP TABLE IF EXISTS dbo.#views;

As can be seen the code above retrieves the system views and dumps them in a temporary table, then loops through each record and for each record retrieves the number of records available with the sp_executesql. The call to the stored procedure is included in a TRY/CATCH block to surpress the error messages, considering that many standard SQL Server DMVs are not supported. The error messages follow the same pattern:

"Msg 15871, Level 16, State 9, Line 187
DMV (Dynamic Management View) 'dm_resource_governor_resource_pool_volumes' is not supported."

On the instance I tested the code, from a total of 729 DMVs only 171 records were returned, though maybe there are some views not shown because the feature related to them was not yet configured:

View name	Description
INFORMATION_SCHEMA.COLUMNS	Returns one row for each column (*)
INFORMATION_SCHEMA.PARAMETERS	Returns one row for each parameter of a user-defined function or stored procedure (*)
INFORMATION_SCHEMA.ROUTINE_COLUMNS	Returns one row for each column returned by the table-valued functions (*)
INFORMATION_SCHEMA.ROUTINES	Returns one row for each stored procedure and function (*)
INFORMATION_SCHEMA.SCHEMATA	Returns one row for each schema in the current database
INFORMATION_SCHEMA.TABLES	Returns one row for each table or view in the current database (*)
INFORMATION_SCHEMA.VIEW_COLUMN_USAGE	Returns one row for each column in the current database that is used in a view definition
INFORMATION_SCHEMA.VIEW_TABLE_USAGE	Returns one row for each table in the current database that is used in a view
INFORMATION_SCHEMA.VIEWS	Returns one row for each view that can be accessed by the current user in the current database
sys.all_columns
sys.all_objects
sys.all_parameters
sys.all_sql_modules
sys.all_views
sys.allocation_units
sys.assemblies
sys.assembly_files
sys.assembly_types
sys.columns
sys.configurations
sys.credentials
sys.data_spaces
sys.database_automatic_tuning_options
sys.database_automatic_tuning_options_internal
sys.database_credentials
sys.database_files
sys.database_filestream_options
sys.database_mirroring
sys.database_mirroring_endpoints
sys.database_permissions
sys.database_principals
sys.database_query_store_internal_state
sys.database_query_store_options
sys.database_recovery_status
sys.database_resource_governor_workload_groups
sys.database_role_members
sys.database_scoped_configurations
sys.database_scoped_credentials
sys.databases
sys.dm_exec_connections
sys.dm_exec_query_stats
sys.dm_exec_requests	Returns information about each request that is executing in SQL Server.
sys.dm_exec_requests_history	Returns information about each request that executed in SQL Server; provided by Microsoft for troubleshooting.
sys.dm_exec_sessions
sys.dm_external_data_processed
sys.dm_os_host_info
sys.dm_request_phases	Returns information about each request phase performed in request's execution.
sys.dm_request_phases_exec_task_stats	Returns information about each task performed in request's execution.
sys.dm_request_phases_task_group_stats	Returns information aggregated at task group level about each task performed in request's execution.
sys.endpoints
sys.event_notification_event_types
sys.extended_properties
sys.external_data_sources
sys.external_file_formats
sys.external_language_files
sys.external_languages
sys.external_table_columns
sys.external_tables
sys.filegroups
sys.fulltext_document_types
sys.fulltext_languages
sys.fulltext_system_stopwords
sys.identity_columns
sys.index_columns
sys.indexes
sys.internal_tables
sys.key_encryptions
sys.linked_logins
sys.login_token
sys.master_files
sys.messages
sys.objects
sys.parameters
sys.partitions
sys.procedures
sys.query_store_databases_health
sys.query_store_global_health
sys.resource_governor_configuration
sys.resource_governor_external_resource_pools
sys.resource_governor_resource_pools
sys.resource_governor_workload_groups
sys.routes
sys.schemas
sys.securable_classes
sys.server_audit_specification_details
sys.server_audit_specifications
sys.server_audits
sys.server_event_session_actions
sys.server_event_session_events
sys.server_event_session_fields
sys.server_event_session_targets
sys.server_event_sessions
sys.server_memory_optimized_hybrid_buffer_pool_configuration
sys.server_permissions
sys.server_principals
sys.server_role_members
sys.servers
sys.service_contract_message_usages
sys.service_contract_usages
sys.service_contracts
sys.service_message_types
sys.service_queue_usages
sys.service_queues
sys.services
sys.spatial_reference_systems
sys.sql_dependencies
sys.sql_expression_dependencies
sys.sql_logins
sys.sql_modules
sys.stats
sys.stats_columns
sys.symmetric_keys
sys.sysaltfiles
sys.syscacheobjects
sys.syscharsets
sys.syscolumns
sys.syscomments
sys.sysconfigures
sys.syscurconfigs
sys.sysdatabases
sys.sysdepends
sys.sysfilegroups
sys.sysfiles
sys.sysindexes
sys.sysindexkeys
sys.syslanguages
sys.syslockinfo
sys.syslogins
sys.sysmembers
sys.sysmessages
sys.sysobjects
sys.sysoledbusers
sys.sysperfinfo
sys.syspermissions
sys.sysprocesses
sys.sysprotects
sys.sysservers
sys.system_columns
sys.system_components_surface_area_configuration
sys.system_internals_allocation_units
sys.system_internals_partition_columns
sys.system_internals_partitions
sys.system_objects
sys.system_parameters
sys.system_sql_modules
sys.system_views
sys.systypes
sys.sysusers
sys.tables
sys.tcp_endpoints
sys.time_zone_info
sys.trace_categories
sys.trace_columns
sys.trace_event_bindings
sys.trace_events
sys.trace_subclass_values
sys.trigger_event_types
sys.type_assembly_usages
sys.types
sys.user_token
sys.via_endpoints
sys.views
sys.xml_schema_attributes
sys.xml_schema_collections
sys.xml_schema_component_placements
sys.xml_schema_components
sys.xml_schema_facets
sys.xml_schema_model_groups
sys.xml_schema_namespaces
sys.xml_schema_types
sys.xml_schema_wildcards

Notes:
1) As can be seen, also the INFORMATION_SCHEMA views don't seem to be fully supprted.
2) "(*)" in description marks the views that can be accessed by the current user in the current database.

3) I removed the number of records as they are instance specific.
4) The code should work also on a dedicated SQL Server pool.
5) I hope to come back and showcase the usage of some of the most important views.

6) The script can be used for the Microsoft Fabric Warehouse, however each record will be shown in a different panel! One can use an additional temporary table to save the results or extend the views table and update the table with the result, like in the following script:

-- retrieving the data management views in use with the number of records they held
DECLARE @view_name nvarchar(150)
DECLARE @sql nvarchar(250)
DECLARE @number_records bigint 
DECLARE @number_views int, @iterator int

DROP TABLE IF EXISTS dbo.#views;

CREATE TABLE dbo.#views (
  ranking int NOT NULL
, view_name nvarchar(150) NOT NULL
, record_count bigint NULL
)

INSERT INTO #views
SELECT row_number() OVER(ORDER BY object_id) ranking
, concat(schema_name(schema_id),'.', name) view_name
, NULL record_count
FROM sys.all_views obj
WHERE obj.Type = 'V'
  AND obj.is_ms_shipped = 1
  --AND obj.name LIKE 'dm_exec_requests%'
ORDER BY view_name

SET @iterator = 1
SET @number_views = IsNull((SELECT count(*) FROM #views), 0)

WHILE (@iterator <= @number_views)
BEGIN 
    SET @view_name = (SELECT view_name FROM #views WHERE ranking = @iterator)
    SET @sql = CONCAT(N'SELECT @NumberRecords = count(*) FROM ', @view_name)

	BEGIN TRY
		--get the number of records
		EXEC sp_executesql @Query = @sql
		, @params = N'@NumberRecords bigint OUTPUT'
		, @NumberRecords = @number_records OUTPUT

		IF IsNull(@number_records, 0)>= 0  
		BEGIN
		  UPDATE #views
                  SET record_count = @number_records
                  WHERE view_name = @view_name
		END 
	END TRY
	BEGIN CATCH  
	 -- no action needed in case of error
    END CATCH;

	SET @iterator = @iterator + 1
END

SELECT *
FROM dbo.#views;

DROP TABLE IF EXISTS dbo.#views;

Happy coding!

Previous Post <<||>> Next Post

27 June 2020

💠🛠️SQL Server: Undocumented (Part II: Execute Command for Each Table)

One neat undocumented feature useful in the administration of the SQL Server is the sp_MSforeachtable stored procedure which allows executing a set of several commands against each table available into the current database, or, with a few changes, against a set of predefined tables. It allows running up to 3 commands as part of the main processing, as well a pre-command, typically used to set variable environments or perform initializations, respectively a post-command, typically used for control and cleanup processes. It replaces thus the need to run the commands within a cursor, even if the latter can occasionally provide more flexibility (see example).

The stored procedures provides the following parameters:

@command1, @command2, @command3: the main commands to be executed (at least the first command must be executed)

@precommand: a command to be executed before the @command1,

@postcommand: a command to be executed after all the commands were executed successfully,

@replacechar: the character used to represent the table within the commands, by default a question mark (?) character

@whereand: used to limit the scope only to a set of tables

Before the introduction of Data Management Views with SQL Server 2005, the stored procedure was used to get the number of records for all tables:

-- getting the number of records for each table
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'

To restrict running the command(s) only for a table, the object_id function can be used within the @whereand parameter:

-- getting the number of records for a table
EXEC sp_MSforeachtable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?' --
      , @whereand = ' And Object_id = Object_id(''[Person].[Address]'')'

In AX 2009 and even the later versions including Dynamics 365 (as long the direct access to the database is possible) the number of records could be restricted to a given company (aka business unit):

-- getting the number of records from a mandant
sp_MSforEachTable @command1 = 'SELECT ''?'' [Table], COUNT(*) NoRecords FROM WHERE DataAreaId IN (''DAT'')'

The procedude can be used to perform CRUD operayions on each table. During a data migration it was possible to clean out the business units not needed with just a simple script:

--delete the data for multiple data areas
 sp_MSforEachTable @command1 = 'DELETE FROM ? WHERE DataAreaId IN (''m01'')'

Administrations tasks can be performed as well, e.g. updating statistics, performing consistency checks for each table, etc.:

-- for each table 
EXEC sp_MSforeachtable 'EXEC sp_spaceused ''?''' -- displaying the disk space researved/used
EXEC sp_MSforeachtable 'UPDATE statistics ? WITH ALL' -- updating statistics
EXEC sp_MSforeachtable 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy

A set of tasks can be performed for a single table by providing the constraint within the @whereand parameter:

-- executing multiple commands for a table
EXEC sp_MSforeachtable @command1='UPDATE statistics ? WITH ALL' -- updating statistics
   , @command2= 'DBCC CHECKTABLE (''?'') WITH ALL_ERRORMSGS, NO_INFOMSGS' -- check table considerncy 
   , @whereand = ' And Object_Name(Object_id) = ''[Person].[Address]'''

Executing the commands against a subset of tables involves using a persisted or temporary table to store the name of the tables which shoul be considered, eventually with further metadata to allow filtering:

-- dropping the table
-- DROP TABLE IF EXISTS #Tables

-- create a temporary table
SELECT TableName
INTO #Tables 
FROM (VALUES ('[Person].[Address]')
, ('[Person].[AddressType]')
, ('[Person].[BusinessEntity]')) DAT(TableName)

-- getting the number of records for the list of tables
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) NoRecords FROM ?'
, @whereand = ' And Object_id In (Select Object_id(TableName) FROM #Tables)'

Warnings:
1) Do not forget to dump the temporary table when finished!
2) The code is provided only for exemplification purposes. You can use the above code on your own risk!
3) Undocumented features can be deprecated in future versions of SQL Server, therefore they should be used with precaution in long-term solutions. It doesn't run for example in Microsoft Fabric warehouses.

Happy coding!

25 October 2018

💎SQL Reloaded: Cursor and Linked Server for Data Import

There are times when is needed to pull some data (repeatedly) from one or more databases for analysis and SSIS is not available or there’s not much time to create individual packages via data imports. In such scenarios is needed to rely on the use of SQL Server Engine’s built-in support. In this case the data can be easily imported via a linked server into ad-hoc created tables in a local database. In fact, the process can be partially automated with the use of a cursor that iterates through a predefined set of tables.For exemplification I will use a SELECT instead of an EXEC just to render the results:

-- cleaning up
-- DROP TABLE dbo.LoadTables 

-- defining the scope
SELECT *
INTO dbo.LoadTables
FROM (VALUES ('dbo.InventTable')
    , ('dbo.InventDimCombination')
    , ('dbo.InventDim')
    , ('dbo.InventItemLocation')) DAT ([Table])


-- creating the stored procedure 
CREATE PROCEDURE dbo.pLoadData(
    @Table as nvarchar(50))
AS
/* loads the set of tables defiend in dbo.LoadTables */
BEGIN
   DECLARE @cTable varchar(50)

   -- creating the cursor
   DECLARE TableList CURSOR FOR
   SELECT [Table]
   FROM dbo.LoadTables
   WHERE [Table] = IsNull(@Table, [Table])
   ORDER BY [Table]

   -- opening the cursor
   OPEN TableList 

   -- fetching next record 
   FETCH NEXT FROM TableList
   INTO @cTable

   -- looping through each record 
   WHILE @@FETCH_STATUS = 0 
   BEGIN
 --- preparing the DROP TABLE statement 
        SELECT(' DROP TABLE IF EXISTS ' + @cTable + '')

        -- preparing the SELECT INTO STATEMENT
        SELECT( ' SELECT *' +
         ' INTO ' + @cTable +
                ' FROM [server].[database].[' + @cTable + ']')

 -- fetching next record 
 FETCH NEXT FROM TableList
 INTO @cTable
   END

   --closing the cursor
   CLOSE TableList 
   -- deallocating the cursor
   DEALLOCATE TableList 
END

Running the stored procedure for all the tables:

 -- Testing the procedure 
 EXEC dbo.pLoadData NULL -- loading all tables 

-- output 
 DROP TABLE IF EXISTS dbo.InventDim
 SELECT * INTO dbo.InventDim FROM [server].[database].[dbo.InventDim]

 DROP TABLE IF EXISTS dbo.InventDimCombination
 SELECT * INTO dbo.InventDimCombination FROM [server].[database].[dbo.InventDimCombination]

 DROP TABLE IF EXISTS dbo.InventItemLocation
 SELECT * INTO dbo.InventItemLocation FROM [server].[database].[dbo.InventItemLocation]

 DROP TABLE IF EXISTS dbo.InventTable
 SELECT * INTO dbo.InventTable FROM [server].[database].[dbo.InventTable]

Running the stored procedure for a specific table:

-- Testing the procedure

EXEC dbo.pLoadData 'dbo.InventTable' -- loading a specific table

-- output 
DROP TABLE IF EXISTS dbo.InventTable
SELECT * INTO dbo.InventTable FROM [server].[database].[dbo.InventTable]

Notes:
1) Having an old example of using a cursor (see Cursor and Lists) the whole mechanism for loading the data was available in 30 Minutes or so.
2) Tables can be added or removed after need, and the loading can be made more flexible by adding other parameters to the logic.
3) The solution is really easy to use and the performance is as well acceptable in comparison to SSIS packages.
4) Probably you already observed the use of DROP TABLE IF EXSISTS introduced with SQL Server 2016 (see also post)

Advantages:
The stored procedure can be extended to any database for which can be created a linked server.
Structural changes of the source tables are reflected in each load.
Tables can be quickly updated when needed just by executing the stored procedure.

Disadvantages:
Such solutions are more for personal use and their use should be avoided in a production environment.
The metadata will be temporarily unavailable during the time the procedure is run. Indexes need to be created after each load.

Happy Coding!

20 June 2018

💫ERP Systems: Dynamics AX 2009 (Part I: Deleting Obsolete Companies)

Introduction

During implementations, migrations and other projects are created in Dynamics AX temporary companies (aka legal entities, data areas) that aren’t needed anymore once they fulfilled their purpose. Excepting the fact that obsolete companies occupy space in the data center, under certain circumstances they can lead to performance problems. The logical thing to do would be to delete the obsolete companies as long there’s no further demand from the business.

In what follows we will look at several methods for deleting obsolete companies. The scripts were tested in Dynamics AX 2009, and more likely they’ll work in coming versions as long the data model behind was kept.

Warning:
Please note that the scripts are provided “AS IS” only to exemplify a technique and they come without any warranty! Before attempting any of the methods described here, review the comments from “Further Considerations” section!

Method 1: Using DynamcsAX Built-In Functionality

Dynamics AX 2009 provides built-in functionality for deleting a company, however when the volume of data in the system goes above a certain limit the functionality starts to perform poorly, even when run directly on an AOS. (It is recommended to run long-running administration jobs directly on the AOS rather than clients.) For example, it was attempted to use this method to delete several companies in Dynamics AX Test environment. By the first company the deletion job needed a few hours, while by the second company the job hasn’t finished after two days, being thus forced to stop it. After two further failed attempts it came the time to look for another solution.

Warning:
It seems that this solution can lead to orphaned data (see [1]). So, even if you are using this method, you might need to consider one of the following methods as well.

Method 2: Using sp_MSforEachTable

In almost all tables in AX the company is stored in a DataAreaId attribute. Over this attribute the records belonging to a company are logically partitioned. This allows writing a script via the undocumented sp_MSforEachTable stored procedure:

--delete the data for one data area
sp_MSforEachTable @command1 = 'DELETE FROM ? WHERE DataAreaId = ''m01'''

An error with be thrown for the tables that don’t contain the DataAreaId attribute:

"Msg 207, Level 16, State 1, Line 1"

Invalid column name 'DataAreaId'.The script can be extended to delete in the same step two or more companies:

--delete the data for multiple data areas
 sp_MSforEachTable @command1 = 'DELETE FROM ? WHERE DataAreaId IN (''m01'', ''m02'')'

During the first test the script needed half of hour to run, however a few tables in which the company is stored in other attributes remained untouched. One can either search for such tables manually, via a script, or run the built-in AX functionality. We opted for running the built-in functionality, which managed to delete the remaining data relatively fast.

Warning:
Microsoft doesn’t support this method and can be used when the volume of obsolete data is relatively small! What does it mean relatively small? The most important limitation of this method is the transaction log, considering that the deleted data are logged. One can either change log’s size to accommodate the volume of data to be deleted or run the deletion only for a subset of the tables. (Changing the recovery model to “simple” or “bulk-logged” won’t make a difference.)

The second important limitation is the available memory, once the available memory is reached SQL Server having to paginate the data, fact that could lead to further disk space consumed. Other limitations have more with the performance to do, e.g. each deletion is reflected also in the indexes. One might consider for example dropping the indexes before deletion and recreating them afterwards.

Method 3: Using a Cursor

Instead of using the undocumented sp_MSforEachTable stored procedure, the loop can be performed via a cursor (see [1]). This method is advantageous when the deletion needs to be performed only for a subset of tables one could use a cursor. The deletion can be grouped together with other activities and run together.

Method 4: Using „Shadow“ Tables

When the volume of data available is huge, and the volume of data that remain in the table is small compared with the overall data, it might be useful to consider using “shadow” tables. One can take advantage of the fact that a truncate command performs incomparable better than a delete command. To use a truncate on a table, the records that need to be kept could be saved temporarily to a copy (aka “shadow”) of the table, the truncate then applied, and the copied records could be moved back. The following scripts exemplify the logic needed to delete the records from InventDim (inventory dimensions) table:

-- (optional) prove the number of records
SELECT count(*) 
FROM dbo.InventDim 
WHERE DataAreaId = 'm01'

-- create the “shadow” table
CREATE TABLE [dbo].[INVENTDIM_Dump](
[INVENTDIMID] [nvarchar](30) NOT NULL,
[INVENTBATCHID] [nvarchar](21) NOT NULL,
[WMSLOCATIONID] [nvarchar](12) NOT NULL,
[INVENTSERIALID] [nvarchar](21) NOT NULL,
[INVENTLOCATIONID] [nvarchar](10) NOT NULL,
[CONFIGID] [nvarchar](10) NOT NULL,
[INVENTSIZEID] [nvarchar](10) NOT NULL,
[INVENTCOLORID] [nvarchar](10) NOT NULL,
[INVENTSITEID] [nvarchar](10) NOT NULL,
[DATAAREAID] [nvarchar](4) NOT NULL,
[RECVERSION] [int] NOT NULL,
[RECID] [bigint] NOT NULL,
[WMSPALLETID] [nvarchar](18) NOT NULL,
[INVENTSTYLEID] [nvarchar](10) NOT NULL
) ON [PRIMARY]

-- copy the data into the “shadow” table
INSERT INTO [dbo].[InventDim_Dump] WITH (TABLOCK)
SELECT *
FROM [dbo].[InventDim] 
WHERE DataAreaId = 'm01'

-- truncate the data frome the main table 
--TRUNCATE TABLE [dbo].[InventDim]

-- copy the data back
INSERT INTO [dbo].[InventDim] WITH (TABLOCK)
SELECT *
FROM [dbo].[InventDim_Dump]

-- (optional) prove whether the IDs were correctly copied 
SELECT count(*)
FROM [dbo].[InventDim] A
JOIN [dbo].[InventDim_Dump] B
ON A.recid = B.RECID 
AND A. DATAAREAID = B.DATAAREAID 
WHERE A.DataAreaId = 'm01'

-- drop the „shadow“ table 
--DROP TABLE[dbo].[InventDim_Dump]

As can be seen the “shadow” tables are simplified versions of the original tables, without constraints or indexes. They can be eventually created in another schema or even other database.

Except the script for table’s creation in the other scripts table’s name can be easily replaced in the editor via the search and replace functionality, trick that reduces considerably the time needed for development. I needed on average 5 minutes for each table, plus 3-4 hours for further tests.

The optional steps are more for exemplification and can be eventually removed.

The Tablock hint used in inserts provides better performance and minimizes the volume of data logged.

I used this method only for the tables having more than 3 million records, around 50 tables in total. Between them there were a few tables having 20-200 GB worth of records. I started with these big tables and figured out that also smaller tables could benefit from this method. A few minutes gained for each small table resulted in the end in a gain of a couple of hours.

The remained records were 0-25% of the initial tables.

In theory, these steps could be performed within a cursor in which the creation of the “shadow” tables could be automated via table metadata as well. This approach will pay-off especially when the schema is not fixed, or the procedure needs to be repeated on different schemas.

Method 5: Delete Records in Batches

There will be a point beyond which the performance provided by the fourth method will deprecate considerably. This point is based on the volume of records available in the table, and the records needed to be inserted back and forth. Without further tests, I suppose that this point lies in the 50-75% interval. Beyond this point for big tables in range of 10x or 100x GB it might be useful to delete the data in batches. A push in this direction might be constrained by the need to shrink the transaction log in between the deletes. The query could be written as follow:

-- deleting top x records 
DELETE top 10000
FROM dbo.InventDim WITH (TABLOCK)
WHERE DataAreaId = ‘m01’

The query can be included in a loop or run manually until no records are returned. It can be tested with different batch sizes to determine the best solution. In between is recommended to check also the growth of the log file and truncate it accordingly when needed.

Method 6: Using X++ code

For those having some basic knowledge of X++ and Dynamics AX classes, a solution based on deleting data via AX code could prove to be a better solution as standard functionality can be leveraged, functionality that eventually considers also the business logic implemented. The downside is the code that need to be written for this purpose, however there are already some examples available on the web (see [4]).

Hint:
In AX 2012 built-in support for batch deletes was added via the delete_from statement (see [3]).

Further Aspects

Before attempting a deletion, it might be useful to analyze how many records will be deleted from each table, and eventually devise different scenarios for specific table categories. To get the number of records one can use either the built-in functionality from AX or use the sp_MSforEachTable stored procedure and export the results to text, following to overwork the data further in Excel:

-- listing the number of records per company 
sp_MSforEachTable @command1 = 'SELECT dataareaid, ''?'' table_name, count(*) no_records FROM ? WHERE DataAreaId IN (''m01'', ''m02'') GROUP BY dataareaid'

The results can be used also to approximate the space occupied by the data.

Independently of the method used it is recommended to restrict users‘ access to the system and to deactivate the scheduled AX or SQL Server jobs. This will ensure that no blockings will occur in the system during the respective time.

As data are synchronized between the AOS’s and the database, it is recommended to shut down the not needed AOS services before the deletions are performed, and restart them once all activities were performed.

To minimize the risks associated with the loss of data it’s recommended to perform a backup of the database(s) before performing any changes.

By deleting the data directly on the database, the business logic from AX (including customizations) is skipped. In theory this can lead to logical inconsistencies, however considering that all the data for a company are deleted, the risks are very small, unless intercompanies are involved.

After the data are deleted it is recommended to recreate the indexes and update the statistics on the tables.

Check whether the transaction log can accommodate the volume of records to be deleted! In extreme cases your SQL Server might crash! From this consideration it might be advantageous to delete only a company at a time.

Based on the volume of data available in the transaction log it might be needed to truncate the log(s) between the steps, as well at the end.

After the principle “better safe than sorry”, it might be a good idea to check the physical and logical consistency of the data before letting the users in.

To minimize the impact on the business, it is recommended to perform the deletion outside the working hours, otherwise the action can lead to blocking and even deadlocks in the system. Always attempt to use standard functionality and resort to other methods only when there’s no way around it.

It is recommended to always test the scripts thoroughly in the test environment before attempting their productive usage!

Previous Post <<||>> Next Post

References:
[1] Microsoft Dynamics AX Technical Support Blog (2010) How to delete orphaned data remained from deleted company?, by Martin Falta [link]
[2] Art of Creation (2010) Delete an AX company on SQL [link]
[3] MSDN (2012) delete_from Statement [link]
4] Kevin’s blog (2017) Dynamics Ax 2012 History cleanup, by Kevin Roos [link]

SQL Troubles

Pages

23 January 2025

💎🏭SQL Reloaded: Number of Records V (via Cursors, a Solution for Warehouses in Microsoft Fabric)

15 January 2023

💎🏭SQL Reloaded: Data Management Views for the Synapse serverless SQL pool (& Microsoft Fabric Warehouse)

27 June 2020

💠🛠️SQL Server: Undocumented (Part II: Execute Command for Each Table)

25 October 2018

💎SQL Reloaded: Cursor and Linked Server for Data Import

20 June 2018

💫ERP Systems: Dynamics AX 2009 (Part I: Deleting Obsolete Companies)

About Me