Showing posts with label Azure SQL Database. Show all posts
Showing posts with label Azure SQL Database. Show all posts

16 March 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part XI: Database and Server Properties)

When taking over a SQL Server, respectively database, one of the first checks I do focuses on the overall configuration, going through the UI available for admins to see if I can find anything that requires further investigation. If no documentation is available on the same, I run a few scripts and export their output as baseline. 

Especially when documenting the configuration, it's useful to export the database options and properties defined at database level. Besides the collation and probably the recovery mode, typically the rest of the configuration is similar, though in exceptional cases one should expect also surprises that require further investigation! 

The following query retrieves in a consolidated way all the options and properties of a SQL database in Microsoft Fabric. 

-- database settings/properties 
SELECT DATABASEPROPERTYEX(DB_NAME(), 'Collation') Collation
--, DATABASEPROPERTYEX(DB_NAME(), 'ComparisonStyle')  ComparisonStyle
, DATABASEPROPERTYEX(DB_NAME(), 'Edition') Edition
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiNullDefault') IsAnsiNullDefault
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiNullsEnabled') IsAnsiNullsEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiPaddingEnabled') IsAnsiPaddingEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiWarningsEnabled') IsAnsiWarningsEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsArithmeticAbortEnabled') IsArithmeticAbortEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoClose') IsAutoClose
, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoCreateStatistics') IsAutoCreateStatistics
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoCreateStatisticsIncremental') IsAutoCreateStatisticsIncremental
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoShrink') IsAutoShrink
, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoUpdateStatistics') IsAutoUpdateStatistics
--, DATABASEPROPERTYEX(DB_NAME(), 'IsClone') IsClone
--, DATABASEPROPERTYEX(DB_NAME(), 'IsCloseCursorsOnCommitEnabled') IsCloseCursorsOnCommitEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsDatabaseSuspendedForSnapshotBackup') IsDatabaseSuspendedForSnapshotBackup
, DATABASEPROPERTYEX(DB_NAME(), 'IsFulltextEnabled') IsFulltextEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsInStandBy') IsInStandBy
--, DATABASEPROPERTYEX(DB_NAME(), 'IsLocalCursorsDefault') IsLocalCursorsDefault
--, DATABASEPROPERTYEX(DB_NAME(), 'IsMemoryOptimizedElevateToSnapshotEnabled') IsMemoryOptimizedElevateToSnapshotEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsMergePublished') IsMergePublished
--, DATABASEPROPERTYEX(DB_NAME(), 'IsNullConcat') IsNullConcat
--, DATABASEPROPERTYEX(DB_NAME(), 'IsNumericRoundAbortEnabled') IsNumericRoundAbortEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsParameterizationForced') IsParameterizationForced
--, DATABASEPROPERTYEX(DB_NAME(), 'IsQuotedIdentifiersEnabled') IsQuotedIdentifiersEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsPublished') IsPublished
--, DATABASEPROPERTYEX(DB_NAME(), 'IsRecursiveTriggersEnabled') IsRecursiveTriggersEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsSubscribed') IsSubscribed
--, DATABASEPROPERTYEX(DB_NAME(), 'IsSyncWithBackup') IsSyncWithBackup
--, DATABASEPROPERTYEX(DB_NAME(), 'IsTornPageDetectionEnabled') IsTornPageDetectionEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsVerifiedClone') IsVerifiedClone
--, DATABASEPROPERTYEX(DB_NAME(), 'IsXTPSupported') IsXTPSupported
, DATABASEPROPERTYEX(DB_NAME(), 'LastGoodCheckDbTime') LastGoodCheckDbTime
, DATABASEPROPERTYEX(DB_NAME(), 'LCID') LCID
--, DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') MaxSizeInBytes
, DATABASEPROPERTYEX(DB_NAME(), 'Recovery') Recovery
--, DATABASEPROPERTYEX(DB_NAME(), 'ServiceObjective') ServiceObjective
--, DATABASEPROPERTYEX(DB_NAME(), 'ServiceObjectiveId') ServiceObjectiveId
, DATABASEPROPERTYEX(DB_NAME(), 'SQLSortOrder') SQLSortOrder
, DATABASEPROPERTYEX(DB_NAME(), 'Status') Status
, DATABASEPROPERTYEX(DB_NAME(), 'Updateability') Updateability
, DATABASEPROPERTYEX(DB_NAME(), 'UserAccess') UserAccess
, DATABASEPROPERTYEX(DB_NAME(), 'Version') Version
--, DATABASEPROPERTYEX(DB_NAME(), 'ReplicaID') ReplicaID

Output:

Collation Edition IsAutoCreateStatistics IsAutoUpdateStatistics IsFulltextEnabled LastGoodCheckDbTime LCID Recovery SQLSortOrder Status Updateability UserAccess Version
SQL_Latin1_General_CP1_CI_AS FabricSQLDB 1 1 1 12/31/1899 1033 FULL 52 ONLINE READ_WRITE MULTI_USER 981

The query can be run also against the SQL analytics endpoints available for warehouses in Microsoft Fabric.

Output:

Collation Edition IsAutoCreateStatistics IsAutoUpdateStatistics IsFulltextEnabled LastGoodCheckDbTime LCID Recovery SQLSortOrder Status Updateability UserAccess Version
Latin1_General_100_BIN2_UTF8 DataWarehouse 1 1 1 12/31/1899 1033 SIMPLE 0 ONLINE READ_WRITE MULTI_USER 987

Respectively, for lakehouses:

Collation Edition IsAutoCreateStatistics IsAutoUpdateStatistics IsFulltextEnabled LastGoodCheckDbTime LCID Recovery SQLSortOrder Status Updateability UserAccess Version
Latin1_General_100_BIN2_UTF8 LakeWarehouse 1 1 1 12/31/1899 1033 SIMPLE 0 ONLINE READ_WRITE MULTI_USER 987

A similar output is obtained if one runs the query against SQL database's SQL analytics endpoint:

Output:

Collation Edition IsAutoCreateStatistics IsAutoUpdateStatistics IsFulltextEnabled LastGoodCheckDbTime LCID Recovery SQLSortOrder Status Updateability UserAccess Version
Latin1_General_100_BIN2_UTF8 LakeWarehouse 1 1 1 12/31/1899 1033 SIMPLE 0 ONLINE READ_WRITE MULTI_USER 987

SQL databases seem to inherit the collation from the earlier versions of SQL Server.

Another meaningful value for SQL databases is MaxSizeInBytes, which in my environment had a value of 3298534883328 bytes ÷ 1,073,741,824 = 3,072 GB.

There are however also server properties. Here's the consolidated overview:

-- server properties
SELECT --SERVERPROPERTY('BuildClrVersion') BuildClrVersion
 SERVERPROPERTY('Collation') Collation
--, SERVERPROPERTY('CollationID') CollationID
, SERVERPROPERTY('ComparisonStyle') ComparisonStyle
--, SERVERPROPERTY('ComputerNamePhysicalNetBIOS') ComputerNamePhysicalNetBIOS
, SERVERPROPERTY('Edition') Edition
--, SERVERPROPERTY('EditionID') EditionID
, SERVERPROPERTY('EngineEdition') EngineEdition
--, SERVERPROPERTY('FilestreamConfiguredLevel') FilestreamConfiguredLevel
--, SERVERPROPERTY('FilestreamEffectiveLevel') FilestreamEffectiveLevel
--, SERVERPROPERTY('FilestreamShareName') FilestreamShareName
--, SERVERPROPERTY('HadrManagerStatus') HadrManagerStatus
--, SERVERPROPERTY('InstanceDefaultBackupPath') InstanceDefaultBackupPath
, SERVERPROPERTY('InstanceDefaultDataPath') InstanceDefaultDataPath
--, SERVERPROPERTY('InstanceDefaultLogPath') InstanceDefaultLogPath
--, SERVERPROPERTY('InstanceName') InstanceName
, SERVERPROPERTY('IsAdvancedAnalyticsInstalled') IsAdvancedAnalyticsInstalled
--, SERVERPROPERTY('IsBigDataCluster') IsBigDataCluster
--, SERVERPROPERTY('IsClustered') IsClustered
, SERVERPROPERTY('IsExternalAuthenticationOnly') IsExternalAuthenticationOnly
, SERVERPROPERTY('IsExternalGovernanceEnabled') IsExternalGovernanceEnabled
, SERVERPROPERTY('IsFullTextInstalled') IsFullTextInstalled
--, SERVERPROPERTY('IsHadrEnabled') IsHadrEnabled
--, SERVERPROPERTY('IsIntegratedSecurityOnly') IsIntegratedSecurityOnly
--, SERVERPROPERTY('IsLocalDB') IsLocalDB
--, SERVERPROPERTY('IsPolyBaseInstalled') IsPolyBaseInstalled
--, SERVERPROPERTY('IsServerSuspendedForSnapshotBackup') IsServerSuspendedForSnapshotBackup
--, SERVERPROPERTY('IsSingleUser') IsSingleUser
--, SERVERPROPERTY('IsTempDbMetadataMemoryOptimized') IsTempDbMetadataMemoryOptimized
, SERVERPROPERTY('IsXTPSupported') IsXTPSupported
, SERVERPROPERTY('LCID') LCID
, SERVERPROPERTY('LicenseType') LicenseType
, SERVERPROPERTY('MachineName') MachineName
, SERVERPROPERTY('NumLicenses') NumLicenses
, SERVERPROPERTY('PathSeparator') PathSeparator
--, SERVERPROPERTY('ProcessID') ProcessID
, SERVERPROPERTY('ProductBuild') ProductBuild
--, SERVERPROPERTY('ProductBuildType') ProductBuildType
--, SERVERPROPERTY('ProductLevel') ProductLevel
--, SERVERPROPERTY('ProductMajorVersion') ProductMajorVersion
--, SERVERPROPERTY('ProductMinorVersion') ProductMinorVersion
--, SERVERPROPERTY('ProductUpdateLevel') ProductUpdateLevel
--, SERVERPROPERTY('ProductUpdateReference') ProductUpdateReference
--, SERVERPROPERTY('ProductUpdateType') ProductUpdateType
, SERVERPROPERTY('ProductVersion') ProductVersion
, SERVERPROPERTY('ResourceLastUpdateDateTime') ResourceLastUpdateDateTime
, SERVERPROPERTY('ResourceVersion') ResourceVersion
, SERVERPROPERTY('ServerName') ServerName
, SERVERPROPERTY('SqlCharSet') SqlCharSet
, SERVERPROPERTY('SqlCharSetName') SqlCharSetName
, SERVERPROPERTY('SqlSortOrder') SqlSortOrder
, SERVERPROPERTY('SqlSortOrderName') SqlSortOrderName
, SERVERPROPERTY('SuspendedDatabaseCount') SuspendedDatabaseCount

Output (consolidated):

Property SQL database Warehouse Lakehouse
Collation SQL_Latin1_General_CP1_CI_AS SQL_Latin1_General_CP1_CI_AS SQL_Latin1_General_CP1_CI_AS
ComparisonStyle 196609 196609 196609
Edition SQL Azure SQL Azure SQL Azure
EngineEdition 12 11 11
InstanceDefaultDataPath NULL NULL NULL
IsAdvancedAnalyticsInstalled 1 1 1
IsExternalAuthenticationOnly 1 0 0
IsExternalGovernanceEnabled 1 1 1
IsFullTextInstalled 1 0 0
IsXTPSupported 1 1 1
LCID 1033 1033 1033
LicenseType DISABLED DISABLED DISABLED
MachineName NULL NULL NULL
NumLicenses NULL NULL NULL
PathSeparator \ \ \
ProductBuild 2000 502 502
ProductVersion 12.0.2000.8 12.0.2000.8 12.0.2000.8
ResourceLastUpdateDateTime 11/6/2024 3:41:27 PM 3/5/2025 12:05:50 PM 3/5/2025 12:05:50 PM
ResourceVersion 16.00.5751 17.00.502 17.00.502
ServerName ... .... ....
SqlCharSet 1 1 1
SqlCharSetName iso_1 iso_1 iso_1
SqlSortOrder 52 52 52
SqlSortOrderName nocase_iso nocase_iso nocase_iso
SuspendedDatabaseCount NULL 0 0

It's interesting that all three instances have the same general collation, while the Engine Edition of SQL databases is not compatible with the others [2]. The Server Names has been removed manually from the output from obvious reasons. The warehouse and lakehouse are in the same environment (SQL Azure instance, see sys.databases), and therefore the same values are shown (though this might happen independently of the environments used).

The queries were run in a trial Microsoft Fabric environment. Other environments can have upon case different properties. Just replace the "--" from the commented code to get a complete overview.

The queries should run also in the other editions of SQL Server. If DATABASEPROPERTYEX is not supported, one should try DATABASEPROPERTY instead.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) SQL Server 2022: DATABASEPROPERTYEX (Transact-SQL) [link]
[2] Microsoft Learn (2024) SQL Server 2022: SERVERPROPERTY (Transact-SQL) [link]

06 January 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part VI: Index Usage Analysis) [new feature]

There are several system dynamic management views (DMV) available in SQL Server, Azure SQL Server and now in SQL databases that allow to gather more information about indexes' fragmentation and usage. Let's look at the most important information available based on the indexes create in the previous posts. As the data were probably purged from the views, it's needed to run first the select queries based on the SalesLT.Product from the previous post. This step is important, otherwise the DMVs might return no records!

One starting point is to use the sys.dm_db_index_physical_stats DMV to look at the indexes' size and fragmentation information for a given table (or view). The table is used usually as starting point for analyzing indexes' fragmentation and then defragment the indexes with high fragmentation.

-- sys metadata - index & data size and fragmentation information for the data and indexes of the specified table or view
SELECT --db_name() db_name
--, object_name(IND.object_id) table_name
 IND.name index_name
, IND.type_desc
, IPS.page_count
, IPS.record_count
, IPS.index_level
, Cast(IPS.avg_fragmentation_in_percent as decimal(10,2)) avg_fragmentation_perc
, Cast(IPS.avg_page_space_used_in_percent as decimal(10,2)) space_used_perc
--, IPS.*
FROM sys.indexes IND
     CROSS APPLY sys.dm_db_index_physical_stats(DB_ID(), IND.object_id, IND.index_id, NULL, 'DETAILED') IPS
WHERE IND.object_id = OBJECT_ID(N'SalesLT.Product');
Output:
index_name type_desc page_count record_count index_level avg_fragmentation_perc space_used_perc
PK_Product_ProductID CLUSTERED 101 295 0 0.99 87.90
PK_Product_ProductID CLUSTERED 1 101 1 0.00 16.20
AK_Product_rowguid NONCLUSTERED 2 295 0 50.00 74.69
AK_Product_rowguid NONCLUSTERED 1 2 1 0.00 0.59
AK_Product_ProductNumber NONCLUSTERED 2 295 0 50.00 85.79
AK_Product_ProductNumber NONCLUSTERED 1 2 1 0.00 0.49
AK_Product_Name NONCLUSTERED 3 295 0 33.33 87.32
AK_Product_Name NONCLUSTERED 1 3 1 0.00 1.67
IX_SalesLT_Product_Color NONCLUSTERED 1 295 0 0.00 79.24
IX_SalesLT_Product_Color_Size NONCLUSTERED 1 295 0 0.00 94.12
IX_SalesLT_Product_ListPrice_IC NONCLUSTERED 4 295 0 0.00 86.60
IX_SalesLT_Product_ListPrice_IC NONCLUSTERED 1 4 1 0.00 1.01

In a second step one can look at the sys.dm_db_index_usage_stats DMV which provides the counts of the different types of index operations and the time each type of operation was last performed:

-- sys metadata - counts of different types of index operations and the time each type of operation was last performed.
SELECT -- db_name() db_name
--, object_name(IND.object_id) table_name
 IND.name
, IND.type_desc
, IUS.user_seeks 
, IUS.user_scans
, IUS.user_lookups 
, IUS.user_updates
, IUS.last_user_seek
, IUS.last_user_scan 
, IUS.last_user_lookup
, IUS.last_user_update
FROM sys.dm_db_index_usage_stats IUS
     JOIN sys.indexes IND
       ON IUS.index_id = IND.index_id
WHERE IND.object_id = OBJECT_ID(N'SalesLT.Product');
Output:
name type_desc user_seeks user_scans user_lookups user_updates last_user_seek last_user_scan last_user_lookup
PK_Product_ProductID CLUSTERED 0 10 15 0 2025-01-06T14:23:54 2025-01-06T14:23:54
IX_SalesLT_Product_Color_Size NONCLUSTERED 11 0 0 0 2025-01-06T14:23:54
IX_SalesLT_Product_ListPrice_IC NONCLUSTERED 8 0 0 0 2025-01-06T13:38:03

Finally, it might be useful to look also at the sys.dm_db_index_operational_stats DMV which returns the current lower-level I/O, locking, latching, and access method activity for each partition of a table or index in the database (see the documentation for the full list of attrbutes):

-- sys metadata - index operations stats
SELECT -- db_name() db_name
--, object_name(IND.object_id) table_name
 IND.name index_name
, IND.type_desc
, IOS.range_scan_count
, IOS.singleton_lookup_count
, IOS.leaf_insert_count
, IOS.leaf_delete_count
, IOS.leaf_update_count
, IOS.nonleaf_insert_count
, IOS.nonleaf_delete_count
, IOS.nonleaf_update_count
FROM sys.indexes IND
     CROSS APPLY sys.dm_db_index_operational_stats(DB_ID(), IND.object_id, IND.index_id, NULL) IOS
WHERE IND.object_id = OBJECT_ID(N'SalesLT.Product')
 AND IOS.range_scan_count<>0
ORDER BY IND.name;
Output:
index_name type_desc range_scan_count singleton_lookup_count leaf_insert_count leaf_delete_count leaf_update_count nonleaf_insert_count nonleaf_delete_count nonleaf_update_count
IX_SalesLT_Product_Color_Size NONCLUSTERED 11 0 0 0 0 0 0 0
IX_SalesLT_Product_ListPrice_IC NONCLUSTERED 8 0 0 0 0 0 0 0
PK_Product_ProductID CLUSTERED 10 64 0 0 0 0 0 0

For more information on these DMVs check the documentation.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) SQL Server: sys.dm_db_index_physical_stats [link]
[2] Microsoft Learn (2024) SQL Server: sys.dm_db_index_usage_stats [link]
[3] Microsoft Learn (2024) SQL Server: sys.dm_db_index_operational_stats [link]

04 January 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part V: Manual Index Maintenance) [new feature]

Indexes' maintenance in Microsoft Fabric's SQL databases is supposed to happen automatically in the background via automatic tuning options feature, though the whole functionality is still in its early phases, and therefore many questions regarding the whole process may arise. Probably the most important question is whether indexes can still be created, respectively maintained manually. That's useful for temporary or even periodic workloads, where maybe organizations might still want to maintain indexes manually. 

The tests made below are based on the SalesLT.Product from AdventureWorkds database available in Microsoft Fabric. The target was to create several indexes that could be used for the various testing purposes. Each set of the below scripts was run 5-10 times until records appeared in the sys.dm_db_missing_index_details table for each test case (see further below):

-- batch 1: filter on single column (to be run 5-10 times)
SELECT *
FROM SalesLT.Product 
WHERE Color = 'Red'

SELECT *
FROM SalesLT.Product 
WHERE Color = 'Black'

SELECT *
FROM SalesLT.Product 
WHERE Color = 'White'

-- batch 2: filter on two columns (to be run 5-10 times)
SELECT *
FROM SalesLT.Product 
WHERE Color = 'Red'
  AND Size = '58'

SELECT *
FROM SalesLT.Product 
WHERE Color = 'Black'
  AND Size = '58'

SELECT *
FROM SalesLT.Product 
WHERE Color = 'White'
     AND Size = '58'

-- batch 3: filter with column selection (to be run 5-10 times)
SELECT ProductNumber, Name, Color, ListPrice
FROM SalesLT.Product 
WHERE ListPrice BETWEEN 50 AND 55

SELECT ProductNumber, Name, Color, ListPrice
FROM SalesLT.Product 
WHERE ListPrice BETWEEN 100 and 105

Once the scripts run, one can look at the records created in the above considered dynamic management view:

-- sys metadata -  missing indexes
SELECT MID.statement AS table_name
, MID.equality_columns
, MID.inequality_columns
, MID.included_columns
--, MIG.index_group_handle
--, MIG.index_handle
FROM sys.dm_db_missing_index_details MID 
    JOIN sys.dm_db_missing_index_groups MIG 
     ON MID.index_handle =  MIG.index_handle
ORDER BY MIG.index_group_handle
, MIG.index_handle
Output:
table_name equality_columns inequality_columns included_columns
[AdventureWorks01-...].[SalesLT].[Product] [Color]
[AdventureWorks01-...].[SalesLT].[Product] [Color], [Size]
[AdventureWorks01-...].[SalesLT].[Product] [ListPrice] [Name], [ProductNumber], [Color]

The next step is to create one of the indexes (please note that database's name must be replaced accordingly or used only the 2-part naming convention - schema & table name ):

-- create index on Color
CREATE INDEX IX_SalesLT_Product_Color 
ON [AdventureWorks01-...].[SalesLT].[Product] (Color);

Once the script was run, all the records related to the SalesLT.Product disappeared from the dynamic management view. Therefore, it might be a good idea to take a snapshot with view's data before creating any indexes manually. Probably the same behavior should be expected when the indexes are created by the system.

-- create index on Color & Size
CREATE INDEX IX_SalesLT_Product_Color_Size
ON [SalesLT].[Product] (Color, Size);

-- create index on ListPrice with included columns
CREATE INDEX IX_SalesLT_Product_ListPrice_IC
ON [SalesLT].[Product] (ListPrice) INCLUDE(ProductNumber, Name, Color);

One can use the following query based on the meta.vIndexes (created in a previous post) to look at the indexes created:

-- sys metadata - index columns
SELECt IND.db_name
, IND.schema_name
, IND.table_name
, IND.index_name
, IND.index_type
, IND.principal_type
, IND.auto_created
FROM meta.vIndexes IND
WHERE IND.schema_name = 'SalesLT'
  AND IND.table_name = 'Product'
  AND IND.index_name IN ('IX_SalesLT_Product_Color ','IX_SalesLT_Product_Color_Size'
,'IX_SalesLT_Product_ListPrice_IC')
ORDER BY IND.table_name
, IND.index_name
Output:
db_name schema_name table_name index_name index_type principal_type auto_created
AdventureWorks01-... SalesLT Product IX_SalesLT_Product_Color NONCLUSTERED S False
AdventureWorks01-... SalesLT Product IX_SalesLT_Product_Color_Size NONCLUSTERED S False
AdventureWorks01-... SalesLT Product IX_SalesLT_Product_ListPrice_IC NONCLUSTERED S False

After this model can be created further indexes as needed. It's always a good idea to take a "copy" of the indexes created (or keep a history of the scripts run for indexes' maintenance). This best practice is now more important, when the system can drop indexes as it considers fit. 

Don't forget to clean up the changes made if the indexes aren't needed anymore:

-- cleaning after
DROP INDEX IF EXISTS SalesLT.IX_SalesLT_Product_Color;
DROP INDEX IF EXISTS SalesLT.IX_SalesLT_Product_Color_Size;
DROP INDEX IF EXISTS SalesLT.IX_SalesLT_Product_ListPrice_IC;

So, after these tests, the standard syntax for index's maintenance seems to work also on SQL databases, with all the implications deriving from this (e.g. porting of scripts, database objects, etc.)

Happy coding!

Previous Post <<||>> Next Post

01 January 2025

💎🏭SQL Reloaded: SQL Server Metadata (Part I: Indexes Overview)

There are scenarios in which it's useful to gather information about the available indexes and their definition as a primary step for troubleshooting or index maintenance. Moreover, it's useful to take a baseline of the defined indexes and update it accordingly when indexes change. A minimum of information can be gathered in Excel files or similar repositories, though for the same can be used also metadata tools, at least when they're easy to use and the associated costs aren't neglectable. 

Usually, there are two levels at which the information is needed - at index, respectively at column level. Sometimes it's useful to have an independent query for each level of detail, though in data warehouses and similar use cases it's useful to provide a model on top of the metadata, at least to improve query's maintainability.

The first view encapsulates the logic needed to export the data and it's based on the sys.indexes, sys.objects, sys.schemas and sys.database_principals system objects. 

-- create schema for metadata
CREATE SCHEMA meta;

-- clean after
DROP VIEW IF EXISTS meta.vIndexes

-- create views
CREATE OR ALTER VIEW meta.vIndexes
AS
-- sys metadata - indexes
SELECT OBJ.schema_id
, IND.object_id 
, IND.index_id 
, SCH.name schema_name
, OBJ.name table_name
, IND.name index_name
, IND.type_desc index_type
, IND.is_primary_key
, IND.is_unique_constraint
, IND.fill_factor
, IND.has_filter
, IND.auto_created
, IND.is_unique is_unique_index
, DBP.type principal_type
, DBP.type_desc principal_type_desc
--, INC.*
FROM sys.indexes IND WITH (NOLOCK)
     JOIN sys.objects OBJ WITH (NOLOCK)
       ON IND.object_id = OBJ.object_id
          JOIN sys.schemas SCH WITH (NOLOCK)
            ON OBJ.schema_id = SCH.schema_id
               JOIN sys.database_principals DBP
                 ON SCH.principal_id = DBP.principal_id 
WHERE DBP.type IN ('S', 'E')

Upon case, the needed information can be exported via a query like the one below:

-- sys metadata - index columns
SELECt IND.db_name
, IND.schema_name
, IND.table_name
, IND.index_name
, IND.index_type
, IND.principal_type
FROM meta.vIndexes IND
WHERE IND.schema_name = 'SalesLT'
  AND IND.table_name IN ('Address', 'Customer')
ORDER BY IND.table_name
, IND.index_name

Output:

db_name schema_name table_name index_name index_type principal_type
AdventureWorks01... SalesLT Address AK_Address_rowguid NONCLUSTERED S
AdventureWorks01... SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion NONCLUSTERED S
AdventureWorks01... SalesLT Address IX_Address_StateProvince NONCLUSTERED S
AdventureWorks01... SalesLT Address PK_Address_AddressID CLUSTERED S
AdventureWorks01... SalesLT Customer AK_Customer_rowguid NONCLUSTERED S
AdventureWorks01... SalesLT Customer IX_Customer_EmailAddress NONCLUSTERED S
AdventureWorks01... SalesLT Customer PK_Customer_CustomerID CLUSTERED S

Similarly, on top of the above view can be built a similar object that provides also the column-related information by adding the sys.index_columns and sys.columns system object to the logic:

-- clean after
DROP VIEW IF EXISTS meta.vIndexColumns

-- create 
CREATE OR ALTER VIEW meta.vIndexColumns
AS
-- sys metadata - index columns
SELECT IND.db_id
, IND.schema_id
, INC.object_id 
, INC.index_id 
, INC.index_column_id
, INC.column_id
, IND.db_name
, IND.schema_name
, IND.table_name
, IND.index_name
, COL.name column_name
, INC.key_ordinal
, INC.partition_ordinal
, IND.index_type
, IND.is_primary_key
, IND.is_unique_constraint
, IND.fill_factor
, IND.has_filter
, IND.auto_created
, IND.is_unique_index
, INC.is_descending_key
, INC.is_included_column
, IND.principal_type
, IND.principal_type_desc
FROM sys.index_columns INC
     JOIN sys.columns COL WITH (NOLOCK)
       ON INC.object_id = COL.object_id 
      AND INC.column_id = COL.column_id 
     JOIN meta.vIndexes IND WITH (NOLOCK)
       ON INC.object_id = IND.object_id
      AND INC.index_id = IND.index_id

And, there's an example of the query based on this view:

-- sys metadata - index columns
SELECt INC.schema_name
, INC.table_name
, INC.index_name
, INC.column_name
, INC.key_ordinal
, INC.index_type
, INC.principal_type
FROM meta.vIndexColumns INC
WHERE INC.schema_name = 'SalesLT'
  AND INC.table_name IN ('Address', 'Customer')
ORDER BY INC.table_name
, INC.index_name
, INC.key_ordinal

Output:
schema_name table_name index_name column_name key_ordinal index_type principal_type
SalesLT Address AK_Address_rowguid rowguid 1 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion AddressLine1 1 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion AddressLine2 2 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion City 3 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion StateProvince 4 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion PostalCode 5 NONCLUSTERED S
SalesLT Address IX_Address_AddressLine1_AddressLine2_City_StateProvince_PostalCode_CountryRegion CountryRegion 6 NONCLUSTERED S
SalesLT Address IX_Address_StateProvince StateProvince 1 NONCLUSTERED S
SalesLT Address PK_Address_AddressID AddressID 1 CLUSTERED S
SalesLT Customer AK_Customer_rowguid rowguid 1 NONCLUSTERED S
SalesLT Customer IX_Customer_EmailAddress EmailAddress 1 NONCLUSTERED S
SalesLT Customer PK_Customer_CustomerID CustomerID 1 CLUSTERED S

Notes:
1) As a DBA it's useful to take a baseline of the indexes defined and reevaluate their usefulness over time. These are one of the checks that should be done when one becomes responsible for the administration of a database server, independently of the vendor.
2) The definitions of the views can be extended as needed, though one should try to keep the overall complexity to a minimum. 
3) There are voices against the use of NOLOCK. Feel free to change the objects accordingly!
4) It's useful to work in a dedicated schema (e.g. meta) and have a different naming convention that deviates slightly from one defined by Microsoft. This should make sure that no confusion with the system objects exists.
5) Azure SQL takes over many of the responsibilities for index maintenance. Even if indexes are managed automatically by the system, a baseline is still needed, at least to evaluate functionality's performance, respectively the changes that occurred in the environment. 
6) Except attributes which were added for specific functionality over time, the queries should work at least starting with SQL Server 2005 forward.
7) See also the notes on clustered vs nonclustered indexes.

Happy coding!

Previous Post <<||>> Next Post

07 December 2024

🏭 💠Data Warehousing: Microsoft Fabric (Part VI: SQL Databases for OLTP scenarios) [new feature]

Data Warehousing Series
Data Warehousing Series

One interesting announcements at Ignite is the availability in public preview of SQL databases in Microsoft Fabric, "a versatile and developer-friendly transactional database built on the foundation of Azure SQL database". With this Fabric can address besides OLAP also OLTP scenarios, evolving thus from analytics to a data platform [1]. According to the announcement, besides the AI-optimized architectural aspects, the feature makes the SQL Azure simple, autonomous and secure by design [1], and these latest aspects are considered in this post. 

Simplicity revolves around the deployment and configuration of databases, the creation of a new database requiring giving a name and the database is created in seconds [1]. It’s a considerable improvement compared with the relatively complex setup needed for on-premise configurations, though sometimes more flexibility in configuration is needed upfront or over database’s lifetime. To get a database ready for testing one can import a sample database or get specific data via data flows and/or pipelines [1]. As development tools one can use Visual Studio Code or SSMS [1], and probably more tools will be available in time.

The integration with both GitHub and Azure DevOps allows to configure each database under source control, which is needed for many scenarios especially when multiple resources make changes to the database objects [1]. Frankly, that’s mainly important during the development phase, respectively in scenarios in which multiple people make in parallel changes to the logic. It will be interesting to see how much overhead or challenges the feature adds to development and how smoothly everything works together!

The most important aspect for many solutions is the replication of data in near-real time to the (open-source) delta parquet format in OneLake and thus making the data available for analytics almost immediately [1]. Probably, from this aspect many cloud-based applications can benefit, even if the performance might not be as good as in other well-established architectures. However, there are many other scenarios in which one needs to maintain and use data for OLTP/OLAP purposes. This invites adequate testing and a good weighting of the advantages and disadvantages involved. 

A SQL database is a native item in Fabric, and therefore it utilizes Fabric capacity units like other Fabric workloads [1]. One can use the Fabric SKU estimator (still in private preview) to estimate the costs [2], though it will be interesting to see how cost-effective the solutions are. Probably, especially when the infrastructure is already available outside of Fabric, it will be easier and cost-effective to use the mirroring functionality. One should test and have a better estimator before moving blindly from the existing infrastructure to Fabric. 

SQL databases in Fabric are autonomous by design, while allowing to get the best performance and availability by default [1]. High availability is reached through zone redundancy, while performance is achieved by scaling automatically the storage and compute to accommodate the workloads [1]. The auto-optimization capability is achieved with the help of the latest Intelligent Query Processing (IQP) enhancements, respectively the creation of missing indexes to improve query performance [1]. It will be interesting to see how the whole process works, given that the maintenance of indexes usually involves some challenges (e.g. identifying covering indexes, indexes needed only for temporary workloads, duplicated indexes).

SQL databases in Fabric are automatically configured for high availability with zone redundancy, while storage and compute scale automatically to accommodate the user workload [1]. The database is auto-optimized through the latest IQP enhancements while the system creates any missing indexes to improve query performance. All data is replicated to OneLake by default [1]. Finally, the database always receives the latest security updates with auto-patching, while automatic backups help in disaster recovery scenarios  [1], which can be of real help for database administrators. 

References:
[1] Microsoft Fabric Updates Blog (2024) Announcing SQL database in Microsoft Fabric Public Preview [link]
[2] Microsoft Fabric Updates Blog (2024) Announcing New Recruitment for the Private Preview of Microsoft Fabric SKU Estimator [link]

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.