Showing posts with label INFORMATION_SCHEMA. Show all posts
Showing posts with label INFORMATION_SCHEMA. Show all posts

06 December 2025

💎🤖💫SQL Reloaded: Schema Differences between Database Versions - Part I: INFORMATION_SCHEMA version

During data migrations and other similar activities it's important to check what changed in the database at the various levels. Usually, it's useful to check when schemas, object names or table definitions changed, even if the changes are thoroughly documented. One can write a script to point out all the differences in one output, though it's recommended to check the differences at each level of detail

For this purpose one can use the INFORMATION_SCHEMA available for many of the RDBMS implementing it. This allows to easily port the scripts between platforms. The below queries were run on SQL Server 2025 in combination with Dynamics 365 schemas, though they should run on the earlier versions, incl. (Azure) SQL Databases. 

Such comparisons must be done from the both sides, this implying a FULL OUTER JOIN when writing a single SELECT statement, however the results can become easily hard to read and even interpret when the number of columns in output increases. Therefore, it's recommended to keep the number of columns at a minimum while addressing the scope, respectively break the FULL OUTER JOIN in two LEFT JOINs.

The simplest check is at schema level, and this can be easily done from both sides (note that database names needed to be replaced accordingly):

-- difference schemas (objects not available in the new schema)
SELECT *
FROM ( -- comparison
	SELECT DB1.CATALOG_NAME
	, DB1.SCHEMA_NAME
	, DB1.SCHEMA_OWNER
	, DB1.DEFAULT_CHARACTER_SET_NAME
	, DB2.SCHEMA_OWNER NEW_SCHEMA_OWNER
	, DB2.DEFAULT_CHARACTER_SET_NAME NEW_DEFAULT_CHARACTER_SET_NAME
	, CASE 
		WHEN DB2.SCHEMA_NAME IS NULL THEN 'schema only in old db'
		WHEN DB1.SCHEMA_OWNER <> IsNull(DB2.SCHEMA_OWNER, '') THEN 'different table type'
	  END Comment
        , CASE WHEN DB1.DEFAULT_CHARACTER_SET_NAME <> DB2.DEFAULT_CHARACTER_SET_NAME THEN 'different character sets' END Character_sets
	FROM [old database_name].INFORMATION_SCHEMA.SCHEMATA DB1
	     LEFT JOIN [new database name].INFORMATION_SCHEMA.SCHEMATA DB2
	       ON DB1.SCHEMA_NAME = DB2.SCHEMA_NAME
 ) DAT
WHERE DAT.Comment IS NOT NULL
ORDER BY DAT.CATALOG_NAME
, DAT.SCHEMA_NAME


-- difference schemas (new objects)
SELECT *
FROM ( -- comparison
	SELECT DB1.CATALOG_NAME
	, DB1.SCHEMA_NAME
	, DB1.SCHEMA_OWNER
	, DB1.DEFAULT_CHARACTER_SET_NAME
	, DB2.SCHEMA_OWNER OLD_SCHEMA_OWNER
	, DB2.DEFAULT_CHARACTER_SET_NAME OLD_DEFAULT_CHARACTER_SET_NAME
	, CASE 
		WHEN DB2.SCHEMA_NAME IS NULL THEN 'schema only in old db'
		WHEN DB1.SCHEMA_OWNER <> IsNull(DB2.SCHEMA_OWNER, '') THEN 'different table type'
	  END Comment
        , CASE WHEN DB1.DEFAULT_CHARACTER_SET_NAME <> DB2.DEFAULT_CHARACTER_SET_NAME THEN 'different character sets' END Character_sets
	FROM [new database name].INFORMATION_SCHEMA.SCHEMATA DB1
	     LEFT JOIN [old database name].INFORMATION_SCHEMA.SCHEMATA DB2
	       ON DB1.SCHEMA_NAME = DB2.SCHEMA_NAME
 ) DAT
WHERE DAT.Comment IS NOT NULL
ORDER BY DAT.CATALOG_NAME
, DAT.SCHEMA_NAME

Comments:
1) The two queries can be easily combined via a UNION ALL, though it might be a good idea then to add a column to indicate the direction of the comparison. 

The next step would be to check which objects has been changed:

-- table-based objects only in the old schema (tables & views)
SELECT *
FROM ( -- comparison
	SELECT DB1.TABLE_CATALOG
	, DB1.TABLE_SCHEMA
	, DB1.TABLE_NAME
	, DB1.TABLE_TYPE
	, DB2.TABLE_CATALOG NEW_TABLE_CATALOG
	, DB2.TABLE_TYPE NEW_TABLE_TYPE
	, CASE 
		WHEN DB2.TABLE_NAME IS NULL THEN 'objects only in old db'
		WHEN DB1.TABLE_TYPE <> IsNull(DB2.TABLE_TYPE, '') THEN 'different table type'
		--WHEN DB1.TABLE_CATALOG <> IsNull(DB2.TABLE_CATALOG, '') THEN 'different table catalog'
	  END Comment
	FROM [old database name].INFORMATION_SCHEMA.TABLES DB1
	    LEFT JOIN [new database name].INFORMATION_SCHEMA.TABLES DB2
	      ON DB1.TABLE_SCHEMA = DB2.TABLE_SCHEMA
	     AND DB1.TABLE_NAME = DB2.TABLE_NAME
 ) DAT
WHERE DAT.Comment IS NOT NULL
ORDER BY DAT.TABLE_SCHEMA
, DAT.TABLE_NAME

Comments:
1) If the database was imported under another name, then the TABLE_CATALOG will have different values as well.

At column level, the query increases in complexity, given the many aspects that must be considered:

-- difference columns (columns not available in the new scheam, respectively changes in definitions)
SELECT *
FROM ( -- comparison
	SELECT DB1.TABLE_CATALOG
	, DB1.TABLE_SCHEMA
	, DB1.TABLE_NAME
	, DB1.COLUMN_NAME 
	, DB2.TABLE_CATALOG NEW_TABLE_CATALOG
	, CASE WHEN DB2.TABLE_NAME IS NULL THEN 'column only in old db' END Comment
	, DB1.DATA_TYPE
	, DB2.DATA_TYPE NEW_DATA_TYPE
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.DATA_TYPE, '') <> IsNull(DB2.DATA_TYPE, '') THEN 'Yes' END Different_data_type
	, DB1.CHARACTER_MAXIMUM_LENGTH
	, DB2.CHARACTER_MAXIMUM_LENGTH NEW_CHARACTER_MAXIMUM_LENGTH
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.CHARACTER_MAXIMUM_LENGTH, '') <> IsNull(DB2.CHARACTER_MAXIMUM_LENGTH, '') THEN 'Yes' END Different_maximum_length
	, DB1.NUMERIC_PRECISION
	, DB2.NUMERIC_PRECISION NEW_NUMERIC_PRECISION
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.NUMERIC_PRECISION, '') <> IsNull(DB2.NUMERIC_PRECISION, '') THEN 'Yes' END Different_numeric_precision
	, DB1.NUMERIC_SCALE
	, DB2.NUMERIC_SCALE NEW_NUMERIC_SCALE
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.NUMERIC_SCALE, '') <> IsNull(DB2.NUMERIC_SCALE,'') THEN 'Yes' END Different_numeric_scale
	, DB1.CHARACTER_SET_NAME
	, DB2.CHARACTER_SET_NAME NEW_CHARACTER_SET_NAME
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.CHARACTER_SET_NAME, '') <> IsNull(DB2.CHARACTER_SET_NAME, '') THEN 'Yes' END Different_character_set_name 
	, DB1.COLLATION_NAME
	, DB2.COLLATION_NAME NEW_COLLATION_NAME
	, CASE WHEN DB2.TABLE_NAME IS NOT NULL AND IsNull(DB1.COLLATION_NAME, '') <> IsNull(DB2.COLLATION_NAME, '') THEN 'Yes' END Different_collation_name
	, DB1.ORDINAL_POSITION
	, DB2.ORDINAL_POSITION NEW_ORDINAL_POSITION
	, DB1.COLUMN_DEFAULT
	, DB2.COLUMN_DEFAULT NEW_COLUMN_DEFAULT
	, DB1.IS_NULLABLE
	, DB2.IS_NULLABLE NEW_IS_NULLABLE
	FROM [old database name].INFORMATION_SCHEMA.COLUMNS DB1
	    LEFT JOIN [new database name].INFORMATION_SCHEMA.COLUMNS DB2
	      ON DB1.TABLE_SCHEMA = DB2.TABLE_SCHEMA
	     AND DB1.TABLE_NAME = DB2.TABLE_NAME
	     AND DB1.COLUMN_NAME = DB2.COLUMN_NAME
 ) DAT
WHERE DAT.Comment IS NOT NULL
  OR IsNull(DAT.Different_data_type,'') = 'Yes'
  OR IsNull(DAT.Different_maximum_length,'') = 'Yes'
  OR IsNull(DAT.Different_numeric_precision,'') = 'Yes'
  OR IsNull(DAT.Different_numeric_scale,'') = 'Yes'
  OR IsNull(DAT.Different_character_set_name,'') = 'Yes'
  OR IsNull(DAT.Different_collation_name,'') = 'Yes'
ORDER BY DAT.TABLE_SCHEMA
, DAT.TABLE_NAME
, DAT.COLLATION_NAME

Comments:
1) The query targets only the most common scenarios, therefore must be changed to handle further cases (e.g. different column defaults, different attributes like nullable, etc.)!
2) The other perspective can be obtained by inverting the table names (without aliases) and changing the name of the columns from "NEW_' to "OLD_" (see the queries for schemas).
3) One can move the column-based conditions for the differences in the main query, though then is needed to duplicate the logic, which will make the code more challenging to change and debug. 

Happy coding!

01 March 2025

💎🤖💫SQL Reloaded: Copilot Stories (Part VII: Using the sp_MSForEachTable Undocumented Stored Procedures in SQL Server)

Probably few know that SQL Server makes available a set of undocumented stored procedures that can be useful in certain scenarios that don't involve code that will deployed into production. One of them is sp_MSForEachTable on which I wrote several posts until now. The stored procedure executes one or more scripts based on a set of constraints against the subset of available tables. The first surprising fact is that Copilot does know the respective stored procedure and seems to correctly generate code for it!

Disclaimer: The below code was generated by Copilot via prompts used for demonstration purposes!

Please note that the following code is provided as it is and you can use it on your own risk! Don't run the code in a productive environment and make sure that you have a usable backup! Please consider all caution when you use a feature marked as undocumented!

Let's start with a simple script and increase the complexity of the queries as it goes!

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure in SQL Server

Output:

-- query that uses this stored procedure to update a specific column in all tables:

EXEC sp_MSForEachTable 'UPDATE ? SET YourColumn = ''NewValue'' WHERE YourCondition'

/*
? is a placeholder for the table name.
YourColumn is the column you want to update.
NewValue is the new value you want to set.
YourCondition is the condition that must be met for the update to occur.*/

As the description says, "?" acts as placeholder for table's name while the other values need to be provided explicitly. The code can be used to overwrite the value of a column that exists in some of the tables. For example, when the data come over an interface from various system, it may happen that there is a mix of upper and lower case values that designate the same value. This might not be an issue for Dynamics AX but can create issues further on. 

Here's the actual code for performing the update:

-- update DataAreaId for DMEMF company
EXEC sp_MSForEachTable 'UPDATE ? SET DataAreaId = Upper(DataAreaId) WHERE DataAreaId = ''demf'''

Executing the code blindly can create serious damage, so always test first the code in a test environment and have a backup available! One needs to be careful if there are fields which have same name(s) but different meaning! It's important to perform a thorough analysis in this direction.

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to select records from a set of tables.

Output:

-- return all columns
EXEC sp_MSForEachTable 'SELECT * FROM ?'

-- return specific columns based on condition
EXEC sp_MSForEachTable 'SELECT Column1, Column2 FROM ? WHERE YourCondition'

The second query will result in errors if the columns don't exist, though the execution runs further! The errors can be filtered out in Excel by sorting them accordingly.
The query works in general for combinations of columns available in many tables. It's the case of DataAreaId and RecId available in Dynamics 365 for Finance and Operations for a high percentage of the tables. RecId should be available for all the tables.

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to count all the records from a table

Output:

-- retrieving the number of records for all tables
EXEC sp_MSForEachTable 'SELECT ''?'' AS TableName, COUNT(*) AS RecordCount FROM ?'

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to select all the records from tables containing the word invent.

Output:

DECLARE @sql NVARCHAR(MAX)

SET @sql = ''
EXEC sp_MSForEachTable 
    @command1 = 'IF ''?'' LIKE ''%invent%'' BEGIN SET @sql = @sql + ''SELECT * FROM '' + ''?'' + ''; '' END',
    @replacechar = '?',
    @command2 = 'EXEC sp_executesql @sql'

PRINT @sql

Here's the working version of the query:

-- retrieve the records for tables having 'invent' intheir name
EXEC sp_MSForEachTable 
    @command1 = 'DECLARE  @sql nvarchar(max) = ''''
		IF ''?'' LIKE ''%invent%'' BEGIN SET @sql = @sql + ''SELECT * FROM '' + ''?'' + ''; '' END;
                    EXEC sp_executesql @sql;'

A few days later, using the same prompt in Edge I got the following answer, simpler and correct:

-- second version run on Edge Copilot
EXEC sp_MSForEachTable 
    @command1 = 'IF ''?'' LIKE ''%invent%'' 
                BEGIN 
                    PRINT ''?'' 
                    EXEC(''SELECT * FROM ?'') 
                END'

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure when the DataAreaId field exists in the table.

Output:

DECLARE @sql NVARCHAR(MAX)

SET @sql = ''
EXEC sp_MSForEachTable 
    @command1 = 'IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = PARSENAME(''?'', 1) AND COLUMN_NAME = ''DataAreaId'') 
                 BEGIN 
                     SET @sql = @sql + ''SELECT * FROM '' + ''?'' + ''; '' 
                 END',
    @replacechar = '?',
    @command2 = 'EXEC sp_executesql @sql'

PRINT @sql

The query uses the metadata available to retrieve the tables in scope.

EXEC sp_MSForEachTable 
    @command1 = ' DECLARE  @sql nvarchar(max) = ''''
	IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = PARSENAME(''?'', 1) AND COLUMN_NAME = ''DataAreaId'') 
        BEGIN 
            SET @sql = @sql + ''SELECT * FROM '' + ''?'' + ''; '' 
			EXEC sp_executesql @sql
        END'

Here's the answer in Edge:

EXEC sp_MSForEachTable 
    @command1 = 'IF EXISTS (SELECT * FROM sys.columns WHERE object_id = OBJECT_ID(''?'' ) AND name = ''DataAreaId'') 
                BEGIN 
                    PRINT ''?'' 
                    EXEC(''SELECT * FROM ?'') 
                END'

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to select records from a set of tables while it checks whether the fields exist in the table.

Output:

DECLARE @sql NVARCHAR(MAX)

SET @sql = ''
EXEC sp_MSForEachTable 
    @command1 = 'DECLARE @table NVARCHAR(128)
                 SET @table = PARSENAME(''?'', 1)
                 IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @table AND COLUMN_NAME = ''Field1'')
                 AND EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @table AND COLUMN_NAME = ''Field2'')
                 BEGIN
                     SET @sql = @sql + ''SELECT Field1, Field2 FROM '' + ''?'' + ''; ''
                 END',
    @replacechar = '?',
    @command2 = 'EXEC sp_executesql @sql'

PRINT @sql

The query used the metadata available in the INFORMATION_SCHHEMA.COLUMNS. Unfortunately, the code doesn't work as expected. The second command is not necessary for performing the task, so the code can be written as follows, and now it works:

EXEC sp_MSForEachTable 
    @command1 = 'DECLARE @table NVARCHAR(128), @sql nvarchar(max) = ''''
                 SET @table = PARSENAME(''?'', 1)
                 IF EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @table AND COLUMN_NAME = ''DataAreaId'')
                 AND EXISTS (SELECT 1 FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = @table AND COLUMN_NAME = ''ItemId'')
                 BEGIN
                     SET @sql = @sql + ''SELECT DataAreaId, ItemId FROM '' + ''?'' + ''; ''
                 END
		 EXEC sp_executesql @sql'

The query is easy to modify to include more columns after the same pattern, if such tables exist after all (e.g. in Dynamics 365 one can search for the combination ItemId, DataAreaId and one or more Inventory dimension like InventSiteId, InventLocationId, etc.). 

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to count all the records for a subset of tables.

Output:

DECLARE @sql NVARCHAR(MAX)

SET @sql = ''
EXEC sp_MSForEachTable 
    @command1 = 'IF ''?'' LIKE ''%subset%'' 
                 BEGIN 
                     SET @sql = @sql + ''SELECT ''''?'''' AS TableName, COUNT(*) AS RecordCount FROM '' + ''?'' + ''; '' 
                 END',
    @replacechar = '?',
    @command2 = 'EXEC sp_executesql @sql'

PRINT @sql

Unfortunately, that's not what I was expecting. So, let's give it a second try:

Prompt:
Write a query that uses the sp_MSForEachTable stored procedure to count all the records for a subset of the tables available in the SYS.TABLES.

Output:

EXEC sp_MSForEachTable 
    @command1 = 'IF OBJECT_ID(''?'' , ''U'') IN (SELECT OBJECT_ID FROM SYS.TABLES WHERE NAME IN (''YourTable1'', ''YourTable2'', ''YourTable3'')) 
                BEGIN 
                    PRINT ''?'' 
                    EXEC(''SELECT COUNT(*) FROM ?'') 
                END'

Replacing the tables names accordingly, the query becomes

EXEC sp_MSForEachTable 
    @command1 = 'IF OBJECT_ID(''?'' , ''U'') IN (SELECT OBJECT_ID FROM SYS.TABLES WHERE NAME IN (''InventTable'', ''InventDim'', ''InventDimCombination'')) 
                BEGIN 
                    PRINT ''?'' 
                    EXEC(''SELECT COUNT(*) FROM ?'') 
                END'

Comments:
1) Edge Copilot returned the expected solution directly and I wonder what the reason could be? It can be the fact that I run the tests on different machines at different points in time, though more likely the reason is that I used two different Copilots - from Edge, respectively Microsoft 365. The expectation was to get better answers in M365 Copilot, which wasn't the case in the above tests!

The two Copilots have different architectures and were trained with different sets of data, respectively have different sources. One important difference is that Edge Copilot primarily relies on web content and users' interactions within the browser, while M365 Copilot relies on data from the D365 services. Both use LLM and NLP, while M365 Copilot uses Microsoft Graph and Syntex.

One thing I remarked during the tests is that sometimes I get different answers to the same question also within the same Copilot. However, that's expected given that there are multiple ways of addressing the same problem! It would still be interesting to understand how a solution is chosen.

2) I seldom used sp_MSForEachTable with more than one parameter, preferring to prepare the needed statement beforehand, as in the working solutions above.

Happy coding!

Refer4ences:
[1] SQLShack (2017) An introduction to sp_MSforeachtable; run commands iteratively through all tables in a database [link]

Acronyms:
LLM - large language model
M365 - Microsoft 365
NLP - natural language processing

Previous Post <<||>> Next Post

06 February 2024

💎🏭SQL Reloaded: Microsoft Fabric's Delta Tables in Action - Table Metadata I (General Information)

In a previous post I've created a delta table called Assets. For troubleshooting and maintenance tasks it would be useful to retrieve a table's metadata - properties, definition, etc. There are different ways to extract the metadata, depending on the layer and method used.

INFORMATION_SCHEMA in SQL Endpoint

Listing all the delta tables available in a Lakehouse can be done using the INFORMATION_SCHEMA via the SQL Endpoint:

-- retrieve the list of tables
SELECT * 
FROM INFORMATION_SCHEMA.TABLES
WHERE TABLE_TYPE = 'BASE TABLE'
ORDER BY TABLE_SCHEMA  

Output:
TABLE_CATALOG TABLE_SCHEMA TABLE_NAME TABLE_TYPE
Testing dbo city BASE TABLE
Testing dbo assets BASE TABLE

The same schema can be used to list columns' definition:
 
-- retrieve column metadata
SELECT TABLE_CATALOG
, TABLE_SCHEMA
, TABLE_NAME
, COLUMN_NAME
, ORDINAL_POSITION
, DATA_TYPE
, CHARACTER_MAXIMUM_LENGTH
, NUMERIC_PRECISION
, NUMERIC_SCALE
, DATETIME_PRECISION
, CHARACTER_SET_NAME
FROM INFORMATION_SCHEMA.COLUMNS
WHERE TABLE_NAME = 'assets'
ORDER BY ORDINAL_POSITION

Note:
Given that the INFORMATION_SCHEMA is an ANSI-standard schema for providing metadata about a database's objects (including views, schemata), probably this is the best way to retrieve general metadata as the above (see [3]). More information over the SQL Endpoint can be obtained by querying directly the SQL Server metadata.

Table's Properties

The Delta Lake documentation reveals that a table's properties can be retrieved via the DESCRIBE command, which can be used in a notebook's cell (see [1] for attributes' definition):

-- describe table's details
DESCRIBE DETAIL Assets;

Output (transposed):
Attribute Value
format delta
id 83e87b3c-28f4-417f-b4f5-842f6ba6f26d
name spark_catalog.testing.assets
description NULL
location abfss://[…]@onelake.dfs.fabric.microsoft.com/[…]/Tables/assets
createdAt 2024-02-05T16:18:54Z
lastModified 2024-02-05T16:29:52Z
partitionColumns
numFiles 1
sizeInBytes 3879
properties [object Object]
minReaderVersion 1
minWriterVersion 2
tableFeatures appendOnly,invariants

One can export the table metadata also from the sys.tables via the SQL Endpoint, though will be considered also SQL Server based metadata:

-- table metadata
SELECT t.object_id
, schema_name(t.schema_id) schema_name
, t.name table_name
, t.type_desc
, t.create_date 
, t.modify_date
, t.durability_desc
, t.temporal_type_desc
, t.data_retention_period_unit_desc
FROM sys.tables t
WHERE name = 'assets'

Output (transposed):
Attribute Value
object_id 1264723558
schema_name dbo
table_name assets
type_desc USER_TABLE
create_date 2024-02-05 16:19:04.810
modify_date 2024-02-05 16:19:04.810
durability_desc SCHEMA_AND_DATA
temporal_type_desc NON_TEMPORAL_TABLE
data_retention_period_unit_desc INFINITE

Note:
1) There seem to be thus two different repositories for storing the metadata, thing reflected also in the different timestamps.

Table's Definition 

A table's definition can be easily exported via the SQL Endpoint in SQL Server Management Studio. Multiple tables' definition can be exported as well via the Object explorer details within the same IDE. 

In Spark SQL you can use the DESCRIBE TABLE command:

-- show table's definition
DESCRIBE TABLE Assets;

Output:
col_name data_type comment
Id int NULL
CreationDate timestamp NULL
Vendor string NULL
Asset string NULL
Model string NULL
Owner string NULL
Tag string NULL
Quantity decimal(13,2) NULL

Alternatively, you can use the EXTENDED keyword with the previous command to show further table information:
 
-- show table's definition (extended)
DESCRIBE TABLE EXTENDED Assets;

Output (only the records that appear under '# Detailed Table Information'):
Label Value
Name spark_catalog.testing.assets
Type MANAGED
Location abfss://[...]@onelake.dfs.fabric.microsoft.com/[...]/Tables/assets
Provider delta
Owner trusted-service-user
Table Properties [delta.minReaderVersion=1,delta.minWriterVersion=2]

Note that the table properties can be listed individually via the SHOW TBLPROPERTIES command:
 
--show table properties
SHOW TBLPROPERTIES Assets;

A column's definition can be retrieved via a DESCRIBE command following the syntax:

-- retrieve column metadata
DESCRIBE Assets Assets.CreationDate;

Output:
info_name info_value
col_name CreationDate
data_type timestamp
comment NULL

In PySpark it's trickier to retrieve a table's definition (code adapted after Dennes Torres' presentation at Toboggan Winter edition 2024):

%%pyspark
import pyarrow.dataset as pq
import os
import re

def show_metadata(file_path, table_name):
    #returns a delta table's metadata 

    print(f"\{table_name}:")
    schema_properties = pq.dataset(file_path).schema.metadata
    if schema_properties:
        for key, value in schema_properties.items():
            print(f"{key.decode('utf-8')}:{value.decode('utf-8')}")
    else:
        print("No properties available!")

#main code
path = "/lakehouse/default/Tables"
tables = [f for f in os.listdir(path) if re.match('asset',f)]

for table in tables:
    show_metadata(f"/{path}/"+ table, table)

Output:
\assets:
org.apache.spark.version:3.4.1
org.apache.spark.sql.parquet.row.metadata:{"type":"struct","fields":[
{"name":"Id","type":"integer","nullable":true,"metadata":{}},
{"name":"CreationDate","type":"timestamp","nullable":true,"metadata":{}},
{"name":"Vendor","type":"string","nullable":true,"metadata":{}},
{"name":"Asset","type":"string","nullable":true,"metadata":{}},
{"name":"Model","type":"string","nullable":true,"metadata":{}},
{"name":"Owner","type":"string","nullable":true,"metadata":{}},
{"name":"Tag","type":"string","nullable":true,"metadata":{}},
{"name":"Quantity","type":"decimal(13,2)","nullable":true,"metadata":{}}]}
com.microsoft.parquet.vorder.enabled:true
com.microsoft.parquet.vorder.level:9

Note:
One can list the metadata for all tables by removing the filter on the 'asset' table:

tables = os.listdir(path)

Notes:
The above views from the INFORMATION_SCHEMA, respectively sys schemas are available in SQL databases in Microsoft Fabric as well.

Happy coding!

References:
[1] Delta Lake (2023) Table utility commands (link)
[2] Databricks (2023) DESCRIBE TABLE (link)
[3] Microsoft Learn (2023) System Information Schema Views (link)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.