30 October 2022

SQL Reloaded: Querying Azure Synapse Metadata for the D365 CRM & FO Tables Exported via Azure Synapse Link for Dataverse

Enabling Azure Synapse Link for Dataverse for Dynamics 365 CRM (D365 CRM) or for Dynamics 365 Finance & Operations (D365 FO) allows to have all the needed tables for reporting in a CDM structure, however the csv files with data don’t have any headers, which makes it challenging to use directly the files without the corresponding metadata. For example, attempting to define external tables would be useless without proper headers and data types. Fortunately, attribute’s name and data types are available in JSON files and can be queried. 

D365 CRM 

For CRM the files start with tables’ names and have the EntityMetadata.json postfix, the files being stored in the Microsoft.Athena.TrickleFeedService folder which lies with the other folders containing the data. Given that all files are in the same folder, the following query can be used to export the metadata for all or some of the tables. Just replace "(container name)" and "(data source)" with the corresponding values for your environment.

 
-- export definition for D365 CRM' CDM (all attributes, all tables)
SELECT DAT.EntityName
, DAT.AttributeName
, DAT.Timestamp
, DAT.AttributeType
, DAT.MetadataId
, DAT.Precision
, DAT.MaxLength
FROM openrowset(
        bulk '/(container name)/Microsoft.Athena.TrickleFeedService/*-EntityMetadata.json',
        data_source = '(data source)',
        format = 'csv',
        fieldterminator ='0x0b',
        fieldquote = '0x0b',
        rowterminator = '0x0b' --> You need to override rowterminator to read classic JSON
    ) with (doc nvarchar(max)) as rows
       CROSS APPLY OPENJSON(doc, '$.AttributeMetadata')--definitions[0].hasAttributes
       with (
         EntityName nvarchar(255) '$.EntityName'
        , AttributeName nvarchar(255) '$.AttributeName'
        , Timestamp nvarchar(50) '$.Timestamp'
        , AttributeType nvarchar(50) '$.AttributeType'
        , MetadataId nvarchar(100) '$.MetadataId'
        , Precision int '$.Precision'
        , MaxLength int '$.MaxLength'
     ) as DAT
WHERE DAT.EntityName IN ('lead', 'opportunity', 'product')

D365 FO
 

For Finance & Operations folders’ structure is more complex, a whole hierarchy of folders being built based on a set of predefined categories. One would need to traverse the hierarchy structure to find the files. Thus, it might be easier to generate the metadata for each table. Files’ names start with tables’ names and have the .cdm.json postfix. The below query generates the metadata for the InventTable table. Just replace the "(container name)", "(instance name)" and "(data source)" with the values for your environment.
 
-- export definition for D365 FO's CDM (all attributes, specific table)
SELECT DAT.name
, DAT.dataFormat
FROM openrowset(
        bulk '/(container name)/(instance name)/Tables/SupplyChain/ProductInformationManagement/Main/InventTable.cdm.json',
        data_source = '(data source)',
        format = 'csv',
        fieldterminator ='0x0b',
        fieldquote = '0x0b',
        rowterminator = '0x0b' --> You need to override rowterminator to read classic JSON
    ) with (doc nvarchar(max)) as rows
       CROSS APPLY OPENJSON(doc, '$.definitions[0].hasAttributes')
       with (
          name nvarchar(255) '$.name'
        , dataFormat nvarchar(50) '$.dataFormat'
     ) as DAT

Further Steps

One can obtain thus the needed metadata, however after a first inspection, the data types are too general compared with the ones considered for the data source attributes. One can work probably with these data types as well, though there's a Microsoft recommendation to minimize the row length by using the smallest possible column size, which leads to better query performance. Exporting the metadata from the source system and matching the two datasets based on tables and attributes’ names would allow addressing this recommendation, even if this implies checking from time to time whether their definition changed. The trick is to keep the same sorting order like in the Synapse files. Unfortunately, not all SQL Server data types are supported (e.g. text, ntext, sql_variant, etc.) or are ideal to work with (e.g. money), however there are alternative data types that can be used. 

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.