SQL Troubles: enumerations

Showing posts with label enumerations. Show all posts

10 October 2023

💫Base Enums Metadata in Dynamics 365 for Finance and Operations

The list of values that don't have their own tables are managed within the application as (Base) Enums and, naturally, only the numeric values being saved to the database. Even if this is practical for the application, it's a nightmare for the people using the data exported from database as is needed to convert the codes to meaningful values.

Fortunately, there's a way to go around this limitation. First, there's the old obsolete AX 2012 documentation for Base Enums, which documents the Names and Descriptions for the Enum Values (e.g., search for CustVendNegInstStatus). As Microsoft stopped maintaining the documentation, the changes made in D365 are not reflected.

Secondly, the mapping between Values and Names is stored in two of the tables available in D365, the following query allowing to retrieve the mapping:

-- Metadata - Base Enum Values 
SELECT EIT.Id 
, EIT.Name EnumName
, EVT.EnumValue EnumValueId
, EVT.Name EnumValueName 
, ' WHEN ' + Cast(EVT.EnumValue as nvarchar(10)) + ' THEN ''' + EVT.Name + '''' [CASE Statement]
FROM dbo.EnumValueTable EVT
     JOIN dbo.EnumIdTable EIT
       ON EVT.EnumId = EIT.Id
WHERE EIT.Name LIKE 'CustVendNegInstStatus%'
--WHERE EVT.Name LIKE '%Redrawn%'
ORDER BY EIT.Name 
, EVT.EnumValue

The [Case Statement] column generates already the statement for the CASE. One can search for the Base Enum, respectively for the value itself.

Unfortunately, the metadata are not complete and, as there's no timestamp, it's not possible to identify what and when changed. Therefore, one needs to regularly check the changes manually, or store a copy of the respective data as baseline, thus being able to compare the latest metadata against the baseline. The solution is not ideal, but it does the trick.

To identify the Base Enum corresponding to a field one can export the script behind a form when the functionality is available. Otherwise, one can use the Table Browser and generate the script from the corresponding source table.

Occasionally, the value stored in the above tables may deviate slightly from the expectation. It's useful then to search only for a substring, use both sources and even search on the internet for more information.

When building reports or a semantic layer on top of D365 data model, for almost all important tables is needed to retrieve such mappings. It's a good idea to encapsulate the mappings in single "base views" and reuse it when needed, avoiding thus to replicate the mappings for each use of the source table (see the logic for TDM.vEcoResProduct).

The data generated by the above script could be dumped to a table and thus the table joined in the queries when the values for a Base Enum are needed, though I don't recommend such an approach because it would increase the complexity of the queries, already complex in certain scenarios. Also the misuse of views for creating pseudoconstants is not recommended.

Happy coding!

Previous Post <<||>> Next Post

💫Data Warehousing and Dynamics 365 for Finance and Operation - A Few Issues to Consider I

Data Warehousing Series

Introduction

Besides the fact that data professionals don't have direct access to D365 F&O production environments (direct access is available only to sandboxes), which was from the beginning an important constraint imposed by the architecture, there are a few more challenges that need to be addressed when working with the data.

Case Sensitiveness

SQL Server is not case sensitive, therefore, depending on the channel though which the data came, values appear either in upper or lower case, respectively a mixture of both. Even if this isn't an issue in D365, it can become an issue when the data leave the environment. E.g., PowerQuery is case sensitive (while DAX is case insensitive), thus, if a field containing a mix of values participate in a join or aggregation, this will result in unexpected behavior (e.g., duplicates, records ignored). It's primarily the case of the Company (aka DataAreaId) field available in most of the important tables.

The ideal solution would be to make sure that the values are correct by design, however this can't be always enforced. Otherwise, when using the data outside of D365 F&O the solution would be to transform all the values in upper case (or lower case). However, also this step might occur too late. E.g., when the data are exported to the Azure Data Lake in parquet file format.

Unique Keys

A unique record in D365 F&O was in earlier versions usually identified by the RecId and DataAreaId, while later the Partition field was added. This means that most of the joins will need to consider all 3 columns, which adds some overhead. In some environments there's only a Partition defined (and thus the field can be ignored), however this is not a warranty.

As long developers use SQL there's no issue of using multiple fields in JOINs, though in PowerQuery there must be created a unique key based on the respective records so the JOINs are possible. Actually, also SQL-based JOINs would benefit if each record would be identified by one field.

Audit Metadata

Not all tables have fields that designate the date when a record was created or last modified, respectively the user who performed the respective action. The fields can be added manually when setting up the system, however that's seldom done. This makes it difficult to audit the records and sometimes it's a challenge also for reporting, respectively for troubleshooting the differences between DWH and source system. Fortunately, the Export to Data Lake adds a timestamp reflecting the time when the record was synchronized, though it can be used then only for the records synchronized after the first load.

Tables vs. Entities

Data are modified in D365 F&O via a collection of entities, which are nothing but views that encapsulate the business logic, being based on the base tables or other views, respectively a combination of both. The Export to Data Lake (*) is based on the tables, while Link to Data Lake is based on data entities.

Using the base tables means that the developer must reengineer the logic from the views. For some cases it might work to create the entities as views in the DWH environment though some features might not be supported. It's the case of serverless and dedicated SQL pools, that support only a subset from the features available under standard Azure SQL Server.

The developer can try to replicate the logic from entities, considering only the logic needed by the business, especially when only a subset from the functionality available in the entity was used. The newly created views can become thus more readable and maintainable. On the other side, if the logic in entity changed, the changes need to be reflected also in the DWH views.

Using the entity-based data makes sure that the data are consistent between environments. Unfortunately, Microsoft found out that isn't so easy to synchronize the data at entity level. Moreover, there are multiple entities based on the same table that reflect only a subset of the columns or rows. Thus, to cover all the fields from a base table, one might be forced to synchronize multiple views, leading thus to data duplication.

In theory, both mechanisms can be used within the same environment, even if this approach is against the unique source of truth principle, when data are duplicated.

Data Validation in the Data Lake

One scenario in which both sources are useful is when validating whether the synchronization mechanism worked as expected. Thus, one can compare the number of records and check whether there are differences that can't be mitigated. However, does it make sense to "duplicate" database objects only for this purpose?

Ideally, to validate whether a record was synchronized should be done in the source environment (e.g. via a timestamp). That's difficult to achieve, especially when there's no direct access to the source database (as is the case for Production databases). Fortunately, Dataverse provides this functionality, even if might not be bullet proof.

In extremis, the most reliable approach is to copy the production environment on a sandbox and do a count of records for each table, using as baseline for comparison the time when the refresh occurred.

Base Enum Values

The list of values that don't have their own tables are managed within the application as Base Enums and, naturally, only the numeric values being saved to the database. Even if this is practical for the application, it's a nightmare for the people using the data exported from database as is needed to convert the codes to meaningful values. Some of the mappings between the codes and values are documented in two system tables, and even in old language-based documentation, though both sources are far from complete. As alternative, one can try to discover the values in the system.

Unfortunately, the mappings need to be repeated when the Enum-based attributed is used in multiple places. One can reduce mapping's duplication by encapsulating the logic into a view (aka "base view") and reused accordingly (see the logic for TDM.vEcoResProduct).

Even if the values for many of the Enums are stored into the EnumValueTable table, Enum's name being available in EnumIdTable table, it's not a good idea to retrieve the values via a JOIN in the business logic. This would complicate the business logic unnecessarily. A CASE is more efficient even if occasionally more difficult to maintain. Unfortunately, there's no timestamp to identify which values were added lately.

Note:
(*) Existing customers have until 1-Nov-2024 to transition from Export to Data lake to Synapse link. Microsoft advises new customers to use Synapse Link.

Previous Post <<||>> Next Post

13 January 2021

💎💫SQL Reloaded: Useful Queries for Dynamics 365 F&O and not only (Part I: Dependencies)

Microsoft Dynamics 365 for Finance and Operations (D365 FO or simply D365) has currently more than 12000 tables, therefore attempting to understand the data model behind might seem at first sight to be a foolhardy effort. Fortunately, only a small percentage from all the tables contain meaningful master or transactional data, fact that makes easier the task.

Moreover, the tables have meaningful prefixes (e.g. 'Invent' for Inventory, 'Purch' for Purchae Orders, 'Sales' for Sales Orders, etc.) and postfixes (e.g. 'Table for header data, 'Line' for position data, 'Trans' for transaction data) while their names are in many cases self-explanatory if one knows the corresponding business terms.

For those coming from earlier versions of D365 (e.g. Microsoft Dynamics AX 2009), with a few exceptions in the inventory transactions, product master data, addresses and finance dimensions, the data structures remained almost the same, if one considers the main functional area of the ERP system. However as new modules and/or functionality were added, the number of new tables increased considerably between the versions.

On the other side in D365 the data are exposed through data entities, abstractions from the physical implementation of the database tables [1], implemented as views typically having the 'Entity' postfix. In many scenarios the data entities more likely will prove to be the best way of accessing the data, as they include an important part of the logic one will need anyway to reengineer, however there are maybe also exceptions in which is needed to work with the base tables. In the end it will be needed to know the main data entities as well the important tables related to them, being needed to check which views reference a given table, respectively which tables are references by a view. This can be easily achieved via the sys.sql_expression_dependencies system catalog view.

The following queries can be run in non-production D365 environments:

-- retrieving the views where an object was used 
SELECT SCHEMA_NAME(obj.SCHEMA_ID) + '.'+ obj.name AS ReferencingObject
, obj.type_desc
, SCHEMA_NAME(obj.SCHEMA_ID) + '.'+ sed.referenced_entity_name AS ReferencedObject
, obr.type_desc ReferencedObjectType
FROM sys.all_objects obj
     JOIN sys.sql_expression_dependencies sed 
       ON obj.OBJECT_ID = sed.REFERENCING_ID 
	      JOIN sys.all_objects obr
		    ON sed.referenced_id = obr.object_id
WHERE sed.referenced_entity_name = 'PurchTable'
  AND obj.Type = 'V'
  AND obj.name LIKE 'Purch%'
ORDER BY ReferencingObject

Output (12 records):

ReferencingObject	type_desc	ReferencedObject	ReferencedObjectType
dbo.PURCHASEORDERCUBE	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHASEORDERRESPONSEHEADERSUMMARY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHLINEBACKORDER	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHLINEBACKORDERWORKSPACE	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHLINEEXPANDED	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHORDERAPPROVEDVIEW	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPREPAYOPENBASE	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERCHARGEENTITY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERENTITY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERLINECDSENTITY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHTABLENOTARCHIVEDVERSIONS	VIEW	dbo.PURCHTABLE	USER_TABLE

The inverse search:

-- retrieving the objects referenced in a view
SELECT SCHEMA_NAME(obj.SCHEMA_ID) + '.'+ obj.name AS ReferencingObject
, obj.type_desc ReferncingObjectType
, SCHEMA_NAME(obj.SCHEMA_ID) + '.'+ sed.referenced_entity_name AS ReferencedObject
, obr.type_desc ReferencedObjectType
FROM sys.all_objects obj
     JOIN sys.sql_expression_dependencies sed 
       ON obj.OBJECT_ID = sed.referencing_id 
	      JOIN sys.all_objects obr
		    ON sed.referenced_id = obr.object_id
WHERE obj.name = 'PURCHPURCHASEORDERHEADERV2ENTITY'
ORDER BY ReferencedObject

Output:

ReferencingObject	ReferncingObjectType	ReferencedObject	ReferencedObjectType
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.ACCOUNTINGDISTRIBUTIONTEMPLATE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.CONFIRMINGPO	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.DIMENSIONSETENTITY	VIEW
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.GetValidFromInContextInfo	SQL_SCALAR_FUNCTION
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.GetValidToInContextInfo	SQL_SCALAR_FUNCTION
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.HCMWORKER	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.LOGISTICSPOSTALADDRESS	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.LOGISTICSPOSTALADDRESSBASEENTITY	VIEW
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.PURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.PURCHTABLE_W	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.REASONTABLEREF	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.TMSPURCHTABLE	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.TRANSPORTATIONDOCUMENT	USER_TABLE
dbo.PURCHPURCHASEORDERHEADERV2ENTITY	VIEW	dbo.VENDINVOICEDECLARATION_IS	USER_TABLE

Notes:

1. There will be cases in which is needed to check views' definition. To understand the logic it will be needed to format the queries in a more readable form. This can take time, given that the joins and the further constraints are not always straightforward. Therefore, it might be a good idea to export a formatted copy of the view for later use.

2. Unfortunately, the data model for D365 is not publicly available. One can use however the data model for AX 2012 (see [2] [3]) as basis and fill the gaps when needed.

3. The important changes between the various data models versions are partially documented in several documents available via the Microsoft Dynamics CustomerSource.

4. For non-production environments the data from views, tables or other database objects can be exported directly from the database. Unfortunately, Microsoft blocked the direct access to production environments. The data from tables can be accessed and exported via the table browser, while the data from entities can be exported to Excel typically from the display forms or via the Data Management Framework.

5. The queries can be used to explore any SQL Server database. The logic can be easily encapsulated in a view to simplify the calls.

Previous Post <<||>> Next Post

References:
[1] Microsoft Dynamics 365 (2020) Data entities overview [source]

[2] Microsoft Docs (2014) Application tables [source]

[3] Microsoft Docs (2014) Base Enums AX 2012 [source]

24 February 2018

💎SQL Reloaded: Misusing Views and Pseudo-Constants

Views as virtual tables can be misused to replace tables in certain circumstances, either by storing values within one or multiple rows, like in the below examples:

-- parameters for a BI solution
CREATE VIEW dbo.vLoV_Parameters
AS
SELECT Cast('ABC' as nvarchar(20)) AS DataAreaId
 , Cast(GetDate() as Date) AS CurrentDate 
 , Cast(100 as int) AS BatchCount 

GO

SELECT *
FROM dbo.vLoV_Parameters

GO

-- values for a dropdown 
CREATE VIEW dbo.vLoV_DataAreas
AS
SELECT Cast('ABC' as nvarchar(20)) AS DataAreaId
, Cast('Company ABC' as nvarchar(50)) AS Description 
UNION ALL
SELECT 'XYZ' DataAreaId 
, 'Company XYZ'

GO

SELECT *
FROM dbo.vLoV_DataAreas

GO

These solutions aren’t elegant, and typically not recommended because they go against one of the principles of good database design, namely “data belong in tables”, though they do the trick when needed. Personally, I used them only in a handful of cases, e.g. when it wasn’t allowed to create tables, when it was needed testing something for a short period of time, or when there was some overhead of creating a table for 2-3 values. Because of their scarce use, I haven’t given them too much thought, not until I discovered Jared Ko’s blog posting on pseudo-constants. He considers the values from the first view as pseudo-constants, and advocates for their use especially for easier dependency tracking, easier code refactoring, avoiding implicit data conversion and easier maintenance of values.

All these are good reasons to consider them, therefore I tried to take further the idea to see if it survives a reality check. For this I took Dynamics AX as testing environment, as it makes extensive use of enumerations (aka base enums) to store list of values needed allover through the application. Behind each table there are one or more enumerations, the tables storing master data abounding of them. For exemplification let’s consider InventTrans, table that stores the inventory transactions, the logic that governs the receipt and issued transactions are governed by three enumerations: StatusIssue, StatusReceipt and Direction.

-- Status Issue Enumeration 
 CREATE VIEW dbo.vLoV_StatusIssue
 AS
 SELECT cast(0 as int) AS None
 , cast(1 as int) AS Sold
 , cast(2 as int) AS Deducted
 , cast(3 as int) AS Picked
 , cast(4 as int) AS ReservPhysical
 , cast(5 as int) AS ReservOrdered
 , cast(6 as int) AS OnOrder
 , cast(7 as int) AS QuotationIssue

GO

-- Status Receipt Enumeration 
 CREATE VIEW dbo.vLoV_StatusReceipt
 AS
SELECT cast(0 as int) AS None
 , cast(1 as int) AS Purchased
 , cast(2 as int) AS Received
 , cast(3 as int) AS Registered
 , cast(4 as int) AS Arrived
 , cast(5 as int) AS Ordered
 , cast(6 as int) AS QuotationReceipt

GO

-- Inventory Direction Enumeration 
 CREATE VIEW dbo.vLoV_InventDirection
 AS
 SELECT cast(0 as int) AS None
 , cast(1 as int) AS Receipt
 , cast(2 as int) AS Issue

To see these views at work let’s construct the InventTrans table on the fly:

-- creating an ad-hoc table  
SELECT *
INTO  dbo.InventTrans
FROM (VALUES (1, 1, 0, 2, -1, 'A0001')
, (2, 1, 0, 2, -10, 'A0002')
, (3, 2, 0, 2, -6, 'A0001')
, (4, 2, 0, 2, -3, 'A0002')
, (5, 3, 0, 2, -2, 'A0001')
, (6, 1, 0, 1, 1, 'A0001')
, (7, 0, 1, 1, 50, 'A0001')
, (8, 0, 2, 1, 100, 'A0002')
, (9, 0, 3, 1, 30, 'A0003')
, (10, 0, 3, 1, 20, 'A0004')
, (11, 0, 1, 2, 10, 'A0001')
) A(TransId, StatusIssue, StatusReceipt, Direction, Qty, ItemId)

Here are two sets of examples using literals vs. pseudo-constants:

--example issued with literals 
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
WHERE ITR.StatusIssue = 1 
  AND ITR.Direction = 2

GO
--example issued with pseudo-constants
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
      JOIN dbo.vLoV_StatusIssue SI
        ON ITR.StatusIssue = SI.Sold
      JOIN dbo.vLoV_InventDirection ID
        ON ITR.Direction = ID.Issue

GO

--example receipt with literals 
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
WHERE ITR.StatusReceipt= 1
   AND ITR.Direction = 1

GO

--example receipt with pseudo-constants
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
      JOIN dbo.vLoV_StatusReceipt SR
        ON ITR.StatusReceipt= SR.Purchased
      JOIN dbo.vLoV_InventDirection ID
        ON ITR.Direction = ID.Receipt

As can be seen the queries using pseudo-constants make the code somehow readable, though the gain is only relative, each enumeration implying an additional join. In addition, when further business tables are added to the logic (e.g. items, purchases or sales orders) it complicates the logic, making it more difficult to separate the essential from nonessential. Imagine a translation of the following query:

-- complex query 
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
              <several tables here>
WHERE ((ITR.StatusReceipt<=3 AND ITR.Direction = 1)
  OR (ITR.StatusIssue<=3 AND ITR.Direction = 2))
  AND (<more constraints here>)

The more difficult the constraints in the WHERE clause, the more improbable is a translation of the literals into pseudo-constraints. Considering that an average query contains 5-10 tables, each of them with 1-3 enumerations, the queries would become impracticable by using pseudo-constants and quite difficult to troubleshoot their execution plans.

The more I’m thinking about, an enumeration data type as global variable in SQL Server (like the ones available in VB) would be more than welcome, especially because values are used over and over again through the queries. Imagine, for example, the possibility of writing code as follows:

-- hypothetical query
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
WHERE ITR.StatusReceipt = @@StatusReceipt .Purchased
  AND ITR.Direction = @@InventDirection.Receipt

From my point of view this would make the code more readable and easier to maintain. Instead, in order to make the code more readable, one’s usually forced to add some comments in the code. This works as well, though the code can become full of comments.

-- query with commented literals
SELECT top 100 ITR.*
FROM dbo.InventTrans ITR
WHERE ITR.StatusReceipt <=3 -- Purchased, Received, Registered 
   AND ITR.Direction = 1-- Receip

In conclusion, pseudo-constants’ usefulness is only limited, and their usage is against developers’ common sense, however a data type in SQL Server with similar functionality would make code more readable and easier to maintain.

Notes:
1) It is possible to simulate an enumeration data type in tables’ definition by using a CHECK constraint.
2) The queries work also in SQL databases in Microsoft Fabric (see file in GitHub repository). You might want to use another schema (e.g. Test), not to interfere with the existing code.

Happy coding!

SQL Troubles

Pages

10 October 2023

💫Base Enums Metadata in Dynamics 365 for Finance and Operations

💫Data Warehousing and Dynamics 365 for Finance and Operation - A Few Issues to Consider I

13 January 2021

💎💫SQL Reloaded: Useful Queries for Dynamics 365 F&O and not only (Part I: Dependencies)

24 February 2018

💎SQL Reloaded: Misusing Views and Pseudo-Constants

About Me