19 December 2023

💎SQL Reloaded: Blocked Products in Dynamics 365 F&O

Besides listing the products released by Legal entity (see previous post), it's useful to know whether they were blocked for Inventory, Sales or Procurement. This is quite easy when doing it over the data entity:

-- check status via the data entity
SELECT PSO.ItemNumber 
, PSO.DataAreaId 
, PSO.IsSalesProcessingStopped
, PSO.IsInventoryProcessingStopped 
, PSO.IsProcurementProcessingStopped
FROM dbo.InventProductSpecificOrderSettingsV3Entity PSO
ORDER BY PSO.ItemNumber 
, PSO.DataAreaId

However, when the data entity is not available, the logic gets a bit more complex because the data are stored in 3 different tables: InventItemInventSetup, InventItemPurchSetup, respectively InventItemSalesSetup. Here's the piece of logic used to get the various statuses in AX 2009 (of course, without the JOINs on Partition) and it works also in Dynamics 365 F&O:

/* Product Specific Order Settings via JOINs */
SELECT ITM.DataAreaId 
, ITM.ItemId 
, IPS.Stopped IsProcurementProcessingStopped
, ILS.Stopped IsInventoryProcessingStopped
, ISS.Stopped IsSalesProcessingStopped
FROM dbo.InventTable ITM
     LEFT JOIN dbo.InventItemPurchSetup IPS
       ON ITM.ItemID = IPS.ItemId
      AND ITM.DataAreaId = IPS.DataAreaId 
      AND ITM.Partition = IPS.Partition
      AND IPS.InventDimId = 'AllBlank'
     LEFT JOIN dbo.InventItemSalesSetup ISS
       ON ITM.ItemID = ISS.ItemId
      AND ITM.DataAreaId = ISS.DataAreaId 
      AND ITM.Partition = ISS.Partition
      AND ISS.InventDimId = 'AllBlank'
     LEFT JOIN dbo.InventItemInventSetup ILS
       ON ITM.ItemID = ILS.ItemId
      AND ITM.DataAreaId = ILS.DataAreaId 
      AND ITM.Partition = ILS.Partition
      AND ILS.InventDimId = 'AllBlank'
ORDER BY ITM.ItemId
,  ITM.DataAreaId 

The constraint on InventDimId is necessary because there can be multiple records for the Product and Legal entity combination. 

Alternatively, one can use UNIONs instead of JOINs and include the logic into a view to simplify the final query (the query was written to test the behavior in a distributed environment like serverless SQL pool):

-- create the view
CREATE OR ALTER VIEW TDM.vProductSpecificOrderSettings
AS
/* Product Specific Order Settings via UNIONs */
SELECT IST.DataAreaId 
, IST.ItemId 
, IST.Partition 
, max(IST.IsInventoryProcessingStopped) IsInventoryProcessingStopped
, max(IST.IsSalesProcessingStopped) IsSalesProcessingStopped
, max(IST.IsProcurementProcessingStopped) IsProcurementProcessingStopped
FROM (
	-- inventory
	SELECT IIS.DataAreaId 
	, IIS.ItemId 
	, IIS.Partition
	, IIS.Stopped IsInventoryProcessingStopped
	, 0 IsSalesProcessingStopped
	, 0 IsProcurementProcessingStopped
	FROM dbo.InventItemInventSetup IIS
	WHERE IIS.InventDimId = 'AllBlank'
	UNION ALL
	-- purchasing
	SELECT IPS.DataAreaId 
	, IPS.ItemId 
	, IPS.Partition 
	, 0
	, 0
	, IPS.Stopped
	FROM dbo.InventItemPurchSetup IPS
	WHERE IPS.InventDimId = 'AllBlank'
	UNION ALL
	-- sales
	SELECT ISS.DataAreaId 
	, ISS.ItemId 
	, ISS.Partition 
	, 0
	, ISS.Stopped
	, 0
	FROM dbo.InventItemSalesSetup ISS
	WHERE ISS.InventDimId = 'AllBlank'
 ) IST
 GROUP BY IST.DataAreaId 
, IST.ItemId 
, IST.Partition 

/* Product Specific Order Settings via UNIONs */
SELECT ITM.DataAreaId 
, ITM.ItemId 
, PSO.IsInventoryProcessingStopped
, PSO.IsSalesProcessingStopped
, PSO.IsProcurementProcessingStopped
FROM dbo.InventTable ITM
     LEFT JOIN TDM.vProductSpecificOrderSettings PSO
	   ON ITM.ItemId = PSO.ItemId 
	  AND ITM.DataAreaId = PSO.DataAreaId 
	  AND ITM.Partition = PSO.Partition
ORDER BY ITM.ItemId 
, ITM.DataAreaId 

At least for the database used for testing on an SQL Server 2022 instance, the last query has slightly lower estimated subtree cost and memory grant than the previous one. It's also interesting that the data entity-based query outperforms the other two queries. 

And here's the combined query from this and previous post:

-- Legal Entities vs Products incl. product order settings
SELECT DAT.Id DataAreaId 
, PRD.DisplayProductNumber ItemId 
, CASE WHEN ITM.DataAreaId IS NOT NULL THEN 1 ELSE 0 END IsReleased
, CASE WHEN ITM.CreatedDatetime <> ITM.ModifiedDateTime THEN 1 ELSE 0 END IsModified
, PSO.IsInventoryProcessingStopped
, PSO.IsSalesProcessingStopped
, PSO.IsProcurementProcessingStopped
FROM dbo.DataArea DAT
     CROSS JOIN dbo.EcoResProduct PRD
	 LEFT JOIN dbo.InventTable ITM
	   ON DAT.Id = ITM.DataAreaId 
	  AND PRD.DisplayProductNumber = ITM.ItemId
	  AND PRD.Partition = ITM.Partition
          LEFT JOIN TDM.vProductSpecificOrderSettings PSO
	        ON ITM.ItemId = PSO.ItemId 
	       AND ITM.DataAreaId = PSO.DataAreaId 
	       AND ITM.Partition = PSO.Partition
WHERE DAT.Id <> 'DAT'
   AND PRD.DisplayProductNumber = 'D0001'
ORDER BY PRD.DisplayProductNumber
, DAT.Id

Happy coding!

18 December 2023

💫Process Management: "Invoice Capture" Process Diagram in Dynamics 365 for Finance & Operations

Introduction

Invoice Capture is a Power Apps-based application available in Dynamics 365 CRM and deeply integrated with Dynamics 365 for Finance and Operations (D365 F&O, or simply F&O). The App is designed to provide end-to-end invoice automation of Vendor invoices (see previous post for an overview). The below diagram attempts to depict the corresponding process, typically performed by the AP Clerk role, however there are also steps that are "automated" and thus performed by the System, respectively that can be performed by both actors. In addition, there are optimization and setup activities that are performed by somebody with the Administrator role. 

Invoice Capture process diagram
Invoice Capture process diagram

Disclaimer:
(1) The process diagram focuses on the main flow of data and the main exceptions. Existing and new features in Invoice Capture might change the flow of data, making the whole diagram or parts of it obsolete. Please refer to the standard documentation [1].
(2) The diagram attempts to accommodate all types of Invoices types, however I haven't tested to the full extent the Purchase Order based scenarios.
(3) Once an invoice for the same Vendor, Address and lines was processed several times, the system should be capable in time to correctly process and transfer the Invoice to F&O without AP Clerk's involvement. 

Import/Upload Invoice

The Power App listens on the configured channels (e.g. Outlook, SharePoint, OneDrive) and the incoming documents (including signature images, receipt confirmations or reminders, other attachments) are imported in the App and made available under 'Received files' page. 

Alternatively, the invoices received in hard-paper format can be scanned and be either imported manually in Invoice Capture, respectively over the configured channels. 

Troubleshoot Invoice import

The Power Automate import flows might fail to import a file if any of the documented limitations applies, for example:

  • only JPG, JPEG, TIF, TIFF, PDF with a maximum of 20 MB are accepted;
  • in Outlook the email landed in other folder than Inbox.

To troubleshoot this, one can try to resend the email, or import the file manually in Invoice capture/ 
Check the documentation, for a complete list of the limitations!

Capture Invoice

Once imported, the files are processed by the OCR service and moved to the ‘Captured invoices’ page if their processing is successful. The extracted metadata is used to create the Invoice record (header, lines, charges, etc.).

Void document

Files that don’t qualify as Invoices (e.g. signature images, attachments) hang in the 'Received files' page and need to be voided by the AP Clerk on a regular basis. The voided files together with duplicates of the files already processes are available in the ‘Captured invoices (Voided)’ view from where can be deleted permanently.

Troubleshoot Invoice processing

It can happen that valid files hang in ‘Received files’ with one of the following statuses:

  • 'Cancelled': Use 'Retry' to restart their processing.
  • 'Processing': Export/download the file manually and reprocess it, while voiding the previous entry.

The files created via non-standard tools (e.g. PDF files) can lead to incorrect metadata extraction. In other cases, special formatting might impede metadata's extraction. Upon case, one can donate the file to Microsoft and have the AI model trained by them. 

Create Master Data (D365 F&O)

When the Vendor is not available, it needs to be created in F&O from where it will be synchronized in almost real-time to the App. At minimum, it needs to be provided the Vendor Number and Name, though Vendor's creation should be done according to the general policies available in the Organization. 

Similarly for new Products, however the setup must be complete at least for the Legal entity in scope, otherwise Invoice's processing in F&O might stop. See the "Create Product" process diagram. 

The setup for new Expense types (aka Procurement categories in F&O) must follow the standard setup. One can misuse an existing Expense type and do the correction in F&O, however this might lead to further issues if the correction isn't made.

Upon case other entities might be involved (e.g. Unit of Measure4) however they should be considered as part of the general setup for Invoice Capture.

Classify Invoice 

Classifying the invoice involves identifying the Legal entity (LE), assigning the Vendor, respectively providing the Invoice type. If a similar Invoice for the same Vendor was previously processed, the values will be taken as default, though they can be modified accordingly.

A channel can be linked to a LE, and in this case the respective LE is provided as default value. Secondly, a mapping rule can be defined for the LE when a unique keyword is expected to be available in the Address. Otherwise, the AP Clerk can assign the LE manually. 
Please note that the way the permissions were set up in CRM can affect what the members of the AP Clerk can see. E.g. users that have restricted roles in CRM will see only the records associated with the LE they were assigned to. In such cases, somebody else must assign the LE to a record without a LE. 

Similarly, a mapping rule can be defined also for the Vendors when a keyword in Vendor's name is unique.

Currently, there are 3 types of Invoices: 'PO invoice', 'Header-only invoice', respectively 'Cost invoice'. The first two can be used only in combination with a PO, while a Cost invoice is used for expenses for which no PO exists. 

Start Review

Once the Invoice was correctly classified, the AP Clerk can start reviewing the Invoice record generated from the extracted metadata. 

Correct Invoice

The AP Clerk must check the recognized metadata from the Invoice header, line and Charges against the invoice, correct or confirm the values, respectively complete the missing data. The financial dimensions, the Currency, the Item, the Expense type (aka Procurement Category in F&O) and the Unit of Measure. 

Unfortunately, there are complex Invoice layouts that make it challenging for OCR to correctly extract the line-based information. It might be needed to add or remove Lines and Charges individually or completely.

Complete Review

Once the errors and warnings are handled, the AP Clerk can complete the review, however further error may appear as part of the validation routine. 

Troubleshoot Invoice issues 

Besides the mandatory fields, several constraints apply:

  • The line total must match the header amount.
  • The Quantity and Net amount must have the same sign. Thus, if the amount must be negative, the Quantity must be also negative.
  • If not Invoice number is available, use a standard way for defining such values.
Several fields are already marked as mandatory. Consider extending the list with the attributes needed by the business. 

Transfer Invoice 

The Invoice can be transferred to F&O when no errors exist. F&O must be online for this step to be successful. 

Monitor Invoice Capture

Monitoring typically regards the whole process, though three points need special attention: (1) the import of Invoices into Invoice Capture; (2) the Invoices hanging in the 'Received files' area, respectively (3) the Invoices not transferred to F&O.

Ideally, there should be a set of reports and metrics on the CRM side to help in the process. 

Optimize Invoice Capture setup

There are several setup areas that can improve users experience while using Invoice Capture. The initial setup reflects one's knowledge about Invoice Capture, however in time optimizations can be identified especially in what concerns the mapping rules. 

Previous post <<||>> Next post

References:
[1] Microsoft Learn (2023) Invoice capture overview (link)

14 December 2023

💫ERP Systems: Microsoft Dynamics 365's Invoice Capture (Some Thoughts to Start with)

Enterprise Resource Planning

Introduction

It's almost the year end and it's time for reviewing what went good and not that good during the year. On the "successful projects" list I can put the Invoice Capture implementation. I wrote on a previous post a short review on what the feature is about.

I had the chance of configuring Invoice Capture for Cost invoices (invoices without Purchase orders) while it was still in public preview, and we went live soon after the feature became generally available. The implementation had its challenges though in the end it was a positive experience, learning a lot from my colleagues, from Microsoft, other consultants and business users who embarked on the same journey. 

Where to Start?

Usually, it's a good idea to start with the documentation and the standard training material, which provides a good overview of what Invoice Capture is about, the steps needed for configuration, the processes involved, permissions, etc. 

You should check also the "Invoice Capture for Dynamics 365" group on Yammer (aka Viva) because besides the latest version of the Implementation Guide document are published in there also the Release notes and training videos associated with them, to which other users provide (lot of) feedback and questions. Some information is first available in the group and much later made available in the documentation. If you're facing an error or a challenge, more likely there's a conversation in there and the answer you're looking for. Otherwise, you can start a thread and the others will try to help. At least until now, Microsoft was quite active in helping.

Via FastTrack, Microsoft provided several sessions in Dec-2022 (preview) and Sep-2023 (GA) that can be used to get a good overview about the feature and its implementation. Frankly, I would start with the last session and then explore the other resources. In the process I found useful several other resources, mainly YouTube content - see Dan's Corner (link), DAFTD365 - and LinkedIn - see Hendrik M Larsen's posts

You might want to also check the Release planner for Finance, to see what features are in the pipeline, respectively the Ideas for Dynamics to get an idea what kind of improvements others wish for. 

In parallel, one can start sketching the "AS IS" and "TO BE" processes, and eventually put together a business case for using Invoice Capture to digitize the processing of Vendor invoices. This isn't a simple Change request, therefore it makes sense to start a project, though its scope is relatively small. 

Bridging the Gap

One can look at the "TO BE" process based on the functionality provided, respectively planned by Microsoft for Invoice capture, or look at the broader picture and sketch how an ideal digitized process should look like. If the gap between the two pictures is big, then might be a good idea to look at alternatives, which anyway should be done as part of the business case. There might be third-party tools out there (e,g, ExFlow) which provide similar functionality, however on the long term it makes sense to go with Microsoft, even if the full extent of the functionality might be not available. 

A review of other tools might be good - to understand how the ISV's approached the integration, what kind of features they provide, respectively whether the ideal digitized process makes sense. Conversely, this will imply more effort.

The current version of Invoice capture provides a good basis to build upon. One can use Power Apps or Power Automate to address some of the gaps, some gaps can be discussed with Microsoft and stress their importance, while other gaps are maybe not that important and can be dismissed. One way or another one must be ready to compromise as long as this doesn't have an important impact on the business. 

The Project

The scope of the project might be relatively small, though one should follow the best practices of Project Management and make sure that all important stakeholders are involved, that the right resources are available when needed, manage the requirements adequately, assure that the changes are adequately tested, that the users are trained, the process documented, etc.

It's important to understand that the simple configuration of Invoice capture will not be the end of the effort. As Microsoft will release further features directly and indirectly related to Invoice Capture, additional effort might be involved after the implementation went live to address the gaps, opportunities, as well as the risks. Moreover, Invoice Capture requires a learning curve; addressing the lessons learned might involve further changes in the system's setup as well in data's management. Therefore, further effort must be planned accordingly. 

Even if we talk about a full implementation or the implementation of a feature, the overall success tends to be more dependent on how the implementation is approached than on the technology involved. 

Closing Thoughts

Some of the points made here can be applied to similar feature implementations. Overall, it's important to gather enough information to start the project and in time to reach the level of depth required by it. Don't expect for things to be perfect, start small and evolve, prioritize, cover the gaps, optimize!

Previous post <<||>> Next post

12 December 2023

💎💫SQL Reloaded: Released Products by Legal Entity in Dynamics 365 F&O

To check in which Legal Entity (LE) a Product was released using the standard UI, the user needs to go through each LE and search for the respective Product, action which can be time-consuming and error-prone if done for multiple Products. An alternative would be to use the Excel add-in or the Table Browser and export the data for each LE, respectively aggregate the data in one file and build a power pivot on top of it. If this kind of checks need to be done on a regular basis, then a better solution is to build a simple report.

When a Product is released for a Legal Entity (LE) a record is created in the InventTable table and is accessible over the dbo.EcoResReleasedProductV2Entity data entity as well. One can build the logic as a GROUP BY with an entry for each LE in scope. 

-- Released Products by Legal Entity
SELECT ItemId
, Max(CASE WHEN DataAreaId ='cnmf' THEN 1 ELSE 0 END) [cnmf]
, Max(CASE WHEN DataAreaId ='demf' THEN 1 ELSE 0 END) [demf]
, Max(CASE WHEN DataAreaId ='inmf' THEN 1 ELSE 0 END) [inmf]
, Max(CASE WHEN DataAreaId ='jpmf' THEN 1 ELSE 0 END) [jpmf]
, Max(CASE WHEN DataAreaId ='mymf' THEN 1 ELSE 0 END) [mymf]
, Max(CASE WHEN DataAreaId ='rumf' THEN 1 ELSE 0 END) [rumf]
, Max(CASE WHEN DataAreaId ='samf' THEN 1 ELSE 0 END) [samf]
, Max(CASE WHEN DataAreaId ='thmf' THEN 1 ELSE 0 END) [thmf]
, Max(CASE WHEN DataAreaId ='usmf' THEN 1 ELSE 0 END) [usmf]
, count(*) NoRecords
FROM dbo.InventTable
WHERE DataAreaId <> 'DAT'
-- AND ItemId = 'D0001'
GROUP BY ItemId
ORDER BY NoRecords DESC
-- HAVING count(*)>1 -- 

Unfortunately, the query is static and needs to be adjusted each time a new LE is added or removed. Unless the query is build and executed dynamically, there is apparently no way in SQL to include/exclude the LEs dynamically. However, one can use a cartesian product between LEs and Products, and retrieve the corresponding entry from the Released products table:

-- Legal Entities vs Products 
SELECT DAT.Id DataAreaId 
, PRD.DisplayProductNumber ItemId 
, CASE WHEN ITM.DataAreaId IS NOT NULL THEN 1 ELSE 0 END IsReleased
, CASE WHEN ITM.CreatedDatetime <> ITM.ModifiedDateTime THEN 1 ELSE 0 END IsModified
FROM dbo.DataArea DAT
     CROSS JOIN dbo.EcoResProduct PRD
	 LEFT JOIN dbo.InventTable ITM
	   ON DAT.Id = ITM.DataAreaId 
	  AND PRD.DisplayProductNumber = ITM.ItemId
	  AND PRD.Partition = ITM.Partition
WHERE DAT.Id <> 'DAT'
   AND PRD.DisplayProductNumber = 'D0001'
ORDER BY PRD.DisplayProductNumber
, DAT.Id 

Further on, the data can be aggregated in a matrix display in a paginated report, given that the respective functionality can dynamically display the LEs in scope. 

Notes:
1) The first query will run also in AX 2009, while the second will run only in D365 F&O.
2) Besides checking whether a Product was released (see IsReleased column) it's useful to check whether the Product was modified (see IsModified). If the Created and Modified timestamps are the same, then the Product was not touched after its release. 
3) The two queries can be extended to retrieve further Product-specific data (e.g. Unit of Measure, Base Prices, Default vendor, etc.).
4) It's safer to use the corresponding data entities instead of the tables.
5) It's also useful to check whether the Products are blocked for Sales, Purchasing or Inventory (see next post). 

Happy coding!

10 December 2023

💫🧮☯ERP: Microsoft Dynamics 365's Invoice Capture (The Good, the Bad and the Ugly)

Enterprise Resource Planning
ERP Systems

At the last meeting of the Microsoft Dynamics Meetup Germany group there were about 20 Microsoft experts invited to expose in two minutes their favorite new feature from the Microsoft ecosystem. None of them though mentioned Invoice Capture, which I think deserves its place on the list. 

Invoice Capture is a Power Apps-based application deeply integrated with Dynamics 365 which allows the semi-automatic processing of Vendor invoices received over the various channels (Outlook, SharePoint, OneDrive) or via manual upload. The Power App listens on the configured channels, imports the documents as they arrive and uses optical character recognition capabilities to extract the standard textual information needed to create a Vendor invoice record with header, lines and further information. In a first phase the Accountant Clerk classifies, reviews, corrects and transfers the Invoice to Dynamics 365, from where the Invoice follows the standard process being enriched, posted to the Subledger and further booked to General Ledger. Of course, several changes were done also in Dynamics 365, especially in what concerns the parametrization and Invoices' automatic processing.

The Good: Thus, Invoice Capture attempts to provide end-to-end invoice automation and probably with further changes to cover at least the most common scenarios it will be able to do so. Since it was released as a minimal viable product (MVP), besides bug fixing several features were added - the search of Vendors and their automatic synchronization, the entry of Cost Center and Department Code financial dimensions upfront in the Power App, the support for multiple tax codes and for custom fields, just to mentioned the most important features. However, more changes are needed to provide customers more flexibility in automating the process and in handling other complex scenarios. 

Through automation and further features like the continuous learning from manual input and previous value retention, Invoice Capture decreases the volume of manual work and increases the financial cycle time, making the overall process more efficient. Moreover, the invoices are available almost as soon as they came into Dynamics 365, allowing better overview and thus better spend control. Features like Invoice approval via workflows and extrinsic automating features can offer further opportunities for improvement. Last but not the least, Invoice Capture allows achieving a paperless AP, helping organizations' effort on their road to digital transformation. 

The Bad: It's natural in Software Development to start with a MVP and built upon, however the gap between the MVP and what customers need involves certain challenges when evaluating and implementing the feature(s). Some hiccups are inherent as a piece of software needs time to stabilize and mature, however with better transparency and communication about the roadmap the respective processes would have been a better experience. On the other side, Microsoft was quite helpful in the process, welcoming the feedback and integrating it in the plan, and in time even provided more transparency. However, there seem to be still many unknowns, especially in what concerns the integration with old and new features from the roadmap (e,g. e-Invoicing, recurring Vendor invoices).

The truth is that customers have different needs, their processes have degrees of complexity that may go way beyond the features provided by the MVP and subsequent versions. Some customers were happy with the MVP, some had to compromise while others maybe went for alternatives. 

The Ugly: It's time consuming to evaluate and implement a new feature, to fill the gaps and find alternatives, especially when the organizational setup is not optimal. However all these are normal challenges from the life of an ERP consultant.

Despite the current and maybe future challenges, Invoice Capture can become in time an important product on the Microsoft roadmap.

Previous Post <<||>> Next Post

08 December 2023

💎🏭SQL Reloaded: Microsoft Fabric (Part I: Monitoring the Warehouse)

I was exploring this week the Microsoft Fabric Warehouse and I observed that there're three views available under the queryinsight schema: exec_requests_history, frequently_run_queries and long_running_queries,  According to their definitions, they are based on two database objects fabric_query_starting and fabric_query_completed that cannot be called directly.

Announced in the Nov-2023 update, the Query Insights (QI) feature is a "scalable, sustainable, and extendable solution to enhance the SQL analytics experience" (see Microsoft's documentation).

Strangely, the three views appear in the Model view together with the objects defined in the dbo or other user-defined schemas. One can hide the queryinsight objects, however doing this operation in each model is impractical, especially when the number of the objects defined in the respective schema will increase in time. On the other side, the respective objects might be useful in building a report for visualizing query's performance. (Probably, a multi-model solution would and/or further settings will allow more flexibility.)

Secondly, the objects from the queryinsight schema are not available in the sys.objects DMV:

SELECT top 10 *
FROM sys.objects 
WHERE name LIKE 'fabric%'

The exec_requests_history DMV (similar to the dm_exec_requests_history DMV from the standard SQL Server) references several DMVs, and it would be useful to retrieve the corresponding information within the same query:

 -- fabric warehouse
 SELECT erh.distributed_statement_id
, erh.start_time
, erh.end_time
, erh.total_elapsed_time_ms
--, erh.login_name
, erh.row_count
, erh.status
, erh.session_id
, erh.connection_id
, erh.program_name
, erh.batch_id
, erh.root_batch_id
, erh.query_hash
, erh.command 
FROM queryinsights.exec_requests_history erh
WHERE status = 'Succeeded'--'Failed'

However, attempting to retrieve session information via the sys.dm_exec_sessions DMV leads to the below error message:

SELECT *
FROM queryinsights.exec_requests_history erh
     LEFT JOIN sys.dm_exec_sessions ses
       ON erh.session_id = ses.session_id
WHERE erh.status = 'Succeeded'--'Failed'

The query references an object that is not supported in distributed processing mode.
Msg 15816, Level 16, State 7, Code line 11

Using the standard SQL Sever system functions seem to work, as long the view from the queryinsights schema is not considered:

According to the documentation, "Some objects, like system views, and functions can't be used while you query data stored in Azure Data Lake or Azure Cosmos DB analytical storage. Avoid using the queries that join external data with system views, load external data in a temp table, or use some security or metadata functions to filter external data."

One can presume thus that fabric_query_starting and fabric_query_completed are stored in the Data Lake and behave like standard user-defined tables. Unfortunately, no documentation seems to be available on this.

I tried using a temporary table, as advised above:

-- dropping the temp table
--DROP TABLE IF EXISTS dbo.#requests_history;

-- create the temp table
CREATE TABLE dbo.#requests_history (
  distributed_statement_id uniqueidentifier
, start_time datetime2(6)
, end_time datetime2(6)
, total_elapsed_time_ms bigint
, login_name varchar(255)
, row_count bigint
, status varchar(50)
, session_id bigint
, connection_id uniqueidentifier
, program_name varchar(255)
, batch_id uniqueidentifier
, root_batch_id uniqueidentifier
, query_hash uniqueidentifier
, command varchar(max)
)

-- inserting a few records
INSERT INTO dbo.#requests_history
SELECT erh.distributed_statement_id
, erh.start_time
, erh.end_time
, erh.total_elapsed_time_ms
, erh.login_name
, erh.row_count
, erh.status
, erh.session_id
, erh.connection_id
, erh.program_name
, erh.batch_id
, erh.root_batch_id
, erh.query_hash
, erh.command 
FROM queryinsights.exec_requests_history erh;

-- retrieve the inserted records
SELECT *
FROM dbo.#requests_history;

Unfortunately, the attempt led to the same error message. Further investigating the issue I arrived to a know issue: "Temp table usage in Data Warehouse and SQL analytics endpoint". Hopefully, the fix will address this scenario as well. Otherwise, it might be easier to import the data into a solution (e.g. Power BI) and do in there the analysis.

Using temporary tables with DMVs seems to work (see post).

Please note that the values are case sensitive and only a subset of the standard data types are supported (see documentation).

You might want to check also the queries from the SQL Server System Catalog.

Happy coding!

21 October 2023

📊Graphical Representation: Overreaching in Data Visualizations

Graphical Representation
Graphical Representation Series 

One of the most important aspects to stress in the context of graphical design is the purpose of graphical representations and the medium in which they are communicated. For example, one needs to differentiate between the graphics propagated on the various media channels that target the public consumption and potential customers (books, newspapers, articles in paper or paperless form, respectively blog posts and similar content) and graphics made for organizational use (reports, dashboards or presentations).

If the former graphics are supposed to back up a story, the reader being led into one direction or another, the author having the freedom of choosing the direction and the message, in the latter, unless the content is supposed to support, persuade or force a decision, the facts and data need to be presented in an equidistant manner, in a form that support insights, decision making or further inference. This applies to data professionals as well to the business users preparing the data.

Data visualization authors tend to use the title and subtitle to highlight in reports and dashboards the most important findings as per their perception, sometimes even stating the obvious. One of the issues with this approach is that the audience might just pick up the respective information without further looking at the chart, missing maybe more important facts. Just highlighting an element in the graphic or providing explanatory headlines is not storytelling, even if it helps in the process. Ideally, the data itself as depicted by the visuals should tell the story! Further information with storytelling character should be provided in the presentation of the data and taylored accordingly for the audience!

With a few exceptions, the information and decisions shouldn't be forced on the audience. There are so many such examples on the various social networks in which data analysts or other types of data professionals seem to imply this in the content they share and this is so wrong on many levels!

No matter how deep a data professional is involved into the business and no matter how extensive is his/her knowledge about the systems, data and processes, the business user and the manager are the closest to the business context and needs, while data professionals might not be aware of the full extent. This lack of context makes it challenging to interpret the trends depicted by the data, respectively to associate the changes observed in trends with decisions made or issues the business dealt with. When such knowledge is not available the data professional tends to extrapolate instead of identifying the chain of causality together with the business (and here annotation capabilities would help considerably). 

Moreover, it falls on management's shoulders to decide which facts, data, metrics, KPIs and information are important for the organization. A data professional can make recommendations, can play with the data and communicate certain insights, gaps or courses of action, though the management decides what's important and how the respective information should be communicated! Overstepping the boundaries can easily lead to unnecessary conflict in which the data professional can easily lose, even if the facts are in his favor. It's enough to deal with missing or incorrect information for the whole story to fall apart. 

It's true that some of the books on graphical design use various highlighting techniques in the explanation process, but they are intended for the readers to understand what the authors want to say. Unfortunately, there are also examples improperly used or authors' opinion diverge from the common sense. Independently of this, the data professional should develop own visual critical thinking and validate the techniques used against own judgement! 

Previous Post <<||>> Next Post

🧊💫Data Warehousing: Architecture (Part VI: Building a Data Lakehouse for Dynamics 365 Environments with Serverless SQL Pool)

Data Warehousing
Data Warehousing Series

One of the major limitations of Microsoft Dynamics 365 is the lack of direct access to the production databases for reporting purposes via standard reporting or ETL/ELT tools. Of course, one can attempt to use OData-based solutions though they don't scale with the data volume and imply further challenges. 

At the beginning, Microsoft attempted to address this limitation by allowing the export of data entities into customer's Azure SQL database, feature known as bring your own database (BYOD). Highly dependent on batch jobs, the feature doesn't support real-time synchronization and composite entities, and is dependent on the BYOD's database capacity, the scalability after a certain point becoming a bottleneck.

Then Microsoft started to work on two solutions for synchronizing the Dynamics 365 data in near-real time (cca. 10-30 minutes) to the Data Lake: the Export to Data Lake add-in (*), respectively the Azure Synapse Link for Dataverse with Azure Data Lake. The former allows the synchronization of the tables from Finance & Operations (doesn't work for CRM) to files that reflect the database model from the source. In exchange, the latter allows the synchronization of data entities to similar structures, and probably will support tables as well. Because the service works via Dataverse it supports also the synchronization of CRM data. 

The below diagram depicts the flow of data from the D365 environments to the Data Lake, the arrow indicating the direction of the flow. One arrow could be drawn between Dynamics 365 Finance & Operations and the Azure Link for Datavetse service, though one may choose to use only the Export to Data Lake add-in given that a data model based on the tables offers more flexibility (in the detriment of effort though). Data from other systems can be exported via pipelines to the Data Lake to provide an integrated and complete view of the business. 

Serverless Data Lakehouse

Once the data available in the Delta Lake, it can be consumed directly by standard and Power BI paginated reports, however building the data model will involve considerable effort and logic duplication between reports. Therefore, it makes sense to prepare the data upfront in the Data Lake, when possible, and here the serverless SQL pool can help building an enterprise data model. The former approach can still be used for rapid prototyping or data discovery. 

The serverless SQL Server pool is a stateless SQL-based distributed data processing query service over Azure data lake for large-scale data and computational functions. Even if it doesn't support standard tables, it allows to make the data from the Data Lakes files available for processing via external tables, a mechanism that maps files' structure to an entity that can be queried like a normal view (though it supports only read operations). 

Further on, the enterprise data model can be built like in a normal Data Warehouse via the supported objects (views, stored procedures and table-valued functions). These objects can be called from standard and Power BI paginated reports, the queries being processed at runtime anew, which might result occasionally in poor performance. However, the architecture is supposed to scale automatically as needed.

If further performance is needed, parts of the logic or the end-result can be exported to the Data Lake and here the Medallion Architecture should be considered when appropriate. Upon case, further processing might be needed to handle the limitations of the serverless SQL pool (e.g.flattening hierarchies, handling data quality issues).

One can go around the lack of standard table support needed especially for value mappings by storing the respective data as files and/or occasionally by misusing views, respectively by generating Spark tables via the Spark pool. 

Note:
(*) Existing customers have until 1-Nov-2024 to transition from Export to Data lake to Synapse link. Microsoft advises new customers to use Synapse Link.

Previous Post <<||>> Next Post

🧊Data Warehousing: Architecture V (Dynamics 365, the Data Lakehouse and the Medallion Architecture)

Data Warehousing
Data Warehousing Series

An IT architecture is built and functions under a set of constraints that derive from architecture’s components. Usually, if we want flexibility or to change something in one area, this might have an impact in another area. This rule applies to the usage of the medallion architecture as well! 

In Data Warehousing the medallion architecture considers a multilayered approach in building a single source of truth, each layer denoting the quality of data stored in the lakehouse [1]. For the moment are defined 3 layers - bronze for raw data, silver for validated data, and gold for enriched data. The concept seems sound considering that a Data Lake contains all types of raw data of different quality that needs to be validated and prepared for reporting or other purposes.

On the other side there are systems like Dynamics 365 that synchronize the data in near-real-time to the Data Lake through various mechanisms at table and/or data entity level (think of data entities as views on top of other tables or views). The databases behind are relational and in theory the data should be of proper quality as needed by business.

The greatest benefit of serverless SQL pool is that it can be used to build near-real-time data analytics solutions on top of the files existing in the Data Lake and the mechanism is quite simple. On top of such files are built external tables in serverless SQL pool, tables that reflect the data model from the source systems. The external tables can be called as any other tables from the various database objects (views, stored procedures and table-valued functions). Thus, can be built an enterprise data model with dimensions, fact-like and mart-like entities on top of the synchronized filed from the Data Lake. The Data Lakehouse (= Data Warehouse + Data Lake) thus created can be used for (enterprise) reporting and other purposes.

As long as there are no special requirements for data processing (e.g. flattening hierarchies, complex data processing, high-performance, data cleaning) this approach allows to report the data from the data sources in near-real time (10-30 minutes), which can prove to be useful for operational and tactical reporting. Tapping into this model via standard Power BI and paginated reports is quite easy. 

Now, if it's to use the data medallion approach and rely on pipelines to process the data, unless one is able to process the data in near-real-time or something compared with it, a considerable delay will be introduced, delay that can span from a couple of hours to one day. It's also true that having the data prepared as needed by the reports can increase the performance considerably as compared to processing the logic at runtime. There are advantages and disadvantages to both approaches. 

Probably, the most important scenario that needs to be handled is that of integrating the data from different sources. If unique mappings between values exist, unique references are available in one system to the records from the other system, respectively when a unique logic can be identified, the data integration can be handled in serverless SQL pool.

Unfortunately, when compared to on-premise or Azure SQL functionality, the serverless SQL pool has important constraints - it's not possible to use scalar UDFs, tables, recursive CTEs, etc. So, one needs to work around these limitations and in some cases use the Spark pool or pipelines. So, at least for exceptions and maybe for strategic reporting a medallion architecture can make sense and be used in parallel. However, imposing it on all the data can reduce flexibility!

Bottom line: consider the architecture against your requirements!

Previous Post <<||>>> Next Post

[1] What is the medallion lakehouse architecture?
https://learn.microsoft.com/en-us/azure/databricks/lakehouse/medallion

20 October 2023

💎SQL Reloaded: Extended LTrim/RTrim in SQL Server 2022 (Before and After)

In SQL Server 2022, the behavior of LTrim (left trimming) and RTrim (right trimming) functions was extended with one more string parameter. When provided, the engine checks whether the first parameter starts (for LTrim), respectively ends (for RTrim) with the respective value and removes it, the same as the space character char(32) was removed previously:

-- prior behavior of LTrim/RTrim
DECLARE @text as nvarchar(50) = '  123  '
SELECT '(' + LTrim(@text) + ')' LeftTrimming
, '(' + RTrim(@text) + ')' RightTrimming
, '(' + Ltrim(RTrim(@text)) + ')' Trimming1 -- prior SQL Server 2017
, '(' + Trim(@text) + ')' Trimming2 -- starting with SQL 2017
LeftTrimming RightTrimming Trimming1 Trimming2
(123 ) ( 123) (123) (123)

Here's the new behavior:

-- extended behavior of LTrim/LTrim (SQL Server 2022+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT LTrim(@text , '123') LeftTrimming
, RTrim(@text , 'abc') RightTrimming;
LeftTrimming RightTrimming
abc123abc 123abc123

 Previosly, to obtain the same result one could write something like:

-- prior solution via Left/Right for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN Left(@text, 3) = '123' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN Right(@text, 3) = 'abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

-- prior solution via "LIKE" for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN @text LIKE '123%' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN @text LIKE '%abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

As can be seen, the syntax is considerable simplified. However, there are few the situations when is needed. In the past I had to write code to remove parenthesis, quotes or similar characters:

-- removing parantheses
DECLARE @text as nvarchar(50) = '(testing)'
SELECT LTrim(@text , '(') LeftTrimming
, RTrim(@text , ')') RightTrimming
, RTrim(LTrim(Trim(@text), '('), ')') Trimming 

-- removing double quotes
DECLARE @text as nvarchar(50) = '"testing"'
SELECT LTrim(@text , '"') LeftTrimming
, RTrim(@text , '"') RightTrimming
, RTrim(LTrim(Trim(@text), '"'), '"') Trimming 

The Trim for the 3rd value in both queries was used to remove the eventual spaces before the character to be replaced:

-- removing paranteses with lead/end spaces
SELECT RTrim(LTrim(Trim('   (testing)   '), '('), ')');

Then I thought, maybe I could use the same to remove the tags from an XML element. I tried the following code and unfortunately it doesn't seem to work:

-- attempting to remove the start/end tags from xml elements
DECLARE @text as nvarchar(50) = '<string>testing</string>'
SELECT LTrim(@text , '<string>') LeftTrimming
, RTrim(@text , '</string>') RightTrimming
, RTrim(LTrim(Trim(@text), '<string>'), '</string>') Trimming
LeftTrimming RightTrimming Trimming
esting</string> <string>te e

That's quite an unpleasant surprise!  In exchange, the value type can be defined as XML and use the following code to obtain the needed result:
-- extracting the value from a tag element
DECLARE @text XML = '<string>testing</string>'
SELECT @text.query('data(/string)') as value

Happy coding!
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.