SQL Troubles: October 2022

30 October 2022

💎SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part III: Ranking) 🆕

In two previous posts I shown how to use the newly introduced WINDOW clause in SQL Server 2022 for simple aggregations, respectively running totals, by providing some historical context concerning what it took to do the same simple aggregations as SUM or AVG within previous versions of SQL Server. Let's look at another scenario based on the previously created Sales.vSalesOrders view - ranking records within a partition.

There are 4 ranking functions that work across partitions: Row_Number, Rank, Dense_Rank and NTile. However, in SQL Server 2000 only Row_Number could be easily implemented, and this only if there is a unique identifier (or one needed to create one on the fly):

-- ranking based on correlated subquery (SQL Server 2000+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, (-- correlated subquery
  SELECT count(SRT.SalesOrderId)
  FROM Sales.vSalesOrders SRT
  WHERE SRT.ProductId = SOL.ProductId 
    AND SRT.[Year] = SOL.[Year]
	AND SRT.[Month] = SOL.[Month]
    AND SRT.SalesOrderId <= SOL.SalesOrderId
   ) RowNumberByDate
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
ORDER BY SOL.[Year]
, SOL.[Month]
, SOL.OrderDate ASC

As alternative for implementing the other ranking functions, one could use procedural language for looping, though this approach was not recommendable given the performance concerns.

SQL Server 2005 introduced all 4 ranking functions, as they are in use also today:

-- ranking functions (SQL Server 2005+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
-- rankings
, Row_Number() OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.OrderQty DESC) RowNumberQty
, Rank() OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.OrderQty DESC) AS RankQty
, Dense_Rank() OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.OrderQty DESC) AS DenseRankQty
, NTile(4) OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.OrderQty DESC) AS NTileQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
ORDER BY SOL.[Year]
, SOL.[Month]
, SOL.OrderQty DESC

Now, in SQL Server 2022 the WINDOW clause allows simplifying the query as follows by defining the partition only once:

-- ranking functions (SQL Server 2022+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
-- rankings

, Row_Number() OVER SalesByMonth AS RowNumberQty
, Rank() OVER SalesByMonth AS RankQty
, Dense_Rank() OVER SalesByMonth AS DenseRankQty
, NTile(4) OVER SalesByMonth AS NTileQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.OrderQty DESC)
ORDER BY SOL.[Year]
, SOL.[Month]
, SOL.OrderQty DESC

Forward (and backward) referencing of one window into the other can be used with ranking functions as well:

-- ranking functions with ascending/descending sorting (SQL Server 2022+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
-- rankings (descending)
, Row_Number() OVER SalesByMonthSortedDESC AS DescRowNumberQty
, Rank() OVER SalesByMonthSortedDESC AS DescRankQty
, Dense_Rank() OVER SalesByMonthSortedDESC AS DescDenseRankQty
, NTile(4) OVER SalesByMonthSortedDESC AS DescNTileQty
-- rankings (ascending)
, Row_Number() OVER SalesByMonthSortedASC AS AscRowNumberQty
, Rank() OVER SalesByMonthSortedASC AS AscRankQty
, Dense_Rank() OVER SalesByMonthSortedASC AS AscDenseRankQty
, NTile(4) OVER SalesByMonthSortedASC AS AscNTileQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month])
, SalesByMonthSortedDESC AS (SalesByMonth ORDER BY SOL.OrderQty DESC)
, SalesByMonthSortedASC AS (SalesByMonth ORDER BY SOL.OrderQty ASC)
ORDER BY SOL.[Year]
, SOL.[Month]
, SOL.OrderQty DESC

Happy coding!

💎SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part II: Running Totals) 🆕

In the previous post I shown how to use the newly introduced WINDOW clause in SQL Server 2022 by providing some historical context concerning what it took to do the same simple aggregations as SUM or AVG within previous versions of SQL Server. Let's look at another scenario based on the previously created Sales.vSalesOrders view - using the same two functions in providing running totals.

A running total is a summation of a sequence of values (e.g. OrderQty) ordered by a key, typically one or more columns that uniquely define the sequence, something like a unique record identifier (e.g. SalesOrderId). Moreover, the running total can be defined within a partition (e.g. Year/Month).

In SQL Server 2000, to calculate the running total and average for a value within a partition would resume implementing the logic for each attribute in correlated subqueries. One needed thus one subquery for each attribute, resulting in multiple processing of the base tables. Even if the dataset was already in the memory, it still involves a major overhead.

-- running total/average based on correlated subquery (SQL Server 2000+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, (-- correlated subquery
  SELECT SUM(SRT.OrderQty)
  FROM Sales.vSalesOrders SRT
  WHERE SRT.ProductId = SOL.ProductId 
    AND SRT.[Year] = SOL.[Year]
	AND SRT.[Month] = SOL.[Month]
	AND SRT.SalesOrderId <= SOL.SalesOrderId
   ) RunningTotalQty
, (-- correlated subquery
  SELECT AVG(SRT.OrderQty)
  FROM Sales.vSalesOrders SRT
  WHERE SRT.ProductId = SOL.ProductId 
    AND SRT.[Year] = SOL.[Year]
	AND SRT.[Month] = SOL.[Month]
	AND SRT.SalesOrderId <= SOL.SalesOrderId
   ) RunningAvgQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
ORDER BY SOL.SalesOrderId

Then SQL Server 2005 allowed consolidating the different correlated subqueries into one by using the CROSS APPLY join:

-- running total/average based on correlated subquery with CROSS APPLY (SQL Server 2005+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SRT.RunningTotalQty
, SRT.RunningAvgQty
FROM Sales.vSalesOrders SOL
     CROSS APPLY (-- correlated subquery
	  SELECT SUM(SRT.OrderQty) RunningTotalQty
	  , AVG(SRT.OrderQty) RunningAvgQty
	  FROM Sales.vSalesOrders SRT
	  WHERE SRT.ProductId = SOL.ProductId 
		AND SRT.[Year] = SOL.[Year]
		AND SRT.[Month] = SOL.[Month]
		AND SRT.SalesOrderId <= SOL.SalesOrderId
   ) SRT
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
ORDER BY SOL.SalesOrderId

Even if SQL Server 2005 introduced window functions that work on a partition, an ORDER BY clause was supported only with SQL Server 2012:

-- running total/average based on window functions (SQL Server 2012+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.SalesOrderId) RunningTotalQty
, AVG(SOL.OrderQty) OVER (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.SalesOrderId) RunningAvgQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
ORDER BY SOL.SalesOrderId

Now, in SQL Server 2022 the WINDOW clause allows simplifying the query as follows by defining the partition only once:

-- running total/average based on window functions and clause (SQL Server 2022+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER SalesByMonth AS RunningTotalQty

, AVG(SOL.OrderQty) OVER SalesByMonth AS RunningAvgQty

FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month] ORDER BY SOL.SalesOrderId)
ORDER BY SOL.SalesOrderId

Moreover, the WINDOW clause allows forward (and backward) referencing of one window into the other:

-- running total/average based on window functions and clause (SQL Server 2022+)
SELECT SOL.SalesOrderId 
, SOL.ProductId
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
-- simple aggregations
, SUM(SOL.OrderQty) OVER SalesByMonth AS TotalQty

, AVG(SOL.OrderQty) OVER SalesByMonth AS AvgQty

-- running totals
, SUM(SOL.OrderQty) OVER SalesByMonthSorted AS RunningTotalQty

, AVG(SOL.OrderQty) OVER SalesByMonthSorted AS RunningAvgQty

FROM Sales.vSalesOrders SOL
WHERE SOL.ProductId IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.ProductId, SOL.[Year], SOL.[Month])
, SalesByMonthSorted AS (SalesByMonth ORDER BY SOL.SalesOrderId)
ORDER BY SOL.SalesOrderId

Happy coding!

💎💫SQL Reloaded: Querying Azure Synapse Metadata for the D365 CRM & FO Tables Exported via Azure Synapse Link for Dataverse

Enabling Azure Synapse Link for Dataverse for Dynamics 365 CRM (D365 CRM) or for Dynamics 365 Finance & Operations (D365 FO) allows to have all the needed tables for reporting in a CDM structure, however the csv files with data don’t have any headers, which makes it challenging to use directly the files without the corresponding metadata. For example, attempting to define external tables would be useless without proper headers and data types. Fortunately, attribute’s name and data types are available in JSON files and can be queried.

D365 CRM

For CRM the files start with tables’ names and have the EntityMetadata.json postfix, the files being stored in the Microsoft.Athena.TrickleFeedService folder which lies with the other folders containing the data. Given that all files are in the same folder, the following query can be used to export the metadata for all or some of the tables. Just replace "(container name)" and "(data source)" with the corresponding values for your environment.

-- export definition for D365 CRM' CDM (all attributes, all tables)
SELECT DAT.EntityName
, DAT.AttributeName
, DAT.Timestamp
, DAT.AttributeType
, DAT.MetadataId
, DAT.Precision
, DAT.MaxLength
FROM openrowset(
        bulk '/(container name)/Microsoft.Athena.TrickleFeedService/*-EntityMetadata.json',
        data_source = '(data source)',
        format = 'csv',
        fieldterminator ='0x0b',
        fieldquote = '0x0b',
        rowterminator = '0x0b' --> You need to override rowterminator to read classic JSON
    ) with (doc nvarchar(max)) as rows
       CROSS APPLY OPENJSON(doc, '$.AttributeMetadata')--definitions[0].hasAttributes
       with (
         EntityName nvarchar(255) '$.EntityName'
        , AttributeName nvarchar(255) '$.AttributeName'
        , Timestamp nvarchar(50) '$.Timestamp'
        , AttributeType nvarchar(50) '$.AttributeType'
        , MetadataId nvarchar(100) '$.MetadataId'
        , Precision int '$.Precision'
        , MaxLength int '$.MaxLength'
     ) as DAT
WHERE DAT.EntityName IN ('lead', 'opportunity', 'product')

D365 FO

For Finance & Operations folders’ structure is more complex, a whole hierarchy of folders being built based on a set of predefined categories. One would need to traverse the hierarchy structure to find the files. Thus, it might be easier to generate the metadata for each table. Files’ names start with tables’ names and have the .cdm.json postfix. The below query generates the metadata for the InventTable table. Just replace the "(container name)", "(instance name)" and "(data source)" with the values for your environment.

-- export definition for D365 FO's CDM (all attributes, specific table)

SELECT DAT.name
, DAT.dataFormat
FROM openrowset(
        bulk '/(container name)/(instance name)/Tables/SupplyChain/ProductInformationManagement/Main/InventTable.cdm.json',
        data_source = '(data source)',
        format = 'csv',
        fieldterminator ='0x0b',
        fieldquote = '0x0b',
        rowterminator = '0x0b' --> You need to override rowterminator to read classic JSON
    ) with (doc nvarchar(max)) as rows
       CROSS APPLY OPENJSON(doc, '$.definitions[0].hasAttributes')
       with (
          name nvarchar(255) '$.name'
        , dataFormat nvarchar(50) '$.dataFormat'
     ) as DAT

Further Steps

One can obtain thus the needed metadata, however after a first inspection, the data types are too general compared with the ones considered for the data source attributes. One can work probably with these data types as well, though there's a Microsoft recommendation to minimize the row length by using the smallest possible column size, which leads to better query performance. Exporting the metadata from the source system and matching the two datasets based on tables and attributes’ names would allow addressing this recommendation, even if this implies checking from time to time whether their definition changed. The trick is to keep the same sorting order like in the Synapse files. Unfortunately, not all SQL Server data types are supported (e.g. text, ntext, sql_variant, etc.) or are ideal to work with (e.g. money), however there are alternative data types that can be used.

Happy coding!

Previous Post <<||>> Next Post

💎🏭SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part I: Simple Aggregations)

Between the many new features introduced in SQL Server 2022, Microsoft brings the WINDOW clause for defining the partitioning and ordering of a dataset which uses window functions. But before unveiling the change, let's take a walk down the memory lane.

In the early ages of SQL Server, many of descriptive statistics techniques were resuming in using the averages over a dataset that contained more or less complex logic. For example, to get the total and average of sales per Month, the query based on AdventureWorks database would look something like:

-- aggregated sales orders - detailed (SQL Server 2000+)
SELECT SOL.ProductID
, Year(SOH.OrderDate) [Year]
, Month(SOH.OrderDate) [Month]
, SUM(SOL.OrderQty) TotalQty
, AVG(SOL.OrderQty) AverageQty
FROM Sales.SalesOrderDetail SOL
     JOIN Sales.SalesOrderHeader SOH
	   ON SOL.SalesOrderID = SOH.SalesOrderID
WHERE SOL.ProductId IN (745)
  AND Year(SOH.OrderDate) = 2012
  AND Month(SOH.OrderDate) BETWEEN 1 AND 3
GROUP BY SOL.ProductID
, Year(SOH.OrderDate) 
, Month(SOH.OrderDate)

When possible, the base logic was encapsulated within a view, hiding query's complexity from the users, allowing also reuse and better maintenance, just to mention a few of the benefits of this approach:

-- encapsulating the logic into a view (SQL Server 2000+)
CREATE OR ALTER VIEW Sales.vSalesOrders
AS
-- sales orders details
SELECT SOL.SalesOrderID
, SOL.ProductID
, Cast(SOH.OrderDate as Date) OrderDate
, Year(SOH.OrderDate) [Year]
, Month(SOH.OrderDate) [Month]
, Cast(SOL.OrderQty as decimal(18,2)) OrderQty
-- many more columns 
FROM Sales.SalesOrderDetail SOL
     JOIN Sales.SalesOrderHeader SOH
	   ON SOL.SalesOrderID = SOH.SalesOrderID
/* -- some constraints can be brought directly into the view
WHERE SOL.ProductID IN (745)
  AND Year(SOH.OrderDate) = 2012
  AND Month(SOH.OrderDate) BETWEEN 1 AND 3
*/

The above query now becomes:

-- aggregated sales orders - simplified (SQL Server 2000+)
SELECT SOL.ProductID
, SOL.[Year]
, SOL.[Month]
, SUM(SOL.OrderQty) TotalQty
, AVG(SOL.OrderQty) AverageQty
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductID IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
GROUP BY SOL.ProductID
, SOL.[Year]
, SOL.[Month]

In many cases, performing this kind of aggregations was all the users wanted. However, quite often it's useful to look and the individual sales and consider them as part of the broader picture. For example, it would be interesting to know, what's the percentage of an individual value from the total (see OrderQtyPct) or any similar aggregate information. In SQL Server 2000 addressing such requirements would involve a join between the same view - one query with the detailed records, while the other contains the aggregated data:

-- sales orders from the totals (SQL Server 2000+)
SELECT SOL.*
, ASO.TotalQty
, Cast(CASE WHEN ASO.TotalQty <> 0 THEN 100*SOL.OrderQty/ASO.TotalQty ELSE 0 END as decimal(18,2)) OrderQtyPct
, ASO.AverageQty
FROM Sales.vSalesOrders SOL
     JOIN ( -- aggregated sales orders
		SELECT SOL.ProductID
		, SOL.[Year]
		, SOL.[Month]
		, SUM(SOL.OrderQty) TotalQty
		, AVG(SOL.OrderQty) AverageQty
		FROM Sales.vSalesOrders SOL
		WHERE SOL.ProductID IN (745)
		  AND SOL.[Year] = 2012
		  AND SOL.[Month] BETWEEN 1 AND 3
		GROUP BY SOL.ProductId
		, SOL.[Year]
		, SOL.[Month]
	) ASO
	ON SOL.ProductID = ASO.ProductID 
   AND SOL.[Year] = ASO.[Year]
   AND SOL.[Month] = ASO.[Month]
ORDER BY SOL.ProductID
, SOL.OrderDate

Now, just imagine that you can't create a view and having to duplicate the logic each time the view is used! Aggregating the data at different levels would require similar joins to the same view or piece of logic.

Fortunately, SQL Server 2005 came with two long-awaited features, common table expressions (CTEs) and window functions, which made a big difference. First, CTEs allowed defining inline views that can be referenced multiple times. Secondly, the window functions allowed aggregations within a partition.

-- sales orders with CTE SQL Server 2005+ query 
WITH CTE
AS (--sales orders in scope
SELECT SOL.SalesOrderID 
, SOL.ProductID
, Cast(SOH.OrderDate as Date) OrderDate
, Year(SOH.OrderDate) [Year]
, Month(SOH.OrderDate) [Month]
, SOL.OrderQty
FROM Sales.SalesOrderDetail SOL
     JOIN Sales.SalesOrderHeader SOH
	   ON SOL.SalesOrderID = SOH.SalesOrderID
WHERE SOL.ProductID IN (745)
  AND Year(SOH.OrderDate) = 2012
  AND Month(SOH.OrderDate) BETWEEN 1 AND 3
)
SELECT SOL.SalesOrderID 
, SOL.ProductID
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER(PARTITION BY SOL.[Year], SOL.[Month]) TotalQty
, AVG(SOL.OrderQty)  OVER(PARTITION BY SOL.[Year], SOL.[Month]) AverageQty
FROM CTE SOL
WHERE SOL.ProductID IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3

This kind of structuring the queries allows to separate the base logic for better maintainability, readability and fast prototyping. Once the logic from CTE becomes stable, it can be moved within a view, following to replace the CTE reference with view's name in the final query. On the other side, the window functions allow writing more flexible and complex code, even if the statements can become occasionally complex (see OrderQtyPct):

-- aggregated sales orders (SQL Server 2005+)
SELECT SOL.SalesOrderID 
, SOL.ProductID
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER(PARTITION BY SOL.[Year], SOL.[Month]) TotalQty
, AVG(SOL.OrderQty)  OVER(PARTITION BY SOL.[Year], SOL.[Month]) AverageQty
, Cast(CASE WHEN SUM(SOL.OrderQty) OVER(PARTITION BY SOL.[Year], SOL.[Month]) <> 0 THEN 100*SOL.OrderQty/SUM(SOL.OrderQty) OVER(PARTITION BY SOL.[Year], SOL.[Month]) ELSE 0 END as decimal(18,2)) OrderQtyPct
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductID IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3

Now, SQL Server 2022 allows to further simplify the logic by allowing to define with a name the window at the end of the statement and reference the name in several window functions (see last line of the query):

-- Aggregations per month (SQL Server 2022+)
SELECT SOL.SalesOrderID 
, SOL.ProductID
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER SalesByMonth AS TotalQty
, AVG(SOL.OrderQty) OVER SalesByMonth AS AverageQty
, Cast(CASE WHEN SUM(SOL.OrderQty) OVER SalesByMonth <> 0 THEN 100*SOL.OrderQty/SUM(SOL.OrderQty) OVER SalesByMonth ELSE 0 END as decimal(18,2)) OrderQtyPct
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductID IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.[Year], SOL.[Month])

Moreover, several windows can be defined. For example, aggregating the data also per year:

-- Aggregations per month and year (SQL Server 2022+) 
SELECT SOL.SalesOrderID 
, SOL.ProductID
, SOL.OrderDate
, SOL.[Year]
, SOL.[Month]
, SOL.OrderQty
, SUM(SOL.OrderQty) OVER SalesByMonth AS TotalQtyByMonth
, AVG(SOL.OrderQty) OVER SalesByMonth AS AverageQtyByMonth
, Cast(CASE WHEN SUM(SOL.OrderQty) OVER SalesByMonth <> 0 THEN 100*SOL.OrderQty/SUM(SOL.OrderQty) OVER SalesByMonth ELSE 0 END as decimal(18,2)) OrderQtyPctByMonth
, SUM(SOL.OrderQty) OVER SalesByYear AS TotalQtyByYear
, AVG(SOL.OrderQty) OVER SalesByYear AS AverageQtyByYear
, Cast(CASE WHEN SUM(SOL.OrderQty) OVER SalesByYear <> 0 THEN 100*SOL.OrderQty/SUM(SOL.OrderQty) OVER SalesByYear ELSE 0 END as decimal(18,2)) OrderQtyPctByYear
FROM Sales.vSalesOrders SOL
WHERE SOL.ProductID IN (745)
  AND SOL.[Year] = 2012
  AND SOL.[Month] BETWEEN 1 AND 3
WINDOW SalesByMonth AS (PARTITION BY SOL.[Year], SOL.[Month])
, SalesByYear AS  (PARTITION BY SOL.[Year])

Isn't that cool? It will take probably some time to get used in writing and reading this kind of queries, though using descriptive names for the windows should facilitate the process!

As a side note, even if there's no product function like in Excel, there's a mathematical trick to transform a product into a sum of elements by applying the Exp (exponential) and Log (logarithm) functions (see example).

Notes:
The queries work also in a SQL databases in Microsoft Fabric. Just replace the Sales with SalesLT schema (see post, respectively GitHub repository with the changed code).

Happy coding!

SQL Troubles

Pages

30 October 2022

💎SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part III: Ranking) 🆕

💎SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part II: Running Totals) 🆕

💎💫SQL Reloaded: Querying Azure Synapse Metadata for the D365 CRM & FO Tables Exported via Azure Synapse Link for Dataverse

💎🏭SQL Reloaded: The WINDOW Clause in SQL Server 2022 (Part I: Simple Aggregations)

About Me