06 March 2010

SQL Reloaded: Oracle vs. SQL Server (Running Totals)

    One of the SQL Server 2005+ functionalities I kind of ignored in the past years is the CROSS APPLY operator that allows returning rows from a table-valued function of a correlated query, allowing for example to calculate running totals, cumulated amounts from the previous records and the current record in a data set. Before the introduction of CROSS APPLY operator such a query could be written as follows:
-- SQL Server/Oracle: Running Totals (simple join)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
SUM(POT.SubTotal) RunningSubTotal

,
SUM(POT.TaxAmt) RunningTaxAmt

,
SUM(POT.Freight) RunningFreight

FROM
Purchasing.PurchaseOrderHeader POH
    LEFT JOIN Purchasing.PurchaseOrderHeader POT
        ON POH.PurchaseOrderID >= POT.PurchaseOrderID

GROUP
BY POH.PurchaseOrderID

,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

ORDER
BY POH.PurchaseOrderID


   Each of the attributes not aggregated needs to be added in the GROUP BY clause, fact that could be quite a hit from a performance standpoint. If only one aggregated value is needed then the query could be rewritten, the cumulated value being calculated in a correlated query:
-- SQL Server/Oracle: Running Totals (correlated query)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
( -- calculating aggregated running total

    SELECT SUM(POT.SubTotal)
    FROM Purchasing.PurchaseOrderHeader POT
    WHERE POH.PurchaseOrderID >= POT.PurchaseOrderID
) RunningSubTotal

FROM
Purchasing.PurchaseOrderHeader POH


   In theory could be added a correlated query for each aggregated value, though this means an additional table scan for each attribute, not the best approach for sure. The CROSS APPLY operator simplifies such queries allowing to do multiple calculations within the same correlated query:


-- SQL Server 2005: Running Totals (CROSS APPLY)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
DAT.RunningSubTotal

,
DAT.RunningTaxAmt

,
DAT.RunningFreight

FROM
Purchasing.PurchaseOrderHeader POH
    CROSS APPLY (-- calculating aggregated running total

        SELECT SUM(POT.SubTotal) RunningSubTotal
        , SUM(POT.TaxAmt) RunningTaxAmt
        , SUM(POT.Freight) RunningFreight
        FROM Purchasing.PurchaseOrderHeader POT
        WHERE POH.PurchaseOrderID >= POT.PurchaseOrderID) DAT

ORDER
BY POH.PurchaseOrderID


     The advantage of  the first two approaches is that they work also in Oracle, though also Oracle comes with an alternative using analytic aggregate functions:
-- Oracle: Running Totals (analytic aggregate function)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
SUM(POH.SubTotal) OVER (ORDER BY POH.PurchaseOrderID ) RunningSubTotal

,
SUM(POH.TaxAmt) OVER (ORDER BY POH.PurchaseOrderID) RunningTaxAmt

,
SUM(POH.Freight) OVER (ORDER BY POH.PurchaseOrderID) RunningFreight

FROM
PurchaseOrderHeader POH

ORDER
BY POH.PurchaseOrderID


Notes:
1.  The calculation of running totals requires the existence of a unique identifier for each record that can be sorted, in this case being used the PurchaseOrderID attribute, though typically could be used the creation date (timestamp) when the record was modified, alone or in combination with other attributes; when used in combination then maybe it makes sense to add a running number in order to apply the above techniques.  In Oracle the use of analytic aggregate functions offers less headaches from this perspective, being enough to specify the unique combination of attributes in the ORDER BY clause in the body of the function.
2.  SQL Server 2005’s window aggregate functions, even if similar with Oracle’s analytic aggregate functions, unlike the window ranking functions they don’t allow the ORDER BY clause in the functions, therefore running totals can’t be modeled in SQL Server with the help of window aggregate functions.
3. In case the running totals are based on other table, then might be needed to use the OUTER APPLY operator, returning thus also the records for which not match is retrieved in the outer dataset.

   Sometimes might be requested to create running totals within a given partition, for example by Vendor, in this case all is needed to do is to add the VendorID in the WHERE or JOIN constraint, while in Oracle’s analytic functions case in the PARTITION BY clause. Here are the three queries modified:
-- SQL Server/Oracle: Running Totals per Vendor (simple join)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
SUM(POT.SubTotal) RunningSubTotal

,
SUM(POT.TaxAmt) RunningTaxAmt

,
SUM(POT.Freight) RunningFreight

FROM
Purchasing.PurchaseOrderHeader POH
    LEFT JOIN Purchasing.PurchaseOrderHeader POT
        ON POH.VendorID = POT.VendorID
     AND POH.PurchaseOrderID >= POT.PurchaseOrderID

GROUP
BY POH.PurchaseOrderID

,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

ORDER
BY POH.VendorID

,
POH.PurchaseOrderID


-- SQL Server 2005: Running Totals per Vendor (CROSS APPLY)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
DAT.RunningSubTotal

,
DAT.RunningTaxAmt

,
DAT.RunningFreight

FROM
Purchasing.PurchaseOrderHeader POH
     CROSS APPLY ( -- calculating aggregated running total

        SELECT SUM(POT.SubTotal) RunningSubTotal
        , SUM(POT.TaxAmt) RunningTaxAmt
        , SUM(POT.Freight) RunningFreight
        FROM Purchasing.PurchaseOrderHeader POT
        WHERE POH.VendorID = POT.VendorID
          AND POH.PurchaseOrderID >= POT.PurchaseOrderID) DAT

ORDER
BY POH.VendorID

,
POH.PurchaseOrderID


-- Oracle: Running Totals per Vendor (analytic aggregate function)
SELECT POH.PurchaseOrderID
,
POH.VendorID

,
POH.OrderDate

,
POH.SubTotal

,
POH.TaxAmt

,
POH.Freight

,
SUM(POH.SubTotal) OVER (PARTITION BY POH.VendorID ORDER BY POH.PurchaseOrderID ) RunningSubTotal

,
SUM(POH.TaxAmt) OVER (PARTITION BY POH.VendorID ORDER BY POH.PurchaseOrderID) RunningTaxAmt

,
SUM(POH.Freight) OVER (PARTITION BY POH.VendorID ORDER BY POH.PurchaseOrderID) RunningFreight

FROM
PurchaseOrderHeader POH

ORDER
BY POH.VendorID

,
POH.PurchaseOrderID

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.