18 February 2010

💎SQL Reloaded: Just in CASE V (Dynamic Queries)

    Stored procedures allows not only to encapsulate logic, enhance queries’ performance by cashing query plans or reduce the risks of injection by using parameters, they allow also altering the structure of a query at runtime, modifying for example the query based on the provided parameters. Such behavior is ideal when needed to base the logic of a query on a list of filters (filter controls) in which some of the attributes are not mandatory. A dynamic created query can be executed using the EXEC function or the sp_executesql stored procedure, the later allowing parameters’ substitution, ensuring thus parameters’ data types, another benefit residing in the fact that the execution plans are more likely to be reused by SQL Server, Microsoft recommending thus the use of sp_executesql over Exec [1]. Now it depends also on how much the queries vary between requests and how easy is to handle the missing parameters or parameter’s substitution – sometimes quite a difficult topic for a beginner. Therefore for exemplifying the techniques I will use Exec, and here is the example:
 
-- Retrieving Products 
CREATE PROCEDURE dbo.pSearchProducts( 
@ProductID int , 
@ProductSubcategoryID int , 
@ProductModelIDs varchar(500) , 
@ProductName nvarchar(50) , 
@ProductNumber nvarchar(25) , 
@MakeFlag bit , 
@FinishedGoodsFlag bit , 
@StartStandardCost money , 
@EndStandardCost money , 
@StartSellStartDate varchar(10) , 
@EndSellStartDate varchar(10) , 
@OrderBy nvarchar(500) ) 
AS 
BEGIN 
DECLARE @Query varchar(2000) 
-- translating Order By clause 
IF LEN(@OrderBy)>0 
BEGIN 
    SET @OrderBy = ',' + @OrderBy  
    SET @OrderBy = Replace(@OrderBy, ',0', ', ProductNumber') 
    SET @OrderBy = Replace(@OrderBy, ',1', ', ProductName') 
    SET @OrderBy = Replace(@OrderBy, ',2', ', ProductModel') 
    SET @OrderBy = Replace(@OrderBy, ',3', ', SellStartDate') 
    SET @OrderBy = Replace(@OrderBy, ',4', ', StandardCost') 
    SET @OrderBy = Right(@OrderBy, Len(@OrderBy)-1) 
END 
-- building the query 
SET @Query = 'SELECT ProductID, ProductName, ProductNumber, ProductModel, CatalogDescription, ProductSubcategory' 
+ ', MakeFlag, FinishedGoodsFlag, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice' 
+ ', Size, SizeUnitMeasureCode, SizeUnitMeasure, WeightUnitMeasureCode, WeightUnitMeasure, Weight' 
+ ', DaysToManufacture, ProductLine, Class, Style, dbo.GetDateAsString(SellStartDate) SellStartDate' 
+ ', dbo.GetDateAsString(SellEndDate) SellEndDate, dbo.GetDateAsString(ModifiedDate) ModifiedDate' 
+ ' FROM Production.vProducts' 
+ ' WHERE 0=0' 
+ CASE IsNull(@ProductID, 0) 
     WHEN 0 THEN '' 
    ELSE ' AND ProductID = ' + cast(@ProductID as varchar(10)) 
END 
+ CASE IsNull(@ProductSubcategoryID, 0) 
     WHEN 0 THEN '' 
     ELSE ' AND ProductSubcategoryID = ' + cast(@ProductSubcategoryID as varchar(10)) 
END 
+ CASE IsNull(@ProductModelIDs, '') 
    WHEN '' THEN '' 
    ELSE ' AND ProductModelID IN (' + @ProductModelIDs + ')' 
END  
+ CASE IsNull(@ProductName, '') 
    WHEN '' THEN '' 
    ELSE ' AND ProductName LIKE ''' + @ProductName + '''' 
END  
+ CASE IsNull(@ProductNumber, '') 
    WHEN '' THEN '' 
    ELSE ' AND ProductNumber LIKE ''' + @ProductNumber + '''' 
END  
+ CASE IsNull(@MakeFlag, 0) 
    WHEN 0 THEN '' 
    ELSE ' AND MakeFlag = 1' 
END  
+ CASE IsNull(@FinishedGoodsFlag, 0) 
    WHEN 0 THEN '' 
    ELSE ' AND FinishedGoodsFlag = 1' 
END  
+ CASE IsNull(@StartStandardCost, 0) 
    WHEN 0 THEN '' 
    ELSE ' AND StandardCost >= ' + cast(@StartStandardCost as varchar(20)) 
END  
+ CASE IsNull(@EndStandardCost, 0) 
    WHEN 0 THEN '' 
    ELSE ' AND StandardCost <= ' + cast(@EndStandardCost as varchar(20)) 
END 
+ CASE IsNull(@StartSellStartDate, '') 
    WHEN '' THEN '' 
    ELSE ' AND DateDiff(d, SellStartDate, dbo.GetStringDate(''' + @StartSellStartDate + '''))<=0' 
END  
+ CASE IsNull(@EndSellStartDate, '') 
     WHEN '' THEN '' 
     ELSE 'AND DateDiff(d, SellStartDate, dbo.GetStringDate(''' + @EndSellStartDate + '''))>=0' 
END  
+ ' ORDER BY ' 
+ CASE LEN(@OrderBy) 
    WHEN 0 THEN ' ProductNumber' -- default sorting 
     ELSE @OrderBy -- chosen sorting 
END 
-- executing the query 
EXEC (@Query)  
END 

    The above stored procedure could split in 5 logical section:
1. The definition holding stored procedure’s name and parameters
    Parameters’ selection depends on the requirements, usually the natural and database key for the main entity or for master data (e.g. Customer, Vendor, Item Number, etc.), the LOV-based attributes used most often as selection criteria (e.g. Statuses, Characteristics), flags (e.g. Make Flag, Finished Goods Flag) that have important meaning for the business, dates used to partition the result set (e.g. Creation Date, Document Date, GL Date) being in general needed to provide the lower and upper bound of the interval. For special type of analysis it’s useful to include also a set of controls for specifying the numeric intervals, like in this case for Standard Cost.
   In general the parameters have the same date type as the attribute they represent, with a few exceptions:
dates – because the different date formats used in the various environments (e.g. client vs. web server vs. database server) I prefer to send the dates as string, thus assuring that the date will be always in the expected format.
multi-choice attributes – in case is needed to handle multi-choice selections then the easiest way to handle this is to provide a list of concatenated values as a string, as in @ProductModelIDs’ case.
flags – sometimes I prefer to use characters instead of boolean (0,1) values because it offers more flexibility in handling triple or multi-state situations.
2. The declaration of internal variables, here the @Query holding the dynamic built query, the variable needs to be big enough to hold the whole concatenated string with maximum list of parameters and ORDER BY attributes.
3. The translation of ORDER BY attributes here stored in @OrderBy input parameter, which contains a list with numbers from 0 to 9 that will dictate the attributes used in ORDER BY clause and their order. For example will be sent ‘3,1’ to sort the result set based on SellStartDate and ProductName.
4. The actual concatenation of the dynamic query, the handling for NULL values being handled with a set of CASE functions, existing one CASE for each parameter, including @OrderBy.
5. Query’s execution, here done using Exec.

Notes:  
    Usually I’m including a 6th section which includes several metadata about the current object (e.g. who created it and when, when was last time modified, objects that call current object, the purpose of the object, etc.).
     It’s needed to test such a procedure for all parameters – either by providing all parameters as in the first below example, or creating a test case for each parameter, the later approach being much safer because when multiple parameters are provided, one of them could hide an existing issue.

EXEC dbo.pSearchProducts 973, 2, '25,26,27', 'Road%', 'BK%', 1, 1, 1000, 1200, '01/01/2003', '31/12/2003', '' -- all parameters 
EXEC dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 0, 0, '', '', '' -- all records 
EXEC dbo.pSearchProducts 1, 0, '', '', '', 0, 0, 0, 0, '', '', '' -- ProductID test 
EXEC dbo.pSearchProducts 0, 2, '', '', '', 0, 0, 0, 0, '', '', '' -- ProductSubcategoryID test 
EXEC dbo.pSearchProducts 0, 0, '3, 4, 5', '', '', 0, 0, 0, 0, '', '', '2,1' -- ProductModelIDs test 
EXEC dbo.pSearchProducts 0, 0, '', 'Mountain%', '', 0, 0, 0, 0, '', '', '1' -- ProductName test 
EXEC dbo.pSearchProducts 0, 0, '', '', 'BK-T18U%', 0, 0, 0, 0, '', '', '2' -- ProductNumber test 
EXEC dbo.pSearchProducts 0, 0, '', '', '', 1, 1, 0, 0, '', '', '' -- MakeFlag & FinishedGoodsFlag test 
EXEC dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 50, 110, '', '', '' -- StandardCost test 
EXEC dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 0, 0, '01/01/2001', '31/12/2002', '3,1' -- SellStartDate test 

    Is it really needed to create a dynamic query in order to handle the various filtering scenarios? In theory yes, there are several techniques that could be used to address a special type of parameters:
 
1. Single values excepting  and dates

    In such cases the CASE could be replaced with the IsNull function test on parameter, if the parameter is null then take the value of the attribute against which the test is performed. For example the following CASE:
 
CASE IsNull(@ProductID, 0) 
    WHEN 0 THEN '' 
     ELSE ' AND ProductID = ' + cast(@ProductID as varchar(10)) 
END 
 
could be replaced with the following constraint:
 
ProductID = IsNull(@ProductID, ProductID) 
 
    A similar technique could be used also for interval-based attributes like the StandardCost from our example, the two above cases could be replaced with the following constraint:
 
StandardCost BETWEEN ISNULL(@StartStandardCost, StandardCost) AND ISNULL(@EndStandardCost, StandardCost) 
 
 Note:
    In case the attributes contain also NULL values then the Null value need to be replaced with a special value, typically not found in the list of values the respective attribute could take, for example –1 for foreign keys, an incredibly small number for numeric data types, single or empty space for string data types.

2. List of values
 
   The list of values are a little more difficult to handle but not impossible, for this scenario could be used the dbo.SplitList function introduced in a previous post. For example the CASE for @ProductModelIDs could be rewritten as follows:
 
IsNull (ProductModelID, -1) IN (SELECT value FROM dbo.SplitList(IsNull(NullIf(@ProductModelIDs, ''), CAST(IsNull(ProductModelID, -1) as varchar(20)) ), ',')) 

    The IsNull(ProductModelID, -1) expression has been introduced in order to handle the Null values, while the expression for the first parameters of dbo.SplitList was written in this way in order to handle the cases in which the @ProductModelIDs was not provided or it contains no values.
   Unfortunately this technique even if it solves the problem, is not quite efficient because the dbo.SplitList function is called for each record, fact that impacts query’s performance considerably. On the other side if the @ProductModelIDs would be mandatory than the dbo.SplitList could be joined with the main query, fact that increases query’s performance, though it must be paid attention to the duplicates, because each duplicate found in the list provided to dbo.SplitList function could lead to duplicates in the final result!

3. Dates

  Because the Date functions can’t be used errorless with the BETWEEN operator in SQL Server, the DateDiff function being preferred for such case, the handing of Date intervals could be achieved by providing the lower/upper bound of the interval date types used, for example ‘0001-01-01’, respectively ‘9999-12-31’ for date data type, or ‘1900-01-01’, respectively ‘2079-06-06’ for smalldatetime data type. or ‘1753-01-01’, respectively ‘9999-12-31’ for datetime data type.  Because the dbo.GetDateAsString function returns a datetime I will use the lower/uppers bounds of the respective data type in case the date values are null, thus the two CASEs for SellStartDate could be rewritten as:

DateDiff (d, SellStartDate, dbo.GetStringDate(IsNull(@StartSellStartDate, '01-01-1753')))<=0 
AND DateDiff(d, SellStartDate, dbo.GetStringDate(IsNull(@EndSellStartDate, '31-12-2009')))>=0     

   In case is needed to return also the dates having Null value, then the SellStatDate could be replaced with any of the two bounds of the data type used, thus the above constraint becomes:

DateDiff (d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(@StartSellStartDate, '01-01-1753')))<=0 
AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(@EndSellStartDate, '31-12-2009')))>=0 

   Unfortunately I haven’t found any easy way to handle the different attribute combinations that could appear in the ORDER BY clause. On the other side there are controls that handle the sorting on the client, therefore a default ORDER BY could be provided, and if the Users need other sorting, then they could sort the data by themselves. Here’s an alterative writing for the above stored procedure without using a dynamic query, though this doesn’t mean that the performance of this approach is better than the one of the first stored procedure given the fact that a constraint is evaluated for each parameter, the unnecessary constraints for list of values and dates coming with a considerable decrease in performance.

  -- Retrieving Products w/o dynamic query 
CREATE PROCEDURE dbo.pSearchProductsND( 
@ProductID int , @ProductSubcategoryID int , @ProductModelIDs varchar(500) , @ProductName nvarchar(50) , @ProductNumber nvarchar(25) , @MakeFlag bit , @FinishedGoodsFlag bit , @StartStandardCost money , @EndStandardCost money , @StartSellStartDate varchar(10) , @EndSellStartDate varchar(10)) 
AS 
BEGIN -- the query SELECT ProductID, ProductName, ProductNumber, ProductModel, CatalogDescription, ProductSubcategory 
,MakeFlag, FinishedGoodsFlag, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice 
,Size, SizeUnitMeasureCode, SizeUnitMeasure, WeightUnitMeasureCode, WeightUnitMeasure, Weight 
,DaysToManufacture, ProductLine, Class, Style, dbo.GetDateAsString(SellStartDate) SellStartDate 
,dbo.GetDateAsString(SellEndDate) SellEndDate, dbo.GetDateAsString(ModifiedDate) ModifiedDate 
FROM Production.vProducts 
WHERE IsNull(ProductID, -1) = COALESCE(NullIf(@ProductID, 0), ProductID, -1) 
AND IsNull(ProductSubcategoryID, -1) = COALESCE(NullIf(@ProductSubcategoryID, 0), ProductSubcategoryID, -1) 
AND IsNull(ProductModelID, -1) IN (SELECT value FROM dbo.SplitList(IsNull(NullIf(@ProductModelIDs, ''), CAST(IsNull(ProductModelID, -1) as varchar(20)) ), ','))  
AND IsNull(ProductName, ' ') LIKE IsNull(NullIf(@ProductName, ''), IsNull(ProductName, ' '))  
AND IsNull(ProductNumber, ' ') LIKE IsNull(NullIf(@ProductNumber, ''), IsNull(ProductNumber, ' '))  
AND IsNull(MakeFlag, 0) = COALESCE(NullIf(@MakeFlag, 0), MakeFlag, 0)  
AND IsNull(FinishedGoodsFlag, 0) = COALESCE(NullIf(@FinishedGoodsFlag, 0), FinishedGoodsFlag, 0) 
AND StandardCost BETWEEN ISNULL(NullIf(@StartStandardCost, 0), StandardCost) AND ISNULL(NullIf(@EndStandardCost, 0), StandardCost) 
AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(NullIf(@StartSellStartDate, ''), '01-01-1753')))<=0 
AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(NullIf(@EndSellStartDate, ''), '31-12-2009')))>=0 
ORDER BY ProductNumber END 
      
   Given the format in which parameters are sent (0 for numeric values, empty string for string values) instead of using Nulls, I had to transform the parameters to Null in case they take the value 0 or empty string, this being the only difference between the constraints used in the query and the example constraints from above. In this way the stored procedures should return the same number of records for the same test cases, here are the test cases for the second stored procedure: 


EXEC dbo.pSearchProductsND 973, 2, '25,26,27', 'Road%', 'BK%', 1, 1, 1000, 1200, '01/01/2003', '31/12/2003' -- all parameters 
EXEC dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 0, 0, '', '' -- all records 
EXEC dbo.pSearchProductsND 1, 0, '', '', '', 0, 0, 0, 0, '', '' -- ProductID test 
EXEC dbo.pSearchProductsND 0, 2, '', '', '', 0, 0, 0, 0, '', '' -- ProductSubcategoryID test 
EXEC dbo.pSearchProductsND 0, 0, '3, 4, 5', '', '', 0, 0, 0, 0, '', '' -- ProductModelIDs test 
EXEC dbo.pSearchProductsND 0, 0, '', 'Mountain%', '', 0, 0, 0, 0, '', '' -- ProductName test 
EXEC dbo.pSearchProductsND 0, 0, '', '', 'BK-T18U%', 0, 0, 0, 0, '', '' -- ProductNumber test 
EXEC dbo.pSearchProductsND 0, 0, '', '', '', 1, 1, 0, 0, '', '' -- MakeFlag & FinishedGoodsFlag test 
EXEC dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 50, 110, '', '' -- StandardCost test 
EXEC dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 0, 0, '01/01/2001', '31/12/2002' -- SellStartDate test 

  Now it depends which method you will want to use, frankly I like the dynamic query approach because it offers more flexibility in dealing with parameters and in handling other types of queries. In addition if it happens that you use the second approach, the request could change and you might need to redesign your stored procedure using the first approach.

References:
[1] MSDN. 2009. Using sp_executesql. [Online] Available from:
http://msdn.microsoft.com/en-us/library/ms175170.aspx (Accessed: 15 February 2010)


15 February 2010

💎SQL Reloaded: Oracle vs. SQL Server (Handling Missing Dates)

    What do we do with missing values? How do they influence the data analysis? Two questions each developer should ask users when creating/modifying a report. In general for numbers things are pretty simple, just replace the NULL values with a 0, this impacting the report minimally. With Dates is a little more complicated because maybe is needed to remove records having a certain date null, replace it with another date representing a certain event or maybe with the current date.

    As NULLIF, COALESCE and CASE functions are available in both systems, they could be used in particular to handle missing dates or other type of data types, especially when intended to make code portable between the two platforms. In addition SQL Server provides the IsNull function, the equivalent in Oracle being the NVL, extended by NVL2. Oracle provides also an equivalent for simple CASE function, namely the DECODE function. Here are some examples based on Product table from AdventureWorks database: 
  
-- SQL Server NULL handling 
SELECT ProductID, SellStartDate, SellEndDate, DiscontinuedDate, GetDate() CurrentDate 
, IsNull(SellEndDate, GetDate()) Example1 
, COALESCE(SellEndDate, GetDate()) Example3 
, COALESCE(DiscontinuedDate, SellEndDate, GetDate()) Example4 
, CASE 
     WHEN SellEndDate IS NULL THEN GetDate() 
     ELSE SellEndDate 
END Example6 
, NullIf(SellEndDate, SellStartDate) Example7 
FROM Production.Product 

-- Oracle NULL handling 
SELECT PRODUCTID, SELLSTARTDATE, SELLENDDATE, DISCONTINUEDDATE, SYSDATE 
, NVL(SELLENDDATE, SYSDATE) EXAMPLE1 
, NVL2(DISCONTINUEDDATE, SELLENDDATE, SYSDATE) EXAMPLE2 
, COALESCE(SELLENDDATE, SYSDATE) EXAMPLE3 
, COALESCE(DISCONTINUEDDATE, SELLENDDATE, SYSDATE) EXAMPLE4 
, DECODE(SELLENDDATE, NULL, SYSDATE, SELLENDDATE) EXAMPLE5 
, CASE 
    WHEN SELLENDDATE IS NULL THEN SYSDATE 
    ELSE SELLENDDATE 
END EXAMPLE6 
, NULLIF(SELLENDDATE, SELLSTARTDATE) EXAMPLE7 
FROM PRODUCT 

      The simple CASE function can’t be use to check for NULL values unless the NULL value is implied within the ELSE branch, and this because the IS NULL clause needs to be used for checking whether a value is missing. An exception from this rule seems to be the DECODE function, as can be seen from 5th example from the Oracle-based query.

     Aggregated functions ignore NULL values in both systems, therefore any of the above techniques could be used to handle the missing values if other behavior is expected. The selection of the date used in case the target date is missing depends on the requirements, usually is used the current date when the reports focus on today’s status, while for the calculation of lead/cycle times is preferred to use the closest date to the target date. 

-- SQL Server: handling nulls in aggregations 
SELECT AVG(DateDiff(d, SELLSTARTDATE, IsNull(SELLENDDATE, GETDATE()))) -- handling null values 
, AVG(DateDiff(d,SELLSTARTDATE, SELLENDDATE)) -- ignoring null values 
FROM Production.Product 

-- Oracle: handling nulls in aggregations 
SELECT AVG(TRUNC(NVL(SELLENDDATE, SYSDATE)) - TRUNC(SELLSTARTDATE)) -- handling null values 
, AVG(TRUNC(SELLENDDATE) - TRUNC(SELLSTARTDATE)) -- ignoring null values 
FROM PRODUCT 

    Another scenario when missing date values need to be handled is in joins, often for such cases a missing value denoting that the record is still active, like in the below example: 

-- SQL Server: getting the Start Date for current List Price 
SELECT ITM.ProductID 
, ITM.ProductNumber 
, ITM.ListPrice 
, PPH.StartDate 
FROM Production.Product ITM 
JOIN Production.ProductListPriceHistory PPH 
ON ITM.ProductID = PPH.ProductID 
AND PPH.EndDate IS NULL 
   
   The above query could be rewritten also in the following form that can be useful to get the record in use during a certain event:

-- SQL Server: getting the List Price when the product was last time modified 
SELECT ITM.ProductID 
, ITM.ProductNumber 
, PPH.ListPrice 
, PPH.StartDate 
, PPH.EndDate 
, ITM.ModifiedDate 
FROM Production.Product ITM 
JOIN Production.ProductListPriceHistory PPH 
ON ITM.ProductID = PPH.ProductID 
AND DateDiff(d, ITM.ModifiedDate, IsNull(PPH.EndDate, GETDATE()))>=0 
AND DateDiff(d, PPH.StartDate, ITM.ModifiedDate)>=0 

    In Oracle special attention must be given to LEAST and GREATEST functions that consider the smallest/greatest values from a list of attributes because if one of the vales is NULL then function’s value is NULL too, therefore if one of the attributes could take NULL values then it should be replaced with the values from a NOT NULL attribute used in the functions, like in 3rd and 4th example from the below example, this change having no impact on functions’ output.

-- Oracle Greatest/Least example 
SELECT PRODUCTID, SELLSTARTDATE, SELLENDDATE, SYSDATE 
, GREATEST(SELLSTARTDATE, SELLENDDATE, SYSDATE) EXAMPLE1 
, LEAST(SELLSTARTDATE, SELLENDDATE, SYSDATE) EXAMPLE2 
, GREATEST(SELLSTARTDATE, NVL(SELLENDDATE, SYSDATE), SYSDATE) EXAMPLE3 
, LEAST(SELLSTARTDATE, NVL(SELLENDDATE, SYSDATE), SYSDATE) EXAMPLE4 
FROM PRODUCT

💎SQL Reloaded: Oracle vs. SQL Server (Date Conversion)

    During data conversions, data migrations or also during simple processing of data is requested to format dates to a given format, extract a given time unit or convert a string to a date data type. Even if Oracle and SQL Server provides several functions for this purpose, there are small techniques that could help make things easier.

    In SQL Server the DatePart and DateName functions can be used to extract the various type of time units, the first function returning always an integer, while the second returns a character string, allowing thus to get the name of the current month or of the current day of the week, otherwise the output being quite similar. 
 
-- SQL Server DatePart 
SELECT GETDATE() CurrentDate 
, DatePart(ss, GETDATE()) SecondPart 
, DatePart(mi, GETDATE()) MinutePart 
, DatePart(hh, GETDATE()) MinutePart 
, DatePart(d, GETDATE()) DayPart 
, DatePart(wk, GETDATE()) WeekPart 
, DatePart(mm, GETDATE()) MonthPart 
, DatePart(q, GETDATE()) QuaterPart 
, DatePart(yyyy, GETDATE()) YearPart  
-- SQL Server DateName 
SELECT GETDATE() CurrentDate 
, DateName(ss, GETDATE()) SecondPart 
, DateName(mi, GETDATE()) MinutePart 
, DateName(hh, GETDATE()) MinutePart 
, DateName(d, GETDATE()) DayPart 
, DateName(wk, GETDATE()) WeekPart 
, DateName(mm, GETDATE()) MonthPart 
, DateName(q, GETDATE()) QuaterPart 
, DateName(yyyy, GETDATE()) YearPart , DateName(mm, GETDATE()) MonthName 
, DateName(dd, GETDATE()) DayName 

    SQL Server provides three quite useful functions for getting the Day, Month or Year of a given date: 
 
-- SQL Server alternative functions 
SELECT DAY(GetDate()) DayPart 
, MONTH(GetDate()) MonthPart , YEAR(GetDate()) YearPar 

    Oracle provides a more flexible alternative of DateName function, respectively the TO_CHAR function, that allow not only the extraction of the different time units, but also the conversion of a date to a specified format. 
  
-- Oracle Date parts 
SELECT to_char(SYSDATE, 'dd-mon-yyyy hh24:mi:ss') CurrentDate 
, to_char(SYSDATE, 'SS') SecondPart 
, to_char(SYSDATE, 'MI') MinutePart 
, to_char(SYSDATE, 'HH') HourPart 
, to_char(SYSDATE, 'DD') DayPart 
, to_char(SYSDATE, 'IW') WeekPart 
, to_char(SYSDATE, 'MM') MonthPart 
, to_char(SYSDATE, 'QQ') QuarterPart 
, to_char(SYSDATE, 'YYYY') YearPart 
, to_char(SYSDATE, 'MONTH') MonthName 
, to_char(SYSDATE, 'DAY') DayName 
FROM DUAL 
-- Oracle Date formatting 
SELECT to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') CurrentDate 
, to_char(SYSDATE, 'Mon dd yyyy hh24:mi') USDateFormat 
, to_char(SYSDATE, 'mm/dd/yyyy') ANSIDateFormat 
, to_char(SYSDATE, 'yyyy.mm.dd') BritishDateFormat 
, to_char(SYSDATE, 'dd/mm/yyyy') GermanDateFormat 
, to_char(SYSDATE, 'dd-mm-yyyy') ItalianDateFormat 
, to_char(SYSDATE, 'yyyy/mm/dd') JapanDateFormat 
, to_char(SYSDATE, 'yyyymmdd') ISODateFormat 
, to_char(SYSDATE, 'dd Mon yyyy hh24:mi:ss') EuropeDateFormat 
, to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') ODBCDateFormat 
, Replace(to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss'), ' ', 'T') ISO8601DateFormat 
FROM DUAL 7


Date Oracle fomatting
    Even if there are more plausible combinations, the above examples could be used as a starting point, they being chosen to match the similar functionality provided by SQL Server using the CONVERT function and styles. 
-- SQL Server date formatting 
SELECT GETDATE() CurrentDate 
, CONVERT(varchar(20), GETDATE(), 100) USDateFormat 
, CONVERT(varchar(20), GETDATE(), 101) ANSIDateFormat 
, CONVERT(varchar(20), GETDATE(), 102) BritishDateFormat 
, CONVERT(varchar(20), GETDATE(), 103) GermanDateFormat 
, CONVERT(varchar(20), GETDATE(), 105) ItalianDateFormat 
, CONVERT(varchar(20), GETDATE(), 111) JapanDateFormat 
, CONVERT(varchar(20), GETDATE(), 112) ISODateFormat 
, CONVERT(varchar(20), GETDATE(), 113) EuropeDateFormat 
, CONVERT(varchar(20), GETDATE(), 120) ODBCDateFormat 
, CONVERT(varchar(20), GETDATE(), 126) ISO8601DateFormat 
 
Date SQL Server formatting
    The use of CONVERT function with styles is not the best approach though it saves the day. When the same formatting is used in multiple objects it makes sense to encapsulate the used date conversions in a function, making thus easier their use and their maintenance in case of changes of formatting. 
 
CREATE FUNCTION dbo.GetDateAsString( @date datetime) 
RETURNS varchar(10) 
AS 
BEGIN 
    RETURN CONVERT(varchar(10), @date, 103) 
END 
 
   The inverse problem is the conversion of a string to a date, Oracle providing the TO_DATE, CAST, TO_TIMESTAMP and  TO_TIMESTAMP_TZ functions for this purpose, the first two functions being the most used. 
--Oracle String to Date Conversion 
SELECT TO_DATE('25-03-2009', 'DD-MM-YYYY') 
, TO_DATE('25-03-2009 18:30:23', 'DD-MM-YYYY HH24:MI:SS') 
, Cast('25-MAR-2009' as Date) 
FROM DUAL    

Excepting the CONVERT function mentioned above, SQL Server provides a CAST function too, both allowing the conversion of strings to date. 
 
SELECT CAST('2009-03-25' as date) 
, CONVERT(date, '2009-03-25') 


  When saving dates into text attributes in SQL Server it should be targeted to use the ISO format which is independent of the format set by DATEFORMAT, otherwise, in case the format of the date stored is known, the string could be translated to the ISO format like in the below function: 
 
--SQL Server: DD/MM/YYYY  String to Date function 
CREATE 
FUNCTION dbo.GetStringDate( 
@date varchar(10)) 
RETURNS datetime 
AS 
BEGIN 
     RETURN Cast(Right(@date, 4) + '/' + Substring(@date, 4,2) + '/' + Left(@date, 2) as datetime) 
END 
SELECT 
dbo.GetStringDate('25/09/2009') 


    Other approach I found quite useful in several cases is based on the VBScript DateSerial function that allows the creation of a date from its constituents:

-- SQL Server: DateSerial 
CREATE FUNCTION dbo.DateSerial( 
@year int 
, @month smallint , 
@day smallint) 
RETURNS 
date 
AS 
BEGIN 
RETURN (Cast(@year as varchar(4)) + '-' + Cast(@month as varchar(2)) + '-' + Cast(@day as varchar(2))) 
END 
SELECT 
dbo.DateSerial(2009,10,24) 
-- SQL Server: DateTimeSerial 
CREATE FUNCTION dbo.DateTimeSerial( 
@year int 
, @month smallint 
, @day smallint 
, @hour smallint 
, @minute smallint 
, @second smallint) 
RETURNS 
datetime AS 
BEGIN 
RETURN (Cast(@year as varchar(4)) + '-' + Cast(@month as varchar(2)) + '-' + Cast(@day as varchar(2)) 
+ ' ' + Cast(@hour as varchar(2)) + ':' + Cast(@minute as varchar(2)) + ':' + Cast(@second as varchar(2))) 
END 

    Given TO_DATE function’s flexibility none of the three above functions - GetStringDate, DateSerial and DateTimeSerial, are really needed in Oracle.

🕋Data Warehousing: Aggregation (Definitions)

"A table or structure containing pre-calculated data for a cube. Aggregations support rapid and efficient querying of a multidimensional database. " (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Data that’s the result of applying a process to combine data elements. This is data that’s taken collectively or in summary form. Usually, this is a sum, count, or average of underlying detail data." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A process where the data is presented in a summary form, such as average." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"Summarized values of a measure." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"A process where the data is presented in a summary form, such as average." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2007)

"Any of a variety of operations that reduces and summarizes large sets of data." (MongoDb, "Glossary", 2008)

"A special type of abstraction relationship that defines a higher-level entity that is an aggregate of several lower-level entities; a 'part-of' type relationship. For example, a bicycle entity would be an aggregate of wheel, handlebar, and seat entities." (Toby J Teorey, ", Database Modeling and Design 4th Ed", 2010)

"1.Generally, the process of gathering into a whole from parts. 2.In data management, a process that transforms atomic data into aggregate-level information by using an aggregation function such as count, sum, average, standard deviation, etc." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A table or structure containing pre-calculated data for an online analytical processing (OLAP) cube. Aggregations support the rapid and efficient querying of a multidimensional database." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A process by which variables are summed based on a classification or temporal hierarchy. Common examples include totaling all sales for a given time period or geographic region." (Evan Stubbs, "Delivering Business Analytics: Practical Guidelines for Best Practice", 2013)

"The sum or collection of data. Often 'aggregation' is used to summarize sales at various levels of the merchandise, location, and time hierarchy." (Brittany Bullard, "Style and Statistics", 2016)

"The act of combining information from separate sources of a lower classification level that results in the creation of information of a higher classification level, which the subject does not have the necessary rights to access." (Shon Harris & Fernando Maymi, "CISSP All-in-One Exam Guide" 8th Ed., 2018)

🕋Data Warehousing: Control Flow (Definitions)

"A component of a SQL Server Integration Services package that controls the flow of tasks within the package. The high-level workflow of a package. Often contains one or more data flow tasks." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Concept to define causal dependency between process steps to enforce a specific execution order." (Christoph Bussler, "B2B and EAI with Business Process Management", 2009)

"The flow of control defines a partial order relationship between the activities of a business process model, specifying in which temporal order they will be executed." (Cesare Pautasso, "Compiling Business Process Models into Executable Code", 2009)

"A group of connected control flow elements that perform tasks." (Microsoft, "SQL Server 2012 Glossary", 2012)

"An abstract representation of all possible sequences of events (paths) during execution of a component or system. Often represented in graphical form, see control flow graph." (Tilo Linz et al, "Software Testing Foundations" 4th Ed., 2014)

"The control-flow perspective focuses on the control-flow, i.e., the ordering of activities. The goal of mining this perspective is to find a good characterization of all possible paths. Other popular perspectives are the organizational perspective and the case perspective." (Pavlos Delias et al, "Applying Process Mining to the Emergency Department", Encyclopedia of Business Analytics and Optimization, 2014)

"In DB2 data warehousing, a graphical model that sequences data flows and mining flows, integrates external commands, programs, and stored procedures, and provides conditional processing logic for a data warehouse application." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A sequence of events (paths) in the execution through a component or system." (ISTQB)

"The ordered workflow in an Integration Services package that performs tasks." (Microsoft Technet)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.