15 February 2010

💎SQL Reloaded: Oracle vs. SQL Server (Date Conversion)

    During data conversions, data migrations or also during simple processing of data is requested to format dates to a given format, extract a given time unit or convert a string to a date data type. Even if Oracle and SQL Server provides several functions for this purpose, there are small techniques that could help make things easier.

    In SQL Server the DatePart and DateName functions can be used to extract the various type of time units, the first function returning always an integer, while the second returns a character string, allowing thus to get the name of the current month or of the current day of the week, otherwise the output being quite similar. 
 
-- SQL Server DatePart 
SELECT GETDATE() CurrentDate 
, DatePart(ss, GETDATE()) SecondPart 
, DatePart(mi, GETDATE()) MinutePart 
, DatePart(hh, GETDATE()) MinutePart 
, DatePart(d, GETDATE()) DayPart 
, DatePart(wk, GETDATE()) WeekPart 
, DatePart(mm, GETDATE()) MonthPart 
, DatePart(q, GETDATE()) QuaterPart 
, DatePart(yyyy, GETDATE()) YearPart  
-- SQL Server DateName 
SELECT GETDATE() CurrentDate 
, DateName(ss, GETDATE()) SecondPart 
, DateName(mi, GETDATE()) MinutePart 
, DateName(hh, GETDATE()) MinutePart 
, DateName(d, GETDATE()) DayPart 
, DateName(wk, GETDATE()) WeekPart 
, DateName(mm, GETDATE()) MonthPart 
, DateName(q, GETDATE()) QuaterPart 
, DateName(yyyy, GETDATE()) YearPart , DateName(mm, GETDATE()) MonthName 
, DateName(dd, GETDATE()) DayName 

    SQL Server provides three quite useful functions for getting the Day, Month or Year of a given date: 
 
-- SQL Server alternative functions 
SELECT DAY(GetDate()) DayPart 
, MONTH(GetDate()) MonthPart , YEAR(GetDate()) YearPar 

    Oracle provides a more flexible alternative of DateName function, respectively the TO_CHAR function, that allow not only the extraction of the different time units, but also the conversion of a date to a specified format. 
  
-- Oracle Date parts 
SELECT to_char(SYSDATE, 'dd-mon-yyyy hh24:mi:ss') CurrentDate 
, to_char(SYSDATE, 'SS') SecondPart 
, to_char(SYSDATE, 'MI') MinutePart 
, to_char(SYSDATE, 'HH') HourPart 
, to_char(SYSDATE, 'DD') DayPart 
, to_char(SYSDATE, 'IW') WeekPart 
, to_char(SYSDATE, 'MM') MonthPart 
, to_char(SYSDATE, 'QQ') QuarterPart 
, to_char(SYSDATE, 'YYYY') YearPart 
, to_char(SYSDATE, 'MONTH') MonthName 
, to_char(SYSDATE, 'DAY') DayName 
FROM DUAL 
-- Oracle Date formatting 
SELECT to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') CurrentDate 
, to_char(SYSDATE, 'Mon dd yyyy hh24:mi') USDateFormat 
, to_char(SYSDATE, 'mm/dd/yyyy') ANSIDateFormat 
, to_char(SYSDATE, 'yyyy.mm.dd') BritishDateFormat 
, to_char(SYSDATE, 'dd/mm/yyyy') GermanDateFormat 
, to_char(SYSDATE, 'dd-mm-yyyy') ItalianDateFormat 
, to_char(SYSDATE, 'yyyy/mm/dd') JapanDateFormat 
, to_char(SYSDATE, 'yyyymmdd') ISODateFormat 
, to_char(SYSDATE, 'dd Mon yyyy hh24:mi:ss') EuropeDateFormat 
, to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss') ODBCDateFormat 
, Replace(to_char(SYSDATE, 'yyyy-mm-dd hh24:mi:ss'), ' ', 'T') ISO8601DateFormat 
FROM DUAL 7


Date Oracle fomatting
    Even if there are more plausible combinations, the above examples could be used as a starting point, they being chosen to match the similar functionality provided by SQL Server using the CONVERT function and styles. 
-- SQL Server date formatting 
SELECT GETDATE() CurrentDate 
, CONVERT(varchar(20), GETDATE(), 100) USDateFormat 
, CONVERT(varchar(20), GETDATE(), 101) ANSIDateFormat 
, CONVERT(varchar(20), GETDATE(), 102) BritishDateFormat 
, CONVERT(varchar(20), GETDATE(), 103) GermanDateFormat 
, CONVERT(varchar(20), GETDATE(), 105) ItalianDateFormat 
, CONVERT(varchar(20), GETDATE(), 111) JapanDateFormat 
, CONVERT(varchar(20), GETDATE(), 112) ISODateFormat 
, CONVERT(varchar(20), GETDATE(), 113) EuropeDateFormat 
, CONVERT(varchar(20), GETDATE(), 120) ODBCDateFormat 
, CONVERT(varchar(20), GETDATE(), 126) ISO8601DateFormat 
 
Date SQL Server formatting
    The use of CONVERT function with styles is not the best approach though it saves the day. When the same formatting is used in multiple objects it makes sense to encapsulate the used date conversions in a function, making thus easier their use and their maintenance in case of changes of formatting. 
 
CREATE FUNCTION dbo.GetDateAsString( @date datetime) 
RETURNS varchar(10) 
AS 
BEGIN 
    RETURN CONVERT(varchar(10), @date, 103) 
END 
 
   The inverse problem is the conversion of a string to a date, Oracle providing the TO_DATE, CAST, TO_TIMESTAMP and  TO_TIMESTAMP_TZ functions for this purpose, the first two functions being the most used. 
--Oracle String to Date Conversion 
SELECT TO_DATE('25-03-2009', 'DD-MM-YYYY') 
, TO_DATE('25-03-2009 18:30:23', 'DD-MM-YYYY HH24:MI:SS') 
, Cast('25-MAR-2009' as Date) 
FROM DUAL    

Excepting the CONVERT function mentioned above, SQL Server provides a CAST function too, both allowing the conversion of strings to date. 
 
SELECT CAST('2009-03-25' as date) 
, CONVERT(date, '2009-03-25') 


  When saving dates into text attributes in SQL Server it should be targeted to use the ISO format which is independent of the format set by DATEFORMAT, otherwise, in case the format of the date stored is known, the string could be translated to the ISO format like in the below function: 
 
--SQL Server: DD/MM/YYYY  String to Date function 
CREATE 
FUNCTION dbo.GetStringDate( 
@date varchar(10)) 
RETURNS datetime 
AS 
BEGIN 
     RETURN Cast(Right(@date, 4) + '/' + Substring(@date, 4,2) + '/' + Left(@date, 2) as datetime) 
END 
SELECT 
dbo.GetStringDate('25/09/2009') 


    Other approach I found quite useful in several cases is based on the VBScript DateSerial function that allows the creation of a date from its constituents:

-- SQL Server: DateSerial 
CREATE FUNCTION dbo.DateSerial( 
@year int 
, @month smallint , 
@day smallint) 
RETURNS 
date 
AS 
BEGIN 
RETURN (Cast(@year as varchar(4)) + '-' + Cast(@month as varchar(2)) + '-' + Cast(@day as varchar(2))) 
END 
SELECT 
dbo.DateSerial(2009,10,24) 
-- SQL Server: DateTimeSerial 
CREATE FUNCTION dbo.DateTimeSerial( 
@year int 
, @month smallint 
, @day smallint 
, @hour smallint 
, @minute smallint 
, @second smallint) 
RETURNS 
datetime AS 
BEGIN 
RETURN (Cast(@year as varchar(4)) + '-' + Cast(@month as varchar(2)) + '-' + Cast(@day as varchar(2)) 
+ ' ' + Cast(@hour as varchar(2)) + ':' + Cast(@minute as varchar(2)) + ':' + Cast(@second as varchar(2))) 
END 

    Given TO_DATE function’s flexibility none of the three above functions - GetStringDate, DateSerial and DateTimeSerial, are really needed in Oracle.

🕋Data Warehousing: Aggregation (Definitions)

"A table or structure containing pre-calculated data for a cube. Aggregations support rapid and efficient querying of a multidimensional database. " (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Data that’s the result of applying a process to combine data elements. This is data that’s taken collectively or in summary form. Usually, this is a sum, count, or average of underlying detail data." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A process where the data is presented in a summary form, such as average." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"Summarized values of a measure." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"A process where the data is presented in a summary form, such as average." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2007)

"Any of a variety of operations that reduces and summarizes large sets of data." (MongoDb, "Glossary", 2008)

"A special type of abstraction relationship that defines a higher-level entity that is an aggregate of several lower-level entities; a 'part-of' type relationship. For example, a bicycle entity would be an aggregate of wheel, handlebar, and seat entities." (Toby J Teorey, ", Database Modeling and Design 4th Ed", 2010)

"1.Generally, the process of gathering into a whole from parts. 2.In data management, a process that transforms atomic data into aggregate-level information by using an aggregation function such as count, sum, average, standard deviation, etc." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A table or structure containing pre-calculated data for an online analytical processing (OLAP) cube. Aggregations support the rapid and efficient querying of a multidimensional database." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A process by which variables are summed based on a classification or temporal hierarchy. Common examples include totaling all sales for a given time period or geographic region." (Evan Stubbs, "Delivering Business Analytics: Practical Guidelines for Best Practice", 2013)

"The sum or collection of data. Often 'aggregation' is used to summarize sales at various levels of the merchandise, location, and time hierarchy." (Brittany Bullard, "Style and Statistics", 2016)

"The act of combining information from separate sources of a lower classification level that results in the creation of information of a higher classification level, which the subject does not have the necessary rights to access." (Shon Harris & Fernando Maymi, "CISSP All-in-One Exam Guide" 8th Ed., 2018)

🕋Data Warehousing: Control Flow (Definitions)

"A component of a SQL Server Integration Services package that controls the flow of tasks within the package. The high-level workflow of a package. Often contains one or more data flow tasks." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Concept to define causal dependency between process steps to enforce a specific execution order." (Christoph Bussler, "B2B and EAI with Business Process Management", 2009)

"The flow of control defines a partial order relationship between the activities of a business process model, specifying in which temporal order they will be executed." (Cesare Pautasso, "Compiling Business Process Models into Executable Code", 2009)

"A group of connected control flow elements that perform tasks." (Microsoft, "SQL Server 2012 Glossary", 2012)

"An abstract representation of all possible sequences of events (paths) during execution of a component or system. Often represented in graphical form, see control flow graph." (Tilo Linz et al, "Software Testing Foundations" 4th Ed., 2014)

"The control-flow perspective focuses on the control-flow, i.e., the ordering of activities. The goal of mining this perspective is to find a good characterization of all possible paths. Other popular perspectives are the organizational perspective and the case perspective." (Pavlos Delias et al, "Applying Process Mining to the Emergency Department", Encyclopedia of Business Analytics and Optimization, 2014)

"In DB2 data warehousing, a graphical model that sequences data flows and mining flows, integrates external commands, programs, and stored procedures, and provides conditional processing logic for a data warehouse application." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A sequence of events (paths) in the execution through a component or system." (ISTQB)

"The ordered workflow in an Integration Services package that performs tasks." (Microsoft Technet)

14 February 2010

💎SQL Reloaded: Oracle vs. SQL Server (Date Validation)

When loading data into a table or even when performing date operations with text attributes that store date values, it’s mandatory to check whether the values represent a valid date, otherwise an error will be thrown, fact that could have negative impact on your logic. SQL Server provides the IsDate function for this purpose, the function returning a 1 if the value is a date, and 0 otherwise, though, as always, there is a catch or two. First of all the valid date refers only to the date format expected/supported by the database, for example SQL Server supports the ISO 8601 formats (YYYYMMDD, YYYY-MM-DD, YYYY-MM or YYYY) together with (independent of) the date format set using the DATEFORMAT or SET LANGUAGE, thus based on the existing settings ‘25-10-2009’ could be a valid date while ‘10-25-2009’ is not even if the interpretation of the respective value raises no doubt. In addition to shorter formats like YYYY-MM or YYYY, the date formats could include leading spaces (e.g. ‘ 2009-10- 25’) fact that could create problems when using the data outside SQL Server, therefore it’s a good idea to check whether the date is strictly in the requested format and eventually convert the various date types to a single supported format. 
 
Here are several examples with the IsDate function: 
 
-- SQL Server: IsDate function examples 
SELECT IsDate('20091025') Example1 --(ISO 8601) 
, IsDate('2009-10-25') Example2 --(ISO 8601) 
, IsDate('2009-10') Example3 --(ISO 8601) 
, IsDate('2009') Example4 -- (ISO 8601) 
, IsDate('25-10-2009') Example5 
, IsDate('10.25-2009') Example6 
, IsDate('10-25-2009') Example7 
, IsDate(' 10-25- 2009') Example8 
, IsDate('25-OCT-2009') Example9 
, IsDate(NULL) Example10 

Date IsDate SQL Server
  
There are two interesting points: the IsDate doesn’t recognize the ‘2009-10’ value as date even if it’s a ISO format, and it recognizes ‘10.25-2009’ as date. A NULL value is not recognized as a valid date, therefore if other behavior is expected the NULL case needs to be handled. 
 
Oracle doesn’t provide a similar functionality like the IsDate function existing in Oracle, though something could be done in this direction – create a function that attempts the conversion to a date and return a 0 in case an error is caught, a 1 otherwise. Because Oracle doesn’t support a boolean or bit data type a number was preferred in order to simulate the same behavior as the SQL Server built-in function. Unfortunately if it’s attempted to create a function like the below one, function’s behavior might not be the expected one. 
  
--Oracle: IsDate implementation (limited use) 
CREATE OR REPLACE FUNCTION IsDate ( 
sDate varchar2 ) RETURN 
number IS result_value date; 
BEGIN 
result_value:= Cast(sDate as date);
       RETURN CASE WHEN sDate IS NULL THEN 0 ELSE 1 END; 
       EXCEPTION 
       WHEN OTHERS THEN       

RETURN 0; 
END ; 

 Here are the above examples but this time in Oracle: 

-- Oracle: IsDate function examples 
SELECT IsDate('20091025') Example1 --(ISO 8601) 
, IsDate('2009-10-25') Example2 --(ISO 8601) 
, IsDate('2009-10') Example3 --(ISO 8601) 
, IsDate('2009') Example4 -- (ISO 8601) 
, IsDate('25-10-2009') Example5 
, IsDate('10.25-2009') Example6 
, IsDate('10-25-2009') Example7 
, IsDate(' 10-25- 2009') Example8 
, IsDate('25-OCT-2009') Example9 
, IsDate(NULL) Example10 
FROM DUAL 

Date IsDate Oracle


The above function could be modified to use the TO_DATE function in order to convert explicitly the string value to a date by hard-coding the expected format. On the other side the expected format could be provided as parameter like in the below implementation, gaining thus flexibility.

--Oracle: IsDateFormat implementation 
CREATE OR REPLACE FUNCTION IsDateFormat ( 
   sDate varchar2 
, sFormat varchar2) RETURN 
number IS result_value date; 
BEGIN 
result_value:= to_date(sDate, sFormat);
      RETURN CASE WHEN sDate IS NULL THEN 0 ELSE 1 END; 
      EXCEPTION 
      WHEN OTHERS THEN 
      RETURN 0; 
END ; 

 Here are the same examples as above, this time with the expected format provided explicitly: 
 
-- Oracle: IsDateFormat function examples 
SELECT IsDateFormat('20091025', 'YYYYMMDD') Example1 --(ISO 8601) 
, IsDateFormat('2009-10-25', 'YYYY-MM-DD') Example2 --(ISO 8601) 
, IsDateFormat('2009-10', 'YYYY-MM') Example3 --(ISO 8601) 
, IsDateFormat('2009', 'YYYY') Example4 -- (ISO 8601) 
, IsDateFormat('25-10-2009', 'DD-MM-YYYY') Example5 
, IsDateFormat('10.25-2009', 'MM-DD-YYYY') Example6 
, IsDateFormat('10-25-2009', 'MM-DD-YYYY') Example7 
, IsDateFormat(' 10-25- 2009', 'MM-DD-YYYY') Example8 
, IsDateFormat('25-OCT-2009', 'DD-MON-YYYY') Example9 
, IsDateFormat(NULL, 'YYYY-MM-DD') Example10 
FROM DUAL
 

Date IsDateFormat Oracle  

💎SQL Reloaded: Oracle vs. SQL Server (Date Operations)

Difference between two dates 
 
    Often is needed to calculate the difference in days (or in any other time unit) between two dates, for example in order to determine the cycle/lead time, further analysis being done based on such data. In SQL Server the difference between two dates can be obtained using the DateDiff function which allows specifying the time unit used. Here’s the use for most important time units: 

-- SQL Server Date Difference 
SELECT ProductID, SellStartDate, SellEndDate, GetDate() CurrentDate , 
DATEDIFF(ss, SellStartDate, IsNull(SellEndDate, GetDate())) NumberSeconds , 
DATEDIFF(mi, SellStartDate, IsNull(SellEndDate, GetDate())) NumberMinutes , 
DATEDIFF(hh, SellStartDate, IsNull(SellEndDate, GetDate())) NumberHours , 
DATEDIFF(d, SellStartDate, IsNull(SellEndDate, GetDate())) NumberDays , 
DATEDIFF(wk, SellStartDate, IsNull(SellEndDate, GetDate())) NumberWeeks , 
DATEDIFF(mm, SellStartDate, IsNull(SellEndDate, GetDate())) NumberMonths , 
DATEDIFF(qq, SellStartDate, IsNull(SellEndDate, GetDate())) NumberQuarters , 
DATEDIFF(yyyy, SellStartDate, IsNull(SellEndDate, GetDate())) NumberYears  
FROM Production.Product   

    In Oracle the difference in days could be obtained using the subtraction operator “-”, therefore in order to get the difference in smaller time units the result should be multiplied 24 for hours, 24*60 for minutes, respectively 24*60*60 for seconds, and divided by 7, 30, 3*30 or 365 in order to approximate the difference in weeks, months, quarters, respectively years. 

-- Oracle Date Difference 
SELECT ProductID, SellStartDate, SellEndDate, SYSDATE , 
(NVL(SellEndDate, SYSDATE)- SellStartDate) * 24 * 60 * 60 NumberSeconds , 
(NVL(SellEndDate, SYSDATE)- SellStartDate) * 24 * 60 NumberMinutes , 
(NVL(SellEndDate, SYSDATE)- SellStartDate) * 24 NumberHours , 
NVL(SellEndDate, SYSDATE)- SellStartDate NumberDays , 
(NVL(SellEndDate, SYSDATE)- SellStartDate)/7 NumberWeeks , 
(NVL(SellEndDate, SYSDATE)- SellStartDate)/30 NumberMonths , 
(NVL(SellEndDate, SYSDATE)- SellStartDate)/(30*3) NumberQuarters , 
(NVL(SellEndDate, SYSDATE)- SellStartDate)/365 NumberYears  
FROM SQLServer.PRODUCT     

    Unfortunately the above approach won’t work if you want to use the result with numeric-based functions like Floor, because the difference is an interval and not a numeric value. If you attempt to use directly the Floor function you’ll get the “ORA-00932: inconsistent datatypes: expected NUMBER got INTERVAL DAY TO SECOND”. In order to overcome this issue the dates need to be truncated, the lowest time unit for this being the minutes. Here’s the modified query: 
-- Oracle Date Difference using truncated values 
SELECT ProductID, SellStartDate, SellEndDate, SYSDATE , 
(TRUNC(NVL(SellEndDate, SYSDATE), 'MI')- TRUNC(SellStartDate, 'MI')) * 24 * 60 * 60 NumberSeconds , 
(TRUNC(NVL(SellEndDate, SYSDATE), 'MI')- TRUNC(SellStartDate, 'MI')) * 24 * 60 NumberMinutes , 
TRUNC((TRUNC(NVL(SellEndDate, SYSDATE), 'HH')- TRUNC(SellStartDate, 'HH')) * 24) NumberHours , 
TRUNC(NVL(SellEndDate, SYSDATE))- TRUNC(SellStartDate) NumberDays , 
FLOOR((TRUNC(NVL(SellEndDate, SYSDATE), 'IW')- TRUNC(SellStartDate, 'IW'))/7) NumberWeeks , 
FLOOR((TRUNC(NVL(SellEndDate, SYSDATE), 'MM')- TRUNC(SellStartDate, 'MM'))/30) NumberMonths , 
FLOOR((TRUNC(NVL(SellEndDate, SYSDATE), 'Q')- TRUNC(SellStartDate, 'Q'))/(30*3)) NumberQuarters , 
FLOOR((TRUNC(NVL(SellEndDate, SYSDATE), 'YYYY')- TRUNC(SellStartDate, 'YYYY'))/365) NumberYears FROM  SQLServer.PRODUCT   

    Up until weeks the differences between the two queries are minimal, the important differences resulting for months, quarters and years. The logic used in the Oracle query could be encapsulated in a date function like the below one:
CREATE OR REPLACE FUNCTION DateDiff ( datepart varchar2 , 
startdate date , 
enddate date) RETURN 
NUMBER IS datepart1 varchar2 (4); 
BEGIN 
      datepart1:= CASE WHEN datepart = 'SS' THEN 'MI' ELSE datepart END;
RETURN FLOOR((TRUNC(enddate, datepart1)- TRUNC(startdate, datepart1)) 

* CASE datepart

    WHEN 'SS' THEN 24 * 60 * 60 
   WHEN 'MI' THEN 24 * 60 
   WHEN 'HH' THEN 24 
   WHEN 'DD' THEN 1 
   WHEN 'IW' THEN 1/7 
   WHEN 'MM' THEN 1/30 
   WHEN 'Q' THEN 1/(30*3)

   WHEN 'YYYY' THEN 1/365 
   ELSE 1 
END); END;  
  
    Here’s the above Oracle query rewritten using the Oracle DateDiff function: 
-- Oracle Date Difference 
SELECT  ProductID, SellStartDate, SellEndDate, SYSDATE , 
DateDiff('SS', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberSeconds , 
DateDiff('MI', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberMinutes , 
DateDiff('HH', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberHours , 
DateDiff('DD', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberDays , 
DateDiff('IW', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberWeeks , 
DateDiff('MM', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberMonths , 
DateDiff('Q', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberQuarters , 
DateDiff('YYYY', SellStartDate, NVL(SellEndDate, SYSDATE)) NumberYears 
FROM SQLServer.PRODUCT 

Adding Time Units
  
  Oracle is using the add/subtract operators also in order to add/subtract time units from a Date, while SQL Server provides the DateAdd built-in function for the same type of operation. 

-- SQL Server adding time unit 
SELECT GetDate() CurrentDate , 
DateAdd(ss, 1, GetDate()) AddingSecond , 
DateAdd(mi, 1, GetDate()) AddingMinute , 
DateAdd(hh, 1, GetDate()) AddingHour , 
DateAdd(dd, 1, GetDate()) AddingDay , 
DateAdd(wk, 1, GetDate()) AddingWeek , 
DateAdd(mm, 1, GetDate()) AddingMonth , 
DateAdd(q, 1, GetDate()) AddingQuarter , 
DateAdd(yyyy, 1, GetDate()) AddingYear 

-- Oracle adding time unit 
SELECT to_char(SYSDATE, 'dd-mon-yyyy hh24:mi:ss') CurrentDate , 
to_char(SYSDATE + 1/(24*60*60), 'dd-mon-yyyy hh24:mi:ss') AddingSecond , 
to_char(SYSDATE + 1/(24*60), 'dd-mon-yyyy hh24:mi:ss') AddingMinute , 
to_char(SYSDATE + 1/24, 'dd-mon-yyyy hh24:mi:ss') AddingHour , 
to_char(SYSDATE + 1, 'dd-mon-yyyy hh24:mi:ss') AddingDay , 
to_char(SYSDATE + 7, 'dd-mon-yyyy hh24:mi:ss') AddingWeek , 
to_char(add_months(SYSDATE, 1), 'dd-mon-yyyy hh24:mi:ss') AddingMonth , 
to_char(add_months(SYSDATE, 3), 'dd-mon-yyyy hh24:mi:ss') AddingQuarter , 
to_char(add_months(SYSDATE, 12), 'dd-mon-yyyy hh24:mi:ss') AddingYear 
FROM DUAL    

    In order to provide similar functionality with SQL Server, could be created a DateAdd function also in Oracle: 
--Oracle DateAdd implementation 
CREATE OR REPLACE FUNCTION DateAdd ( 
 datepart varchar2 , 
unit number , 
targetdate date) RETURN 
date IS     resultdate  date; 
BEGIN 
     IF datepart IN ('MM', 'Q', 'YYYY') THEN resultdate  := CASE datepart  
           WHEN 'MM' THEN add_months(targetdate, unit)
         WHEN 'Q' THEN add_months(targetdate, unit*3) 

         ELSE add_months(targetdate, unit*12) 
   END; 
  ELSE resultdate  := targetdate + unit * CASE datepart 

     WHEN 'SS' THEN 1/ (24 * 60 * 60) 
     WHEN 'MI' THEN 1/ (24 * 60) 
     WHEN 'HH' THEN 1 / 24 
     WHEN 'DD' THEN 1 
     WHEN 'IW' THEN 7 
      ELSE 1 
END; 
END IF; 
RETURN resultdate; 
END; 

     Here’s the above Oracle query rewritten using the Oracle DateAdd function: 
-- Oracle using DateAdd function 
SELECT to_char(SYSDATE, 'dd-mon-yyyy hh24:mi:ss') CurrentDate , 
to_char(DateAdd('SS', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingSecond , 
to_char(DateAdd('MI', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingMinute , 
to_char(DateAdd('HH', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingHour , 
to_char(DateAdd('DD', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingDay , 
to_char(DateAdd('IW', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingWeek , 
to_char(DateAdd('MM', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingMonth , 
to_char(DateAdd('Q', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingQuarter , 
to_char(DateAdd('YYYY', 1, SYSDATE), 'dd-mon-yyyy hh24:mi:ss') AddingYear 
FROM DUAL      

    Of course, in order to subtract a time unit then a negative value needs to be provided for the second parameter.

12 February 2010

🕋Data Warehousing: Operational Data Store (Definitions)

"The operational data store is subject-oriented and contains current, integrated, consistent data that reflects the current state of its subject. Operational data stores are similar to enterprise data warehouses in that they may include data from different systems that has been made consistent. Operational data stores are different from enterprise data warehouses in that they are updated frequently to reflect the current state of the operational systems." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"A physical set of tables sitting between the operational systems and the data warehouse or a specially administered hot partition of the data warehouse itself. The main reason for an ODS is to provide immediate reporting of operational results if neither the operational system nor the regular data warehouse can provide satisfactory access. Because an ODS is necessarily an extract of the operational data, it also may play the role of source for the data warehouse." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed., 2002)

"The operational data store is a subject-oriented, integrated, current, volatile collection of data used to support the operational and tactical decision-making process for the enterprise. It is the central point of data integration for business management, delivering a common view of enterprise data." (Claudia Imhoff et al, "Mastering Data Warehouse Design", 2003)

"A hybrid structure designed to support both operational transaction processing and analytical processing." (William H Inmon, "Building the Data Warehouse", 2005)

"A collection of data from operational systems, most often integrated together, that is used for some operational purpose. The most critical characteristic here is that this is used for some operational function. This operational dependency takes precedence and the ODS should not be considered a central component of the data warehousing environment. An ODS can be a clean, integrated source of data to be pulled into the data warehousing environment." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

"A database designed to integrate data from multiple sources to facilitate operations. This is as opposed to a data warehouse, which integrates data from multiple sources to facilitate reporting and analysis." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"A database that is subject-oriented, read-only to end users, current (non-historical), volatile, and integrated; is separate from and derived from one or more systems of record; and supports day-today business operations and real-time decision making." (David Lyle & John G Schmidt, "Lean Integration", 2010)

"A DB system designed to integrate data from multiple sources to allow operational access to the data for operational reporting." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"Database for transaction processing systems that uses data warehouse concepts to provide clean data." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"A database designed to integrate data from multiple sources for additional operations on the data." (Craig S Mullins, "Database Administration", 2012)

"A data store that provides data from the original source in near real time." (Brenda L Dietrich et al, "Analytics Across the Enterprise", 2014)

"ODS is the decision support database that integrated operational data from multiple source systems used to capture operational data and is used primarily for near real time operational reporting and analytics. ODSs are used to measure the operations processes efficiencies. The integration pattern is at the lowest levels of granularity and can happen from near real-time to multiple times in a day." (Saumya Chaki, "Enterprise Information Management in Practice", 2015)

"A data store that integrates data from a range of sources, which is subsequently merged and cleaned to serve as the foundation for enterprise operational reporting. It is an important piece of an Enterprise Data Warehouse (EDW) used for enterprise analytical reporting." (David K Pham, "From Business Strategy to Information Technology Roadmap", 2016)

"An ODS system integrates operational or transactional data from multiple systems to support operational reporting." (John D Kelleher & Brendan Tierney, "Data science", 2018)

"An operational data store (ODS) is an alternative to having operational decision support system (DSS) applications access data directly from the database that supports transaction processing (TP). While both require a significant amount of planning, the ODS tends to focus on the operational requirements of a particular business process (for example, customer service), and on the need to allow updates and propagate those updates back to the source operational system from which the data elements were obtained. The data warehouse, on the other hand, provides an architecture for decision makers to access data to perform strategic analysis, which often involves historical and cross-functional data and the need to support many applications." (Gartner)

💎SQL Reloaded: Ways of Looking at Data V (Hierarchies)


While commenting on a post on User Defined Functions from SQL Server Programmers Blog I realized that I could use a multi-statement table-valued function in order to simulate recursive processing for ‘Identifying the top-most parent’ problem presented in Reports - Ways of Looking at Data – Part IV: Hierarchies. Some years back, I needed to create the BOM structure for a report based on IFS-iV ERP system, the data being stored in a SQL Server 2000-based data warehouse. Then, given the fact that common table expressions were introduced only on SQL Server 2005, I used a similar technique as the below one though based on temporary tables, if I’m remembering correctly. Then, like now, I was just iterating through the various levels of the BOM inserting data in the temporary table, this approach having better performance than creating a query self-referencing the same logic multiple times, the logic scaling quite well on multiple millions of records.

 -- BOM Top Most Items 
CREATE FUNCTION dbo.BOMTopMostItems(  
   @ProductID  int , 
   @Date datetime)  
RETURNS  
@BOM TABLE( 
    BillOfMaterialsID int NULL,  
   ProductAssemblyID int NULL,  
   ComponentID int NOT NULL, 
   ModifiedDate datetime NOT NULL, 
   Level smallint NOT NULL)  
AS 
BEGIN 
DECLARE @NoRecords int -- stored the number of Products part of an Assembly  

DECLARE @Level smallint  

SET @Date = ISNull(@Date, GetDate())  

SET @Level = 0 -- initializing the routine 
INSERT @BOM(BillOfMaterialsID, ProductAssemblyID, ComponentID, ModifiedDate, Level)  
SELECT BillOfMaterialsID, ProductAssemblyID, ComponentID, ModifiedDate, @Level  
FROM Production.BillOfMaterials  
WHERE ComponentID = @ProductID  
AND DATEDIFF(d, IsNull(EndDate, GetDate()), @Date)<=0  

SELECT @NoRecords = count(1) 
FROM @BOM  
WHERE ProductAssemblyID IS NOT NULL  

 
WHILE  (IsNull(@NoRecords, 0)>1)  
BEGIN  

    -- inserting next level 
    INSERT @BOM(BillOfMaterialsID, ProductAssemblyID, ComponentID, ModifiedDate, Level) 
    SELECT BOM.BillOfMaterialsID, BOM.ProductAssemblyID, BOM.ComponentID, BOM.ModifiedDate, @Level+1 
    FROM @BOM B  
         JOIN Production.BillOfMaterials BOM 
           ON B.ProductAssemblyID = BOM.ComponentID 
          AND DATEDIFF(d, IsNull(BOM.EndDate, GetDate()), @Date)<=0 
    WHERE Level = @Level 

    SET @Level = @Level+1  

    SELECT @NoRecords = count(1) 
    FROM @BOM 
    WHERE ProductAssemblyID IS NOT NULL 
      AND Level = @Level 
 
END -- deleting the intermediate levels 
 
DELETE   
FROM @BOM  
WHERE ProductAssemblyID IS NOT NULL 

RETURN 
END 
 
As can be seen I’m first inserting the BOM information of the Product searched, then based on these data I’m taking and inserting the next level, stopping when the top-most items where identified, here ProductAssemblyID IS NOT NULL, in the end deleting the intermediate levels. The same approach could be used to generate the whole hierarchy, in this case starting from the top.

Here is an example based on the above function, I used NULL in the second parameter because the @date parameter is set to the current date if not provided.
    

SELECT * 
FROM dbo.BOMTopMostItems(486, NULL)  

 When checking the data returned I observed that for the given ProductID the same BillOfMaterialsID appears more than once, this because, as can be seen by generating the whole BOM for one of the respective duplicated ComponentID, the same ProductID could appear in a BOM at different levels or at the same level but belonging to different assemblies. Therefore in order to get only the list of top-most items you could use a DISTINCT:

SELECT DISTINCT ComponentID  
FROM dbo.BOMTopMostItems(486, NULL) 

On the other side if we want to remove the duplicates resulted from Products appearing at the same level of the same BOM but for different assemblies, could be used a DISTINCT when inserting into the @BOM table or, even if not recommended, use a DISTINCT when calling the function:
 
SELECT DISTINCT BillOfMaterialsID
, ProductAssemblyID
, ComponentID
, ModifiedDate
, Level  
FROM dbo.BOMTopMostItems(486, NULL)

Happy coding!

Previous Post <<||>> Next Post

11 February 2010

🕋Data Warehousing: Staging Area (Definitions)

"The staging area is where data from the operational systems is first brought together. It is an informally designed and maintained grouping of data that may or may not have persistence beyond the load process." (Claudia Imhoff et al, "Mastering Data Warehouse Design", 2003)

"A database design used to preprocess data before loading it into a different structure." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A place where data in transit is placed, usually coming from the legacy environment prior to entering the ETL layer of processing." (William H Inmon, "Building the Data Warehouse", 2005)

"Place where data is stored while it is being prepared for use, typically where data used by ETL processes is stored. This may encompass everything from where the data is extracted from its original source until it is loaded into presentation servers for end user access. It may also be where data is stored to prepare it for loading into a normalized data warehouse." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

[initial staging area:] "The area where the copy of the data from sources persists as a result of the extract/data movement process. (Data from real-time sources that is intended for real-time targets only is not passed through extract/data movement and does not land in the initial staging area.) The major purpose for the initial staging area is to persist source data in nonvolatile storage to achieve the “pull it once from source” goal." (Anthony D Giordano, "Data Integration Blueprint and Modeling", 2010)

"A location where data that is to be transformed is held in abeyance waiting for other events to occur" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"An area for the provision of data, where data is stored temporarily for validation and correction before it will be imported into the target database. A staging area thereby provides technical support for the first time right principle of data management." (Boris Otto & Hubert Österle, "Corporate Data Quality", 2015)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.