In yesterday’s post on the same topic I was showing how to create a sequence of dates on the fly with the help of common table expressions and multi-statement table-valued functions. I was mentioning that similar functionality could be obtained using a numeric sequence (interval), following to create the dates when calling the function. Here’s the function that provides the respective functionality, the difference from fGetDatesInInterval, excepting the different data types, is that it has a third parameter for the increment value (default 1), being thus possible to generate a list of odd/even numbers or any other type of arithmetic progressions.
The numeric values returned by the function could be used to form a date sequence by using the DateAdd function and a Cast to the date data type. The downside with this approach is that if the dates are used in more expressions within the same query, the expression used to calculate the dates increases the visual complexity of the query, a function like fGetDatesInInterval could be easier to use instead.
The fGetInterval has a second particularity, the Value attribute from the table used to retrieve the data was declared as PRIMARY KEY, fact that allows to improve queries’ performance when searches are performed on table’s values. The DateSequence from the table returned by fGetDatesInInterval could be defined as PRIMARY KEY too.
The common table expression and its use resumes in writing a single statement, and considering that the logic from fGetInterval function could be written as a single statement, in theory we could write an inline table-valued function instead. This technique works as long the OPTION clause is not used within the inline function, when this is attempted is returned the following error: Incorrect syntax near the keyword 'OPTION'. As it seems the OPTION clause can’t be used with single-statement objects like views or inline table-valued functions. Therefore common table expressions can be used in single-statement objects, the functionality being thus restrained to the 100 default limit number of recursion.
Missing Values in a Data Set
-- SQL Server 2005+: Numeric Interval CREATE FUNCTION dbo.fGetInterval( @Start int = 0 , @End int = 0 , @Step int=1) RETURNS @Interval TABLE( Value int PRIMARY KEY) AS BEGIN DECLARE @Sign smallint SET @Sign = IsNull(NullIf(Sign(@End-@Start), 0), 1); WITH CTE(Sequence) AS( SELECT COALESCE(@Start, @End, 0) Sequence UNION ALL SELECT CTE.Sequence + @Step * @Sign Sequence FROM CTE WHERE (@End-CTE.Sequence-@Step)*@Sign>=0 ) INSERT @Interval SELECT Sequence FROM CTE OPTION(MAXRECURSION 0) RETURN END -- numeric interval SELECT * FROM dbo.fGetInterval(0, 1000, 1) -- numeric interval (3rd parameter NULL) SELECT * FROM dbo.fGetInterval(0, 999, NULL) -- even numbers SELECT * FROM dbo.fGetInterval(0, 1000, 2) -- odd numbers SELECT * FROM dbo.fGetInterval(1, 1000, 2) -- single value SELECT * FROM dbo.fGetInterval(1, 1, 2)
The numeric values returned by the function could be used to form a date sequence by using the DateAdd function and a Cast to the date data type. The downside with this approach is that if the dates are used in more expressions within the same query, the expression used to calculate the dates increases the visual complexity of the query, a function like fGetDatesInInterval could be easier to use instead.
-- creating a date sequence SELECT Value , Cast(DATEADD(d, Value, GETDATE()) as Date) DateSequence FROM dbo.fGetInterval(0, 1000, 1)
The fGetInterval has a second particularity, the Value attribute from the table used to retrieve the data was declared as PRIMARY KEY, fact that allows to improve queries’ performance when searches are performed on table’s values. The DateSequence from the table returned by fGetDatesInInterval could be defined as PRIMARY KEY too.
The common table expression and its use resumes in writing a single statement, and considering that the logic from fGetInterval function could be written as a single statement, in theory we could write an inline table-valued function instead. This technique works as long the OPTION clause is not used within the inline function, when this is attempted is returned the following error: Incorrect syntax near the keyword 'OPTION'. As it seems the OPTION clause can’t be used with single-statement objects like views or inline table-valued functions. Therefore common table expressions can be used in single-statement objects, the functionality being thus restrained to the 100 default limit number of recursion.
Missing Values in a Data Set
While searching for something on the Web, today I found Aaron Akin’s post on Recursive Common Table Expressions, he’s using them in order to determine the missing numbers/dates in a sequence. This can be easily achieved using for example the fGetDatesInInterval or fGetInterval functions with a left join to the tested data set. Using the AdventureWorks database and fGetDatesInInterval function, supposing that I would like to see the working dates within a time interval with no POs, the query could be written as follows:
-- working days with no PO SELECT DI.DateSequence FROM dbo.fGetDatesInInterval('2003-01-01', '2003-12-31') DI LEFT JOIN ( -- list of PO dates SELECT OrderDate FROM Purchasing.PurchaseOrderHeader GROUP BY OrderDate ) POH ON DATEDIFF(d, DI.DateSequence, POH.OrderDate)=0 WHERE DATEPART(dw, DI.DateSequence) BETWEEN 2 AND 6 --working days AND POH.OrderDate IS NULL
A similar approach could be used also in case of numeric data sets, here’s an example based on fGetInterval and SplitList table-valued functions:
SELECT DI.Value FROM dbo.fGetInterval(1, 10, 1) DI LEFT JOIN dbo.SplitList('1,3,6,7,10', ',') SL ON DI.Value = SL.Value WHERE SL.Value IS NULL
No comments:
Post a Comment