18 February 2010

Just in CASE – Part V: Dynamic Queries

    Stored procedures allows not only to encapsulate logic, enhance queries’ performance by cashing query plans or reduce the risks of injection by using parameters, they allow also altering the structure of a query at runtime, modifying for example the query based on the provided parameters. Such behavior is ideal when needed to base the logic of a query on a list of filters (filter controls) in which some of the attributes are not mandatory. A dynamic created query can be executed using the EXEC function or the sp_executesql stored procedure, the later allowing parameters’ substitution, ensuring thus parameters’ data types, another benefit residing in the fact that the execution plans are more likely to be reused by SQL Server, Microsoft recommending thus the use of sp_executesql over Exec [1]. Now it depends also on how much the queries vary between requests and how easy is to handle the missing parameters or parameter’s substitution – sometimes quite a difficult topic for a beginner. Therefore for exemplifying the techniques I will use Exec, and here is the example:
-- Retrieving Products
CREATE PROCEDURE dbo.pSearchProducts(
@ProductID int
,
@ProductSubcategoryID int
,
@ProductModelIDs varchar(500) ,
@ProductName nvarchar(50) ,
@ProductNumber nvarchar(25) ,
@MakeFlag bit
,
@FinishedGoodsFlag bit
,
@StartStandardCost money
,
@EndStandardCost money
,
@StartSellStartDate varchar(10) ,
@EndSellStartDate varchar(10) ,
@OrderBy nvarchar(500) )
AS
BEGIN
DECLARE
@Query varchar(2000)

-- translating Order By clause
IF LEN(@OrderBy)>0
BEGIN
    SET @OrderBy = ',' + @OrderBy
    SET @OrderBy = Replace(@OrderBy, ',0', ', ProductNumber')

    SET @OrderBy = Replace(@OrderBy, ',1', ', ProductName')

    SET @OrderBy = Replace(@OrderBy, ',2', ', ProductModel')

    SET @OrderBy = Replace(@OrderBy, ',3', ', SellStartDate')

    SET @OrderBy = Replace(@OrderBy, ',4', ', StandardCost')

    SET @OrderBy = Right(@OrderBy, Len(@OrderBy)-1)

END
-- building the query
SET @Query = 'SELECT ProductID, ProductName, ProductNumber, ProductModel, CatalogDescription, ProductSubcategory'
+ ', MakeFlag, FinishedGoodsFlag, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice'

+ ', Size, SizeUnitMeasureCode, SizeUnitMeasure, WeightUnitMeasureCode, WeightUnitMeasure, Weight'

+ ', DaysToManufacture, ProductLine, Class, Style, dbo.GetDateAsString(SellStartDate) SellStartDate'

+ ', dbo.GetDateAsString(SellEndDate) SellEndDate, dbo.GetDateAsString(ModifiedDate) ModifiedDate'

+ ' FROM Production.vProducts'

+ ' WHERE 0=0'

+ CASE IsNull(@ProductID, 0)

     WHEN 0 THEN ''

    ELSE ' AND ProductID = ' + cast(@ProductID as varchar(10))

END

+ CASE IsNull(@ProductSubcategoryID, 0)

     WHEN 0 THEN ''

     ELSE ' AND ProductSubcategoryID = ' + cast(@ProductSubcategoryID as varchar(10))

END

+ CASE IsNull(@ProductModelIDs, '')

    WHEN '' THEN ''

    ELSE ' AND ProductModelID IN (' + @ProductModelIDs + ')'

END
+ CASE IsNull(@ProductName, '')

    WHEN '' THEN ''

    ELSE ' AND ProductName LIKE ''' + @ProductName + ''''

END
+ CASE IsNull(@ProductNumber, '')

    WHEN '' THEN ''

    ELSE ' AND ProductNumber LIKE ''' + @ProductNumber + ''''

END
+ CASE IsNull(@MakeFlag, 0)

    WHEN 0 THEN ''

    ELSE ' AND MakeFlag = 1'

END
+ CASE IsNull(@FinishedGoodsFlag, 0)

    WHEN 0 THEN ''

    ELSE ' AND FinishedGoodsFlag = 1'

END
+ CASE IsNull(@StartStandardCost, 0)

    WHEN 0 THEN ''

    ELSE ' AND StandardCost >= ' + cast(@StartStandardCost as varchar(20))

END
+ CASE IsNull(@EndStandardCost, 0)

    WHEN 0 THEN ''

    ELSE ' AND StandardCost <= ' + cast(@EndStandardCost as varchar(20))

END

+ CASE IsNull(@StartSellStartDate, '')

    WHEN '' THEN ''

    ELSE ' AND DateDiff(d, SellStartDate, dbo.GetStringDate(''' + @StartSellStartDate + '''))<=0'

END
+ CASE IsNull(@EndSellStartDate, '')

     WHEN '' THEN ''

     ELSE 'AND DateDiff(d, SellStartDate, dbo.GetStringDate(''' + @EndSellStartDate + '''))>=0'

END
+ ' ORDER BY '

+ CASE LEN(@OrderBy)

    WHEN 0 THEN ' ProductNumber' -- default sorting

     ELSE @OrderBy -- chosen sorting

END

-- executing the query
EXEC (@Query)
END
    The above stored procedure could split in 5 logical section:
1. The definition holding stored procedure’s name and parameters
    Parameters’ selection depends on the requirements, usually the natural and database key for the main entity or for master data (e.g. Customer, Vendor, Item Number, etc.), the LOV-based attributes used most often as selection criteria (e.g. Statuses, Characteristics), flags (e.g. Make Flag, Finished Goods Flag) that have important meaning for the business, dates used to partition the result set (e.g. Creation Date, Document Date, GL Date) being in general needed to provide the lower and upper bound of the interval. For special type of analysis it’s useful to include also a set of controls for specifying the numeric intervals, like in this case for Standard Cost.
   In general the parameters have the same date type as the attribute they represent, with a few exceptions:
dates – because the different date formats used in the various environments (e.g. client vs. web server vs. database server) I prefer to send the dates as string, thus assuring that the date will be always in the expected format.
multi-choice attributes – in case is needed to handle multi-choice selections then the easiest way to handle this is to provide a list of concatenated values as a string, as in @ProductModelIDs’ case.
flags – sometimes I prefer to use characters instead of boolean (0,1) values because it offers more flexibility in handling triple or multi-state situations.
2. The declaration of internal variables, here the @Query holding the dynamic built query, the variable needs to be big enough to hold the whole concatenated string with maximum list of parameters and ORDER BY attributes.
3. The translation of ORDER BY attributes here stored in @OrderBy input parameter, which contains a list with numbers from 0 to 9 that will dictate the attributes used in ORDER BY clause and their order. For example will be sent ‘3,1’ to sort the result set based on SellStartDate and ProductName.
4. The actual concatenation of the dynamic query, the handling for NULL values being handled with a set of CASE functions, existing one CASE for each parameter, including @OrderBy.
5. Query’s execution, here done using Exec.

Notes:  
    Usually I’m including a 6th section which includes several metadata about the current object (e.g. who created it and when, when was last time modified, objects that call current object, the purpose of the object, etc.).
     It’s needed to test such a procedure for all parameters – either by providing all parameters as in the first below example, or creating a test case for each parameter, the later approach being much safer because when multiple parameters are provided, one of them could hide an existing issue.

EXEC dbo.pSearchProducts 973, 2, '25,26,27', 'Road%', 'BK%', 1, 1, 1000, 1200, '01/01/2003', '31/12/2003', '' -- all parameters
EXEC
dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 0, 0, '', '', '' -- all records

EXEC
dbo.pSearchProducts 1, 0, '', '', '', 0, 0, 0, 0, '', '', '' -- ProductID test

EXEC
dbo.pSearchProducts 0, 2, '', '', '', 0, 0, 0, 0, '', '', '' -- ProductSubcategoryID test

EXEC
dbo.pSearchProducts 0, 0, '3, 4, 5', '', '', 0, 0, 0, 0, '', '', '2,1' -- ProductModelIDs test

EXEC
dbo.pSearchProducts 0, 0, '', 'Mountain%', '', 0, 0, 0, 0, '', '', '1' -- ProductName test

EXEC
dbo.pSearchProducts 0, 0, '', '', 'BK-T18U%', 0, 0, 0, 0, '', '', '2' -- ProductNumber test

EXEC
dbo.pSearchProducts 0, 0, '', '', '', 1, 1, 0, 0, '', '', '' -- MakeFlag & FinishedGoodsFlag test

EXEC
dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 50, 110, '', '', '' -- StandardCost test

EXEC
dbo.pSearchProducts 0, 0, '', '', '', 0, 0, 0, 0, '01/01/2001', '31/12/2002', '3,1' -- SellStartDate test


    Is it really needed to create a dynamic query in order to handle the various filtering scenarios? In theory yes, there are several techniques that could be used to address a special type of parameters:
1. Single values excepting  and dates
    In such cases the CASE could be replaced with the IsNull function test on parameter, if the parameter is null then take the value of the attribute against which the test is performed. For example the following CASE:
CASE IsNull(@ProductID, 0)
    WHEN 0 THEN ''

     ELSE ' AND ProductID = ' + cast(@ProductID as varchar(10))

END

could be replaced with the following constraint:
ProductID = IsNull(@ProductID, ProductID)
    A similar technique could be used also for interval-based attributes like the StandardCost from our example, the two above cases could be replaced with the following constraint:
StandardCost BETWEEN ISNULL(@StartStandardCost, StandardCost) AND ISNULL(@EndStandardCost, StandardCost)
 Note:
    In case the attributes contain also NULL values then the Null value need to be replaced with a special value, typically not found in the list of values the respective attribute could take, for example –1 for foreign keys, an incredibly small number for numeric data types, single or empty space for string data types.

2. List of values
   The list of values are a little more difficult to handle but not impossible, for this scenario could be used the dbo.SplitList function introduced in a previous post. For example the CASE for @ProductModelIDs could be rewritten as follows:
IsNull (ProductModelID, -1) IN (SELECT value FROM dbo.SplitList(IsNull(NullIf(@ProductModelIDs, ''), CAST(IsNull(ProductModelID, -1) as varchar(20)) ), ','))

    The IsNull(ProductModelID, -1) expression has been introduced in order to handle the Null values, while the expression for the first parameters of dbo.SplitList was written in this way in order to handle the cases in which the @ProductModelIDs was not provided or it contains no values.
   Unfortunately this technique even if it solves the problem, is not quite efficient because the dbo.SplitList function is called for each record, fact that impacts query’s performance considerably. On the other side if the @ProductModelIDs would be mandatory than the dbo.SplitList could be joined with the main query, fact that increases query’s performance, though it must be paid attention to the duplicates, because each duplicate found in the list provided to dbo.SplitList function could lead to duplicates in the final result!

3. Dates
   Because the Date functions can’t be used errorless with the BETWEEN operator in SQL Server, the DateDiff function being preferred for such case, the handing of Date intervals could be achieved by providing the lower/upper bound of the interval date types used, for example ‘0001-01-01’, respectively ‘9999-12-31’ for date data type, or ‘1900-01-01’, respectively ‘2079-06-06’ for smalldatetime data type. or ‘1753-01-01’, respectively ‘9999-12-31’ for datetime data type.  Because the dbo.GetDateAsString function returns a datetime I will use the lower/uppers bounds of the respective data type in case the date values are null, thus the two CASEs for SellStartDate could be rewritten as:
DateDiff (d, SellStartDate, dbo.GetStringDate(IsNull(@StartSellStartDate, '01-01-1753')))<=0
AND DateDiff(d, SellStartDate, dbo.GetStringDate(IsNull(@EndSellStartDate, '31-12-2009')))>=0
    
   In case is needed to return also the dates having Null value, then the SellStatDate could be replaced with any of the two bounds of the data type used, thus the above constraint becomes:
DateDiff (d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(@StartSellStartDate, '01-01-1753')))<=0
AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(@EndSellStartDate, '31-12-2009')))>=0


   Unfortunately I haven’t found any easy way to handle the different attribute combinations that could appear in the ORDER BY clause. On the other side there are controls that handle the sorting on the client, therefore a default ORDER BY could be provided, and if the Users need other sorting, then they could sort the data by themselves. Here’s an alterative writing for the above stored procedure without using a dynamic query, though this doesn’t mean that the performance of this approach is better than the one of the first stored procedure given the fact that a constraint is evaluated for each parameter, the unnecessary constraints for list of values and dates coming with a considerable decrease in performance.
-- Retrieving Products w/o dynamic query
CREATE PROCEDURE dbo.pSearchProductsND(
@ProductID int
,
@ProductSubcategoryID int
,
@ProductModelIDs varchar(500) ,
@ProductName nvarchar(50) ,
@ProductNumber nvarchar(25) ,
@MakeFlag bit
,
@FinishedGoodsFlag bit
,
@StartStandardCost money
,
@EndStandardCost money
,
@StartSellStartDate varchar(10) ,
@EndSellStartDate varchar(10))

AS
BEGIN
-- the query
SELECT
ProductID, ProductName, ProductNumber, ProductModel, CatalogDescription, ProductSubcategory

,
MakeFlag, FinishedGoodsFlag, Color, SafetyStockLevel, ReorderPoint, StandardCost, ListPrice

,
Size, SizeUnitMeasureCode, SizeUnitMeasure, WeightUnitMeasureCode, WeightUnitMeasure, Weight

,
DaysToManufacture, ProductLine, Class, Style, dbo.GetDateAsString(SellStartDate) SellStartDate

,
dbo.GetDateAsString(SellEndDate) SellEndDate, dbo.GetDateAsString(ModifiedDate) ModifiedDate

FROM
Production.vProducts

WHERE
IsNull(ProductID, -1) = COALESCE(NullIf(@ProductID, 0), ProductID, -1)

AND IsNull(ProductSubcategoryID, -1) = COALESCE(NullIf(@ProductSubcategoryID, 0), ProductSubcategoryID, -1)

AND IsNull(ProductModelID, -1) IN (SELECT value FROM dbo.SplitList(IsNull(NullIf(@ProductModelIDs, ''), CAST(IsNull(ProductModelID, -1) as varchar(20)) ), ','))
AND IsNull(ProductName, ' ') LIKE IsNull(NullIf(@ProductName, ''), IsNull(ProductName, ' '))
AND IsNull(ProductNumber, ' ') LIKE IsNull(NullIf(@ProductNumber, ''), IsNull(ProductNumber, ' '))
AND IsNull(MakeFlag, 0) = COALESCE(NullIf(@MakeFlag, 0), MakeFlag, 0)
AND IsNull(FinishedGoodsFlag, 0) = COALESCE(NullIf(@FinishedGoodsFlag, 0), FinishedGoodsFlag, 0)

AND StandardCost BETWEEN ISNULL(NullIf(@StartStandardCost, 0), StandardCost) AND ISNULL(NullIf(@EndStandardCost, 0), StandardCost)

AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(NullIf(@StartSellStartDate, ''), '01-01-1753')))<=0
AND DateDiff(d, IsNull(SellStartDate, '1753-01-01'), dbo.GetStringDate(IsNull(NullIf(@EndSellStartDate, ''), '31-12-2009')))>=0
ORDER BY ProductNumber
END

   Given the format in which parameters are sent (0 for numeric values, empty string for string values) instead of using Nulls, I had to transform the parameters to Null in case they take the value 0 or empty string, this being the only difference between the constraints used in the query and the example constraints from above. In this way the stored procedures should return the same number of records for the same test cases, here are the test cases for the second stored procedure:
EXECdbo.pSearchProductsND 973, 2, '25,26,27', 'Road%', 'BK%', 1, 1, 1000, 1200, '01/01/2003', '31/12/2003' -- all parameters
EXEC dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 0, 0, '', '' -- all records

EXEC
dbo.pSearchProductsND 1, 0, '', '', '', 0, 0, 0, 0, '', '' -- ProductID test

EXEC
dbo.pSearchProductsND 0, 2, '', '', '', 0, 0, 0, 0, '', '' -- ProductSubcategoryID test

EXEC
dbo.pSearchProductsND 0, 0, '3, 4, 5', '', '', 0, 0, 0, 0, '', '' -- ProductModelIDs test

EXEC
dbo.pSearchProductsND 0, 0, '', 'Mountain%', '', 0, 0, 0, 0, '', '' -- ProductName test

EXEC
dbo.pSearchProductsND 0, 0, '', '', 'BK-T18U%', 0, 0, 0, 0, '', '' -- ProductNumber test

EXEC
dbo.pSearchProductsND 0, 0, '', '', '', 1, 1, 0, 0, '', '' -- MakeFlag & FinishedGoodsFlag test

EXEC
dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 50, 110, '', '' -- StandardCost test

EXEC
dbo.pSearchProductsND 0, 0, '', '', '', 0, 0, 0, 0, '01/01/2001', '31/12/2002' -- SellStartDate test

 
  Now it depends which method you will want to use, frankly I like the dynamic query approach because it offers more flexibility in dealing with parameters and in handling other types of queries. In addition if it happens that you use the second approach, the request could change and you might need to redesign your stored procedure using the first approach.

References:
[1] MSDN. 2009. Using sp_executesql. [Online] Available from: http://msdn.microsoft.com/en-us/library/ms175170.aspx (Accessed: 15 February 2010)

No comments: