02 February 2010

Just In CASE - Part I: Introduction

    One of the functions I often use in the my daily work is CASE, a powerful combination between “Iif” function and “Case” conditional statement available in VB/VBA. It evaluates a list of conditions reducible to false/true and returns a value on the branch for which a match was found, allowing thus to create conditional-based code. There are two syntax forms in use: the Simple CASE that can be used to evaluate the values of an attribute, and the Searched CASE used to evaluate expressions.
SQL Server Books Online     A CASE could have one or more WHEN branches and only one or no ELSE branch, the WHEN branches being evaluated in the order they were given, while the ELSE branch is always last evaluated. Once an expression on WHEN branch is evaluated to true, the function returns the value evaluated from result_expression and stops the searching. If no match was found then the function evaluates the result_expression from ELSE branch, and if the ELSE branch is missing a value of NULL is return. Actually it’s simpler than is sounds.

    CASE combined with other operators and functions enable to create powerful transformations, many of the transformations done during ETL processing could be solved with a CASE. The following SQL script based on Production.Product table from AdventureWorks database relies heavily on the use of CASE function in order to transform the data to a form intelligible for the Users or any other consumers.
    The simplest use of CASE function is the translation of bit values to more meaningful values, typically ‘Y’/’N’ or ‘Yes’/’No’. It’s the case of the code chunks marked with Example 1 & 2 in the above query, the first example showing the use of Simple CASE syntax, while the second one of a Searched CASE.
    The third example shows the Simple CASE syntax for multiple values of the same attribute, being provided a branch for each of the values taken by Class attribute (‘H’, ‘M’, ‘L’) and translating them to more meaningful values (‘High’, ‘Medium’, ‘Low’), the ‘n/a’ (not available) value being returned for the other cases.
    The alternative syntax could be used too, though for such simple evaluations the previous syntax form is easier to use.
    For the ones knowing the IsNull or COALESCE function the 4th example doesn’t make sense, a simpler writing for the same would be IsNull(Color, ‘n/a’). What the example tries to demonstrate is the use of comparison against NULL values using the IS NULL clause.
    In the 5th and 6th example the matching expressions make use of IN and BETWEEN operators, any other operators could be used in this context. In the 5th example the BETWEEN operator is working with character data type therefore is used the lexical comparison – for such cases it has to be taken into account that if Size would have taken also the value ‘334’ it would be evaluated on the second branch and return the value ‘S’!
    More general, the CASE function could be used to split a set of alphanumeric values or an interval of numeric values in buckets or evaluate specific expressions on each branch, like functions that have different formulas on each interval. If the previous examples work with sets of alphanumeric values, the 6th example uses the StandardCost to derive several Cost Classes (buckets split), the intervals considered are (-infinite, 0], (0, 50], (50, 250], (250, 1000], [1000, 10000] and (10000, infinite).
    Somebody would remark: “ok, but the values, 50, 250, and 1000 are considered on more than one branch!”. Correct, but should not be forgotten that the CASE function stops after the first expression evaluated to true, so if the StandardCost is 50 the second branch will be the last branch processed, so the respective piece of code is correct. In order to avoid any confusion we could have written the following lengthier and equivalent code:
    The 7th example is another example working with intervals based on the difference in days between the SellEndDate and the current date.

No comments: