13 September 2007

Useful Functions in Data Conversion

    When working with data, even if it's about reporting, data conversion or data analysis, there are a few useful functions which makes life easier. They deal with three types of basic variables and their varations: strings (e.g. char, varchar), numeric (e.g. int, decimal) and date (e.g. smalldatetime, datetime). 1. CASE input_expression WHEN when_expression THEN result_expression [ ...n ] [ ELSE else_result_expression ] END     Judging from experience, CASE is the most used function in data conversions, when the output value is based on on one or multiple constraints over the already existing values. CASE WHEN column1 = 'yes' OR column2 = 'other' THEN 'yes/other' END     Just be carefull, when the ELSE is not provided, a null will be returned! Another important thing, the values returned must have the same base type, dates, strings or numeric values can't be mixed without convertion! --example 1 CASE WHEN DateDiff(d, DateValue, GetDate())>0 THEN 'past' WHEN DateDiff(d, DateValue, GetDate())=0 THEN 'today' ELSE 'future' END --example 2 CASE WHEN column1 BETWEEN 1 AND 100 THEN '1st range' WHEN column1 BETWEEN 101 AND 1000 THEN '2nd range' ELSE '3rd range' END --example 3 CASE WHEN column1 IN ('A', 'B', 'C') THEN '1st set' WHEN column1 IN ('C', 'D', 'E') THEN '2nd set' ELSE '3rd set' END     There is a saying for DBAs and not only, "Know your data!". As each statement is checked until comparison operation evaluated True, it is better to arange the WHEN expression in the descending order of highest occurence. 2. IsNull(expression)     It's always a good idea to check for NULL values, they could occur anytime when a default value is not provided for a value. One of the cases I often found was when joining two tables when one of the values from JOIN was NULL. SELECT * FROM table1 A LEFT JOIN table2 B ON A.column1 = B.column1 AND IsNull(A.column2, ' ') = IsNull(B.column2, ' ') WHERE B.column1 IS NULL for string values or SELECT * FROM table1 A LEFT JOIN table2 B ON A.column1 = B.column1 AND IsNull(A.column2, 0) = IsNull(B.column2, 0) WHERE B.column1 IS NULL for numeric, this when ' ' and 0 aren't found in the list of possible values or there is no impact on the result. Another used value for IsNull function is 'N/A', to point out the missing value. 3. COALESCE ( expression [ ,...n ] )     There are cases when is needed to take the first not null value from a list of columns, could be dates or any type of other value. Instead of writting something like IsNull(Date1, IsNull(Date2, IsNull(Date3, GetDate()) it's much easier to write COALESCE(Date1, Date2, Date3, GetDate()) SELECT * FROM table1 WHERE COALESCE(Column1, Column2, Column3) IS NOT NULL or SELECT * FROM table1 WHERE Column1 IS NOT NULL OR Column2 IS NOT NULL OR Column3 IS NOT NULL 4. NullIf ( expression , expression )     NullIf function is handy especially when is not needed a certain value in output, for example 0. SELECT column1, NullIf(column2, 0) FROM table_name     When multiple values are not needed, it's easier maybe to write SELECT column1 , CASE WHEN column2 IN ('A', 'B', 'V') THEN NULL ELSE column1 END FROM table_name instead of writing SELECT column1, NullIf(NullIf(NullIf(column2, 'A'), 'B'), 'V') FROM table_name 5. CONVERT ( data_type [ ( length ) ] , expression [ , style ] )     I used to use CAST instead CONVERT, however CONVERT is more powerfull, especially when it comes to dates. SQL Server comes with a few dates formats obtain by providing a value for style parameter: SELECT CONVERT(varchar(10), GetDate(),104) -- SELECT CONVERT(varchar(12), GetDate(),105) -- SELECT CONVERT(varchar(10), GetDate(),103) -- SELECT CONVERT(varchar(10), GetDate(),102) -- SELECT CONVERT(varchar(10), GetDate(),101) -- SELECT CONVERT(varchar(12), GetDate(),106) -- or SELECT Upper(CONVERT(varchar(12), GetDate(),106))     The conversion of a date to a string is mainly required by the need to have the date in a special format or not allow Excel to format the date to its default. 6. IsNumeric ( expression )     This function shouldn't miss when casting string values to numeric values and doing operations with them. SELECT CAST(column1 as decimal(12,2)) + 101 FROM table_name WHERE IsNumeric(column1)=1     Please note that when multiple constraints are used, the IsNumeric function should appear first after the WHERE clause, a query like SELECT CAST(column1 as decimal(12,2)) + 101 FROM table_name WHERE column2 = 'value' AND IsNumeric(column1)=1 might raise an error if one of the column1 values is not numeric. So better write: SELECT CAST(column1 as decimal(12,2)) + 101 FROM table_name WHERE IsNumeric(column1)=1 AND column2 = 'value' 7. IsDate( expression )     It works the same way as IsNumeric, but on date. Thus if the provided values is a date, it will return 1, otherwise 0. SELECT IsDate(GetDate()) SELECT IsDate('11/31/2007') --november has only 30 days or with more readable values: SELECT CASE IsDate('11/31/2007') WHEN 1 THEN 'Yes' ELSE 'No' END Note:     For more information on the function is always a good idea to consult Microsoft Technical documentation.

No comments: