14 September 2007

💎SQL Reloaded: Useful Functions in Data Conversion

    When working with data, even if it's about reporting, data conversion or data analysis, there are a few useful functions which makes life easier. They deal with three types of basic variables and their variations: strings (e.g. char, varchar), numeric (e.g. int, decimal) and date (e.g. smalldatetime, datetime).  

1. Case Function

CASE input_expression 
        WHEN when_expression 
        THEN result_expression [ ...n ] 
       [ELSE else_result_expression] 
END   
        
Judging from experience, CASE is the most used function in data conversions, when the output value is based on on one or multiple constraints over the already existing values. 
  
CASE WHEN column1 = 'yes' OR column2 = 'other' THEN 'yes/other' END    

 Just be careful, when the ELSE is not provided, a null will be returned! 

Another important thing, the values returned must have the same base type, dates, strings or numeric values can't be mixed without conversion! 

--example 1 
CASE 
    WHEN DateDiff(d, DateValue, GetDate())>0 THEN 'past' WHEN DateDiff(d, DateValue, GetDate())=0 THEN 'today' 
    ELSE 'future' 
END 

--example 2 
CASE 
     WHEN column1 BETWEEN 1 AND 100 THEN '1st range' 
     WHEN column1 BETWEEN 101 AND 1000 THEN '2nd range' 
     ELSE '3rd range' 
END 

--example 3 
CASE 
    WHEN column1 IN ('A', 'B', 'C') THEN '1st set' 
    WHEN column1 IN ('C', 'D', 'E') THEN '2nd set' 
    ELSE '3rd set' 
END      

 There is a saying for DBAs and not only, "Know your data!". As each statement is checked until comparison operation evaluated True, it is better to arrange the WHEN expression in the descending order of highest occurrence. 

 2. IsNull Function

IsNull(expression)   
  
 It's always a good idea to check for NULL values, they could occur anytime when a default value is not provided for a value. One of the cases I often found was when joining two tables when one of the values from JOIN was NULL. 

-- example for string values 
SELECT * 
FROM table1 A 
         LEFT JOIN table2 B 
             ON A.column1 = B.column1 
           AND IsNull(A.column2, ' ') = IsNull(B.column2, ' ') 
WHERE B.column1 IS NULL  

-- example for numeric values 
SELECT * 
FROM table1 A 
      LEFT JOIN table2 B 
          ON A.column1 = B.column1  
        AND IsNull(A.column2, 0) = IsNull(B.column2, 0) 
WHERE B.column1 IS NULL 

The ' ' and 0 values should be used when they aren't found in the list of possible values or there is no impact on the result. Another used value for IsNull function is 'N/A', to point out the missing value. 

 3. Coalesce Function

COALESCE ( expression [ ,...n ] )  
       
There are cases when is needed to take the first not null value from a list of columns, could be dates or any type of other value. Instead of writting something like IsNull(Date1, IsNull(Date2, IsNull(Date3, GetDate()) it's much easier to write COALESCE(Date1, Date2, Date3, GetDate()) 

-- example coalesce 
SELECT * 
FROM table1 
WHERE COALESCE(Column1, Column2, Column3) IS NOT NULL 

-- example IsNull alternative 
SELECT * 
FROM table1 
WHERE Column1 IS NOT NULL 
        OR Column2 IS NOT NULL 
        OR Column3 IS NOT NULL 

 4. NullIf Function
  
NullIf ( expression , expression )    

NullIf function is handy especially when is not needed a certain value in output, for example 0 or the .  

-- example numeric values 
SELECT column1
, NullIf(column2, 0) column2
FROM table_name     

-- example string or date values 
SELECT column1
, NullIf(column2, '') column2
FROM table_name     

When multiple values are not needed, it's easier maybe to write 

-- example multiple checks 
 SELECT column1 
, CASE 
     WHEN column2 IN ('A', 'B', 'V') THEN NULL 
     ELSE column1 END 
FROM table_name 

-- alternative 
SELECT column1, NullIf(NullIf(NullIf(column2, 'A'), 'B'), 'V') 
FROM table_name

 5. Convert Function

CONVERT ( data_type [ ( length ) ] , expression [ , style ] )       

 I used to use CAST instead CONVERT, however CONVERT is more powerful, especially when it comes to dates. SQL Server comes with a few dates formats obtain by providing a value for style parameter: 

SELECT CONVERT(varchar(10), GetDate(),101) -- US format: mm/dd/yyyy
SELECT CONVERT(varchar(10), GetDate(),102) -- ANSI format: yyyy.mm.dd
SELECT CONVERT(varchar(10), GetDate(),103) -- British/French dd/mm/yyyy
SELECT CONVERT(varchar(10), GetDate(),104) -- German format: dd.mm.yyyy
SELECT CONVERT(varchar(12), GetDate(),105) -- Italian format: dd-mm-yyyy
SELECT CONVERT(varchar(12), GetDate(),106) -- dd mon yyyy
SELECT Upper(CONVERT(varchar(12), GetDate(),106))   -- upper case of: dd mon yyyy
SELECT CONVERT(varchar(8), GetDate(),112) -- ISO format (preferred): yyyymmdd
SELECT CONVERT(varchar(10), GetDate(), 23) -- yyyy-mm-dd
SELECT CONVERT(varchar(23), GetDate(),126) -- yyyy-mm-ddThh:mi:ss.mmm

The conversion of a date to a string is mainly required by the need to have the date in a special format or not allow Excel to format the date to its default. 

 6. IsNumeric Function

IsNumeric ( expression )     

This function shouldn't miss when casting string values to numeric values and doing operations with them. 

SELECT CAST(column1 as decimal(12,2)) + 101 
FROM table_name 
WHERE IsNumeric(column1)=1     

Please note that when multiple constraints are used, the IsNumeric function should appear first after the WHERE clause, a query like the next one might raise an error if one of the column1 values is not numeric. 

-- example 
SELECT CAST(column1 as decimal(12,2)) + 101 
FROM table_name 
WHERE column2 = 'value' 
AND IsNumeric(column1)=1

-- alternative writing (recommended)
SELECT CAST(column1 as decimal(12,2)) + 101 
FROM table_name WHERE IsNumeric(column1)=1 
   AND column2 = 'value' 

 7. IsDate Function

IsDate( expression )          

 It works the same way as IsNumeric, but on date data types. Thus if the provided values is a date, it will return 1, otherwise 0. 

  SELECT IsDate(GetDate()) 
  SELECT IsDate('11/31/2007') --november has only 30 days 
  SELECT CASE IsDate('11/31/2007') WHEN 1 THEN 'Yes' ELSE 'No' END 

 Note:     
For more information on the function is always a good idea to consult Microsoft Technical documentation.

13 September 2007

💎SQL Reloaded: XML on SQL Server 2005

That's the work I did for a homework at school (still studying, hoping to finish my Masters degree). Sorry, I have only short comments for it, hopefully will be helpful for somebody! /*creating a table for demonstration purposes ID - identity field Books - XML document where the tables are stored */ CREATE TABLE XMLBooks( ID int IDENTITY(1,1) PRIMARY KEY , Books XML) /*dropping the table */ --DROP TABLE XMLBooks /* inserting a record - an XML document containing one book*/ INSERT INTO XMLBooks VALUES ('<books> <book id="1" language="EN"> <title>Inside SQL Server 2005 Tools</title> <authors> <author>Michael Raheem</author> <author>Dima Sonkin</author> <author>Thierry D''Hers</author> <author>Kami LeMonds</author> </authors> <year>2006</year> <publisher>Addison Wesley</publisher> <price></price> <pages>816</pages> <isbn10>0-321-39796-7</isbn10> <isbn13>978-0-321-39796-6</isbn13> </book> </books>') /* inserting a record - an XML document containing three books*/ INSERT INTO XMLBooks VALUES ('<books> <book id="2" language="EN"> <title>Core C# and .NET</title> <authors> <author>Stephen C. Perry</author> </authors> <year>2005</year> <publisher>Prentice Hall</publisher> <price></price> <pages>1008</pages> <isbn10>0-13-147227-5</isbn10> <isbn13></isbn13> </book> <book id="3" language="EN"> <title> Microsoft SQL Server 2005</title> <authors> <author>Andrew J. Brust</author> <author>Stephen Forte</author> </authors> <year>2006</year> <publisher>Microsoft Press</publisher> <price></price> <pages>600</pages> <isbn10>0-7356-1923-9</isbn10> <isbn13>978-0-7356-1923-4</isbn13> </book> <book id="4" language="EN"> <title>MCAD/MCSD: Visual Basic .NET XML Web Services and Server Components Study Guide</title> <authors> <author>Pamela Fanstill</author> <author>Brian Reisman </author> <author>Mitch Ruebush</author> </authors> <year>2003</year> <publisher>Sybex</publisher> <price></price> <pages>598</pages> <isbn10>0-7821-4193-5</isbn10> <isbn13></isbn13> </book> </books>') /* a classical SELECT */ SELECT * FROM XMLBooks /* a first XML falvoured query*/ SELECT Books.query('/.') FROM XMLBooks /* selecting an element's content*/ SELECT Books.query('//book[@id=1]/title/text()') FROM XMLBooks /* selecting an element's content*/ SELECT Books.value('data(//book[@id=1]/title/)','varchar(250)') FROM XMLBooks 

  Msg 2256, Level 16, State 1, Line 2 XQuery [XMLBooks.Books.value()]: Syntax error near ')', expected a "node test". 

  /* selecting an element's content*/ SELECT Books.value('data((//book[@id=1]/title)[1])','varchar(250)') FROM XMLBooks /* selecting only the occurence*/ SELECT Books.value('data((//book[@id=1]/title)[1])','varchar(250)') FROM XMLBooks WHERE Books.exist('//book[@id=1]')=1 /* updating a node by inserting a new child after an existing one */ UPDATE XMLBooks SET Books.modify(' insert Jack Daniels after (//book[@id=1]//author)[1]') /* updating a node */ UPDATE XMLBooks SET Books.modify(' insert Jack Daniels before (//book[@id=1]//author)[1]') /* deleting a child */ UPDATE XMLBooks SET Books.modify(' delete //book[@id=1]//author[1]') /* modifying an attribute */ UPDATE XMLBooks SET Books.modify(' replace value of (//book/@id)[1] with "100"') /* deleting a record */ DELETE FROM XMLBooks WHERE Books.exist('//book[@id=1]')=1 SELECT Motocross.Team.query(‘.’) AS RESULT FROM @xmlvar.nodes(‘/Motocross/Team’) Motocross(Team) SELECT Book.Athors.query('/.') AS AuthorsList FROM XMLBooks.nodes('/books/book/authors') Book(Authors)

💎SQL Reloaded: Null-ifying the World

    Do you remember the table used in previous post? In the respective case we provided all the values, what happens when a value is missing? How do we deal with such cases? For such cases we have the value NULL. Let's try to insert a NULL in one of the columns:
 
CREATE TABLE Table1(
ID int
, [Year] int
, Product varchar(30)
, Price decimal(13,2)
, Quantity decimal(13,3))
 
   Then we insterted in the table test data, let's try to insert also a NULL for one of the values:
 
-- inserting the test data 
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, NULL, 199.23, 100)

--checking table's content
SELECT *
FROM Table1

    We can check for NULL values using the IS NULL clasuse:
 
-- checking for NULLs
SELECT *
FROM Table1
WHERE Product IS NULL

If you don't like NULL values, you can replace them with other values, with the help of IsNull function. 

-- handling NULL values with a text 
SELECT ID, [Year], IsNull (Product, 'missing value'), Price, Quantity
FROM Table1

-- handling NULL values with a text (different result)
<span style="color: #000099;">SELECT</span> ID, [Year], IsNull(Product, 'The Prouct value is missing! Please correct it!') Product, Price, Quantity
<span style="color: #000099;">FROM</span> Table1

   As you can see, because the maximum length of Product column is 30, the text is trimmed after 30 characters. In order to solve this issue, we can reduce the error message to maximum 30 characters or convert the Product type from varchar(30) to varchar(50), enough to hold the whole string:

SELECT ID, [Year], IsNull(Cast(Product as varchar(50)), 'The Prouct value is missing! Please correct it!') Product, Price, Quantity
FROM Table1

-- inserting more NULL values.
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, 'Product 6', NULL, 100)
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, 'Product 7', 45.08, NULL)
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, 'Product 8', NULL, NULL)
INSERTTable1(ID, [Year], Product, Price, Quantity)
VALUES (1, NULL, 'Product 9', NULL, NULL)
      
 When we need to test for NULL values in multiple columns we could write something like this:
 
-- individual checks 
SELECT *
FROM Table1
WHERE Price IS NULL
OR Quantity IS NULL

-- simpler alternative 
SELECT *
FROM Table1
WHERE IsNull(Price1, Price2) IS NULL

These checks are useful when you need to check for two values which shouldn't be simultaneously NULL. For more columns with possible NULL values, when the values have the same type, we can rewrite the statements as follows:
 
-- checking for multiple values
SELECT *
FROM Table1
WHERE COALESCE(Price1, Price2, Price3) IS NULL

-- replacing NULL values 
SELECT ID
, [Year]
, IsNull(Product, 'The Prouct value is missing!') Product
, IsNull(Price, 0)
, IsNull(Quantity, 0)
FROM Table1

When inserting NULL values in a column with a NOT NULL constraint, the following error message will be shown:
Msg 515, Level 16, State 2, Line 1 Cannot insert the value NULL into column 'Product', table 'Learning.dbo.Table1'; column does not allow nulls. INSERT fails.

--droping the table
DROP TABLE Table1

-- creating the table anew
CREATE TABLE Table1(
ID NOT NULL
, [Year] int NOT NULL
, Product varchar(30) NOT NULL
, Price decimal(13,2) NOT NULL
, Quantity decimal(13,3) NOT NULL)

-- inserting NULL values
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, NULL, 199.23, 100)7
 
   Another way is to use defaults, thus if a value was not provided, it will be replaced with the DEFAULT values:
 
-- creating the table anew
CREATE TABLE Table1(
  ID int NOT NULL
, [Year] int NOT NULL
, Product varchar(30) DEFAULT 'Unknown' 
, Price decimal(13,2) NOT NULL
, Quantity decimal(13,3) NOT NULL)

-- Inserting some values
INSERT Table1(ID, [Year], Price, Quantity)
VALUES (1, 2002, 199.23, 100)
INSERT Table1(ID, [Year], Product, Price, Quantity)
VALUES (1, 2002, NULL, 199.23, 100)

--testing the output
SELECT *
FROM Table1

💎SQL Reloaded: Preparing Test Data

There are cases in which we need to create test data in order to check a certain functionality. In this post I'll show how can be created data with the help of loops. For this let's create a simple table containing 5 fields: 

-- creating the test table 
CREATE TABLE Table1( 
ID int NOT NULL 
, [Year] int NOT NULL 
, Product varchar(30) NOT NULL 
, Price decimal(13,2) NOT NULL 
, Quantity decimal(13,3) NOT NULL) 

-- dropping the test table 
--DROP TABLE Table1 

-- deleting the data 
--TRUNCATE TABLE Table1          
        
Usually I'm commenting DROP and TRUNCATES so I can avoid the modification of data by mistake.

 And now the loop, with the help of RAND and modulo the values can be varied:

-- inserting the data
DECLARE @index int 
SET @index = 1 
WHILE @index < 99
BEGIN 
      INSERT Table1(ID, [Year], Product, Price, Quantity) 
      VALUES (@index, 2000 + (@index % 4), 'Product ' + Cast(@index%10 as char(3)), 10 * RAND(), 100 * RAND()) SET @index = @index + 1 
END 


--testing the output 
SELECT * 
FROM Table1     

-- using the data
SELECT [Year], Product, SUM(Price * Quantity) Amount 
FROM Table1 
GROUP BY [Year], Product
ORDER BY [Year], Product 

💎SQL Reloaded: Lists as Parameters in Stored Procedures

In the previous post I shown how a delimited list can be transformed in a table with the help of table-valued functions. In this posting I'll show how to use lists in stored procedures without the need to create the query as a string. Let's create a table and populate it from a list:

-- creating the table
CREATE TABLE dbo.Testing( 
 value varchar(50))  

--populating it with records
INSERT dbo.Testing 
SELECT value
FROM dbo.SplitList('1,5,7,56,67,34,33,54,76', ',') 

-- reviewing the output
SELECT * 
FROM Testing 

 Normally in a stored procedure, if you want to limit your criteria on a dynamic range or values provided as parameters then you would have to concatenate your script dynamically like below: 

-- creating the stored procedure
CREATE PROCEDURE dbo.TestingListsAsParameters( 
 @listOfValues varchar(500)) 
AS 
DECLARE @SQL varchar(1000) 
SET @SQL = 'SELECT * FROM dbo.Testing WHERE Value IN (' + @listOfValues + ')' 
EXEC (@SQL) 

-- testing the stored procedure
EXEC TestingListsAsParameters '5,7,56, 67' 

Same result can be obtained by using the following method: 

-- creating the stored procedure (join-based logic)
CREATE PROCEDURE dbo.TestingListsAsParameters1( 
@listOfValues varchar(500)) 
AS 
SELECT A.* 
FROM dbo.Testing A 
     JOIN dbo.SplitList(@listOfValues, ',') B 
       ON A.Value = B.Value

-- testing the stored procedure 
EXEC TestingListsAsParameters1 '5,7,56,67' 

Sometimes we want to return all the records if the list provided as parameter contains no values. The above stored procedure can be written as follows: 

-- creating the stored procedure 
CREATE PROCEDURE dbo.TestingListsAsParameters2( 
@listOfValues varchar(500)='') 
AS 
SELECT A.* 
FROM Testing A 
WHERE (Value IN (SELECT value FROM dbo.SplitList(@listOfValues, ',')) 
     AND Len(@listOfValues)>0) 
        OR Len(@listOfValues)=0 

--testing the stored procedure 
EXEC TestingListsAsParameters2 '5,7,56,67' --returns only the matches from the list 
EXEC TestingListsAsParameters2 --returns all the values 

The method can be extended to support multiple lists as parameters.

Lists, Sets and a Little Math

In the previous post I've shown how a delimited list can be transformed in a table with the help of table-valued functions, and I gave as example a JOIN which provided the common elements between two lists:

-- intersection of two sets
SELECT A.* 
FROM dbo.SplitList('23 34 50 71', ' ') A 
     JOIN dbo.SplitList('23,34,25,45', ',') B 
    ON A.Value = B.Value 

A list is a collection of elements which can repeat in the list. When the elements of a list don't repeat we can talk about a set. Sets considered in combination with reunion, intersection, (set) difference, cartesian product and power set forms the basis of topology, however the most interesting part is when we start taking elements from the set or combining the elements of sets.

Arrangements with repetition, called also n-tuples of m-sets, given the fact that the arrangements are created by taking n elements from m sets. The number of elements is given by the formula: m^n. Thus, the arrangements of 2, 3, 4 elements with repetition will give 16, 64, respectively 256 combinations:

 -- arrangements of 2 elements with repetition
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
    , dbo.SplitList('1,2,3,4', ',') B 
 ORDER BY A.Value, B.Value 
 
 -- arrangements of 3 elements with repetition
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
    , dbo.SplitList('1,2,3,4', ',') B
    , dbo.SplitList('1,2,3,4', ',') C 
 ORDER BY A.Value, B.Value, C.Value

 -- arrangements of 4 elements with repetition
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
    , dbo.SplitList('1,2,3,4', ',') B 
    , dbo.SplitList('1,2,3,4', ',') C 
    , dbo.SplitList('1,2,3,4', ',') D 
ORDER BY A.Value, B.Value, C.Value, D.Value 

Combinations are selections of items, such that the order of selection does not matter, though the elements don't repeat inside of the n-tuple. Combinations of 4 elements taken as 2, 3, respectively 4:

 -- combinations of 4 elements taken as 2
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
      JOIN  dbo.SplitList('1,2,3,4', ',') B  
        ON A.Value < B.Value 
 ORDER BY A.Value, B.Value
 
 -- combinations of 4 elements taken as 3
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
       JOIN dbo.SplitList('1,2,3,4', ',') B
  ON A.Value < B.Value 
            JOIN  dbo.SplitList('1,2,3,4', ',') C 
       ON B.Value < C.Value 
 ORDER BY A.Value, B.Value, C.Value

 -- combinations of 4 elements taken as 4
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
      JOIN dbo.SplitList('1,2,3,4', ',') B 
 ON A.Value < B.Value 
           JOIN dbo.SplitList('1,2,3,4', ',') C 
      ON B.Value < C.Value 
                JOIN dbo.SplitList('1,2,3,4', ',') D 
    ON C.Value < D.Value 
 ORDER BY A.Value, B.Value, C.Value, D.Value 

Permutations, in exchange, involve arranging the elements of a set by interchanging their position. The permutations of 4 elements are given by following code:

 -- permutations of 4 elements 
 SELECT * 
 FROM dbo.SplitList('1,2,3,4', ',') A 
      JOIN dbo.SplitList('1,2,3,4', ',') B 
 ON A.Value <> B.Value 
           JOIN dbo.SplitList('1,2,3,4', ',') C 
             ON B.Value <> C.Value 
     AND A.Value <> C.Value 
         JOIN dbo.SplitList('1,2,3,4', ',') D 
    ON C.Value <> D.Value 
          AND B.Value <> D.Value 
   AND A.Value <> D.Value 
 ORDER BY A.Value, B.Value, C.Value, D.Value 

In praxis there are specific algorithms for generating the arrangements, combinations or permutations of a set, however I think that the above examples are simpler to understand as they reflect also the mathematical heuristic. The downside is that the complexity of the queries increases with the number of elements in the set.

In the end here's a fun example on how simple propositions can be created:

 -- creating propositions
 SELECT 'The ' + A.Value + ' ' + B.Value + ' ' + C.Value + '!' 
 FROM dbo.SplitList('cat,dog,bird,fish', ',') A 
    , dbo.SplitList('drinks,smells,eats,jumps', ',') B 
    , dbo.SplitList('loudly,nicely,grumpy,alone', ',') C 

Happy Coding!
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.