12 January 2008

SQL Server and Excel Data

Looking after information about SQL Server 2008, I stumble over Bob Beauchemin’s blog, the first posting I read Another use for SQL Server 2008 row constructors demonstrating a new use of VALUES clause that allows to insert multiple lines in a table by using only one query, or to group a set of values in a table and use them as source for a JOIN. I was waiting for this feature to be available under SQL Server 2005, though better later than never!

The feature is useful when you need to limit the output of a query based on matrix (tabular) values coming from an Excel or text file. For exemplification I’ll use HumanResources.vEmployee view from AdventureWorks database that co/mes with SQL Server 2005, you can download it from Code Plex in case you don’t have it for SQL Server 2008.
Let’s suppose that you have an Excel file with Employees for which you need contact information from a table available on SQL Server. You have the FirstName, MiddeName and LastName, and you need the EmailAddress and Phone. In SQL Server 2008 you can do that by creating a temporary table-like structure on the fly using VALUES clause, and use it then in JOIN or INSERT statements.

Query 1

The heart of the query is the below structure, where B(FirstName, MiddleName, LastName) is the new table, each row in its definition being specified by comma delimited triples of form ('FirstName ', 'MiddleName ', ' LastName'):



The construct it’s time consuming to build manually, especially when the number of lines is considerable big, though you can get the construct in Excel with the help of an easy formula.



The formula from column D is = ", ('" & A2 & "','" & B2 & "','" & C2 & "')" and it can be applied to the other lines too. You just need to copy now the data from Column D to SQL Server and use them in Query with a few small changes. Of course, you can create also a custom function (macro) in Excel to obtain the whole structure is a singe cell.

You can do something alike under older versions of SQL Server (or other databases) using a simple trick – concatenating the values from each column by row by using a delimiter like “/”, “~”, “|” or any other delimiter, though you have to be sure that the delimiter isn’t found in your data sources (Excel and table). Using “/” the formula is = ", '" & A2 & "/" & B2 & "/" & C2 & "'".




Then you have to use the same trick and concatenate the columns from the table, the query becoming:

Query 2

This technique involves small difficulties when:
• The data used for searching have other type than string derived data types, however that can be overcome by casting the values to string before concatenation.
• The string values contain spaces at extremities, so it’s better to trim the values using LTrim and RTrim functions.
• The values from the two sources are slightly different, for example diacritics vs. Latin standard characters equivalents, for this being necessary a transformation of the values to the same format.

No comments: