29 June 2010

Number of Records – The CLR Version

    While scanning rapidly the daily posts from MSDN blogs, I stopped on Chris Skorlinski’s “Example code executing TSQL for each Table in a database” post, wondering then on whether he’s using other techniques than I used in the past. I run into this type of need several times, especially in cases in which I wanted to report in an automated manner the number of records for a given set of tables, typically during ETL tasks. The problem could be easily solved with the help of a cursor, though it’s not the most elegant technique, even if it solves the problem fairly easy. I remember how I tried to find a way to execute a dynamic statement using a dynamic script inside of a  UDF (user-defined function), though that’s not possible because a stored procedure can’t be executed inside of a function. An alternative would be to use the OPENROWSET function and execute a statement using a linked server, though the OPENROWSET function doesn’t work with dynamically created statements.

    Five years ago, soon after SQL Server 2005 RTM appeared on the market, I tried to create a UDF programmatically using an example from Bob Beauchemin & co.’s “A First Look at SQL Server 2005 for Developers” book. The example worked then, but not without some inherent headaches, having to spent some time on troubleshooting the received errors, and if I remember correctly they were caused by a change in the way the connection was created from the SQL Server SQLClient. Anyway, that’s already history, though since then I was wondering on whether is possible to use a programmatically created UDF to return the number of records for a given table sent as parameter. Since then I haven’t had the chance to test that, even if it seemed logically possible, but seeing Chris Skorlinski’s post and as last week I installed Visual Studio 2010 on my home computer, I said to myself that that’s a nice way to see the tool at work. Thought and done, but again not without problems, running in several small but time consuming issues.

    So armed with some patience, I did a quick search on Microsoft site on how to use CLR scalar-valued functions under SQL Server 2008, running into this MSDN resource, which actually returns the record count using a static query. So I opened Visual Studio 2010 and created a new Database/SQL Server/Visual Basic SQL CLR Database Project named SQLServerProject1 (I forgot to give the project a meaningful name) and added a database reference to the AdventureWorks database using Windows Authentication.

CLR UDF - Database Reference

    Then in the new created project I’ve added a UDF to the project (Add/New Item/User Defined Function), naming the new created class as CLRLibrary, I replaced the existing function with the one from the MSDN article, added a new String input parameter called TableName, modified the query and changed UDF’s name to NumberRecords as per below piece of code:
Imports System 
Imports System.Data 
Imports System.Data.SqlClient 
Imports System.Data.SqlTypes 
Imports Microsoft.SqlServer.Server 
Imports System.Runtime.InteropServices 'added manually 

Partial Public Class UserDefinedFunctions 
SqlFunction(DataAccess:=DataAccessKind.Read)> _ 
Public Shared Function NumberRecords(ByVal TableName As String) As Integer 
Using conn As New SqlConnection("context connection=true") 
Dim cmd As New SqlCommand("SELECT COUNT(1) AS [NumberRecords] FROM " & TableName, conn) 
Return CType(cmd.ExecuteScalar(), Integer) 
End Using 
End Function 
End Class 

    In addition, I had to add also a reference to System.Runtime.InteropServices, and now the function was ready for testing so I built the solution; then I copied CLRLibrary’s Full Path in SQL Server Management Studio and attempted to follow the steps described in the MSDN article. Now it’s time for some troubleshooting fun… First I enabled the CLR integration configuration option by running the below script:

--enable CLR integration use master 
go sp_configure 'clr enabled', 1  

    Using the master database I attempted to create a reference to the assembly, using the Full Path and the dll name.
CREATE ASSEMBLY SqlServerProject1 
FROM 'D:\\SqlServerProject1.dll' 

   The next step was to declare the function:

CREATE FUNCTION dbo.NumberRecords(@table_name nvarchar(100)) 
EXTERNAL NAME SqlServerProject1.[SqlServerProject1.UserDefinedFunctions].NumberRecords 

    Actually is the final form of the DDL script, because initially I used only a combination of the Project, class and function name (e.g. SqlServerProject1.UserDefinedFunctions.NumberRecords), receiving the following error message:
Msg 6505, Level 16, State 2, Procedure NumberRecords, Line 1
Could not find Type 'UserDefinedFunctions' in assembly 'SqlServerProject1'.

    After a quick search on Google I found out from Eggheadcafe forum that the Namespace was missing from the declaration, and once I made this change the above statement worked without problems. The next step was naturally to test on whether the function is working:

SELECT dbo.NumberRecords('AdventureWorks.HumanResources.Department')  

And it worked! So it’s time to test it against a set of tables from AdventureWorks database:
SELECT s.name [schema_name]
   , T.name table_Name 
, dbo.NumberRecords('AdventureWorks.' + s.name + '.' + t.name) NumberRecords 
FROM AdventureWorks.sys.tables t 
      JOIN AdventureWorks.sys.schemas s 
        ON t.schema_id = S.schema_id  
ORDER BY s.name  
, T.name 

  Actually I missed some important thing from the above steps, somewhere while attempting to test the function or to register the assembly, I received the following error message:
A .NET Framework error occurred during execution of user-defined routine or aggregate "NumberRecords": System.InvalidOperationException: Data access is not allowed in this context.  Either the context is a function or method not marked with DataAccessKind.Read or SystemDataAccessKind.Read, is a callback to obtain data from FillRow method of a Table Valued Function, or is a UDT validation method.

   A post on MSDN SQL Server forum remembered me that I forgot to check on whether I’m using the 3.5 or 4.0 .Net Framework. So in the SQLServerProject1 project, from main menu/Project/SQLServerProject1 Properties/Compile/Advice Compile Options/Target Framework, I changed the value to .Net Framework 3.5. This change requires the project to be closed and reopened.

    In between the above steps I had also to check on whether the assembly was registered using the below script:

--assembly file details 
SELECT A.name [Assembly] 
, AF.name File_Path 
FROM sys.assemblies A 
     JOIN sys.assembly_files AF 
        ON A.assembly_id = AF.assembly_id 
WHERE A.is_user_defined=1 
--AND A.name='SqlServerProject1' 

    Also, if you’d like to remove the assembly from the database, you’ll have first to drop the function, and only then drop the assembly:

-- cleaning up
DROP FUNCTION NumberRecords 
DROP ASSEMBLY SqlServerProject1 

    It can be further discussed on whether this approach is acceptable as performance or not, on whether it can be misused, etc. Actually a nice misuse is the following statement which passes a WHERE constraint with the table name.

SELECT dbo.NumberRecords('AdventureWorks.HumanResources.Department WHERE Name LIKE ''P%''') 

    It could be created a second parameter for passing the WHERE constraint or, why not, pass the whole query as parameter, as long only a scalar is returned. I hope I haven’t forgot any step, I’m kind of mixing the facts because I attempted also to create the same function using a new created login, running into other type of issues that maybe deserve a second post.

No comments:

Related Posts Plugin for WordPress, Blogger...