11 July 2010

Evaluating Textual Expressions with CLR UDFs

    The developers having experience with scripting languages like VBScript or JavaScript are most probably used with Eval (VBScript/JavaScript) or Execute (VBScript) functions, functions that parse a string evaluating its content to an expression or evaluating a given set of instructions, allowing thus to create and execute code at runtime. In several occasions I needed similar functionality also in T-SQL, having to evaluate a formula stored in a text field or text variable. The latest situation of this type I could recall was when importing a set of BOM (Bill of Materials) data from Oracle into SQL Server. I had two problems then, one related to Oracle and the second to SQL Server. As each level in the (Oracle) BOM has a Quantity which needs to be multiplied with the Quantities from the child levels in order to get the total Required Quantity for each level, I needed a method to aggregate the respective values. Unlike in SQL Server, in Oracle (as far I know) there is no built-in functionality for that, however SYS_CONNECT_BY_PATH function (see example) allows to built a string with the total quantities available in the BOM structure and is even possible to evaluate the resulted string using a function, though as I was not allowed to create a function in the production environment then, I had to solve the issue in SQL Server (2000) itself. The concatenated string of quantities at each BOM level was looking something like ‘1*1*2*4*1’ (here for the 5thBOM  level), so I needed a way to evaluate easily the respective product.

    In SQL Server the Cast and Convert functions allow to cast a string holding a numeric value to a numeric data type but their utility stops there.
-- Cast/Convert (working) example
SELECT Cast('1' as int) * 25
Convert(int, '1') * 10

    Many years ago, I remember I “tried my luck” and attempted to run a similar script like the below one:
-- Cast/Convert (not working) example
SELECT Cast('1*2' as int) * 25
Convert(int, '1*2') * 10

    No wonder that I got an error:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value '1*2' to data type int.

    I attempted to find, again without luck, a built-in solution for my problem. For dynamic created scripts could be used, in general, the sp_executesql stored procedure:
-- expression evaluation using sp_executesql (output: 6)
DECLARE @retval int;
@param int;

@param = 3

sp_executesql N'SELECT @retval = 1*2*@param', N'@param int, @retval int output', @param=@param, @retval = @retval output;


    Unfortunately the sp_executesql can’t be used from UDFs, therefore the functionality can’t be used in queries involving a select from a given object. However, given its simplicity, the string could be regarded as a list of numbers delimited by “*” (multiplication) sign, a loop or starting with SQL Server 2005, a common table expression could solve the problem but it will work only for this type of multiplications, unless the expression is parsed using specific mathematical algorithms. Then I wrote a function similar with the below one:
CREATE FUNCTION dbo.EvalProduct(


Purpose: evaluates the product of multiple numbers written as a text expression
Parameters: @ListValues varchar(max) - List of values to be multiplied
Sample call: SELECT dbo.EvalProduct('1*2*3*4')
SELECT dbo.EvalProduct('1.2*2.3*3.4')
SELECT dbo.EvalProduct('1.2')
SELECT dbo.EvalProduct('NULL')
@Index int
@Product decimal(18,2)

@Index = CharIndex('*', @ListValues)

@Product = CASE

     WHEN @Index>1 OR (@Index = 0 AND IsNumeric(@ListValues) = 1) THEN 1


WHILE @Index > 0

    SET @Product = @Product * Cast(Left(@ListValues, @Index-1) as decimal(18,2))

    SET @ListValues = Right(@ListValues, Len(@ListValues)-@Index)

    SET @Index = CharIndex('*', @ListValues)


IF IsNumeric(@ListValues)=1
SET @Product = @Product * Cast(@ListValues as decimal(18,2))

RETURN @Product

    The function, even if handy, it has its limitations, first of all it works only with products of positive numeric values, and secondly there is no chance to use it for more complex expressions. Fortunately, starting with SQL Server 2005 there is an alternative.  In last post, Data Profiling Using CLR Functions I introduced ExecuteScalarToInt and ExecuteScalarToDecimal CLR functions which could be used to evaluate a text expression to an integer, respectively a decimal value, and thus they could be used also to solve the above problem:
-- CLR functions approach (Output: 120, respectively 236.48)
SELECT dbo.ExecuteScalarToInt('SELECT 1*2*3*4*5') IntExample
dbo.ExecuteScalarToDecimal('SELECT 1.2*2.3*3.4*4.5*5.6') DecimalExample

    The functions could be actually used with any type of expression that returns a numeric value, the expressions could include also built-in or user-defined functions:
-- CLR with built-in mathematical functions (Output: 21.76)
SELECT dbo.ExecuteScalarToDecimal('SELECT Pi()*SQRT(47.98)')
--CLR with UDF (Output: 9.38)
SELECT dbo.ExecuteScalarToDecimal('SELECT dbo.EvalProduct(''1.2*2.3*3.4'')')

    This means that the two functions could be used for example in queries in which the formulas following to be executed are stored in tables or built dynamically. Let’s look at an example based on the values stored in a table:
-- creating a simple table
CREATE TABLE dbo.TestTextExpressions (
Expression varchar(max))

--inserting the values
INSERT INTO dbo.TestTextExpressions

-- running the formulas
SELECT Expression
dbo.ExecuteScalarToDecimal('SELECT ' + Expression) [Output]


CLR UDF - expression evaluation 
1.    Instead of adding the SELECT in the parameter string, it could be created a CLR function (e.g. EvalToInt) that includes it already. Not sure how much benefit could be obtained from this, however I prefer to have generalized functionality rather then providing specific functions for each scenario.
2.    Most probably there are also exceptions in which this approach can’t be used. As the two CLR functions don’t perform any validation, the usefulness of such functions depends also on the validity of the input. If the input is invalid, then an error will be thrown; on the other side, even if it’s possible to catch the error in the CLR functions itself, it could prove to be difficult to alter function’s behavior (for example returning a NULL value) without impacting the result.
3.    In order to run UDFs from CLR UDFs, the SystemDataAccess:=SystemDataAccessKind.Read must be added in CLR functions definitions, otherwise “This statement has attempted to access data whose access is restricted by the assembly” error message will be thrown.

PS:  It would be nice to receive some feedback on how the above functions helped you to solve a similar problem!

1 comment:

ZZZ Projects said...

Just to let you know that starting from SQL Server 2012, it's possible to evaluate arithmetic expression in T-SQL with Eval SQL.NET library

Formula VARCHAR(50) ,

VALUES ( 'x*y+z', 2, 4, 6 ),
( 'x+y*z', 2, 4, 6 ),
( '(x+y)*z', 2, 4, 6 )

DECLARE @sqlnet SQLNET = SQLNET::New('')

-- 14
-- 26
-- 36
SELECT @sqlnet.Code(Formula)
.Val('x', X)
.Val('y', Y)
.Val('z', Z).Eval()
FROM @table

Disclaimer: I'm the owner of the project Eval SQL.NET on GitHub