Showing posts with label Oracle. Show all posts
Showing posts with label Oracle. Show all posts

20 October 2018

ERP Systems: AX 2009 vs. D365 FO - Product Prices with two Different Queries

   In Dynamics AX 2009 as well in Dynamics 365 for Finance and Operations (D365 FO) the Inventory, Purchase and Sales Prices are stored in InventTableModule table on separate rows, independently of the Products, stored in InventTable. Thus for each Product there are three rows that need to be joined. To get all the Prices one needs to write a query like this one:

SELECT ITM.DataAreaId 
, ITM.ItemId 
, ITM.ItemName 
, ILP.UnitId InventUnitId
, IPP.UnitId PurchUnitId
, ISP.UnitId SalesUnitId
, ILP.Price InventPrice
, IPP.Price PurchPrice
, ISP.Price SalesPrice
FROM dbo.InventTable ITM
     JOIN dbo.InventTableModule ILP
       ON ITM.ItemId = ILP.ItemId
      AND ITM.DATAAREAID = ILP.DATAAREAID 
      AND ILP.ModuleType = 0 -- Inventory
     JOIN dbo.InventTableModule IPP
       ON ITM.ItemId = IPP.ItemId
      AND ITM.DATAAREAID = IPP.DATAAREAID 
      AND IPP.ModuleType = 1 -- Purchasing
     JOIN dbo.InventTableModule ISP
       ON ITM.ItemId = ISP.ItemId
      AND ITM.DATAAREAID = ISP.DATAAREAID 
      AND ISP.ModuleType = 2 -- Sales 
WHERE ITM.DataAreaId = 'abc'

Looking at the query plan one can see that SQL Server uses three Merge Joins together with a Clustered Index Seek, and recommends using a Nonclustered Index to increase the performance:

image

   To avoid needing to perform three joins, one can rewrite the query with the help of a GROUP BY, thus reducing the number of Merge Joins from three to one:

SELECT ITD.DataAreaID 
, ITD.ItemId 
, ITM.ItemName  
, ITD.InventPrice
, ITD.InventPriceUnit
, ITD.InventUnitId
, ITD.PurchPrice
, ITD.PurchPriceUnit
, ITD.PurchUnitId
, ITD.SalesPrice
, ITD.SalesPriceUnit
, ITD.SalesUnitId
FROM dbo.InventTable ITM
     LEFT JOIN (—price details
     SELECT ITD.ITEMID
     , ITD.DATAAREAID 
     , Max(CASE ITD.ModuleType WHEN 0 THEN ITD.Price END) InventPrice
     , Max(CASE ITD.ModuleType WHEN 0 THEN ITD.PriceUnit END) InventPriceUnit
     , Max(CASE ITD.ModuleType WHEN 0 THEN ITD.UnitId END) InventUnitId
     , Max(CASE ITD.ModuleType WHEN 1 THEN ITD.Price END) PurchPrice
     , Max(CASE ITD.ModuleType WHEN 1 THEN ITD.PriceUnit END) PurchPriceUnit
     , Max(CASE ITD.ModuleType WHEN 1 THEN ITD.UnitId END) PurchUnitId
     , Max(CASE ITD.ModuleType WHEN 2 THEN ITD.Price END) SalesPrice
     , Max(CASE ITD.ModuleType WHEN 2 THEN ITD.PriceUnit END) SalesPriceUnit
     , Max(CASE ITD.ModuleType WHEN 2 THEN ITD.UnitId END) SalesUnitId
     FROM dbo.InventTableModule ITD
     GROUP BY ITD.ITEMID
     , ITD.DATAAREAID 
    ) ITD
       ON ITD.ITEMID = ITM.ITEMID
      AND ITD.DATAAREAID = ITM.DATAAREAID
WHERE ITD.DataAreaID = 'abc'
ORDER BY ITD.ItemId

And here’s the plan for it:

image

It seems that despite the overhead introduced by the GROUP BY, the second query performs better, at least when all or almost all records are performed.

In Dynamics AX there were seldom the occasions when I needed to write similar queries. Probably, most of the cases are related to the use of window functions (e.g. the first n Vendors or Vendor Prices). On the other side, a bug in previous versions of Oracle, which was limiting the number of JOINs that could be used, made me consider such queries also when was maybe not the case.  

What about you? Did you had the chance to write similar queries? What made you consider the second type of query?

Happy coding!

06 November 2016

SQL Server Administration: System.OutOfMemoryException in SQL Server Management Studio and other 32-bit Drawbacks

    I was playing this week with a few datasets downloaded from the web on various topics, trying to torture the data until they’ll confess something. A few of the datasets were prepared for load into a MySQL database as individual INSERT INTO statements. They were containing between 100000 and a few millions of records. While looking at the big but slim datasets in SSMS (SQL Server Management Studio) and reconciling the differences between MySQL and SQL Server I got several times the System.OutOfMemoryException exception, SSMS crashing one or two times. That should be ok, given the number of records, though I was surprised that I got the same error message while executing the INSERT INTO statements for one of the smallest datasets which had about 300000 records:

    „An error occurred while executing batch. Error message is: Exception of type 'System.OutOfMemoryException' was thrown”

    Kb 2874903 brings some light into the topic – SSMS is still a 32-bit process and thus limited to 2GB of memory. The Kb offers three methods to avoid this issue. The first two, outputting the query results to text or file didn’t worked. The third method based on using sqlcmd utility worked smoothly with a syntax like the one below:
sqlcmd -i “<file_name.sql>” -d “<database name>”

    So it doesn’t matter that you’re having a supercomputer and that working with big datasets becomes a necessity nowadays, this limitation can make data loading just a little bit more complicated. On one side, it’s true that when dealing with such datasets is probably recommended to use directly sqlcmd to execute the scripts. On the other side, independently from this type of problem, even if understandable from the need of keeping backwards compatibility with 32-bit platforms/solutions, it’s hard to digest the fact that Microsoft keeps some of its products 32-bit based when SQL Server is targeting 64-bit platforms. One has same problem when using BIDS (Business Intelligence Development Studio), developing SSRS, SSIS or SSAS solutions under 32-bit and having maybe to deploy the code as 64-bit (e.g. SQL Server Agent). From my point of view most of the issues I had were when dealing with proprietary drivers like the ones for Oracle or even for MS Office. In addition in SSIS there could be features that are only available in 32-bit versions, or have limitations on 64-bit computers (see [5]). As it seems also the SQL Server Data Tools (SSDT) will have similar drawbacks…

   Anyway, sqlcmd utility saved the day with a minimum of overhead. Unfortunately it’s not always that easy to solvethe compatibility issues between 32-bit and 64-bit software and platforms.

Update 20.06.2017:
   One can synchronize the runtime version between BIDS and SQL Server Agent pretty easy. In BIDS under "Configuration Properties/Debug Option" at Project level, there is the “Run64BitRuntime” Property. Set to false it will run your package on 32-bit version. In a SQL Server Agent Package, there is the “Use 32 bit runtime” Checkbox under “Execution options” at step level. Checking this checkbox will run your package on 32-bit version.

   A hint that the two values might be out of synch is the following error message raised when running the package:
"Attempt to load Oracle client libraries threw BadImageFormatException. This problem will occur when running in 64 bit mode with the 32 bit Oracle client components installed."
Resources:
[1] Microsoft Support (2013) Kb 2874903: "System.OutOfMemoryException" exception when you execute a query in SQL Server Management Studio https://support.microsoft.com/en-us/kb/2874903
[2] MSDN (2016) SQL Server 2016: sqlcmd Utility https://msdn.microsoft.com/en-us/library/ms162773.aspx
[3] MSDN (2016) SQL Server 2016: Use the sqlcmd Utility https://msdn.microsoft.com/en-us/library/ms180944.aspx
[4] MSDN (2012) Introducing Business Intelligence Development Studio  https://msdn.microsoft.com/en-us/library/ms173767.aspx
[5] SQL Server 2008 R2: 64 bit Considerations for Integration Services https://technet.microsoft.com/en-us/library/ms141766(v=sql.105).aspx





08 December 2010

SQL Reloaded: Pulling the Strings of SQL Server IX (Special Characters)

    Under special characters denomination are categorized typically the characters that don’t belong to the alphabet of a given language, typically English (a-z, A-Z), or the numeric digits (0-9). Characters like umlauts (e.g. ä, ë, ö, ü, ÿ), accents (e.g. é, í, ó, ú) and other type of characters used in specific languages like German, French, Turkish, Hungarian, Romanian, Dutch or Swedish, together with punctuation signs or graphical characters are falling under the designation of special characters. It’s true that SQL Server, like many other databases, supports Unicode, a standard designed to encode such characters, though not all database designs are taking Unicodes into account. Preponderantly the old database and other software solutions use the non-unicode string data types (char, varchar, text) fact that makes the special characters to be displayed inadequately, sometimes undecipherable. In order to avoid this behavior could be decided to port the data types to unicode or use only the standard characters of English language, both solutions with their positive and negative aspects. In such cases, especially during migration project or ETL tasks, eventually as part of a Data Quality initiative, it’s preferred to identify and replace the special characters in a manual, automatic or semi-automatic fashion. In addition, there are also cases in which, from various reasons, some attributes are not allowed or should not include special characters, and also this aspect could be included in a Data Quality initiative.

    During assessment of Data Quality, in an organized or less organized manner, a first step resides in understanding the quality of the data. In the current case this resumes primarily in identifying how many records contain special characters, how could be the data cleaned, and the “cost” for this activity. Actually, before going this far, must be defined the character sets included in special characters, the definition could vary from case to case. For example in some cases could be considered also the space or the important punctuation signs as valid characters, while in others they may not be allowed. There could be identified thus multiple scenarios, though I found out that the range of characters a-z, A-Z, 0-9 and the space are considered as valid character in most of the cases. For this purpose could be built a simple function that iterates through all the characters of a string and identifies if there is any character not belonging to the before mentioned range of valid characters. In order to address this, a few years back I built a function similar with the below one:

-- checks if a string has special characters 
CREATE FUNCTION dbo.HasSpecialCharacters( 
@string nvarchar(1000)) 
RETURNS int 
AS  
    BEGIN 
         DECLARE @retval int 
         DECLARE @index int 
         DECLARE @char nchar(1)  

         SET @retval = 0 
         SET @index = 1  

         WHILE (@index <= IsNull(len(@string), 0) AND @retval=0) 
        BEGIN  
           SET @char = Substring(@string, @index, @index+1) 
           IF NOT (ASCII(@char) BETWEEN 48 AND 57 -- numeric value 
             OR ASCII(@char) BETWEEN 65 AND 90 -- capital letters 
            OR ASCII(@char) BETWEEN 97 AND 122 -- small letters 
            OR ASCII(@char) = 32) --space 
           BEGIN 
                SET @retval = @index 
           END 
           ELSE 
               SET @index = @index + 1  
     END  

    RETURN (@retval) 
END 

    Function’s logic is based on the observation that the ASCII of numeric values could be found in the integer interval between 48 and 57, the capital letters between 65 and 90, while the small letters between 97 and 122. By adding the ASCII for space and eventually several other characters, the check on whether an character is valid resuming thus to only 4 constraints. Here’s the function at work:  

-- testing HasSpecialCharacters function 
SELECT dbo.HasSpecialCharacters('kj324h5kjkj3245k2j3hkj342jj4') Example1 
, dbo.HasSpecialCharacters('Qualität') Example2 
, dbo.HasSpecialCharacters('Änderung') Example3 
, dbo.HasSpecialCharacters('') Example4 
, dbo.HasSpecialCharacters(NULL) Example5 
, dbo.HasSpecialCharacters('Ä') Example6 
, dbo.HasSpecialCharacters('ä') Example7 
, dbo.HasSpecialCharacters('a') Example8 
 
special characters 1

    As can be seen, the function returns the position where a special character is found, fact that enables users to identify the character that causes the problem. A similar function could be built also in order to count the number of special characters found in a string, the change residing in performing a counter rather then returning the position at the first occurrence of a special character.

    The function might not be perfect though it solves the problem. There are also other alternatives, for example of storing the special characters in a table and performing a simple join against the target table. Another solution could be based on the use RegEx functionality, either by using OLE automation or directly CLR functionality. There could be done variations on the above solution too by limiting to check on whether the characters of a string are falling in the range projected by the ASCII function. That’s what the following function does:  

-- checks if a string has special characters falling in an interval  
CREATE FUNCTION dbo.HasCharNotInASCIIRange( 
@string nvarchar(1000) 
, @start int 
, @end int) 
RETURNS int 
AS  
BEGIN      
    DECLARE @retval int 
    DECLARE @index int 
    DECLARE @char nchar(1) 
 
    SET @retval = 0 
    SET @index = 1 
    WHILE (@index <= IsNull(len(@string), 0) AND @retval=0) 
    BEGIN  
         SET @char = Substring(@string, @index, @index+1) 
         IF NOT (ASCII(@char) BETWEEN @start AND @end)  
        BEGIN 
              SET @retval = @index 
        END 
        ELSE 
             SET @index = @index + 1  
     END 
     RETURN (@retval) 
END 

      
   With this function are necessary 4 calls in order to identify if a string contains special characters, though we loose the flexibility of identifying the first character that is invalid. We could still identify the first occurrence by taking the minimum value returned by the 4 calls, however, unlike Oracle (see Least function), SQL Server doesn’t have such a function, so we’ll have eventually to built it. Anyway, here’s the above function at work:  

-- testing HasCharNotInASCIIRange function 
SELECT dbo.HasCharNotInASCIIRange('k12345', 48, 57) Example1 
, dbo.HasCharNotInASCIIRange('12k345', 48, 57) Example2 
, dbo.HasCharNotInASCIIRange('12345', 48, 57) Example3 
, dbo.HasCharNotInASCIIRange(' 12345', 48, 57) Example4 
, dbo.HasCharNotInASCIIRange('12345 ', 48, 57) Example5 
, dbo.HasCharNotInASCIIRange('', 48, 57) Example6 
, dbo.HasCharNotInASCIIRange(NULL, 48, 57) Example7 
, dbo.HasCharNotInASCIIRange('', 48, 57) Example8 
, dbo.HasCharNotInASCIIRange('a', 48, 57) Example9 
, dbo.HasCharNotInASCIIRange('Ä', 48, 57) Example10 
, dbo.HasCharNotInASCIIRange('ä', 32, 32) Example11 

special characters 2

15 August 2010

Data Security: SQL Injection I - Introduction

Introduction

  If you are working in IT, most probably you’ve heard already about SQL Injection, if not then might be it’s a good idea to ask your colleagues and eventually your IT manager if your company has any policies related to it. If you are working for a software vendor or a consultancy company then SQL Injection countermeasure techniques might be quite well positioned in the list of best practices in what concerns the development of Web/Desktop Applications, Web Services or database-related logic adopted by your company. If you are working for a company, other than the two mentioned above, and have various software projects on the role or already in house, then most probably you’ll have to ask if the software vendors you are working with have took into consideration the SQL Injection threats and proved their solutions against them. On contrary, if you have nothing to do with IT at all, it might still be a good idea to ask your IT department if they have anything in place related to SQL Injection – Security Policy, security best practices, etc.

Definition

  Wikipedia defines SQL Injection as “a code injection technique that exploits a security vulnerability occurring in the database layer of an application” [3], the code injection being defined as “the exploitation of a computer bug that is caused by processing invalid data” [4]. For a programmer the definition is acceptable, though for other type of professionals it might not be so clear what’s about, especially when they are not familiar with IT terminology. I find more clear the definition provided by J. Clarke et. al, who in his book SQL Injection Attacks and Defense, defines SQL Injection as the vulnerability that results when you give an attacker the ability to influence the Structured Query Language (SQL) queries that an application passes to a back-end database” [2]. I will slightly modify the last definition and say that the SQL injection is a security vulnerability residing in the possibility to alter the intended behavior of the SQL Queries passed to the database.

Some Background

    At the beginning of our century, with the increase importance of Web Applications whose availability over WAN/Internet (networks) brought new security issues, the SQL Injection became a really hot topic given the damages such techniques could do to an application, with just a few tricks the “hacker” having the possibility to enter in the application and even in the machine hosting the database used, entering thus in the possession of sensitive information, and above all having the possibility of damaging the database. J. Clarke et. al remarks that the first connection between web applications and SQL injection is widely accredited to Rain Forest Puppy, who in an article titled “NT Web Technology Vulnerabilities” (see “ODBC and MS SQL server 6.5” section) written in 1998 for Phrack, an e-zine written by and for hackers [2], was describing the behavior specific to SQL Injection in relation to MS SQL Server 6.5. 
 
    I remember when my boss break us the news that we have to protect urgently our applications against SQL Injection, having to redesign some of the database objects and components in order to protect our applications against such techniques. I was then in my first or second year of professional experience, so the topic was new and quite intriguing not only for myself but also for my colleagues, some of them having a few more years of professional programming experience that I did and, I hope I’m not mistaking, none (or few) of them actually have heard about it. It was interesting to check how simple techniques could do so much damage. At those times there were few articles on SQL Injection and specific countermeasure techniques, not to mention best practices, so we were kind of groping in the dark in finding a countermeasure to the problem.

State of Art

    Since then, the number of search engines hits on the topic is quite impressive, many professionals approaching the problem in their way, Vendors started to design their solutions and make aware programmers on best practices in order to minimize this type of security threat, books were written on this topic, the awareness increased between developers and other type of IT professionals. Even if considerable effort has been made into this direction, and the topic appears often on the blogs, there are still many web sites not designed to address SQL Injection concerns. In 2007, The WhiteHat Security, placed SQL Injection on 5th position in top of vulnerabilities, estimating that 1 out of 5 web sites is vulnerable to SQL Injection [1]. In 8th Web Security Report based on 2009 data provided by WhiteHat Security[5], and as it seems also in 9th report [6], SQL Injection remains on the same position, what’s interesting to remark is the split per scripting technology provided in [6]: 
 
SQL Injection - Statistics WhiteHat

   In Web Hacking Incident Database maintained by Web Application Security Consortium, SQL Injection is considered as 17.97 % out of the total 512 reported top attack methods. Even if the number of reported attacks is insignificant in report to the number of sites available on the web, the percentage of cases seems to be in agreement with the number provided in WhiteHat Security reports.

Resources

   If the topic made you curious, you could find out more with just a simple search on the Web. There are many professionals who wrote on this topic, however it’s a good idea to start directly with the resources provided by the RDBMS vendors, for example Microsoft through its Security Research & Defense blog, in SQL Injection Attacks post has an interesting list of resources on this topic. A nice document on ‘How to write SQL injection proof PL/SQL’ comes from Oracle, an interesting presentation on ‘SQL Injection Myths and Fallacies’ was made at MySQL Conference & Expo, etc.

References:
[1] WhiteHat Security. [2007]. Website Security Statistics Report. [Online] Available from: http://www.whitehatsec.com/home/assets/WPStatsreport_100107.pdf (Accessed: 15 August 2010)
[2] J.Clarke et. al (2009). SQL Injection Attacks and Defense. Elsevier. ISBN: 978-1-59749-424-3
[3] Wikipedia. (2010). SQL Injection. [Online] Available from: http://en.wikipedia.org/wiki/Sql_injection (Accessed: 15 August 2010)
[4] Wikipedia. (2010). Code Injection. [Online] Available from: http://en.wikipedia.org/wiki/Code_injection (Accessed: 15 August 2010)
[5] WhiteHat Security. [2009]. Website Security Statistic Report, 8th Ed. [Online] Available from: http://www.whitehatsec.com/home/assets/WPstats_fall09_8th.pdf (Accessed: 15 August 2010)
[6] WhiteHat Security. [2010]. Website Security Statistic Report, 9th Ed. [Online] Available from: http://www.slideshare.net/jeremiahgrossman/whitehat-security-9th-website-security-statistics-report-3995771 (Accessed: 15 August 2010)

11 July 2010

SQL Reloaded: Evaluating Textual Expressions with CLR UDFs

The developers having experience with scripting languages like VBScript or JavaScript are most probably used with Eval (VBScript/JavaScript) or Execute (VBScript) functions, functions that parse a string evaluating its content to an expression or evaluating a given set of instructions, allowing thus to create and execute code at runtime. In several occasions I needed similar functionality also in T-SQL, having to evaluate a formula stored in a text field or text variable. The latest situation of this type I could recall was when importing a set of BOM (Bill of Materials) data from Oracle into SQL Server. I had two problems then, one related to Oracle and the second to SQL Server. 

As each level in the (Oracle) BOM has a Quantity which needs to be multiplied with the Quantities from the child levels in order to get the total Required Quantity for each level, I needed a method to aggregate the respective values. Unlike in SQL Server, in Oracle (as far I know) there is no built-in functionality for that, however SYS_CONNECT_BY_PATH function (see example) allows to built a string with the total quantities available in the BOM structure and is even possible to evaluate the resulted string using a function, though as I was not allowed to create a function in the production environment then, I had to solve the issue in SQL Server (2000) itself. The concatenated string of quantities at each BOM level was looking something like ‘1*1*2*4*1’ (here for the 5thBOM  level), so I needed a way to evaluate easily the respective product.

    In SQL Server the Cast and Convert functions allow to cast a string holding a numeric value to a numeric data type but their utility stops there.

-- Cast/Convert (working) example 
SELECT Cast('1' as int) * 25 
, Convert(int, '1') * 10 
 
    Many years ago, I remember I “tried my luck” and attempted to run a similar script like the below one:

-- Cast/Convert (not working) example 
SELECT Cast('1*2' as int) * 25 
, Convert(int, '1*2') * 10 
 
    No wonder that I got an error:
Msg 245, Level 16, State 1, Line 1
Conversion failed when converting the varchar value '1*2' to data type int.

   I attempted to find, again without luck, a built-in solution for my problem. For dynamic created scripts could be used, in general, the sp_executesql stored procedure:

-- expression evaluation using sp_executesql (output: 6) DECLARE @retval int; 
DECLARE @param int; 
SET @param = 3 
EXECUTE sp_executesql N'SELECT @retval = 1*2*@param', N'@param int, @retval int output', @param=@param, @retval = @retval output;  
SELECT @retval 

    Unfortunately the sp_executesql can’t be used from UDFs, therefore the functionality can’t be used in queries involving a select from a given object. However, given its simplicity, the string could be regarded as a list of numbers delimited by “*” (multiplication) sign, a loop or starting with SQL Server 2005, a common table expression could solve the problem but it will work only for this type of multiplications, unless the expression is parsed using specific mathematical algorithms. Then I wrote a function similar with the below one:

CREATE FUNCTION dbo.EvalProduct( 
@ListValues varchar(max)) 
RETURNS decimal(18,2) 
/* 
Purpose: evaluates the product of multiple numbers written as a text expression 
Parameters: @ListValues varchar(max) - List of values to be multiplied 
Sample call: SELECT dbo.EvalProduct('1*2*3*4') 
SELECT dbo.EvalProduct('1.2*2.3*3.4') 
SELECT dbo.EvalProduct('1.2') 
SELECT dbo.EvalProduct('NULL') 
*/ AS 
BEGIN 
DECLARE @Index int 
DECLARE @Product decimal(18,2) 
SET @Index = CharIndex('*', @ListValues) 
SET @Product = CASE 
     WHEN @Index&gt;1 OR (@Index = 0 AND IsNumeric(@ListValues) = 1) THEN 1 
     ELSE NULL 
END  

WHILE @Index &gt; 0  
BEGIN 
    SET @Product = @Product * Cast(Left(@ListValues, @Index-1) as decimal(18,2)) 

    SET @ListValues = Right(@ListValues, Len(@ListValues)-@Index) 
    SET @Index = CharIndex('*', @ListValues) 
END  

IF IsNumeric(@ListValues)=1 
SET @Product = @Product * Cast(@ListValues as decimal(18,2))  

RETURN @Product 
END 

      
    The function, even if handy, it has its limitations, first of all it works only with products of positive numeric values, and secondly there is no chance to use it for more complex expressions. Fortunately, starting with SQL Server 2005 there is an alternative.  In last post, Data Profiling Using CLR Functions I introduced ExecuteScalarToInt and ExecuteScalarToDecimal CLR functions which could be used to evaluate a text expression to an integer, respectively a decimal value, and thus they could be used also to solve the above problem:  

-- CLR functions approach (Output: 120, respectively 236.48) 
SELECT dbo.ExecuteScalarToInt('SELECT 1*2*3*4*5') IntExample 
, dbo.ExecuteScalarToDecimal('SELECT 1.2*2.3*3.4*4.5*5.6') DecimalExample  

    The functions could be actually used with any type of expression that returns a numeric value, the expressions could include also built-in or user-defined functions:
 
-- CLR with built-in mathematical functions (Output: 21.76)
SELECT dbo.ExecuteScalarToDecimal('SELECT Pi()*SQRT(47.98)') 

--CLR with UDF (Output: 9.38) 
SELECT dbo.ExecuteScalarToDecimal('SELECT dbo.EvalProduct(''1.2*2.3*3.4'')') 

    This means that the two functions could be used for example in queries in which the formulas following to be executed are stored in tables or built dynamically. Let’s look at an example based on the values stored in a table:

-- creating a simple table 
 CREATE TABLE dbo.TestTextExpressions ( 
Expression varchar(max)) 

--inserting the values 
INSERT INTO dbo.TestTextExpressions 
VALUES ('1*2*3*4') 
, ('Pi()*SQRT(47.98)') 
, ('Cos(10)') 
, ('Year(GetDate())') 
, ('Square(10.45)') 

-- running the formulas 
SELECT Expression 
, dbo.ExecuteScalarToDecimal('SELECT ' + Expression) [Output]      
FROM dbo.TestTextExpressions   


CLR UDF - expression evaluation 
   
Notes:
1.    Instead of adding the SELECT in the parameter string, it could be created a CLR function (e.g. EvalToInt) that includes it already. Not sure how much benefit could be obtained from this, however I prefer to have generalized functionality rather then providing specific functions for each scenario.
2.    Most probably there are also exceptions in which this approach can’t be used. As the two CLR functions don’t perform any validation, the usefulness of such functions depends also on the validity of the input. If the input is invalid, then an error will be thrown; on the other side, even if it’s possible to catch the error in the CLR functions itself, it could prove to be difficult to alter function’s behavior (for example returning a NULL value) without impacting the result.
3.    In order to run UDFs from CLR UDFs, the SystemDataAccess:=SystemDataAccessKind.Read must be added in CLR functions definitions, otherwise “This statement has attempted to access data whose access is restricted by the assembly” error message will be thrown.

PS:  It would be nice to receive some feedback on how the above functions helped you to solve a similar problem!

20 June 2010

SSIS: Using Oracle as Data Source

    Making SSIS work with Oracle doesn’t seem to be a complicated task especially after several years of experience in doing that, though, as usual, something new appears in the landscape – new software versions, new requirements, an environment with its own particularities, etc. In general, when an application needs a connection to an Oracle server, is needed to install on the client computer several components that come with Oracle Client, add/configure the TNS (Transparent Network Substrate) name, eventually set-up some global variables in case the components were not installed in the default location, and depending on the chosen provider might be needed to configure also a DSN (Data Source Name) pointing to the Oracle server. SSIS makes actually no exception from these steps, once the steps performed you should be in theory ready to develop, test, deploy, schedule and run packages using a connection to an Oracle server.

    Unfortunately, from my experience, every 2-3 installations, there is a problem with the Oracle Client and its configuration, most of the times the solution being quite simple – removing some declarations from the set-up files, correcting the global variables or the TNS name. As it seems the Oracle Client is quite sensible to the changes in the default installation path, therefore is indicated to install the Client using the default path unless you’d like to gain more experience in troubleshooting such installation issues. In addition, as any respectable company, some of the products come with their own defects, the patching of such issues being not so easy, therefore it’s advisable to check beforehand the known issues coming with the Client version you’d need to install, preferably in case you need to take advantage of the latest Oracle features, it makes sense to install the latest stable Client.  

    Because the Oracle Client downloadable package has a few hundred MB, there is a thinner Oracle alternative to Oracle Client, namely Instant Client package, the components could be downloaded from Oracle site (here) and installed individually, as a minimum being necessary to install Instant Client Package – Basic or Basic Lite versions, the ODBC libraries and the SQL *Plus libraries in case you want to test the connection. Of course, after case could be installed also the JDBC, SDK or any other packages Oracle made available.

    On 64 bit platforms might be needed to install in parallel the 32 and 64 bit Oracle/Instant Clients (see SSIS, Oracle and X64 post from Business Vision DEV Team), while in order to troubleshoot the various issues could be a good idea to check the differences between 32 bit and 64 bit registry (here). 

    There are several drivers that allows you to connect to an Oracle database using SSIS, the most popular ones:
- Microsoft OLE BD Provider for Oracle
- Oracle Provider for OLE DB
- .Net Framework Data Provider for Oracle
- .Net Framework Data Provider for Odbc
- Oracle Data Provider for .NET

    The Microsoft drivers come with MDAC (Microsoft Data Access Components), its latest version being MDAC 2.8. Starting with Windows Vista and Windows Server 2008, Microsoft changed MDAC into WDAC (Windows Data Access Components), including it as part of the operating system, removing thus the need to redistribute the components. See Data Access Technologies Road Map, FAQ and Troubleshooting MDAC/WDAC for MDAC/WDAC architecture, components and releases, respectively troubleshooting. Given the various issues that exist with a particular MDAC/WDAC library, see in KB301202 how you could check the current version by using the registers, while for  Windows version up to Windows Server 2003 could be used the Component Checker MDAC Utility.

    If .Net Framework Data Provider for Oracle, respectively for Odbc come with the .Net framework, in exchange, Oracle Data Provider for .NET, see also FAQ is Oracle’s implementation for ADO.NET data access,  supposed to take advantage of advanced Oracle database functionality, it comes with ODAC (Oracle Data Access Components). Please note that you might need in install the respective components in addition to the Oracle Client.

    There are also third party drivers for Oracle or even particularly for SSIS, for example Microsoft Connectors Version 1.1 for Oracle and Teradata for use with SSIS from Attunity; see also a few tips from SQL Server Performance blog.

    Before taking any decision on which driver to use, it might be a good idea to look also at drivers’ limitations and advantages. In the past I often used the Microsoft Oracle ODBC Driver until I run into in an important limitation, namely its inexistent support for unicodes, this residing, according to KB244661, in the fact that “from Microsoft Data Access Components (MDAC) version 2.5 and later versions, both the Microsoft ODBC Driver and OLE DB Provider support ONLY Oracle 7 and Oracle 8i”. Also Oracle Provider for OLE DB seems to have its own limitations.

    SQL Server provides linked servers, the powerful functionality of executing commands against OLE DB data sources on remote servers, including Oracle. As linked servers offer the ability to create cross-vendor distributed queries, in certain scenarios they could prove to be a powerful alternative of querying Oracle databases, in such cases not being needed to create in SSIS package an additional connection to Oracle. KB280106 and an article on Oracle Provider for OLE DB describe how to set up and troubleshoot a linked server to an Oracle database in SQL Server.

    You might want to check also the SQL Server Integration Services with Oracle Database 10 White Paper coming from Microsoft. “Connectivity and SQL Server 2005 Integration Services”  MSDN article written by Bob Beauchemin, Scott Barrett’s blog or this SSIS 64 bit – Using Oracle Provider post.

17 April 2010

Software Engineering: SQL Reloaded (Part I: Learning SQL)

Software Engineering Series


Introduction

One of the topics that appears in many database related groups or forums is from where and how new comers should start to learn SQL (Structured Query Language) in general, and specific vendor SQL features in particular. Two years back, in Learning something new post from my programming-grasps blog attempt, I wrote down my thoughts on how to approach learning new subjects in general, many of the respective observations could be applied to SQL learning too, and in addition could be added also domain-specific recommendations. 

I would say that when discussing a topic like learning it makes sense to approach the subject from users level of knowledge and skill-set, different recommendations could apply for beginners, intermediate and expert level. No, also when you are an expert the learning process doesn’t stops there as SQL and RDBMS (Relational Database Management System) evolve with time, new features and techniques being added, same as new contexts of applicability could emerge (e.g. very large databases, geographic information systems, Web-related SQL, etc.).

The Beginner Level

Approaching a topic from beginner’s perspective is never an easy task given the fact that many new comers don’t know from where to start, what to learn first, what RDBMS to target. I was and I’m still at that point as new subjects appear on my learning schedule. One important decision is what RDBMS solution to choose, and for this you might need to consider several aspects: market share, goals/objectives, functional area (e.g. database administration, database development, data warehousing, business intelligence, ETL, data quality), personal preferences, salaries and jobs postings, UI, features, tools, learning curve, performance, programmability, extensibility, integration and compatibilities with other technologies, availability of documentation, books, tutorials and other learning material, etc.

I hope I’m not annoying anybody when I’m saying that Oracle, Microsoft and IBM are the leaders in RDBMS and related data technologies, this being also the ranking in what concerns the market share and job postings (see for example B. Nicolich’s SQL Server Job Trends article). If you aren’t having any preferences I would recommend you to focus on at least one of the RDBMS of the mentioned vendors: Oracle, SQL Server and MySQL. Unfortunately, given researches’ subjectivity, I can’t delimitate which of the three has better performance or is better to use, being important the context in which the studies are made, the overall architecture and their particularities, etc. 

Having a Microsoft technologies background I like SQL Server’s easy to use UI and integrated solution approach it provides, on the other hand Oracle seems to provide more SQL-related features (e.g. functions), while MySQL is still trying to catch the other two vendors and become more mature. All three vendors implement SQL ANSI 92 standard, so learning SQL on one of the platforms would allow you to write queries also on the other platforms, the difference residing in the functions used, DML (Data Manipulation Language) or DDL (Data Definition Language) syntax and specific functionality, architecture and tools.

   For SQL Server and Oracle you could download the trial versions, or opt for the Express versions that could be used free but with certain constraints, while for MySQL you could download one of the free community editions, quite stable solutions. If SQL Server comes with a nice Management Studio that could be used to write queries and administrate your databases, for Oracle and MySQL you might have to consider tools like Toad for OracleOracle or SQL Developer, respectively Toad for MySQL, that allow to be more productive on the respective platforms. If you are intending to use a RDBMS more for personal use (e.g. data analysis), what I call a personal database, then MS Access and MS Excel could prove to be enough, especially when is needed to develop personal or small scale solutions in an organization with a limited knowledge on SQL, though a more evolved RDBMS (actually MS Excel is not a RDBMS but it could be used as a data source or data analysis tool) is recommended when providing solution for multiple users.

All three vendors provide rich documentation, with a plus for SQL Server and Oracle, through the online MSDN or Technet, respectively Oracle Database Documentation Library, and MySQL Reference Manual, the knowledge provided by the documentation being complemented by the many published books, magazines, tutorials, blogs, webcasts, podcasts, studies, best practices and other type of technical literature, many of them easily retrievable with just a simple search using your favorite search engine, important being to know what you are looking for. You have to consider that the technical documentation focuses more on the features, syntax and technical details, while for specific purposes like development, architecture integration or problem solving you might have to consider the other sources of knowledge with a plus for books, tutorials and blogs, the later two being actually more up-to-date than the books, given their shorter “time to market” and informal character.  

If your interest in SQL and databases derives from pure educational purposes then you could also consider the free online course materials (presentations, video lectures) posted within initiatives like the ones of YouTube EDU or MIT Open CourseWare, many other lecture notes being available online from various well-known universities. Other important source of knowledge that gains field in the past years is the one of social and professional networks centered around the important technologies, so joining such a network could bring some benefit but you don’t expect miracles!

In what concerns the books maybe it makes sense to list at least my preferences given the huge volume of books on this topic. I’m a fan of “for Dummies” series, therefore I find Allen Taylor’s SQL for Dummies and its more extensive version SQL All-In-One Desk Reference for Dummies or Database Development for Dummies as a good starting point. If you are targeting SQL Server then you might need to check also Microsoft SQL Server 2008 For Dummies, Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies, Microsoft SQL Server 2005 Programming For Dummies or Microsoft SQL Server 2005 Express Edition For Dummies. In exchange the list of choices for Oracle or MySQL is quite smaller, as far I know there are only two books on the market on Oracle - Oracle PL/SQL For Dummies and Oracle 11g For Dummies, while for MySQL the books from “for Dummies” cycle target the use of MySQL in combination with other scripting programming languages: Apache, MySQL, and PHP Web Development All-in-One Desk Reference, PHP & MySQL For Dummies or PHP & MySQL Everyday Apps For Dummies. Another popular series are the “for Beginners” or “Beginning” series, having a more narrow focus, for example Beginning T-SQL with Microsoft SQL Server 2005 and 2008 or Beginning PL/SQL: From Novice to Professional could be two of the books you could start with.

Choosing the right book is highly dependent on what you want to achieve, your learning style and author preferences, books’ availability and costs, in general each professional having his/her own list of preferences. Years back I was quite stressed because many of the examples written in books were not working, nowadays hopefully that changed positively, in addition many publishers making the code available through books’ supporting sites or publishers websites together with a few sample chapters from the respective books. If you don’t want to buy the whole book or you’d like to read it online, you could become member of online libraries like Safari Books Online, Skillsoft’s Books 24x7, eBrary or SpringerLink. Many books are starting to become available for purchasing in PDF directly from publishers or in proprietary format on digital book readers like Amazon’s Kindle or Sony’s Reader, the digital books being in theory with 10-30% cheaper than the paperback products.

The Intermediate Level

The intermediate level is a relative point in the evolution of a professional, in theory this level being reached after several projects or prolonged use of SQL in various scenarios. During this stage many developers are starting to approach topics like performance or best practices, going thus above the basics, or interest themselves in taking a certification, attempting thus to approach the whole range of topics related to databases. Some of the books listed above qualify as well to the intermediate or expert levels, it depends also on books’ level of detail and approach on the given topics. The books’ series closer to this level are the ones for Professionals, for example Pro SQL Server 2008 Relational Database Design and Implementation, Pro T-SQL 2008 Programmer’s Guide, Professional Oracle Programming, Pro MySQL or Professional MySQL.

For those targeting the MCTS certification, it makes sense to consider the Self-Paced Training Kit Exams: (
Exam 70-432) Microsoft SQL Server 2008-Implementation and Maintenance, (Exam 70-433) Microsoft SQL Server 2008-Database Development, while for those approaching the MCITP certification then MCITP SQL Server 2005 Database Developer All-in-One Exam Guide is quite useful. As I haven’t approached the Oracle or MySQL certifications, I think it’s more indicated to do some research by yourself. 

The Expert Level

At this level developers are starting to go deeper into the database internals, learning how things are handled in the background, differentiate between multiple solutions and choose the optimum solution. Books like Microsoft SQL Server 2008 Internals, SQL Server 2008 Query Performance Tuning Distilled or Professional SQL Server 2008 Internals and Troubleshooting shouldn’t miss from a developer’s library! I would expect there are similar books also for Oracle and MySQL, but as I’m not aware of them so I will stop here.

At this level a developer could also approach the database topics from a higher level, making an incursion in the mathematical/conceptual theory of relational databases, books like Information Modeling and Relational Databases or Applied Mathematics for Database Professionals could become really handy in understanding such topics.

Note:

Most probably there are even better books out there though, as I highlighted above, these are my recommendations based on the books that have fallen in my hand along the years. In many cases, instead of buying a book, I’m looking on whether I could find on the web the information I need, though the quality of the material varies, sometimes same as the quality of the books. There are even cases when the information are contradictory, however in most of the cases I would go with the experts that have the information from the source. 
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.