19 November 2005

💠🛠️SQL Server: Administration (Part I: Troubleshooting Problems in SQL Server 2000 DTS Packages)

    There are cases in which a problem can not be solved with a query, additional data processing being requested. In this category of cases can be useful to use a temporary table or the table data type. Once the logic built you want to export the data using a simple DTS project having as Source the built query.

    While playing with SQL Server 2005 (Beta 2) I observed that the DTS Packages have problem in handling temporary tables as well table data types.

1. Temporary table Sample:

CREATE PROCEDURE dbo.pTemporaryTable
AS
CREATE TABLE #Temp(ID int, Country varchar(10)) -- create a temporary table

-- insert a few records
INSERT #Temp VALUES (1, 'US')
INSERT #Temp VALUES (2, 'UK')
INSERT #Temp VALUES (3, 'Germany')
INSERT #Temp VALUES (4, 'France')   

SELECT * FROM #Temp -- select records

DROP TABLE #Temp  drop the temporary table
 
-- testing the stored procedure
EXEC dbo.pTemporaryTable
Output:
ID Country
    ----------- ----------
1 US
2 UK
3 Germany
4 France
(4 row(s) affected)

Trying to use the stored procedure as Source in a DTS package brought the following error message: Error Source: Microsoft OLE DB Provider for SQL Server Error Description: Invalid object name ‘#Temp’   

 I hoped this has been fixed in SQL Server 2005 (Beta 3) version, but when I tried to use the stored procedure as a Source I got the error:
Error at Data Flow Task [OLE DB Source [1]]: An OLE DB error has occurred. Error code: 0x80004005 An OLE DB record is available. Source: "Microsoft OLE DB Provider for SQL Server" Hresult: 0x80004005 Description: "Invalid object name '#Temp'."
Error at Data Flow Task [OLE DB Source [1]]: Unable to retrieve column information from the data source. Make sure your target table in the database is available. ADDITIONAL INFORMATION: Exception from HRESULT: 0xC020204A (Microsoft.SqlServer.DTSPipelineWrap)

2. Table data type sample:

CREATE PROCEDURE dbo.pTableDataType
AS
DECLARE @Temp TABLE (ID int, Country varchar(10)) -- create table

-- insert a few records
INSERT @Temp VALUES (1, 'US')
INSERT @Temp VALUES (2, 'UK')
INSERT @Temp VALUES (3, 'Germany')
INSERT @Temp VALUES (4, 'France')   

SELECT * FROM @Temp -- select records

-- testing the stored procedure
EXEC dbo.pTableDataType

Output:
ID Country
   ----------- ----------
1 US
2 UK
3 Germany
4 France
(4 row(s) affected)    

This time the error didn't appear when the Source was provided, but when the package was run: Error message: Invalid Pointer    

In Server 2005 (Beta 3) it raised a similar error:
SSIS package "Package6.dtsx" starting. Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning Information: 0x40043006 at Data Flow Task, DTS.Pipeline: Prepare for Execute phase is beginning Information: 0x40043007 at Data Flow Task, DTS.Pipeline: Pre-Execute phase is beginning SSIS package "Package6.dtsx" finished: Canceled.    

 I saw there will be problems because in Flat File Connection Manager’s Preview no data were returned. I hope these problems were solved in the last SQL Server Release, if not we are in troubles!

Possible Solutions:    
If the Stored Procedures provided above are run in SQL Query Analyzer or using ADO, they will work without problems. This behavior is frustrating, especially when the logic is really complicated and you put lot of effort in it; however there are two possible solutions, both involving affordable drawbacks:
1. Create a physical table manually or at runtime, save the data in it and later remove the table. This will not work for concurrent use (it works maybe if the name of the table will be time stamped), but for ad-hoc reports might be acceptable.
2. Built an ADO based component which exports the data to a file in the format you want. The component is not difficult to implement, but additional coding might be requested for each destination type. The component can be run from a UI or from a DTS package.
3. In SQL Server Yukon is possible to use the CLR and implement the logic in a managed stored procedure or table valued function.

Happy Coding!

18 November 2005

💎SQL Reloaded: Prime Numbers

I was just thinking about prime numbers and why people had such a fascination about them. In mysticism every number has a powerful meaning, a power coming from inside, and therewith attracts certain powers from exterior, is strange, hard to understand all this, especially how all numbers can be reduced to a number between 1 and 9.  Every philosophy has a little truth in it. 
 
I was trying to find a nice article on number reduction, for example the one from sourceryforge could be useful. Last week I implemented a function which returns the reduction of any sequence, could be date or number. I was intending to apply it on prime numbers, but even if can be observed certain similarities between different chains, is difficult to believe this will lead anywhere. 

 CREATE FUNCTION dbo.NumberReduction( @text varchar(50)) RETURNS smallint /* Purpose: reduces to number a sequence of digits Parameters: @text varchar(50) - digits sequence 

 Notes: the function works with any sequnce of alphanumeric, not numeric values being ignored Sample: 

SELECT dbo.NumberReduction('11/17/2005') 
SELECT dbo.NumberReduction('1234') 
SELECT dbo.NumberReduction('1234 8993 90003 3456') */ 
 BEGIN    
DECLARE @index smallint    
DECLARE @total int    
SET @index = 1    
SET @total = 0    --for each char in alphanumeric sequence    
WHILE (@index<=len(@text))   
 BEGIN          
IF IsNumeric(Substring(@text, @index, 1)) = 1         
SET @total = @total + Cast(Substring(@text, @index, 1) as smallint)       
SET @index = @index+1    
END    
--if the number is not in 1..9 range, it calls further the NumberReduction for the new total    
IF len(@total)>1       
SET @total = dbo.NumberReduction(@total)    
RETURN @total END 

I tried to generate the prime numbers between 1 and 1000000 to analyze how many prime numbers are for each thousand. 

SELECT PrimeNumber/1000 DataSet , count(*) NumberPrimeNumbers 
FROM PrimeNUmbers 
GROUP BY PrimeNumber/1000 ORDER BY DataSet 

As it seems the maximum number of prime numbers is reached for the first thousand and decreases. DataSet*1000 

 Number Prime Numbers -------------------- ------------------ 0 (2..1000) 168 1 (1001..2000) 135 2 (2001..3000) 127 3 ........... 120 4 ........... 119 5 ........... 114 6 ........... 117 7 ........... 107 8 ........... 110 9 ........... 112 10 .......... 106 11 .......... 103 12 .......... 109 13 .......... 105 14 .......... 102 15 .......... 108 16 .......... 98 17 .......... 104 18 .......... 94 19 .......... 104 20 .......... 98 ........................... 980 ......... 67 981 ......... 75 982 ......... 70 983 ......... 70 984 ......... 70 985 ......... 76 986 ......... 76 987 ......... 63 988 ......... 71 989 ......... 72 990 ......... 71 991 ......... 79 992 ......... 65 993 ......... 68 994 ......... 78 995 ......... 69 996 ......... 69 997 ......... 83 998 ......... 74 999 ......... 65 


💎SQL Reloaded: To be or not to be a Prime

   According to Wolfram a prime number is "any positive integer p>1 that has no positive integer divisors other than 1 and p itself. In other words there is no positive integer numbers p>m>1 that p = m*n, n>1. So, to check whether p is prime all we have to do is to check if any number m between 2 and p-1 is not a divisor for p.    

   If m is not a divisor for p then we could write: p/m = n + r, r> 0, this is in mathematics, but in programming even if is SQL, C++ or any programming language, when working with integers, by p/m is understood n while by p mod m is r.

   Respecting the order of operations (p/m)*m = p only if m is a divisor for p. Now, is it really necessary to check all the numbers between 2 and p-1 if they are or not divisors for p? Probably not, when m increases, p/m decreases, and also the range of possible divisors, which is between m+1 and p/m-1 until m becomes greater or equal with p/m. m=p/m only when m is the square from p, so is necessary and sufficient to check only the values between 2 and √p.    

    To summarize, if all the numbers m betwen 2 and √p are not divisors of p then p is prime. This is the test approach used in IsPrime function:

CREATE FUNCTION dbo.IsPrime(
 @number bigint)
RETURNS smallint
/*
   Purpose: checks if a positive integer is prime number
   Parameters: @number bigint - the number to be checked if prime
   Notes: 
   Sample: SELECT dbo.IsPrime(1000)
  SELECT dbo.IsPrime(947)
  SELECT dbo.IsPrime(111913)
  SELECT dbo.IsPrime(113083)
*/
BEGIN
    DECLARE @index bigint
 DECLARE @result bit

 SET @index = 2
    SET @result = 1

 WHILE (@index<=sqrt(@number))
 BEGIN
  IF (@number = (@number/@index)*@index)
   SET @result = 0 

  SET @index = @index+1
 END

 RETURN @result
END

The pListPrimeNumbers will be used to test the function:

CREATE PROCEDURE dbo.pListPrimeNumbers(
  @startInterval bigint
, @endInterval bigint)
AS
/*
   Purpose: lists the prime numbers from a the [@startInterval, @endInterval] interval 
   Parameters: @startInterval bigint - the left side of the interval in which will be searched the prime numbers
      @endInterval bigint - the right side of the interval in which will be searched the prime numbers
   Notes: 
   Sample: EXEC dbo.pListPrimeNumbers 900, 1000
           EXEC dbo.pListPrimeNumbers 111900, 112900
*/
DECLARE @index bigint
SET @index = @startInterval

--for each integer in interval
WHILE (@index<=@endInterval)
BEGIN  
    -- list a number if prime 
 IF (dbo.IsPrime(@index)=1)
  SELECT @index [Prime Numbers]

 SET @index = @index+1 
END
 
   One can use a simple table to store the results, which will allow further analyssis of the data:

-- creating the table 
CREATE TABLE PrimeNumbers (PrimeNumber bigint)

-- inserting the data 
INSERT PrimeNumbers
EXEC dbo.pListPrimeNumbers 1, 10000

Notes:    
The utility of IsPrime function is relative, primarily because there are not many the cases when such a function is needed in SQL, and secondarily because the performance gets poorer with the number tested. Programming languages like VB.Net or C# are more appropriate for such tasks. Initially I wanted to test the performance of implementing IsPrime in SQL vs implementing it as a managed function, but an error stopped me from the attempt:
Msg 6505, Level 16, State 1, Procedure IsPrimeNumber, Line 1 Could not find Type 'UserDefinedFunctions' in assembly 'MyLibrary'.    

3 months ago the same piece of code worked without problems, but since then I had to reinstall the SQL Server because I couln't use DTS packages after I installed LINQ preview. Welcome to the "real" world of Microsoft :(.

Happy Coding!

14 November 2005

💎SQL Reloaded: Out of Space

Some time ago I found a post in which somebody was annoyed of writing code like this:

DECLARE @zipcode char(5)

SET @zipcode = '90'

SELECT CASE len(@zipcode)
 WHEN 1 THEN '0000' + @zipcode
 WHEN 2 THEN '000' + @zipcode
 WHEN 3 THEN '00' + @zipcode
 WHEN 4 THEN '0' + @zipcode
 WHEN 5 THEN @zipcode
END

I remembered I met the same situation, but instead of writing a line of code for each case, I prefered to write a simple function, which used the Space and Replace built-in functions to add first a number of spaces and then replace them with a defined character. Maybe somebody will argue about the performance of such a function, but I prefer in many cases the reusability against performance, when the difference is not that big. Additionally, this enhances code’s readability.    

So, here is the function:

CREATE FUNCTION dbo.FillBefore(
   @target varchar(50)
 , @length int
 , @filler char(1))
/*
   Purpose: enlarges a string to a predefiend length by puting in front of it the same character
   Parameters: @target varchar(50) - the string to be transformed
  , @length int - the length of output string
  , @filler char(1) - the character which will be used as filler 
   Notes: the function works if the length of @length is greater or equal with the one of the @target, otherwise will return a NULL
*/
RETURNS varchar(50)
AS
BEGIN
   RETURN Replace(Space(@length-len(IsNull(@target, ''))), ' ', @filler) 
       + IsNull(@target, '')
END

-- testing the function
SELECT dbo.FillBefore('1234', 5, '0')
SELECT dbo.FillBefore('1234', 10, ' ')
SELECT dbo.FillBefore(1234, 10, ' ')
SELECT dbo.FillBefore(Cast(1234 as varchar(10)), 10, '0')

Another scenario in which the Space function was useful was when I had to save the result of a query to a text file in fixed width format. It resumes to the same logic used in FillBefore function, except the spaces must be added at the end of the string. For this I just needed to modify the order of concatenation and create a new function:

CREATE FUNCTION dbo.FillAfter(
   @target varchar(50)
 , @length int
 , @filler char(1))
/*
   Purpose: enlarges a string to a predefiend length by adding at the end the same character
   Parameters: @target varchar(50) - the string to be transformed
  , @length int - the length of output string
  , @filler char(1) - the character which will be used as filler 
   Notes: the function works if the length of @length is greater or equal with the one of the @target, otherwise will return a NULL

*/
RETURNS varchar(50)
AS
BEGIN
   RETURN IsNull(@target, '') 
        + Replace(Space(@length-len(IsNull(@target,''))), ' ', @filler)
END

  -- testing the function 
  SELECT dbo.FillAfter('1234', 5, '0')
  SELECT dbo.FillAfter('1234', 10, ' ')
  SELECT dbo.FillAfter(1234, 10, '0')
  SELECT dbo.FillAfter(Cast(1234 as varchar(10)), 10, '0')
  SELECT dbo.FillAfter(NULL, 10, '0')

Notes:
1.    In SQL Server 2005, the output of a query can be saved using a DTS Package with fix format directly to a text file using ‘fixed width’ (the columns are defined by fix widths) or ‘ragged right’ (the columns are defined by fix widths, except the last one which is delimited by the new line character) format for the destination file.
2.    To document or not to document. I know that many times we don't have the time to document the database objects we created, but I found that a simple comment saved more time later, even if was about my or other developer's time.
3.    If is really needed to use the CASE function, is good do a analysis of the data and put as first evaluation the case with the highest probability to occur, as second evaluation the case with the second probability to occur, and so on. The reason for this is that each option is evaluated until the comparision operation evaluates to TRUE. For example, if the zip codes are made in most of the cases of 5 characters, then it makes sense to put it in the first WHEN position.

08 November 2005

Metablogging: Introduction

A warm welcome to my blog! 

I started this blog with the idea of documenting some of the issues I found when working with SQL Server, Oracle and other DBMS, respectively to share some of the techniques used in manipulating data via SQL or other means (including .Net, MS Access or Excel). Therefore, it was intended to be a blog on SQL, data, databases, data processing and related topics, though with time I arrived to share my ideas on Business Intelligence, Data Warehousing, Data Management, ERP Systems, MS OfficeProject Management, Strategic Management, Performance Management, Programming, Data Science, Software Engineering, Project Management, Systems Engineering, and other topics directly or indirectly related to the work of a data professional. 

Even if I was planning to post periodically, life took over, the professional as well the personal life made it almost impossible do to that. The unavailability of an easy-to-use editor made blogging more complex than it should be, spending more time in formatting than in writing the code. In the end I found a mix of solutions I could work with.

In time, feeling the lack of meaningful and workable definitions, I felt the need to look also in how people define the various concepts related to the above mentioned topics. I started then to collect definitions and quotes that would help me better understand the respective concepts, sharing and updating the respective content as well. 

More recently, I started to share my notes I made over the years - a compilation of text copied from different topics, simplified for learning and review. Unfortunately, some of the resources disappeared from the web, some information got deprecated, respectively are missing in other areas. Also the posting such content is not ideal, taking a long time to write.

I have more than 24 years of professional experience in IT in the above mentioned areas. I started to program in the 9th grade on Spectrum consoles (that had no external hard drives) and continued to do it during the university years, even if the topics were more or less related to mathematical topics. Even if that can't be considered as professional experience, it proved to be a useful experience. I still do some programming, though it resumes mainly to data processing, building back-to-back solutions when needed. 

In the 3rd professional year, I started to be deeper involved in the business side, which offered me a better perspective into the challenges organizations deal with. It was an interesting journey reflected also in some of my posts.

Happy and errorless coding! 

Best regards,
                        Adrian
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.