20 November 2005

Database Management: Trivia (Part I: Who's gonna rule the world?)

Database Management
Database Management Series

This morning I started to read an article from Information Week about the future of database administrators, as it seems they will rule the world, at least the world of data. It's not a joke, their role will become more and more important over the next years, why this? 

 1. Databases become bigger and more complex      
The size and complexity of databases is increasing from year to year, same their importance in making faster decision based on current or historical data. The bigger the size and complexity, bigger also the need for ad-hoc or one mouse click reports, and who’s the person who has the knowledge and resources to do this?! 

2. Data mining      
Historical data doesn't resume only to simple reporting, but data mining is becoming more popular, having as purpose the discovery of trends in the business, facilitating somehow predictions for business' future.     

Let’s not forget that the big players on the database market started to offer data mining solutions, incorporated or not in their database platforms, the data mining features are a start in providing new views over the data.      

Big companies will opt probably for data analysts, but for medium-sized and small companies, the role of a data analyst could be easily taken by a database administrator. 

3. The knowledge that matters      
Even if databases are backend for complex UI, in time, if not immediately, the database administrator becomes aware of underlaying overall structure of the database and, why not, leaded by curiosity or business' requirements he becomes aware of the content of the data. Even if is a paradox, the database administrator has sometimes more data knowledge than a manager and the skills to do with data that magic that is important for managers. 

4. Data Cleansing      
More and more companies are becoming aware of the outliers from their data, so data cleansing is becoming a priority, and again the database administrator has an important role in this process. 

5. Database Security      
Database platforms are growing, unfortunately the same thing happens with their security holes and the number of attacks. A skilled database administrator must have the knowledge to overcome these problems and solve the issues as soon as possible. 

6. Training      
Even if can be considered a consequence of the previously enumerated facts, it deserves to be considered separately. Companies are starting to realize how important it to have a skilled database administrator, so probably they will provide the requested training or help the DBA to achieve this. 

 7. On site/offshore compromise      
If in the past years, has been popular the concept of off shoring people, including database administrators; probably soon companies will become aware that reducing direct maintenance costs doesn’t mean a reduction of overall costs, which may contain also indirect costs. The delayed response time generated by communication issues, availability or resources plays an important role in the increase indirect costs, most of the big companies will be forced then to opt for a compromise between on site and offshore.      

In this case the direct benefit for a database administrator is not so obvious, except for the financial benefit there is also an image advantage, even if is disregarded, the image and the fact a person is not anymore isolated is important.

Reviewing the post almost 20 years later with the knowledge of today, I can just smile. Besides data analysts, the DBAs were at that time the closest to the data. Probably, many DBAs walked the new paths created by the various opportunities. Trying to uncover the unknown through the known is a foolish attempt, but it’s the best we can do to prepare us for what comes next.

Created: Nov-2005, Last Reviewed: Mar-2024

19 November 2005

💎SQL Reloaded: Cursors and Lists

I found cursors really useful when the set-based logic provided by a query make it difficult to solve special types of problems (e.g. concatenate in a list the values returned by a select, multiple updates/insert/deletions based on the success or failure of previous logic). Cursors are relatively easy to write though lot of code is redundant from one solution to another. Therefore, I prefer to have the simplest code and then modify it according to new requirements.   

Here's a simple example of a function that returns a list of emails:

--creating the sample table
CREATE TABLE EmailAddresses(ID int, Email varchar(50))

--insert test records
INSERT EmailAddresses
VALUES (1, 'John.Travolta@star.com')
INSERT EmailAddresses
VALUES (2, 'Robert.DeNiro@star.com')
INSERT EmailAddresses
VALUES (3, 'MegRyan@star.com')
INSERT EmailAddresses
VALUES (4, 'Helen.Hunt@star.com')
INSERT EmailAddresses
VALUES (5, 'Jodie.Foster@star.com')

-- creating the function 
CREATE FUNCTION dbo.GetEmails()
RETURNS varchar(1000)
/*
Purpose: returns a list of Emails
Parameters:
Notes:
Sample: SELECT dbo.GetEmails() AS ListEmails
*/
AS
BEGIN
   DECLARE @Email varchar(50)
   DECLARE @Emails varchar(1000)

   SET @Emails = ''

   -- Create Emails Cursor
   DECLARE Emails CURSOR FOR
   SELECT Email
   FROM EmailAddresses
   ORDER BY Email

   OPEN Emails -- Open Emails Cursor

   --fetch first set of records from Emails Cursor
   FETCH NEXT FROM Emails
   INTO @Email

   WHILE @@FETCH_STATUS = 0 --if the fatch was successful
   BEGIN
      SET @Emails = @Emails + @Email + ','

      --fetch next set of records from Emails Cursor
      FETCH NEXT FROM Emails
      INTO @Email
   END

   CLOSE Emails -- close Emails cursor
   DEALLOCATE Emails --deallocate Emails cursor

   --remove the extra comma
   IF Len(@Email)>0
      SET @Emails = Left(@Emails, Len(@Emails)-1)

   RETURN @Emails
END
 
--testing the function
SELECT dbo.GetEmails()

Notes:
The code was tested on SQL Server 2000 till 2017.
The logic from dbo.GetList function can use as source any other table as long the length of the target column is less than 50 characters as the @Result was defined as varchar(50).

Exploring the above idea, what if the content of the query is not known until runtime and all is known is that only one column of data is returned? This could be achieved with the help of a table data type:

--creating the source table
CREATE TABLE Countries(ID int, Country varchar(50))

--insert test records
INSERT Countries
VALUES (1, 'US')
INSERT Countries
VALUES (2, 'UK')
INSERT Countries
VALUES (3, 'Germany')
INSERT Countries
VALUES (4, 'Spain')
INSERT Countries
VALUES (5, 'France')

 
-- creating the function 
CREATE FUNCTION dbo.GetList(
@Query varchar(250))
RETURNS varchar(1000)
/*
Purpose: returns a list of List
Parameters:
Notes:
Sample: SELECT dbo.GetList('SELECT Country FROM Countries ORDER BY Country') AS List
*/

AS
BEGIN
   DECLARE @Result varchar(50)
   DECLARE @List varchar(1000)
   DECLARE @Temp TABLE (Val varchar(10))

   SET @List = ''

   --insert in a table data type the results returned by query
   INSERT @Temp
   EXEC (@Query)

   -- Create List Cursor
   DECLARE List CURSOR FOR
   SELECT Val
   FROM @Temp

   OPEN List -- Open List Cursor

   --fetch first set of records from List Cursor
   FETCH NEXT FROM List
   INTO @Result

   WHILE @@FETCH_STATUS = 0 --if the fatch was successful
   BEGIN
      SET @List = @List + @Result + ','

      --fetch next set of records from List Cursor
      FETCH NEXT FROM List
      INTO @Result
   END

   CLOSE List -- close List cursor
   DEALLOCATE List --deallocate List cursor

   --remove the extra comma
   IF Len(@Result)>0
      SET @List = Left(@List, Len(@List)-1)

   RETURN @List
END

--testing the function 
SELECT dbo.GetList('SELECT Country FROM Countries ORDER BY Country') AS List

Result: List --------------------------
France,Germany,Spain,UK,US (1 row(s) affected)   

Instead of using a query one can use a stored procedure as well. Here's the stored procedure and the example that calls it:

--creating the stored procedure
CREATE PROCEDURE dbo.pListCountries
AS
SELECT Country
FROM Countries
ORDER BY Country
 
--testing the function
SELECT dbo.GetList('dbo.pListCountries') AS List

Result: List
--------------------------
France,Germany,Spain,UK,US (1 row(s) affected)  

Notes:
1) As it seems the above code stopped working between the next editions of SQL Server.
2) Trying to create the dbo.GetList function in a SQL database in Microsoft Fabric leads to the following error message:
"Msg 443, Level 16, State 14, Line 21, Invalid use of a side-effecting operator 'INSERT EXEC' within a function."
Happy coding!

💠🛠️SQL Server: Administration (Part I: Troubleshooting Problems in SQL Server 2000 DTS Packages)

    There are cases in which a problem can not be solved with a query, additional data processing being requested. In this category of cases can be useful to use a temporary table or the table data type. Once the logic built you want to export the data using a simple DTS project having as Source the built query.

    While playing with SQL Server 2005 (Beta 2) I observed that the DTS Packages have problem in handling temporary tables as well table data types.

1. Temporary table Sample:

CREATE PROCEDURE dbo.pTemporaryTable
AS
CREATE TABLE #Temp(ID int, Country varchar(10)) -- create a temporary table

-- insert a few records
INSERT #Temp VALUES (1, 'US')
INSERT #Temp VALUES (2, 'UK')
INSERT #Temp VALUES (3, 'Germany')
INSERT #Temp VALUES (4, 'France')   

SELECT * FROM #Temp -- select records

DROP TABLE #Temp  drop the temporary table
 
-- testing the stored procedure
EXEC dbo.pTemporaryTable
Output:
ID Country
    ----------- ----------
1 US
2 UK
3 Germany
4 France
(4 row(s) affected)

Trying to use the stored procedure as Source in a DTS package brought the following error message: Error Source: Microsoft OLE DB Provider for SQL Server Error Description: Invalid object name ‘#Temp’   

 I hoped this has been fixed in SQL Server 2005 (Beta 3) version, but when I tried to use the stored procedure as a Source I got the error:
Error at Data Flow Task [OLE DB Source [1]]: An OLE DB error has occurred. Error code: 0x80004005 An OLE DB record is available. Source: "Microsoft OLE DB Provider for SQL Server" Hresult: 0x80004005 Description: "Invalid object name '#Temp'."
Error at Data Flow Task [OLE DB Source [1]]: Unable to retrieve column information from the data source. Make sure your target table in the database is available. ADDITIONAL INFORMATION: Exception from HRESULT: 0xC020204A (Microsoft.SqlServer.DTSPipelineWrap)

2. Table data type sample:

CREATE PROCEDURE dbo.pTableDataType
AS
DECLARE @Temp TABLE (ID int, Country varchar(10)) -- create table

-- insert a few records
INSERT @Temp VALUES (1, 'US')
INSERT @Temp VALUES (2, 'UK')
INSERT @Temp VALUES (3, 'Germany')
INSERT @Temp VALUES (4, 'France')   

SELECT * FROM @Temp -- select records

-- testing the stored procedure
EXEC dbo.pTableDataType

Output:
ID Country
   ----------- ----------
1 US
2 UK
3 Germany
4 France
(4 row(s) affected)    

This time the error didn't appear when the Source was provided, but when the package was run: Error message: Invalid Pointer    

In Server 2005 (Beta 3) it raised a similar error:
SSIS package "Package6.dtsx" starting. Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning Information: 0x4004300A at Data Flow Task, DTS.Pipeline: Validation phase is beginning Information: 0x40043006 at Data Flow Task, DTS.Pipeline: Prepare for Execute phase is beginning Information: 0x40043007 at Data Flow Task, DTS.Pipeline: Pre-Execute phase is beginning SSIS package "Package6.dtsx" finished: Canceled.    

 I saw there will be problems because in Flat File Connection Manager’s Preview no data were returned. I hope these problems were solved in the last SQL Server Release, if not we are in troubles!

Possible Solutions:    
If the Stored Procedures provided above are run in SQL Query Analyzer or using ADO, they will work without problems. This behavior is frustrating, especially when the logic is really complicated and you put lot of effort in it; however there are two possible solutions, both involving affordable drawbacks:
1. Create a physical table manually or at runtime, save the data in it and later remove the table. This will not work for concurrent use (it works maybe if the name of the table will be time stamped), but for ad-hoc reports might be acceptable.
2. Built an ADO based component which exports the data to a file in the format you want. The component is not difficult to implement, but additional coding might be requested for each destination type. The component can be run from a UI or from a DTS package.
3. In SQL Server Yukon is possible to use the CLR and implement the logic in a managed stored procedure or table valued function.

Happy Coding!

18 November 2005

💎SQL Reloaded: Prime Numbers

I was just thinking about prime numbers and why people had such a fascination about them. In mysticism every number has a powerful meaning, a power coming from inside, and therewith attracts certain powers from exterior. It's strange and probably hard to understand the reduction of numbers between 1 and 9. Probably, each philosophy has a little truth in it. 
 
I was trying to find a nice article on number reduction, for example the one from sourceryforge could be useful. Last week I implemented a function which returns the reduction of any sequence, could be date or number. I was intending to apply it on prime numbers, but even if it can be observed certain similarities between different chains, it's difficult to believe this will lead anywhere. 

CREATE FUNCTION dbo.NumberReduction( 
   @text varchar(50)) 
RETURNS smallint 
/* Purpose: reduces to number a sequence of digits Parameters: @text varchar(50) - digits sequence 
   Notes: the function works with any sequnce of alphanumeric, not numeric values being ignored Sample: 
*/ 
BEGIN    
	DECLARE @index smallint = 1 
	DECLARE @total int = 0 

	WHILE (@index<=len(@text))   
	BEGIN          
		IF IsNumeric(Substring(@text, @index, 1)) = 1         
			SET @total = @total + Cast(Substring(@text, @index, 1) as smallint)  
		
		SET @index = @index+1    
	END    

	--if the number is not in 1..9 range, it calls further the NumberReduction for the new total    
	IF len(@total)>1       
		SET @total = dbo.NumberReduction(@total) 
	
	RETURN @total 
END 

-- testing the function
SELECT dbo.NumberReduction('11/17/2005') 
SELECT dbo.NumberReduction('1234') 
SELECT dbo.NumberReduction('1234 8993 90003 3456') 

Now, let's look at the number of 9592 primes found in the first 100.000 numbers:

-- number of primes in each thousands numbers 
SELECT PrimeNumber/1000+1 DataSet 
, count(*) NumberPrimeNumbers 
FROM PrimeNumbers 
GROUP BY PrimeNumber/1000+1
ORDER BY DataSet

As it seems the maximum number of prime numbers is reached for the first thousand and decreases until is stabilizes somehow. It's more interesting what happens when the numbers are really big, however special techniques are used for testing such big numbers. The record is currently held by 2^136,279,841-1 with 41,024,320 digits (see Wikipedia). There are also interesting visualizations on prime numbers: see Ulam's spiral.

Here's a chart with the number of prime numbers and the table generated from the above script:




Thousands Number Primes
1 169
2 135
3 127
4 120
5 119
6 114
7 117
8 107
9 110
10 112
11 106
12 103
13 109
14 105
15 102
16 108
17 98
18 104
19 94
20 104
21 98
22 104
23 100
24 104
25 94
26 98
27 101
28 94
29 98
30 92
31 95
32 92
33 106
34 100
35 94
36 92
37 99
38 94
39 90
40 96
41 88
42 101
43 102
44 85
45 96
46 86
47 90
48 95
49 89
50 98
51 89
52 97
53 89
54 92
55 90
56 93
57 99
58 91
59 90
60 94
61 88
62 87
63 88
64 93
65 80
66 98
67 84
68 99
69 80
70 81
71 98
72 95
73 90
74 83
75 92
76 91
77 83
78 95
79 84
80 91
81 88
82 92
83 89
84 84
85 87
86 85
87 88
88 93
89 76
90 94
91 89
92 85
93 97
94 86
95 87
96 95
97 84
98 82
99 87
100 87


Happy coding!

💎SQL Reloaded: To be or not to be a Prime

According to Wolfram a prime number is "any positive integer p>1 that has no positive integer divisors other than 1 and p itself. In other words there is no positive integer numbers p>m>1 that p = m*n, n>1. So, to check whether p is prime all we have to do is to check if any number m between 2 and p-1 is not a divisor for p.    

If m is not a divisor for p then we could write: p/m = n + r, r> 0, this is in mathematics, but in programming even if is SQL, C++ or any programming language, when working with integers, by p/m is understood n while by p mod m is r.

Respecting the order of operations (p/m)*m = p only if m is a divisor for p. Now, is it really necessary to check all the numbers between 2 and p-1 if they are or not divisors for p? Probably not, when m increases, p/m decreases, and also the range of possible divisors, which is between m+1 and p/m-1 until m becomes greater or equal with p/m. m=p/m only when m is the square from p, so is necessary and sufficient to check only the values between 2 and √p.    

To summarize, if all the numbers m betwen 2 and √p are not divisors of p then p is prime. This is the test approach used in IsPrime function:

CREATE FUNCTION dbo.IsPrime(
 @number bigint)
RETURNS smallint
/*
   Purpose: checks if a positive integer is prime number
   Parameters: @number bigint - the number to be checked if prime
   Notes: 
   Sample: SELECT dbo.IsPrime(1000)
  SELECT dbo.IsPrime(947)
  SELECT dbo.IsPrime(111913)
  SELECT dbo.IsPrime(113083)
*/
BEGIN
    DECLARE @index bigint
 DECLARE @result bit

 SET @index = 2
    SET @result = 1

 WHILE (@index<=sqrt(@number))
 BEGIN
  IF (@number = (@number/@index)*@index)
   SET @result = 0 

  SET @index = @index+1
 END

 RETURN @result
END

The pListPrimeNumbers will be used to test the function:

CREATE PROCEDURE dbo.pListPrimeNumbers(
  @startInterval bigint
, @endInterval bigint)
AS
/*
   Purpose: lists the prime numbers from a the [@startInterval, @endInterval] interval 
   Parameters: @startInterval bigint - the left side of the interval in which will be searched the prime numbers
      @endInterval bigint - the right side of the interval in which will be searched the prime numbers
   Notes: 
   Sample: EXEC dbo.pListPrimeNumbers 900, 1000
           EXEC dbo.pListPrimeNumbers 111900, 112900
*/
DECLARE @index bigint
SET @index = @startInterval

--for each integer in interval
WHILE (@index<=@endInterval)
BEGIN  
    -- list a number if prime 
 IF (dbo.IsPrime(@index)=1)
  SELECT @index [Prime Numbers]

 SET @index = @index+1 
END
 
   One can use a simple table to store the results, which will allow further analyssis of the data:

-- creating the table 
CREATE TABLE PrimeNumbers (PrimeNumber bigint)

-- inserting the data 
INSERT PrimeNumbers
EXEC dbo.pListPrimeNumbers 1, 10000

Notes:    
1) The utility of IsPrime function is relative, primarily because there are not many the cases when such a function is needed in SQL, and secondarily because the performance gets poorer with the number tested. Programming languages like VB.Net or C# are more appropriate for such tasks. Initially I wanted to test the performance of implementing IsPrime in SQL vs implementing it as a managed function, but an error stopped me from the attempt:
"Msg 6505, Level 16, State 1, Procedure IsPrimeNumber, Line 1 Could not find Type 'UserDefinedFunctions' in assembly 'MyLibrary'."  
About 3 months ago the same piece of code worked without problems, but since then I had to reinstall the SQL Server because I couln't use DTS packages after I installed LINQ preview. Welcome to the "real" world of Microsoft :(.
2) The code has been tested successfully also on a SQL database in Microsoft Fabric.

Happy Coding!

16 November 2005

🧩IT: World Wide Web (Just the Quotes)

"The actual observed working structure of the organisation is a multiply connected 'web' whose interconnections evolve with time." (Tim Berners-Lee, "Information Management: A Proposal", 1989)

"This is why a 'web' of notes with links (like references) between them is far more useful than a fixed hierarchical system. When describing a complex system, many people resort to diagrams with circles and arrows. Circles and arrows leave one free to describe the interrelationships between things in a way that tables, for example, do not. The system we need is like a diagram of circles and arrows, where circles and arrows can stand for anything." (Tim Berners-Lee, "Information Management: A Proposal", 1989)

"Developments on the Internet over the next several years will set the course of our industry for a long time to come." (Bill Gates, "Internet Tidal Wave", [Microsoft internal memo], 1995)

"For me, the most exciting thing in the software area is the Internet, and part of the reason for that is no one owns it. It’s a free for all, it’s much like the early days of the personal computer." (Steve Jobs, Wall $treet Week, 1995)

"The Web is not going to change the world, certainly not in the next 10 years. It’s going to augment the world." (Steve Jobs, Wired, 1996)

"The internet model has many lessons for the new economy but perhaps the most important is its embrace of dumb swarm power. The aim of swarm power is superior performance in a turbulent environment. When things happen fast and furious, they tend to route around central control. By interlinking many simple parts into a loose confederation, control devolves from the center to the lowest or outermost points, which collectively keep things on course. A successful system, though, requires more than simply relinquishing control completely to the networked mob." (Kevin Kelly, "New Rules for the New Economy: 10 radical strategies for a connected world", 1998)

"I have a dream for the Web [...] and it has two parts. In the first part, the Web becomes a much more powerful means for collaboration between people. I have always imagine  the information space as something to which everyone has immediate and intuitive access, and not just to browse, but to create. Furthermore, the dream of people-to-people communication through shared knowledge must be possible for groups of all sizes, interacting electronically with as much ease as they do now in person." (Tim Berners-Lee, "Weaving the Web", 1999)

"It [the Internet] is inherently destructive of memory. You think you’re getting lots more [information] until you’ve found out you’ve made a bargain with the Devil. You’ve slowly mutated, and have become an extension of the machine." (James Billington, [interview] 1999)

"The first form of semantic data on the Web was metadata information about information. (There happens to be a company called Metadata, but I use the term here as a generic noun, as it has been used for many years.) Metadata consist of a set of properties of a document. By definition, metadata are data, as well as data about data. They describe catalogue information about who wrote Web pages and what they are about; information about how Web pages fit together and relate to each other as versions; translations, and reformattings; and social information such as distribution rights and privacy codes." (Tim Berners-Lee, "Weaving the Web", 1999)

"The web is more a social creation than a technical one. I designed it for a social effect - to help people work together - and not as a technical toy. The ultimate goal of the Web is to support and improve our web-like existence in the world. We clump into families, associations, and companies. We develop trust across the miles and distrust around the corner." (Tim Berners-Lee, "Weaving the Web", 1999)

"What we believe, endorse, agree with, and depend on is representable and, increasingly, represented on the Web. We all have to ensure that the society we build with the Web is the sort we intend." (Tim Berners-Lee, "Weaving the Web", 1999)

"The problem with the Internet startup craze isn’t that too many people are starting companies; it’s that too many people aren’t sticking with it. That’s somewhat understandable, because there are many moments that are filled with despair and agony, when you have to fire people and cancel things and deal with very difficult situations. That’s when you find out who you are and what your values are." (Steve Jobs, Fortune, 2000)

"The Web does not just connect machines, it connects people." (Tim Berners-Lee, [speech] 2008)

"Never before have we had so many tools to learn and to communicate. Yet the art of talking, listening, and ascertaining the truth seems more elusive than ever in this Internet and cable age, lost in a bitter stream of blather and misinformation." (Maureen Dowd, "Toilet-Paper Barricades", The New York Times, 2009)

"Big data is based on the feedback economy where the Internet of Things places sensors on more and more equipment. More and more data is being generated as medical records are digitized, more stores have loyalty cards to track consumer purchases, and people are wearing health-tracking devices. Generally, big data is more about looking at behavior, rather than monitoring transactions, which is the domain of traditional relational databases. As the cost of storage is dropping, companies track more and more data to look for patterns and build predictive models." (Neil Dunlop, "Big Data", 2015)

"If Web 2.0 for you is blogs and wikis, then that is people to people. But that was what the Web was supposed to be all along." (Tim Berners-Lee, [interview])

"One of the powerful things about networking technology like the Internet or the Web or the Semantic Web [...] is that the things we've just done with them far surpass the imagination of the people who invented them." (Tim Berners-Lee)

"The first step is putting data on the Web in a form that machines can naturally understand, or converting it to that form. This creates what I call a Semantic Web-a web of data that can be processed directly or indirectly by machines." (Tim Berners-Lee)

"The power of the Web is in its universality. Access by everyone regardless of disability is an essential aspect." (Tim Berners-Lee)

14 November 2005

💎SQL Reloaded: Out of Space

Some time ago I found a post in which somebody was annoyed of writing code like this:

DECLARE @zipcode char(5)

SET @zipcode = '90'

SELECT CASE len(@zipcode)
 WHEN 1 THEN '0000' + @zipcode
 WHEN 2 THEN '000' + @zipcode
 WHEN 3 THEN '00' + @zipcode
 WHEN 4 THEN '0' + @zipcode
 WHEN 5 THEN @zipcode
END

I remembered I met the same situation, but instead of writing a line of code for each case, I prefered to write a simple function, which used the Space and Replace built-in functions to add first a number of spaces and then replace them with a defined character. Maybe somebody will argue about the performance of such a function, but I prefer in many cases the reusability against performance, when the difference is not that big. Additionally, this enhances code’s readability.    

So, here is the function:

CREATE FUNCTION dbo.FillBefore(
   @target varchar(50)
 , @length int
 , @filler char(1))
/*
   Purpose: enlarges a string to a predefiend length by puting in front of it the same character
   Parameters: @target varchar(50) - the string to be transformed
  , @length int - the length of output string
  , @filler char(1) - the character which will be used as filler 
   Notes: the function works if the length of @length is greater or equal with the one of the @target, otherwise will return a NULL
*/
RETURNS varchar(50)
AS
BEGIN
   RETURN Replace(Space(@length-len(IsNull(@target, ''))), ' ', @filler) 
       + IsNull(@target, '')
END

-- testing the function
SELECT dbo.FillBefore('1234', 5, '0')
SELECT dbo.FillBefore('1234', 10, ' ')
SELECT dbo.FillBefore(1234, 10, ' ')
SELECT dbo.FillBefore(Cast(1234 as varchar(10)), 10, '0')

Another scenario in which the Space function was useful was when I had to save the result of a query to a text file in fixed width format. It resumes to the same logic used in FillBefore function, except the spaces must be added at the end of the string. For this I just needed to modify the order of concatenation and create a new function:

CREATE FUNCTION dbo.FillAfter(
   @target varchar(50)
 , @length int
 , @filler char(1))
/*
   Purpose: enlarges a string to a predefiend length by adding at the end the same character
   Parameters: @target varchar(50) - the string to be transformed
  , @length int - the length of output string
  , @filler char(1) - the character which will be used as filler 
   Notes: the function works if the length of @length is greater or equal with the one of the @target, otherwise will return a NULL

*/
RETURNS varchar(50)
AS
BEGIN
   RETURN IsNull(@target, '') 
        + Replace(Space(@length-len(IsNull(@target,''))), ' ', @filler)
END

  -- testing the function 
  SELECT dbo.FillAfter('1234', 5, '0')
  SELECT dbo.FillAfter('1234', 10, ' ')
  SELECT dbo.FillAfter(1234, 10, '0')
  SELECT dbo.FillAfter(Cast(1234 as varchar(10)), 10, '0')
  SELECT dbo.FillAfter(NULL, 10, '0')

Notes:
1. In SQL Server 2005, the output of a query can be saved using a DTS Package with fix format directly to a text file using ‘fixed width’ (the columns are defined by fix widths) or ‘ragged right’ (the columns are defined by fix widths, except the last one which is delimited by the new line character) format for the destination file.
2. To document or not to document. I know that many times we don't have the time to document the database objects we created, but I found that a simple comment saved more time later, even if was about my or other developer's time.
3. If is really needed to use the CASE function, is good do a analysis of the data and put as first evaluation the case with the highest probability to occur, as second evaluation the case with the second probability to occur, and so on. The reason for this is that each option is evaluated until the comparision operation evaluates to TRUE. For example, if the zip codes are made in most of the cases of 5 characters, then it makes sense to put it in the first WHEN position.
4) The code has been tested successfully also on a SQL database in Microsoft Fabric.

Happy coding!

08 November 2005

Meta-Blogging: Introduction


A warm welcome to my blog! 

I started this blog with the idea of documenting some of the issues I found when working with SQL Server, Oracle and other DBMS, respectively to share some of the techniques used in manipulating data via SQL or other means (including .Net, MS Access or Excel). Therefore, it was intended to be a blog on SQL, data, databases, data processing and related topics, though with time I arrived to share my ideas on Business Intelligence, Data Warehousing, Data Management, ERP Systems, MS OfficeProject Management, Strategic Management, Performance Management, Programming, Data Science, Software Engineering, Project Management, Systems Engineering, and other topics directly or indirectly related to the work of a data professional. 

Even if I was planning to post periodically, life took over, the professional as well the personal life made it almost impossible do to that. The unavailability of an easy-to-use editor made blogging more complex than it should be, spending more time on formatting than in writing the code, which made me stop blogging for a while. In the end I found a mix of solutions I could work with, building also some macros to format the output.

In time, feeling the lack of meaningful and workable definitions, I felt the need to look also in how people define the various concepts related to the above mentioned topics. I started then to collect definitions and quotes that would help me better understand the respective concepts, sharing and updating the respective content as well. 

More recently, I started to share my notes I made over the years - a compilation of text copied from different topics, simplified for learning and review. Unfortunately, some of the resources disappeared from the web, some information got deprecated, respectively are missing in other areas. Also the posting such content is not ideal, taking a long time to write.

I have more than 25 years of professional experience in IT in the above mentioned areas. I started to program in the 9th grade on Spectrum consoles (that had no external hard drives) and continued to do it during the university years, even if the topics were more or less related to mathematical topics. Even if that can't be considered as professional experience, it proved to be a useful experience. I still do some programming, though it resumes mainly to data processing, building back-to-back solutions when needed. 

Over the years I got in contact with several programming languages: Basic, Pascal, Fotran, C++, Java, VB/VBA, T-SQL, JavaScript, HTML/DHTML, VBScript, ColdFusion, ActionScript, C#, VB.Net, PL/SQL, and more recently R, Power Query M, DAXPython, KQL and who knows what comes my way!

In the 3rd professional year, I started to be deeper involved in the business side, which offered me a better perspective into the challenges organizations deal with. It was an interesting journey reflected also in many of my posts.

Happy and errorless coding! 

Best regards,
                        Adrian
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.