Showing posts with label Reverse. Show all posts
Showing posts with label Reverse. Show all posts

21 May 2020

SQL Reloaded: Functions Useful in Data Migrations

Left and Right-Padding

Oracle has the LPAD and RPAD functions which return and expression, left-padded, respectively right-padded with the specified characters. The functions are useful in formatting unique identifiers to fit a certain format (eg. 0000012345 instead of 12345). Here’re similar implementations of the respective functions:

-- left padding function 
CREATE FUNCTION dbo.LeftPadding( 
  @str varchar(50) 
, @length int  
, @padchar varchar(1))
RETURNS varchar(50)
AS 
BEGIN 
  RETURN CASE  
    WHEN LEN(@str)<@length THEN Replicate(@padchar, @length-LEN(@str)) + @str       
    ELSE @str  
    END 
END  

-- example left padding 
SELECT dbo.LeftPadding('12345', '10', '0')

-- right padding function 
CREATE FUNCTION dbo.RightPadding( 
  @str varchar(50) 
, @length int  
, @padchar varchar(1))
RETURNS varchar(50)
AS 
BEGIN 
  RETURN CASE  
    WHEN LEN(@str)<@length THEN @str + Replicate(@padchar, @length-LEN(@str))      
	ELSE @str  
  END
END  

-- example right padding 
SELECT dbo.RightPadding('12345', '10', '0')

Left and Right Side

When multiple pieces of data are stored within same same attribute, it’s helpful to get the left, respectively the right part of the string based on a given delimiter, where the reverse flag tells the directions in which the left is applied:

-- left part function 
CREATE FUNCTION dbo.CutLeft( 
  @str varchar(max) 
, @delimiter varchar(1)
, @reverse bit = 0)
RETURNS varchar(max)
AS 
BEGIN 
  RETURN CASE  
     WHEN CharIndex(@delimiter, @str)>0 THEN 
       CASE @reverse 
         WHEN 0 THEN Left(@str, CharIndex(@delimiter, @str)-1)    
         ELSE Left(@str, Len(@str)-CharIndex(@delimiter, Reverse(@str))) 
       END
	 ELSE @str  
  END
END  

-- example left part 
SELECT dbo.CutLeft('12345,045,000', ',', 0)
, dbo.CutLeft('12345,045,000', ',', 1)



-- right part function 
CREATE FUNCTION dbo.CutRight( 
  @str varchar(max) 
, @delimiter varchar(1)
, @reverse bit = 0)
RETURNS varchar(max)
AS 
BEGIN 
  RETURN CASE  
    WHEN CharIndex(@delimiter, @str)>0 THEN 
      CASE @reverse 
        WHEN 0 THEN Right(@str, CharIndex(@delimiter, Reverse(@str))-1) 
        ELSE Right(@str, Len(@str)-CharIndex(@delimiter, @str)) 
      END
   ELSE @str  
  END
END  

-- example right part 
SELECT dbo.CutRight('12345,045,000', ',', 0)
, dbo.CutRight('12345,045,000', ',', 1)

Replacing Special Characters

Special characters can prove to be undesirable in certain scenarios (e.g. matching values, searching). Until a “Replace from” function will be made available, the solution is to include the replacements in a user-defined function similar with the one below:

DROP FUNCTION IF EXISTS [dbo].[ReplaceSpecialChars]

CREATE FUNCTION [dbo].[ReplaceSpecialChars](
@string nvarchar(max)
, @replacer as nvarchar(1) = '-'
) RETURNS nvarchar(max)
-- replaces special characters with a given character (e.g. an empty string, space)
AS
BEGIN   
  IF CharIndex('*', @string) > 0  
     SET @string = replace(@string, '*', @replacer)	
    
  IF CharIndex('#', @string) > 0  
     SET @string = replace(@string, '#', @replacer)	
    
  IF CharIndex('$', @string) > 0  
     SET @string = replace(@string, '$', @replacer)	
    
  IF CharIndex('%', @string) > 0  
     SET @string = replace(@string, '%', @replacer)	
    
  IF CharIndex('&', @string) > 0  
     SET @string = replace(@string, '&', @replacer)	
    
  IF CharIndex(';', @string) > 0  
     SET @string = replace(@string, ';', @replacer)	
    
  IF CharIndex('/', @string) > 0  
     SET @string = replace(@string, '/', @replacer)	
    
  IF CharIndex('?', @string) > 0  
     SET @string = replace(@string, '?', @replacer)	
    
  IF CharIndex('\', @string) > 0  
     SET @string = replace(@string, '\', @replacer)	
    
  IF CharIndex('(', @string) > 0  
     SET @string = replace(@string, '(', @replacer)	
    
  IF CharIndex(')', @string) > 0  
     SET @string = replace(@string, ')', @replacer)	
    
  IF CharIndex('|', @string) > 0  
     SET @string = replace(@string, '|', @replacer)	
    
  IF CharIndex('{', @string) > 0  
     SET @string = replace(@string, '{', @replacer)	
    
  IF CharIndex('}', @string) > 0  
     SET @string = replace(@string, '}', @replacer)	
    
  IF CharIndex('[', @string) > 0  
     SET @string = replace(@string, '[', @replacer)	
    
  IF CharIndex(']', @string) > 0  
     SET @string = replace(@string, ']', @replacer)	
                                
  RETURN (LTrim(RTrim(@string)))
END

SELECT [dbo].[ReplaceSpecialChars]('1*2#3$4%5&6;7/8?9\10(11)12|13{14}15[16]', '')
SELECT [dbo].[ReplaceSpecialChars]('1*2#3$4%5&6;7/8?9\10(11)12|13{14}15[16]', ' ')

Other type of special characters are the umlauts (e.g. ä, ß, ö, ü from German language):

DROP FUNCTION IF EXISTS [dbo].[ReplaceUmlauts]

CREATE FUNCTION [dbo].[ReplaceUmlauts](
@string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces umlauts with their equivalent
AS
BEGIN   
  IF CharIndex('ä', @string) > 0  
     SET @string = replace(@string, 'ä', 'ae')	
    
  IF CharIndex('ö', @string) > 0  
     SET @string = replace(@string, 'ö', 'oe')	
    
  IF CharIndex('ß', @string) > 0  
     SET @string = replace(@string, 'ß', 'ss')	
    
  IF CharIndex('%', @string) > 0  
     SET @string = replace(@string, 'ü', 'ue')	
    
                                
  RETURN Trim(@string)
END

SELECT [dbo].[ReplaceUmlauts]('Hr Schrötter trinkt ein heißes Getränk')

Handling Umlauts

Another type of specials characters are the letter specific to certain languages that deviate from the Latin characters (e.g. umlauts in German, accents in French), the usage of such characters introducing further challenges in handling the characters, especially when converting the data between characters sets. A common scenario is the one in which umlauts in ISO-8891-1 are encoded using two character sets. Probably the easiest way to handle such characters is to write a function as follows:

-- create the function 
CREATE FUNCTION [dbo].[ReplaceCodes2Umlauts](
  @string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces ISO 8859-1 characters with the corresponding encoding 
AS
BEGIN   
  IF CharIndex('ß', @string) > 0  
     SET @string = replace(@string, 'ß', 'ß')	

  IF CharIndex('ö', @string) > 0  
     SET @string = replace(@string, 'ö', 'ö')	

  IF CharIndex('Ö', @string) > 0  
     SET @string = replace(@string, 'Ö', 'Ö')	
    
  IF CharIndex('ü', @string) > 0  
     SET @string = replace(@string, 'ü', 'ü')	

  IF CharIndex('Ü', @string) > 0  
     SET @string = replace(@string, 'Ãœ', 'Ü')	

  IF CharIndex('ä', @string) > 0  
     SET @string = replace(@string, 'ä', 'ä')

  IF CharIndex('Ä', @string) > 0  
     SET @string = replace(@string, 'Ä', 'Ä')
                                
  RETURN (LTrim(RTrim(@string)))
END


--test the function 
SELECT [dbo].[ReplaceCodes2Umlauts]('Falsches Ãœben von Xylophonmusik quält jeden größeren Zwerg')

In the inverse scenario, at least for the German language is possible to replace the umlauts with a set the corresponding transliterations:

-- drop the function 
DROP FUNCTION IF EXISTS [dbo].[ReplaceUmlauts]

-- create the function 
CREATE FUNCTION [dbo].[ReplaceUmlauts](
  @string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces umlauts with corresponding transliterations 
AS
BEGIN   
  IF CharIndex('ß', @string) > 0  
     SET @string = replace(@string, 'ß', 'ss')	

  IF CharIndex('ü', @string) > 0  
     SET @string = replace(@string, 'ü', 'ue')	

  IF CharIndex('ä', @string) > 0  
     SET @string = replace(@string, 'ä', 'ae')	
    
  IF CharIndex('ö', @string) > 0  
     SET @string = replace(@string, 'ö', 'oe')	
                                
  RETURN (LTrim(RTrim(@string)))
END

--test the function 
SELECT [dbo].[ReplaceUmlauts]('Falsches üben von Xylophonmusik quält jeden größeren Zwerg')

Happy coding!

03 January 2011

SQL Reloaded: Pulling the Strings of SQL Server V (Character Indexes)

A string is in fact a concatenation or chain of characters, each character in the string having a position referred as character’s index. In programming it’s useful to identify the first occurrence of a character or string in forward or backwards order, with a few simple tricks being possible to identify all the occurrences of the characters or whole strings in a string. Such functions are appearing in some of the well known programming languages under names like InStr, IndexOf for forward processing, respectively, InStrRev, LastIndexOf. SQL Server has only one function for this purpose, namely CharIndex, used to retrieve the position where a character is found starting from a given position. If no occurrence was found, the function returns 0, while if no start position if given, the default is 0. There are several other remarks that could be found in MSDN, and it’s actually always a good idea to consult the documentation from time to time, not only when meeting a new function. But let’s get back to the CharIndex function with a few examples based on searching a character within a given string:

-- searching a character into a string 
DECLARE @string varchar(50)  
SET @string = 'this is a string'  
SELECT CharIndex(' ', @string) Example1  
, CharIndex(NULL, @string) Example2 
, CharIndex(NULL, @string, NULL) Example3 
, CharIndex(NULL, NULL, NULL) Example4 
Output:
Example1 Example2 Example3 Example4
5 NULL NULL NULL

If in the first example the function is used to retrieve the first occurrence of a space in the given string, in the last three examples is highlighted the behavior of the function in case one of the parameters is NULL, in such cases the function returning a NULL – this means that when other behavior is expected, it falls in your attributions to handle the NULL values accordingly.

Here’s a second set of examples based on the search of a string within another string. As can be seen there isn’t a difference at all in how the function is applied:

-- searching a string into a string
DECLARE @string varchar(50)  
DECLARE @search varchar(50)  
SET @string = 'this_is_another_string'  
SET @search = 'is'  
SELECT @string Example1 -- simple string  
, CharIndex(@search, @string) Example2 -- occurring position  
, CharIndex(@search, @string, 5) Example3 -- second occurring position 
Output:
Example1 Example2 Example3
this_is_another_string 3 6

From the three examples, the third deserves some special attention because it attempts to retrieve the second occurrence of the string by using a hardcoded starting value. In fact the search could start a position further from the first occurrence, something like in the below example:

-- searching a string into a string in foward/reverse order 
DECLARE @string varchar(50)  
DECLARE @search varchar(50)  
SET @string = 'this_was_a_string'  
SET @search = '_'  
SELECT @string Example1 -- simple string  
, CharIndex(@search, @string) Example2 -- 1st occuring position (forward)  
, CharIndex(@search, @string, CharIndex(@search, @string)+1) Example3 -- 2nd occuring position (forward)  
, CharIndex(@search, @string, CharIndex(@search, @string, CharIndex(@search, @string)+1)+1) Example4 -- 3rd 
occuring position (forward)
Output:
Example1 Example2 Example3 Example4
this_was_a_string 5 9 11

The logic could be repeated over and over again in order to retrieve the n-th position, list all the positions of occurrence or the total number of occurrences.

Even if SQL Server doesn’t provide a function for retrieving the first occurrence in backward order, in other words starting from the end to the start, there is actually a simple trick that might do it. As SQL Server provides a Reverse function, which reverses the value of the string provided as parameter.

-- 1st occurring position backwards 
DECLARE @string varchar(50)  
DECLARE @search varchar(50)  
SET @string = 'this_was_a_string'  
SET @search = '_'  
SELECT Len(@string) - CharIndex(@search, Reverse(@string))+1 Example 

Unfortunately the logic needed to retrieve the second and following occurrences starting from a given position it’s not so easy readable. Therefore, for the sake of “simplicity”, here’s the logic for reverse processing incorporated in CharIndexRev function:

-- reverse CharIndex 
CREATE FUNCTION dbo.CharIndexRev( 
@search varchar(50) 
, @string varchar(500) 
, @start int = 0) 
RETURNS int 
AS BEGIN 
RETURN Len(@string) - CharIndex(@search, Reverse(@string), CASE WHEN IsNull(@start, 0) != 0 THEN Len(@string) - @start+1 ELSE 0 END)+1 
END 

And here’s the function at work:
 
-- searching a string into a string in reverse order 
DECLARE @string varchar(50) 
DECLARE @search varchar(50) 
SET @string = 'this_was_a_string' 
SET @search = '_' 
SELECT dbo.CharIndexRev(NULL, @string, 0) Example1 
, dbo.CharIndexRev(@search, NULL, 0) Example2 
, dbo.CharIndexRev(NULL, NULL, 0) Example3 
, dbo.CharIndexRev(@search, @string, 0) Example4 
, dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1) Example5 
, dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1)-1) Example6 
Output:
Example1 Example2 Example3 Example4 Example5 Example6
NULL NULL NULL 11 9 5

As can be seen, the output of the last three examples matches the reversed output of the previous set (11, 9, 5) vs. (5, 9, 11), while the first three examples exemplify the behavior of the function when one of the parameters is Null.

05 December 2010

SQL Reloaded: A Look into Reverse

Some time back I was mentioning Stuff as one of the functions rarely used in SQL Server. Here’s another one: Reverse. I remember I used such a function in other programming languages and maybe 1-2 times in SQL Server, and I can’t say I seen it in action in other posts. As it name states and the MSDN documentation confirms, Reverse returns the reverse of a string value. The function takes as parameter any string value or value that could be implicitly conversed to a string, otherwise you’ll have to do an explicit conversion. As can be seen from the below first example the reverse of the “string” string is “gnirts”, and, as per the second example, the reverse of the revers of a string is the initial string itself. The third and fourth example use in exchange a numeric value.
 
-- reversing a string using Reverse function 
SELECT Reverse('string') Example1 
, Reverse(Reverse('string')) Example2 
, Reverse(1234567890) Example3 
, Reverse(Reverse(1234567890)) Example4 
Example1 Example2 Example3 Example4
gnirts string 987654321 1234567890

It seems there is not much to be said about Reverse function, and in the few situations that you need it, it’s great to have it, otherwise you would have to built it yourself. Let’s try to imagine there is no Reverse function in SQL Server, how would we provide such functionality by using existing functionality? In general such an exercise might look like lost time, though it could be useful in order to present several basic techniques. In this case the Reverse functionality could be provided by using a simple loop that iterates through string’s characters changing their order. The below piece of code should return the same output as above:

-- reversing a string using looping 
DECLARE @String varchar(50) 
DECLARE @Retval varchar(50) 
DECLARE @Index int 
SET @String = 'string' 
SET @Index = 1 
SET @Retval = '' 
WHILE @Index <= Len(@String) 
BEGIN  
   SET @Retval = Substring(@String, @Index, 1) + @Retval   
   SET @Index = @Index + 1  
END 
SELECT @Retval 

Some type of problems whose solution involve the sequential processing of values within loops could be solved also using recursive techniques (recursion) that involve the sequential processing of values in which the output of a given intermediary step is based on the previous step. Recursion imply a slightly different view of the same problem, it could prove itself to be maybe less efficient in some cases and quite a useful technique in the others. SQL Server provides two types of recursion, one at function level involving functions, stored procedures are triggers, respectively at query level implemented in common table expressions. Recursive functions, functions that call themselves, have been for quite a while in SQL Server though few of the database books I read really talk about them. Probably I will detail the topic in another post, though for the moment I will show only the technique at work. In this case the recursive approach is quite simple:

-- reversing a string using a recursive function 
CREATE FUNCTION dbo.ReverseString( 
@string varchar(50)) 
RETURNS varchar(50) 
AS 
BEGIN           
     RETURN (CASE WHEN Len(@string)>1 THEN dbo.ReverseString( Right(@string, len(@string)-1)) + Left(@string, 1) ELSE @string END) 
END 
 
-- testing 
SELECT dbo.ReverseString('string') Example1 
, dbo.ReverseString(dbo.ReverseString('string')) Example2 

Common table expressions (CTE), since their introduction with SQL Server 2005, became quite an important technique in processing hierarchical structures like BOMs, which involve the traversal of the whole tree. A sequence is actually a tree having the width of 1 node, so it seems like a CTE’s job:
  
-- reversing a string using common table expression 
DECLARE @String varchar(50) 
SET @String = 'string' 
;WITH CTE(String, [Index]) 
AS (     SELECT Cast('' as varchar(50)) String 
    , 1 [Index] 
    UNION ALL 
    SELECT Cast(Substring(@String, [Index], 1) + String as varchar(50)) String 
    , [Index] + 1 [Index] 
    FROM CTE 
    WHERE [Index]<=Len(@String) 
) 
SELECT @String Reversed 
FROM CTE 
WHERE [Index]-1=Len(@String) 

Unlike the previous two solutions, the CTE approach involves more work, work that maybe isn’t required just for the sake of using the technique. When multiple alternative approaches exist, I prefer using the (simplest) solution that offers the highest degree of flexibility. In this case is the loop, though in others it could be a recursive function or a CTE.
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.