SQL Troubles: special character

Showing posts with label special character. Show all posts

20 October 2023

💎🏭SQL Reloaded: Extended LTrim/RTrim in SQL Server 2022 (Before and After)

In SQL Server 2022, the behavior of LTrim (left trimming) and RTrim (right trimming) functions was extended with one more string parameter. When provided, the engine checks whether the first parameter starts (for LTrim), respectively ends (for RTrim) with the respective value and removes it, the same as the space character char(32) was removed previously:

-- prior behavior of LTrim/RTrim
DECLARE @text as nvarchar(50) = '  123  '
SELECT '(' + LTrim(@text) + ')' LeftTrimming
, '(' + RTrim(@text) + ')' RightTrimming
, '(' + Ltrim(RTrim(@text)) + ')' Trimming1 -- prior SQL Server 2017
, '(' + Trim(@text) + ')' Trimming2 -- starting with SQL 2017

LeftTrimming	RightTrimming	Trimming1	Trimming2
(123 )	( 123)	(123)	(123)

Here's the new behavior:

-- extended behavior of LTrim/LTrim (SQL Server 2022+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT LTrim(@text , '123') LeftTrimming
, RTrim(@text , 'abc') RightTrimming;

LeftTrimming	RightTrimming
abc123abc	123abc123

Previously, to obtain the same result one could write something like:

-- prior solution via Left/Right for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN Left(@text, 3) = '123' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN Right(@text, 3) = 'abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

-- prior solution via "LIKE" for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN @text LIKE '123%' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN @text LIKE '%abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

As can be seen, the syntax is considerable simplified. However, there are few the situations when is needed. In the past I had to write code to remove parenthesis, quotes or similar characters:

-- removing parantheses
DECLARE @text as nvarchar(50) = '(testing)'
SELECT LTrim(@text , '(') LeftTrimming
, RTrim(@text , ')') RightTrimming
, RTrim(LTrim(Trim(@text), '('), ')') Trimming 

-- removing double quotes
DECLARE @text as nvarchar(50) = '"testing"'
SELECT LTrim(@text , '"') LeftTrimming
, RTrim(@text , '"') RightTrimming
, RTrim(LTrim(Trim(@text), '"'), '"') Trimming

The Trim for the 3rd value in both queries was used to remove the eventual spaces before the character to be replaced:

-- removing paranteses with lead/end spaces
SELECT RTrim(LTrim(Trim('   (testing)   '), '('), ')');

Then I thought, maybe I could use the same to remove the tags from an XML element. I tried the following code and unfortunately it doesn't seem to work:

-- attempting to remove the start/end tags from xml elements
DECLARE @text as nvarchar(50) = '<string>testing</string>'
SELECT LTrim(@text , '<string>') LeftTrimming
, RTrim(@text , '</string>') RightTrimming
, RTrim(LTrim(Trim(@text), '<string>'), '</string>') Trimming

LeftTrimming	RightTrimming	Trimming
esting</string>	<string>te	e

That's quite an unpleasant surprise! In exchange, the value type can be defined as XML and use the following code to obtain the needed result:

-- extracting the value from a tag element
DECLARE @text XML = '<string>testing</string>'
SELECT @text.query('data(/string)') as value

Notes:

The queries work also in SQL databases in Microsoft Fabric.

Happy coding!

01 April 2021

💎SQL Reloaded: Processing JSON Files with Complex Structure in SQL Server 2016+

Unfortunately (or fortunately, for the challenge-searchers), not all JSON data files have a simple (matrix) structure, while the data might not even have a proper (readable) definition. It's the case of the unemployment data provided by the Cologne municipality (source). However with a language page translator and some small effort one can identify the proximate data definition:

Source Field	Target Field	Data Type
AM_ALO_INSG_AA	ALO_Total	int
AM_ALO_SGB2_AA	ALO_SGB2	int
AM_ALO_UNTER25_AA	ALO_Under25	int
AM_ALO_INSG_AP	ALO_Total_Perc	float
AM_ALO_SGB2_AP	ALO_SGB2_Perc	float
AM_ALO_UNTER25_AP	ALO_Under25_Perc	float
AM_ALO_INSG_HA	ALO_Total_Hist	int
AM_ALO_SGB2_HA	ALO_SGB2_Hist	int
AM_ALO_UNTER25_HA	ALO_Under25_Hist	int
AM_ALO_INSG_HP	ALO_Total_HistPerc	float
AM_ALO_SGB2_HP	ALO_SGB2_HistPerc	float
AM_ALO_UNTER25_HP	ALO_Under25_HistPerc	float
AM_SVB_INSG_AA	SVB_Total	int
AM_SVB_MANN_AA	SVB_Men	int
AM_SVB_FRAU_AA	SVB_Women	int
AM_SVB_DEUTSCH_AA	SVB_German	int
AM_SVB_AUSLAND_AA	SVB_AUSLAND	int
AM_SVB_U25J_AA	SVB_Under25Yo	int
AM_SVB_UEBER55J_AA	SVB_Over55Yo	int
AM_SVB_INSG_AP	SVB_Total_Perc	float
AM_SVB_MANN_AP	SVB_Men_Perc	float
AM_SVB_FRAU_AP	SVB_Women_Perc	float
AM_SVB_DEUTSCH_AP	SVB_German_Perc	float
AM_SVB_AUSLAND_AP	SVB_AUSLAND_Perc	float
AM_SVB_U25J_AP	SVB_Under25Yo_Perc	float
AM_SVB_UEBER55J_AP	SVB_Over55Yo_Perc	float
AM_SVB_INSG_HA	SVB_Total_Hist	int
AM_SVB_MANN_HA	SVB_Men_Hist	int
AM_SVB_FRAU_HA	SVB_Women_Hist	int
AM_SVB_DEUTSCH_HA	SVB_German_Hist	int
AM_SVB_AUSLAND_HA	SVB_AUSLAND_Hist	int
AM_SVB_U25J_HA	SVB_Under25Yo_Hist	int
AM_SVB_UEBER55J_HA	SVB_Over55Yo_Hist	int
AM_SVB_INSG_HP	SVB_Total_HistPerc	float
AM_SVB_MANN_HP	SVB_Men_HistPerc	float
AM_SVB_FRAU_HP	SVB_Women_HistPerc	float
AM_SVB_DEUTSCH_HP	SVB_German_HistPerc	float
AM_SVB_AUSLAND_HP	SVB_AUSLAND_HistPerc	float
AM_SVB_U25J_HP	SVB_Under25Yo_HistPerc	float
AM_SVB_UEBER55J_HP	SVB_Over55Yo_HistPerc	float
SHAPE.AREA	SHAPE.AREA	int
SHAPE.LEN	SHAPE.LEN	int

As previously stated (see post), it makes sense to build the logic over several iterations, making first sure that the references to file's columns were used correctly (observe the way the various elements were referenced in the queries):

SELECT DAT.ObjectId 
, DAT.Nummer
, DAT.Name
, DAT.ALO_Total
, DAT.ALO_SGB2
FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$.features')
 WITH( 
	  ObjectId int '$.properties.OBJECTID'
	, Nummer int '$.properties.NUMMER'
	, Name nvarchar(max) '$.properties.NAME'
	, ALO_Total int '$.properties.AM_ALO_INSG_AA'
	, ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA'
) AS DAT;

Output (first 13 records):

ObjectId	Nummer	Name	ALO_Total	ALO_SGB2
1	211	Godorf	121	92
2	308	LÃ¶venich	156	83
3	307	Weiden	552	383
4	306	Junkersdorf	305	164
5	309	Widdersdorf	200	93
6	404	Vogelsang	310	208
7	505	Weidenpesch	527	351
8	502	Mauenheim	190	127
12	207	Hahnwald	15	3
13	213	Meschenich	644	554

The logic seems to work, however the German umlauts aren't displayed as expected ('LÃ¶venich', when it should have been 'Lövenich'). This is caused by the differences in character sets. An easy way to address this is to use a functions which does the conversion (see the dbo.ReplaceCodes2Umlauts UDF from an older post).

By applying the function on the Names, adding the further columns and an INSERT clause, the query becomes:

-- importing the JSON file
SELECT DAT.ObjectId 
, DAT.Nummer
, dbo.ReplaceCodes2Umlauts(DAT.Name) Name
, DAT.ALO_Total
, DAT.ALO_SGB2
, DAT.ALO_Under25
, DAT.ALO_Total_Perc
, DAT.ALO_SGB2_Perc
, DAT.ALO_Under25_Perc
, DAT.ALO_Total_Hist
, DAT.ALO_SGB2_Hist
, DAT.ALO_Under25_Hist
, DAT.ALO_Total_HistPerc
, DAT.ALO_SGB2_HistPerc
, DAT.ALO_Under25_HistPerc
, DAT.SVB_Total
, DAT.SVB_Men
, DAT.SVB_Women
, DAT.SVB_German
, DAT.SVB_AUSLAND
, DAT.SVB_Under25Yo
, DAT.SVB_Over55Yo
, DAT.SVB_Total_Perc
, DAT.SVB_Men_Perc
, DAT.SVB_Women_Perc
, DAT.SVB_German_Perc
, DAT.SVB_AUSLAND_Perc
, DAT.SVB_Under25Yo_Perc
, DAT.SVB_Over55Yo_Perc
, DAT.SVB_Total_Hist
, DAT.SVB_Men_Hist
, DAT.SVB_Women_Hist
, DAT.SVB_German_Hist
, DAT.SVB_AUSLAND_Hist
, DAT.SVB_Under25Yo_Hist
, DAT.SVB_Over55Yo_Hist
, DAT.SVB_Total_HistPerc
, DAT.SVB_Men_HistPerc
, DAT.SVB_Women_HistPerc
, DAT.SVB_German_HistPerc
, DAT.SVB_AUSLAND_HistPerc
, DAT.SVB_Under25Yo_HistPerc
, DAT.SVB_Over55Yo_HistPerc
, DAT.Shape_Area 
, DAT.Shape_Len 
INTO dbo.Unemployment_Cologne
FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$.features')
 WITH( 
	  ObjectId int '$.properties.OBJECTID'
	, Nummer int '$.properties.NUMMER'
	, Name nvarchar(max) '$.properties.NAME'
	, ALO_Total int '$.properties.AM_ALO_INSG_AA'
	, ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA'
	, ALO_Under25 int '$.properties.AM_ALO_UNTER25_AA'
	, ALO_Total_Perc float '$.properties.AM_ALO_INSG_AP'
	, ALO_SGB2_Perc float '$.properties.AM_ALO_SGB2_AP'
	, ALO_Under25_Perc float '$.properties.AM_ALO_UNTER25_AP'
	, ALO_Total_Hist int '$.properties.AM_ALO_INSG_HA'
	, ALO_SGB2_Hist int '$.properties.AM_ALO_SGB2_HA'
	, ALO_Under25_Hist int '$.properties.AM_ALO_UNTER25_HA'
	, ALO_Total_HistPerc float '$.properties.AM_ALO_INSG_HP'
	, ALO_SGB2_HistPerc float '$.properties.AM_ALO_SGB2_HP'
	, ALO_Under25_HistPerc float '$.properties.AM_ALO_UNTER25_HP'
	, SVB_Total int '$.properties.AM_SVB_INSG_AA'
	, SVB_Men int '$.properties.AM_SVB_MANN_AA'
	, SVB_Women int '$.properties.AM_SVB_FRAU_AA'
	, SVB_German int '$.properties.AM_SVB_DEUTSCH_AA'
	, SVB_AUSLAND int '$.properties.AM_SVB_AUSLAND_AA'
	, SVB_Under25Yo int '$.properties.AM_SVB_U25J_AA'
	, SVB_Over55Yo int '$.properties.AM_SVB_UEBER55J_AA'
	, SVB_Total_Perc float '$.properties.AM_SVB_INSG_AP'
	, SVB_Men_Perc float '$.properties.AM_SVB_MANN_AP'
	, SVB_Women_Perc float '$.properties.AM_SVB_FRAU_AP'
	, SVB_German_Perc float '$.properties.AM_SVB_DEUTSCH_AP'
	, SVB_AUSLAND_Perc float '$.properties.AM_SVB_AUSLAND_AP'
	, SVB_Under25Yo_Perc float '$.properties.AM_SVB_U25J_AP'
	, SVB_Over55Yo_Perc float '$.properties.AM_SVB_UEBER55J_AP'
	, SVB_Total_Hist int '$.properties.AM_SVB_INSG_HA'
	, SVB_Men_Hist int '$.properties.AM_SVB_MANN_HA'
	, SVB_Women_Hist int '$.properties.AM_SVB_FRAU_HA'
	, SVB_German_Hist int '$.properties.AM_SVB_DEUTSCH_HA'
	, SVB_AUSLAND_Hist int '$.properties.AM_SVB_AUSLAND_HA'
	, SVB_Under25Yo_Hist int '$.properties.AM_SVB_U25J_HA'
	, SVB_Over55Yo_Hist int '$.properties.AM_SVB_UEBER55J_HA'
	, SVB_Total_HistPerc float '$.properties.AM_SVB_INSG_HP'
	, SVB_Men_HistPerc float '$.properties.AM_SVB_MANN_HP'
	, SVB_Women_HistPerc float '$.properties.AM_SVB_FRAU_HP'
	, SVB_German_HistPerc float '$.properties.AM_SVB_DEUTSCH_HP'
	, SVB_AUSLAND_HistPerc float '$.properties.AM_SVB_AUSLAND_HP'
	, SVB_Under25Yo_HistPerc float '$.properties.AM_SVB_U25J_HP'
	, SVB_Over55Yo_HistPerc float '$.properties.AM_SVB_UEBER55J_HP'
	, Shape_Area float '$.properties."SHAPE.AREA"'
	, Shape_Len float '$.properties."SHAPE.LEN"'
) AS DAT;

Once the data made available, one can go on and discover the data and the relationships existing between the various columns.

Happy coding!

21 May 2020

💎🏭SQL Reloaded: Functions Useful in Data Migrations

Left and Right-Padding

Oracle has the LPAD and RPAD functions which return and expression, left-padded, respectively right-padded with the specified characters. The functions are useful in formatting unique identifiers to fit a certain format (e.g. 0000012345 instead of 12345). Here’re similar implementations of the respective functions:

-- left padding function 
CREATE FUNCTION dbo.LeftPadding( 
  @str varchar(50) 
, @length int  
, @padchar varchar(1))
RETURNS varchar(50)
AS 
BEGIN 
  RETURN CASE  
    WHEN LEN(@str)<@length THEN Replicate(@padchar, @length-LEN(@str)) + @str       
    ELSE @str  
    END 
END  

-- example left padding 
SELECT dbo.LeftPadding('12345', '10', '0')

-- right padding function 
CREATE FUNCTION dbo.RightPadding( 
  @str varchar(50) 
, @length int  
, @padchar varchar(1))
RETURNS varchar(50)
AS 
BEGIN 
  RETURN CASE  
    WHEN LEN(@str)<@length THEN @str + Replicate(@padchar, @length-LEN(@str))      
	ELSE @str  
  END
END  

-- example right padding 
SELECT dbo.RightPadding('12345', '10', '0')

Left and Right Side

When multiple pieces of data are stored within same same attribute, it’s helpful to get the left, respectively the right part of the string based on a given delimiter, where the reverse flag tells the directions in which the left is applied:

-- left part function 
CREATE FUNCTION dbo.CutLeft( 
  @str varchar(max) 
, @delimiter varchar(1)
, @reverse bit = 0)
RETURNS varchar(max)
AS 
BEGIN 
  RETURN CASE  
     WHEN CharIndex(@delimiter, @str)>0 THEN 
       CASE @reverse 
         WHEN 0 THEN Left(@str, CharIndex(@delimiter, @str)-1)    
         ELSE Left(@str, Len(@str)-CharIndex(@delimiter, Reverse(@str))) 
       END
	 ELSE @str  
  END
END  

-- example left part 
SELECT dbo.CutLeft('12345,045,000', ',', 0)
, dbo.CutLeft('12345,045,000', ',', 1)



-- right part function 
CREATE FUNCTION dbo.CutRight( 
  @str varchar(max) 
, @delimiter varchar(1)
, @reverse bit = 0)
RETURNS varchar(max)
AS 
BEGIN 
  RETURN CASE  
    WHEN CharIndex(@delimiter, @str)>0 THEN 
      CASE @reverse 
        WHEN 0 THEN Right(@str, CharIndex(@delimiter, Reverse(@str))-1) 
        ELSE Right(@str, Len(@str)-CharIndex(@delimiter, @str)) 
      END
   ELSE @str  
  END
END  

-- example right part 
SELECT dbo.CutRight('12345,045,000', ',', 0)
, dbo.CutRight('12345,045,000', ',', 1)

Replacing Special Characters

Special characters can prove to be undesirable in certain scenarios (e.g. matching values, searching). Until a “Replace from” function will be made available, the solution is to include the replacements in a user-defined function similar with the one below:

DROP FUNCTION IF EXISTS [dbo].[ReplaceSpecialChars]

CREATE FUNCTION [dbo].[ReplaceSpecialChars](
@string nvarchar(max)
, @replacer as nvarchar(1) = '-'
) RETURNS nvarchar(max)
-- replaces special characters with a given character (e.g. an empty string, space)
AS
BEGIN   
  IF CharIndex('*', @string) > 0  
     SET @string = replace(@string, '*', @replacer)	
    
  IF CharIndex('#', @string) > 0  
     SET @string = replace(@string, '#', @replacer)	
    
  IF CharIndex('$', @string) > 0  
     SET @string = replace(@string, '$', @replacer)	
    
  IF CharIndex('%', @string) > 0  
     SET @string = replace(@string, '%', @replacer)	
    
  IF CharIndex('&', @string) > 0  
     SET @string = replace(@string, '&', @replacer)	
    
  IF CharIndex(';', @string) > 0  
     SET @string = replace(@string, ';', @replacer)	
    
  IF CharIndex('/', @string) > 0  
     SET @string = replace(@string, '/', @replacer)	
    
  IF CharIndex('?', @string) > 0  
     SET @string = replace(@string, '?', @replacer)	
    
  IF CharIndex('\', @string) > 0  
     SET @string = replace(@string, '\', @replacer)	
    
  IF CharIndex('(', @string) > 0  
     SET @string = replace(@string, '(', @replacer)	
    
  IF CharIndex(')', @string) > 0  
     SET @string = replace(@string, ')', @replacer)	
    
  IF CharIndex('|', @string) > 0  
     SET @string = replace(@string, '|', @replacer)	
    
  IF CharIndex('{', @string) > 0  
     SET @string = replace(@string, '{', @replacer)	
    
  IF CharIndex('}', @string) > 0  
     SET @string = replace(@string, '}', @replacer)	
    
  IF CharIndex('[', @string) > 0  
     SET @string = replace(@string, '[', @replacer)	
    
  IF CharIndex(']', @string) > 0  
     SET @string = replace(@string, ']', @replacer)	
                                
  RETURN (LTrim(RTrim(@string)))
END

SELECT [dbo].[ReplaceSpecialChars]('1*2#3$4%5&6;7/8?9\10(11)12|13{14}15[16]', '')
SELECT [dbo].[ReplaceSpecialChars]('1*2#3$4%5&6;7/8?9\10(11)12|13{14}15[16]', ' ')

Other type of special characters are the umlauts (e.g. ä, ß, ö, ü from German language):

DROP FUNCTION IF EXISTS [dbo].[ReplaceUmlauts]

CREATE FUNCTION [dbo].[ReplaceUmlauts](
@string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces umlauts with their equivalent
AS
BEGIN   
  IF CharIndex('ä', @string) > 0  
     SET @string = replace(@string, 'ä', 'ae')	
    
  IF CharIndex('ö', @string) > 0  
     SET @string = replace(@string, 'ö', 'oe')	
    
  IF CharIndex('ß', @string) > 0  
     SET @string = replace(@string, 'ß', 'ss')	
    
  IF CharIndex('%', @string) > 0  
     SET @string = replace(@string, 'ü', 'ue')	
    
                                
  RETURN Trim(@string)
END

SELECT [dbo].[ReplaceUmlauts]('Hr Schrötter trinkt ein heißes Getränk')

Handling Umlauts

Another type of specials characters are the letter specific to certain languages that deviate from the Latin characters (e.g. umlauts in German, accents in French), the usage of such characters introducing further challenges in handling the characters, especially when converting the data between characters sets. A common scenario is the one in which umlauts in ISO-8891-1 are encoded using two character sets. Probably the easiest way to handle such characters is to write a function as follows:

-- create the function 
CREATE FUNCTION [dbo].[ReplaceCodes2Umlauts](
  @string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces ISO 8859-1 characters with the corresponding encoding 
AS
BEGIN   
  IF CharIndex('ÃŸ', @string) > 0  
     SET @string = replace(@string, 'ÃŸ', 'ß')	

  IF CharIndex('Ã¶', @string) > 0  
     SET @string = replace(@string, 'Ã¶', 'ö')	

  IF CharIndex('Ã–', @string) > 0  
     SET @string = replace(@string, 'Ã–', 'Ö')	
    
  IF CharIndex('Ã¼', @string) > 0  
     SET @string = replace(@string, 'Ã¼', 'ü')	

  IF CharIndex('Ãœ', @string) > 0  
     SET @string = replace(@string, 'Ãœ', 'Ü')	

  IF CharIndex('Ã¤', @string) > 0  
     SET @string = replace(@string, 'Ã¤', 'ä')

  IF CharIndex('Ã„', @string) > 0  
     SET @string = replace(@string, 'Ã„', 'Ä')
                                
  RETURN (LTrim(RTrim(@string)))
END


--test the function 
SELECT [dbo].[ReplaceCodes2Umlauts]('Falsches Ãœben von Xylophonmusik quÃ¤lt jeden grÃ¶ßeren Zwerg')

In the inverse scenario, at least for the German language is possible to replace the umlauts with a set the corresponding transliterations:

-- drop the function 
DROP FUNCTION IF EXISTS [dbo].[ReplaceUmlauts]

-- create the function 
CREATE FUNCTION [dbo].[ReplaceUmlauts](
  @string nvarchar(max)
) RETURNS nvarchar(max)
-- replaces umlauts with corresponding transliterations 
AS
BEGIN   
  IF CharIndex('ß', @string) > 0  
     SET @string = replace(@string, 'ß', 'ss')	

  IF CharIndex('ü', @string) > 0  
     SET @string = replace(@string, 'ü', 'ue')	

  IF CharIndex('ä', @string) > 0  
     SET @string = replace(@string, 'ä', 'ae')	
    
  IF CharIndex('ö', @string) > 0  
     SET @string = replace(@string, 'ö', 'oe')	
                                
  RETURN (LTrim(RTrim(@string)))
END

--test the function 
SELECT [dbo].[ReplaceUmlauts]('Falsches üben von Xylophonmusik quält jeden größeren Zwerg')

Notes:
The queries work also in SQL databases in Microsoft Fabric.

Happy coding!

27 October 2018

💎SQL Reloaded: Wish List (Part I: Replace From)

With SQL Server 2017 Microsoft introduced the Trim function, which not only replaces the combined use of LTrim and RTrim functions, but also replaces other specified characters from the start or end of a string (see my previous post):

-- Trim special characters 
SELECT Trim ('# ' FROM '# 843984 #') Example1
, Trim ('[]' FROM '[843984]') Example2

Output:
Example1 Example2
---------- --------
843984 843984

Similarly, I wish I had a function that replaces special characters from a whole string (not only the trails), for example:

-- Replace special characters 
SELECT Replace ('# ' FROM '# 84#3984 #', '') Example1
, Replace ('[]' FROM '[84][39][84]', '') Example2

Unfortunately, as far I know, there is no such simple function. Therefore, in order to replace the “]”, “[“ and “#” special characters from a string one is forced either to write verbose expressions like in the first example or to include the logic into a user-defined function like in the second:

-- a chain of replacements 
SELECT Replace(Replace(Replace('[#84][#39][#84]', '[' , ''), ']', ''), '#', '') Example1

-- encapsulated replacements
CREATE FUNCTION [dbo].[ReplaceSpecialChars](
  @string nvarchar(max)
, @replacer as nvarchar(1) 
) RETURNS nvarchar(max)
-- replaces the special characters from a string with a given replacer
AS
BEGIN   
  IF CharIndex('#', @string) > 0  
     SET @string = replace(@string, '#', @replacer) 
        
  IF CharIndex('[', @string) > 0  
     SET @string = replace(@string, '[', @replacer) 
    
  IF CharIndex(']', @string) > 0  
     SET @string = replace(@string, ']', @replacer) 
                                
  RETURN Trim(@string)
END

-- testing the function 
SELECT [dbo].[ReplaceSpecialChars]('[#84][#39][#84]', '') Example2

In data cleaning the list of characters to replace can get considerable big (somewhere between 10 and 30 characters). In addition, one can deal with different scenarios in which the strings to be replaced differ and thus one is forced to write multiple such functions.

To the list of special characters often needs to be considered also language specific characters like ß, ü, ä, ö that are replaced with ss, ue, ae, respectively oe (see also the post).

Personally, I would find such a replace function more than useful. What about you?

Happy coding!

SQL Troubles

Pages

20 October 2023

💎🏭SQL Reloaded: Extended LTrim/RTrim in SQL Server 2022 (Before and After)

01 April 2021

💎SQL Reloaded: Processing JSON Files with Complex Structure in SQL Server 2016+

21 May 2020

💎🏭SQL Reloaded: Functions Useful in Data Migrations

27 October 2018

💎SQL Reloaded: Wish List (Part I: Replace From)

About Me