-- Trim special characters SELECT Trim ('# ' FROM '# 843984 #') Example1 , Trim ('[]' FROM '[843984]') Example2
Example1 Example2
---------- --------
843984 843984
Similarly, I wish I had a function that replaces special characters from a whole string (not only the trails), for example:
-- Replace special characters SELECT Replace ('# ' FROM '# 84#3984 #', '') Example1 , Replace ('[]' FROM '[84][39][84]', '') Example2
Unfortunately, as far I know, there is no such simple function. Therefore, in order to replace the “]”, “[“ and “#” special characters from a string one is forced either to write verbose expressions like in the first example or to include the logic into a user-defined function like in the second:
-- a chain of replacements SELECT Replace(Replace(Replace('[#84][#39][#84]', '[' , ''), ']', ''), '#', '') Example1 -- encapsulated replacements CREATE FUNCTION [dbo].[ReplaceSpecialChars]( @string nvarchar(max) , @replacer as nvarchar(1) ) RETURNS nvarchar(max) -- replaces the special characters from a string with a given replacer AS BEGIN IF CharIndex('#', @string) > 0 SET @string = replace(@string, '#', @replacer) IF CharIndex('[', @string) > 0 SET @string = replace(@string, '[', @replacer) IF CharIndex(']', @string) > 0 SET @string = replace(@string, ']', @replacer) RETURN Trim(@string) END -- testing the function SELECT [dbo].[ReplaceSpecialChars]('[#84][#39][#84]', '') Example2
In data cleaning the list of characters to replace can get considerable big (somewhere between 10 and 30 characters). In addition, one can deal with different scenarios in which the strings to be replaced differ and thus one is forced to write multiple such functions.
To the list of special characters often needs to be considered also language specific characters like ß, ü, ä, ö that are replaced with ss, ue, ae, respectively oe (see also the post).
Personally, I would find such a replace function more than useful. What about you?
Happy coding!
No comments:
Post a Comment