SQL Troubles: trimming

Showing posts with label trimming. Show all posts

06 March 2025

💎🤖SQL Reloaded: Copilot Stories (Part IX: Working with the Stuff Function)

One of the underused string functions in SQL Server is Stuff, which returns part of a character, binary, text, or image expression. This post evaluates the answers provided by Microsoft 365 Copilot in the use of the respective function.

Disclaimer: The below code was generated by Copilot via prompts and is used for demonstration purposes!

Prompt:
Give me 10 scenarios with examples in which the Stuff function can be used in SQL Server.

Output:

/* Example 1: Inserting a Substring */

-- Insert a substring into a specific position within a string
SELECT STUFF('Hello World', 6, 0, 'SQL ') AS Result; -- Result: 'Hello SQL World'

/* Example 2: Replacing a Substring */

-- Replace a part of a string with another substring.
SELECT STUFF('Hello World', 7, 5, 'SQL Server') AS Result; -- Result: 'Hello SQL Server'

/* Example 3: Removing Characters */

-- Remove a specific number of characters from a string
SELECT STUFF('Hello World', 6, 5, '') AS Result; -- Result: 'Hello'

/* Example 4: Masking Sensitive Data */

-- Mask a credit card number
SELECT STUFF('1234-5678-9012-3456', 6, 9, '****-****') AS MaskedCard; -- Result: '1234-****-****-3456'

/* Example 5: Formatting Phone Numbers */

-- insert formatting characters into a phone number.
SELECT STUFF(STUFF('1234567890', 4, 0, '-'), 8, 0, '-') AS FormattedPhone; -- Result: '123-456-7890'

/* Example 6: Creating Initials */

-- extract initials from a full name.
SELECT STUFF(STUFF('John Doe', 2, 3, '.'), 5, 3, '.') AS Initials; -- Result: 'J.D.'

/* Example 7: Dynamic SQL Generation */

-- modify parts of a dynamic SQL query.
DECLARE @sql NVARCHAR(MAX) = 'SELECT * FROM Table WHERE Column = Value';
SELECT STUFF(@sql, CHARINDEX('Value', @sql), LEN('Value'), 'NewValue') AS ModifiedSQL; -- Result: 'SELECT * FROM Table WHERE Column = NewValue'

/* Example 8: URL Manipulation */

-- Insert or replace parts of a URL.
SELECT STUFF('http://example.com/page', 8, 0, 'www.') AS ModifiedURL; -- Result: 'http://www.example.com/page'

/* Example 9: Handling CSV Data */

-- Modify parts of a CSV string
SELECT STUFF('1,2,3,4,5', 3, 1, '10') AS ModifiedCSV; -- Result: '1,10,3,4,5'

/* Example 10: Custom String Formatting */ 

-- Insert custom formatting into a string
SELECT STUFF('20250305', 5, 0, '-') AS FormattedDate; -- Result: '2025-0305'

The first 6 examples are typical usages of the Stuff function while the other might raise the eyebrows of some developers.

Example 1: The example is not entirely correct (counting the characters starts at 1).

Example 2: In exchange, this example is correct.

Example 3: The example is unfortunately incorrect (same issue like in the first example). The technique can be useful when only one contiguous region is to be replaced once. Otherwise, one should use the Replace function.

Example 4: Starting with SQL Server 2016 one can use dynamic masking for similar purposes, which should provide more flexibility in many cases.

Example 5: Formatting phone number can be more complex than this because there can be different formats, country codes or any other special formats.

Example 6: Names can be quite complex, each part having different size.

Example 7: The example might work in certain cases, though it's enough for a field to contain the keyword "value" for the whole logic to generate errors.

Example 8: The example is valid, though web addresses can use also other transfer protocols (ftp, ftps, sftp, scp, etc.).

Example 9: A more reliable method is splitting the string in parts and replacing the value from the nth position then inserting a text when the length of the values is not known.

Example 10: Even if the technique is valid, the example isn't entirely valid even if it might be met in practice. See 20th example below for a better approach.

Prompt:
Give me 10 more scenarios with examples in which the Stuff function can be used in SQL Server.

Output:

/* Example 11: Reversing a string */

-- Reverse a string by replacing characters in a specific order.
SELECT STUFF(STUFF(STUFF('abcde', 1, 1, 'e'), 2, 1, 'd'), 3, 1, 'c') AS ReversedString; -- Result: 'edcba'

/* Example 12: Generating unique identifiers */

-- Modify parts of a GUID to create a custom identifier.
SELECT STUFF('123e4567-e89b-12d3-a456-426614174000', 1, 8, 'abcdef12') AS CustomGUID;-- Result: 'abcdef12-e89b-12d3-a456-426614174000'

/* Example 13: Combining columns */

-- Combine multiple columns into a single formatted string.
SELECT STUFF(FirstName + ' ' + LastName, LEN(FirstName) + 1, 0, ', ') AS FullName
FROM Employees;-- Result: 'John, Doe'

/* Example 14: Trimming strings */

-- Remove leading or trailing characters from a string.
SELECT STUFF('   Hello World   ', 1, 3, '') AS TrimmedString;-- Result: 'Hello World   '

/* Example 15: Creating abbreviations */

-- Generate abbreviations from a full string.
SELECT STUFF('North Atlantic Treaty Organization', 6, 22, 'ATO') AS Abbreviation;-- Result: 'NATO'

/* Example 16: Modifying JSON Data */

-- Insert or replace parts of a JSON string 
SELECT STUFF('{"name":"John","age":30}', 9, 4, 'Doe') AS ModifiedJSON;-- Result: '{"name":"Doe","age":30}'

/* Example 17: Handling XML Data */

-- Modify parts of an XML string.
SELECT STUFF('<name>John</name>', 7, 4, 'Doe') AS ModifiedXML;-- Result: '<name>Doe</name>'

/* Example 18: Generating random strings */

-- Insert random characters into a string.
SELECT STUFF('abcdef', 3, 0, CHAR(65 + ABS(CHECKSUM(NEWID())) % 26)) AS RandomString;-- Result: 'abXcdef' (X is a random character)

/* Example 19: Creating custom error messages  */

-- Insert dynamic values into error messages.
DECLARE @ErrorMessage NVARCHAR(100) = 'Error: Value {0} is not valid.';
SELECT STUFF(@ErrorMessage, CHARINDEX('{0}', @ErrorMessage), 3, '123') AS CustomErrorMessage;-- Result: 'Error: Value 123 is not valid.'

/* Example 20: Formatting dates */

-- Insert separators into a date string.
SELECT STUFF(STUFF('20250305', 5, 0, '-'), 8, 0, '-') AS FormattedDate; -- Result: '2025-03-05'

Example 11: Copilot got the use partially wrong given that each character except the pivoting character needs to be replaced accordingly. So for 6 characters one needs to replace the Stuff function 5 times!

Example 12: Implementing custom GUID is a process more complex than this as one needs to take care of not generating duplicates.

Example 13: This is an example on how to handle changes dynamically.

Example 14: One should use the Trim function whenever possible, respectively the combination LTrim and RTrim, if Trim is not available (it was introduced in SQL 2017).

Example 15: The example is incorrect. One should consider the length of the string in the formula.

Example 16: One must know the values in advance, otherwise the example doesn't hold. Moreover, the same issue like in the first example occurs.

Example 17: The example in not dynamic.

Example 18: It's an interesting technique for generating "random" characters given that unique values are generated across a dataset.

Example 19: One can write error messages with multiple placeholders, though the Replace function is simpler to use.

Example 20: It's easier to cast the value as datetime and apply the required formatting accordingly. Not testing whether the value is a date can lead to curious results.

I met some of the usages exemplified above, though I used the Stuff function seldom (see a previous post), when no other functions were available. Frankly, Copilot could prove to be a useful tool for learning SQL or other programming language in similar ways.

Happy coding!

Previous Post <<||>> Next Post

20 October 2023

💎🏭SQL Reloaded: Extended LTrim/RTrim in SQL Server 2022 (Before and After)

In SQL Server 2022, the behavior of LTrim (left trimming) and RTrim (right trimming) functions was extended with one more string parameter. When provided, the engine checks whether the first parameter starts (for LTrim), respectively ends (for RTrim) with the respective value and removes it, the same as the space character char(32) was removed previously:

-- prior behavior of LTrim/RTrim
DECLARE @text as nvarchar(50) = '  123  '
SELECT '(' + LTrim(@text) + ')' LeftTrimming
, '(' + RTrim(@text) + ')' RightTrimming
, '(' + Ltrim(RTrim(@text)) + ')' Trimming1 -- prior SQL Server 2017
, '(' + Trim(@text) + ')' Trimming2 -- starting with SQL 2017

LeftTrimming	RightTrimming	Trimming1	Trimming2
(123 )	( 123)	(123)	(123)

Here's the new behavior:

-- extended behavior of LTrim/LTrim (SQL Server 2022+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT LTrim(@text , '123') LeftTrimming
, RTrim(@text , 'abc') RightTrimming;

LeftTrimming	RightTrimming
abc123abc	123abc123

Previously, to obtain the same result one could write something like:

-- prior solution via Left/Right for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN Left(@text, 3) = '123' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN Right(@text, 3) = 'abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

-- prior solution via "LIKE" for the same (SQL Server 2000+)
DECLARE @text as nvarchar(50) = '123abc123abc'
SELECT CASE WHEN @text LIKE '123%' THEN Right(@text,Len(@text)-3) ELSE @text END LeftTrimming
, CASE WHEN @text LIKE '%abc' THEN Left(@text,Len(@text)-3) ELSE @text END  RightTrimming

As can be seen, the syntax is considerable simplified. However, there are few the situations when is needed. In the past I had to write code to remove parenthesis, quotes or similar characters:

-- removing parantheses
DECLARE @text as nvarchar(50) = '(testing)'
SELECT LTrim(@text , '(') LeftTrimming
, RTrim(@text , ')') RightTrimming
, RTrim(LTrim(Trim(@text), '('), ')') Trimming 

-- removing double quotes
DECLARE @text as nvarchar(50) = '"testing"'
SELECT LTrim(@text , '"') LeftTrimming
, RTrim(@text , '"') RightTrimming
, RTrim(LTrim(Trim(@text), '"'), '"') Trimming

The Trim for the 3rd value in both queries was used to remove the eventual spaces before the character to be replaced:

-- removing paranteses with lead/end spaces
SELECT RTrim(LTrim(Trim('   (testing)   '), '('), ')');

Then I thought, maybe I could use the same to remove the tags from an XML element. I tried the following code and unfortunately it doesn't seem to work:

-- attempting to remove the start/end tags from xml elements
DECLARE @text as nvarchar(50) = '<string>testing</string>'
SELECT LTrim(@text , '<string>') LeftTrimming
, RTrim(@text , '</string>') RightTrimming
, RTrim(LTrim(Trim(@text), '<string>'), '</string>') Trimming

LeftTrimming	RightTrimming	Trimming
esting</string>	<string>te	e

That's quite an unpleasant surprise! In exchange, the value type can be defined as XML and use the following code to obtain the needed result:

-- extracting the value from a tag element
DECLARE @text XML = '<string>testing</string>'
SELECT @text.query('data(/string)') as value

Notes:

The queries work also in SQL databases in Microsoft Fabric.

Happy coding!

27 October 2018

💎SQL Reloaded: Wish List (Part I: Replace From)

With SQL Server 2017 Microsoft introduced the Trim function, which not only replaces the combined use of LTrim and RTrim functions, but also replaces other specified characters from the start or end of a string (see my previous post):

-- Trim special characters 
SELECT Trim ('# ' FROM '# 843984 #') Example1
, Trim ('[]' FROM '[843984]') Example2

Output:
Example1 Example2
---------- --------
843984 843984

Similarly, I wish I had a function that replaces special characters from a whole string (not only the trails), for example:

-- Replace special characters 
SELECT Replace ('# ' FROM '# 84#3984 #', '') Example1
, Replace ('[]' FROM '[84][39][84]', '') Example2

Unfortunately, as far I know, there is no such simple function. Therefore, in order to replace the “]”, “[“ and “#” special characters from a string one is forced either to write verbose expressions like in the first example or to include the logic into a user-defined function like in the second:

-- a chain of replacements 
SELECT Replace(Replace(Replace('[#84][#39][#84]', '[' , ''), ']', ''), '#', '') Example1

-- encapsulated replacements
CREATE FUNCTION [dbo].[ReplaceSpecialChars](
  @string nvarchar(max)
, @replacer as nvarchar(1) 
) RETURNS nvarchar(max)
-- replaces the special characters from a string with a given replacer
AS
BEGIN   
  IF CharIndex('#', @string) > 0  
     SET @string = replace(@string, '#', @replacer) 
        
  IF CharIndex('[', @string) > 0  
     SET @string = replace(@string, '[', @replacer) 
    
  IF CharIndex(']', @string) > 0  
     SET @string = replace(@string, ']', @replacer) 
                                
  RETURN Trim(@string)
END

-- testing the function 
SELECT [dbo].[ReplaceSpecialChars]('[#84][#39][#84]', '') Example2

In data cleaning the list of characters to replace can get considerable big (somewhere between 10 and 30 characters). In addition, one can deal with different scenarios in which the strings to be replaced differ and thus one is forced to write multiple such functions.

To the list of special characters often needs to be considered also language specific characters like ß, ü, ä, ö that are replaced with ss, ue, ae, respectively oe (see also the post).

Personally, I would find such a replace function more than useful. What about you?

Happy coding!

26 October 2018

💎🏭SQL Reloaded: Trimming Strings (Before and After)

One of the annoying things when writing queries is the repetitive lengthy expressions that obfuscate in general the queries making them more difficult to read, understand and troubleshoot, and sometimes such expressions come with a performance penalty as well. Loading data from Excel, text files and other sources involving poorly formatted data often requires trimming (all) the text values. In the early versions of SQL Server, the equivalent of a Trim function was obtained by using the combined LTrim and RTrim functions. This resumed in writing code like this (based on AdventureWorks 2014 database):

-- trimming via LTrim, RTrim 
SELECT LTrim(RTrim(AddressLine1)) AddressLine1
, LTrim(RTrim(AddressLine2)) AddressLine2
, LTrim(RTrim(City)) City
, LTrim(RTrim(PostalCode)) PostalCode
FROM Person.Address

This might not look much though imagine you have to deal with 30-50 text attributes, that the code is not written in a readable format (e.g. the way is stored in database), that some attributes require further processing (e.g. removal of special characters, splitting, concatenating).

Often developers preferred encapsulating the call to the two functions within a user-defined function:

-- Trim user-defiend function
CREATE FUNCTION dbo.Trim(
@string nvarchar(max))
RETURNS nvarchar(max)
BEGIN
    RETURN LTrim(RTrim(@string))
END

With it the code is somehow simplified, but not by much and includes the costs of calling a user-defined function:

-- trimming via dbo.Trim
SELECT dbo.Trim(AddressLine1) AddressLine1
, dbo.Trim(AddressLine2) AddressLine2
, dbo.Trim(City) City
, dbo.Trim(PostalCode) PostalCode
FROM Person.Address

In SQL Server 2017 was introduced the Trim function which not only replaces the combined use of LTrim and RTrim functions, but it also allows to replace other specified characters (including CR, LF, Tab) from the start or end of a string.

By default the function removes the space from both sides of a string:

-- trimming via Trim
SELECT Trim(AddressLine1) AddressLine1
, Trim(AddressLine2) AddressLine2
, Trim(City) City
, Trim(PostalCode) PostalCode
FROM Person.Address

When a set of characters is provided the function removes the specified characters:

SELECT Trim ('#' FROM '# 843984') Example1
, Trim ('[]' FROM '[843984]') Example2
, Trim ('+' FROM '+49127298000') Example3
, Trim ('+-' FROM '+ 49-12729-8000 ') + ';' Example4
, Trim ('+ ' FROM '+ 49-12729-8000 ') + ';' Example5
, ASCII(Left(Trim (char(13) FROM char(13) + '49127298000'), 1)) Example6

Output:
Example1   Example2     Example3        Example4            Example5            Example6
--------          --------          ------------           -----------------       -----------------        -----------
843984      843984        49127298000   49-12729-8000 ; 49-12729-8000;    52

As can be seen, when is needed to remove other characters together with the space then is needed to include the space in the list of characters.

Notes:
1) The dbo.Trim function can be created in SQL Server 2017 environments as well.
2) The collation of the database will affect the behavior of Trim function, therefore the results might look different when a case sensitive collection is used.

3) The queries work also in SQL databases in Microsoft Fabric.

Happy coding!

07 January 2011

💎🏭SQL Reloaded: Pulling the Strings of SQL Server IV (Spaces, Trimming, Length and Comparisons)

In the previous post on concatenation, I was talking about the importance of spaces and other delimiters in making concatenations’ output more “readable”. Excepting their importance in natural language, the spaces have some further implication in the way strings are stored and processed. As remarked in the introductory post from this topic, there are two types of spaces that stand out in the crowds of spaces, namely the trailing spaces, the spaces found at the right extremity of a string, respectively the leading spaces, the spaces found at the left extremity of a string.

Are few the cases when the two trailing space are of any use, therefore databases like SQL Server usually ignore them. The philosophy about leading space is slightly different because there are cases in which they are used in order to align the text to the right, however there are tools which are cutting off the leading spaces. When no such tools are available or any of the two types of spaces are not cut off, then we’ll have do to it ourselves, and here we come to the first topic of this post, trimming.

Trimming

Trimming is the operation of removing the empty spaces found at the endings of a string. Unlike other programming languages which use only one function for this purpose (e.g. Trim function in VB or Oracle), SQL Server makes use of two functions used for this purpose, LTrim used to trim the spaces found at the left ending of the string, respectively RTrim, used to trim the spaces found at the right ending of the string.

-- trimming a string 
SELECT  LTrim(' this is a string ') Length1 -- left trimming 
, RTrim(' this is a string ') Length2 --right trimming 
, LTrim(RTrim(' this is a string ')) Length2 --left & right trimming

As can be seen it’s not so easy to identify the differences, maybe the next function will help to see that there is actually a difference.

Note:
1) If it looks like the two trimming functions are not working with strings having leading or trailing spaces, then maybe you are not dealing with an empty character but rather with other characters like CR, LF, CRLF or other similar characters, rendered sometimes like an empty character.

2) In SQL Server 2017 was introduced the Trim function which not only replaces the combined use of LTrim and RTrim functions, but it allows to replace other specified characters (including CR, LF, Tab) from the start or end of a string. (see post)

Length

Before approaching other operations with strings, it’s maybe useful (actually necessary as we will see) to get a glimpse of the way we can determine the length of a string value, in other words how many characters it has, this being possible by using the Len function:

-- length of a string 
SELECT Len('this is a string') Length1 -- simple string 
, Len('this is a string ') Length2 --ending in space 
, Len(' this is a string') Length3 --starting with a space 
, Len(' this is a string ') Length4 --starting &amp; ending with a space 
, Len(LTrim(' this is a string ')) Length5 --length & left trimming 
,Len(RTrim(' this is a string ')) Length5 --length & right trimming

,Len(LTrim(RTrim(' this is a string '))) Length5 --length, left & right trimming

In order to understand the above results, one observation is necessary: if a strings ends in with one or more empty characters, the Len function ignores them, though this doesn’t happen with the leading empty characters, they needing to be removed explicitly if needed.

Comparisons

The comparison operation points the differences or similarities existing between two data types, involving at minimum two expressions that reduce at runtime to a data type and the comparison operator. This means that each member of comparison could include any valid combinations of functions as long they are reduced to compatible data types. In what concerns the comparison of strings, things are relatively simple, the comparison being allowed independently on whether they have fix or varying length. Relatively simple because if we’d have to go into details, then we’d need to talk about character sets (also called character encoding or character maps) and other string goodies the ANSI SQL standard(s) are coming with, including a set of rules that dictate the behavior of comparisons. So, let’s keep things as simple as possible. As per above attempt of definition, a comparison implies typically an equality, respectively difference, based on equal (“=”), respectively not equal (“<>” or “!=”). Here are some simple examples:

-- sample comparisons 
SELECT CASE WHEN 'abc' != 'abc ' THEN 1 ELSE 0 END Example1 
, CASE WHEN ' abc' != 'abc' THEN 1 ELSE 0 END Example2 
, CASE WHEN ' ' != '' THEN 1 ELSE 0 END Example3 
-- error comparison , CASE WHEN 'abc' != NULL THEN 1 ELSE 0 END Example4 
, CASE WHEN 'abc' = NULL THEN 1 ELSE 0 END Example5 
-- adequate NULL comparison , CASE WHEN 'abc' IS NOT NULL THEN 1 ELSE 0 END Example6  
, CASE WHEN 'abc' IS NULL THEN 1 ELSE 0 END Example7

Output:

Example1	Example2	Example3	Example5	Example7
0	1	0	0	0

The first three examples are demonstrating again the behavior of leading, respectively trailing spaces. The next two examples, even if they seem quite logical in terms of natural language semantics, they are wrong from the point of view of SQL semantics, and this because the comparison of values in which one of them is NULL equates to a NULL, thus resulting the above behavior in which both expressions from the 4th and 5th example equate to false. The next two examples show how the NULLs should be handled in comparisons with the help of IS operator, respectively it’s negation – IS NOT.

Like in the case of numeric values, the comparison between two strings could be expressed by using the “less than” (“<;”) and “greater than” (“?”) operators, alone or in combination with the equality operator (“<=”, “>=”) or the negation operator (“!>”, “<!”) (see comparison operators in MDSN). Typically an SQL Server database is case insensitive, so there will be no difference between the following strings: “ABC”, “abc”, “Abc”, etc. Here are some examples:

-- sample comparisons (case sensitive) 
SELECT CASE WHEN 'abc' < 'ABC' THEN 1 ELSE 0 END Example1 
, CASE WHEN 'abc' > 'abc' THEN 1 ELSE 0 END Example2 
, CASE WHEN 'abc' >= 'abc ' THEN 1 ELSE 0 END Example3 
, CASE WHEN 'abc' <> 'ABC' THEN 1 ELSE 0 END Example4 
, CASE WHEN 'abc' > '' THEN 1 ELSE 0 END Example5 
, CASE WHEN ' ' > '' THEN 1 ELSE 0 END Example6

Output:

Example1	Example2	Example3	Example4	Example5	Example6
0	0	1	0	1	0

The case sensitivity could be changed at attribute, table or database level. As we don’t deal with a table and the don’t want to complicate too much the queries, let’s consider changing the sensitivity at database level. So if you are using a non-production database, try the following script in order to enable, respectively to disable the case sensitivity:

--enabling case sensitivity for a database 
ALTER DATABASE <database name>  
COLLATE Latin1_General_CS_AS  

--disabling case sensitivity for a database 
ALTER DATABASE <database name> 
COLLATE Latin1_General_CI_AS

In order to test the behavior of case sensitivity, enable first the sensitivity and then rerun the previous set of example (involving case sensitivity).
Output:

Example1	Example2	Example3	Example4	Example5	Example6
1	0	1	1	1	0

After that you could disable again the case sensitivity by running the last script. Please note that if your database has other collation, you’ll have to change the scripts accordingly in order to point to your database’s collation.

Notes:
The queries work also in SQL databases in Microsoft Fabric.

Happy coding!

SQL Troubles

Pages

06 March 2025

💎🤖SQL Reloaded: Copilot Stories (Part IX: Working with the Stuff Function)

20 October 2023

💎🏭SQL Reloaded: Extended LTrim/RTrim in SQL Server 2022 (Before and After)

27 October 2018

💎SQL Reloaded: Wish List (Part I: Replace From)

26 October 2018

💎🏭SQL Reloaded: Trimming Strings (Before and After)

07 January 2011

💎🏭SQL Reloaded: Pulling the Strings of SQL Server IV (Spaces, Trimming, Length and Comparisons)

About Me