03 January 2011

Pulling the Strings of SQL Server–Part V: Character Indexes

    A string is in fact a concatenation or chain of characters, each character in the string having a position referred as character’s index. In programming it’s useful to identify the first occurrence of a character or string in forward or backwards order, with a few simple tricks being possible to identify all the occurrences of the characters or whole strings in a string. Such functions are appearing in some of the well known programming languages under names like InStr, IndexOf for forward processing, respectively, InStrRev, LastIndexOf. SQL Server has only one function for this purpose, namely CharIndex, used to retrieve the position where a character is found starting from a given position. If no occurrence was found, the function returns 0, while if no start position if given, the default is 0. There are several other remarks that could be found in MSDN, and it’s actually always a good idea to consult the documentation from time to time, not only when meeting a new function. But let’s get back to the CharIndex function with a few examples based on searching a character within a given string:
-- searching a character into a string
DECLARE @string varchar(50)
SET @string = 'this is a string'

SELECT
CharIndex(' ', @string) Example1

, CharIndex(NULL, @string) Example2
, CharIndex(NULL, @string, NULL) Example3
,
CharIndex(NULL, NULL, NULL) Example4

CharIndex Example Set 1
   If in the first example the function is used to retrieve the first occurrence of a space in the given string, in the last three examples is highlighted the behavior of the function in case one of the parameters is NULL, in such cases the function returning a NULL – this means that when other behavior is expected, it falls in your attributions to handle the NULL values accordingly.

   Here’s a second set of examples based on the search of a string within another string. As can be seen there isn’t a difference at all in how the function is applied:
-- searching a string into a string
DECLARE @string varchar(50)
DECLARE @search varchar(50)

SET @string = 'this_is_another_string'

SET @search = 'is'

SELECT @string Example1 -- simple string

, CharIndex(@search, @string) Example2 -- occurring position

,
CharIndex(@search, @string, 5) Example3 -- second occurring position

CharIndex Example Set 2
   From the three examples, the third deserves some special attention because it attempts to retrieve the second occurrence of the string by using a hardcoded starting value. In fact the search could start a position further from the first occurrence, something like in the below example:
-- searching a string into a string in foward/reverse order
DECLARE @string varchar(50)
DECLARE @search varchar(50)

SET @string = 'this_was_a_string'

SET @search = '_'

SELECT @string Example1 -- simple string

, CharIndex(@search, @string) Example2 -- 1st occuring position (forward)

, CharIndex(@search, @string, CharIndex(@search, @string)+1) Example3 -- 2nd occuring position (forward)

, CharIndex(@search, @string, CharIndex(@search, @string, CharIndex(@search, @string)+1)+1) Example4 -- 3rd occuring position (forward)

CharIndex Example Set 3

    The logic could be repeated over and over again in order to retrieve the n-th position, list all the positions of occurrence or the total number of occurrences.

    Even if SQL Server doesn’t provide a function for retrieving the first occurrence in backward order, in other words starting from the end to the start, there is actually a simple trick that might do it. As SQL Server provides a Reverse function, which reverses the value of the string provided as parameter.
-- 1st occuring position backwards
DECLARE @string varchar(50)
DECLARE
@search varchar(50)

SET
@string = 'this_was_a_string'

SET
@search = '_'

SELECT
Len(@string) - CharIndex(@search, Reverse(@string))+1 Example


  Unfortunately the logic needed to retrieve the second and following occurrences starting from a given position it’s not so easy readable. Therefore, for the sake of “simplicity”, here’s the logic for reverse processing incorporated in CharIndexRev function:
-- reverse CharIndex
CREATE FUNCTION dbo.CharIndexRev(
@search varchar(50)
,
@string varchar(500)
,
@start int = 0)

RETURNS
int
AS
BEGIN
RETURN Len(@string) - CharIndex(@search, Reverse(@string), CASE WHEN IsNull(@start, 0) != 0 THEN Len(@string) - @start+1 ELSE 0 END)+1

END
And here’s the function at work:
-- searching a string into a string in reverse order
DECLARE @string varchar(50)
DECLARE
@search varchar(50)

SET
@string = 'this_was_a_string'

SET
@search = '_'

SELECT
dbo.CharIndexRev(NULL, @string, 0) Example1

,
dbo.CharIndexRev(@search, NULL, 0) Example2

,
dbo.CharIndexRev(NULL, NULL, 0) Example3

,
dbo.CharIndexRev(@search, @string, 0) Example4

,
dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1) Example5

,
dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, dbo.CharIndexRev(@search, @string, 0)-1)-1) Example6

CharIndex Example Set 4
    As can be seen, the output of the last three examples matches the reversed output of the previous set (11, 9, 5) vs. (5, 9, 11), while the first three examples exemplify the behavior of the function when one of the parameters is Null.

No comments: