30 March 2011

🔹SQL Server: Full-Text Catalog (Definitions)

"A catalog that stores a database's full-text index." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The full-text catalog stores all of the full-text indexes for tables within a database." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A Full-Text catalog is a special storage space used to house Full-Text indexes. By default, all Full-Text indexes are housed in a single catalog." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A special storage space used to house Full-Text indexes. By default, all Full-Text indexes are housed in a single catalog." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"A full-text catalog is a logical grouping of SQL Server full-text indexes for management purposes." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"A collection of full-text index components and other files that are organized in a specific directory structure and contain the data that is needed to perform queries." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A full-text catalog is a logical grouping of SQL Server full-text indexes for management purposes." (Jay Natarajan et al, "Pro T-SQL 2012 Programmer's Guide" 3rd Ed., 2012)

"A logical grouping of SQL Server full-text indexes for management purposes." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed., 2015)

🔹SQL Server: Filegroup (Definitions)

"A named collection of one or more database files that forms a single unit of allocation and administration. Filegroups enable the creation of objects in a specific place - for example, placing a heavily accessed table on a very fast drive. They also provide the ability to back up specific objects." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A named collection of one or more files that forms a single unit of allocation or for administration of a database. Filegroups allows you to engage in advanced administration techniques including the specific storage of particular objects on particular hard drives." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A named collection of one or more disk files that represent a single allocation on or for administration of a database. A database can have more than one filegroup." (Allan Hirt et al, "Microsoft SQL Server 2000 High Availability", 2004)

"In SQL Server, a named collection of one or more files that forms a single unit of allocation. Also for administration of a database." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"This is a logical grouping of files used to segregate database objects for storage and performance reasons." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A logical means of controlling the placement of database objects on a file or set of files." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server 2005 Optimization and Maintenance 70-444", 2007)

"A method of optimizing the performance of a database by controlling the placement of database files and database objects. By default, all data and database objects are placed into a single file, in a single filegroup. Unlike files, which can be viewed on the disk, a filegroup is conceptual." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Collections of SQL Server data files. For performance and administrative reasons, you can place user objects into dedicated filegroups." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A named collection of one or more data files that forms a single unit of data allocation or for administration of a database." (Microsoft, "SQL Server 2012 Glossary", 2012)

25 March 2011

🧭Business Intelligence: Troubleshooting (Part II: Approaching a Query)

Business Intelligence Series


You received a (long) query for troubleshooting, reviewing, conversion or any similar tasks. In addition, you don’t know much about the underlying table structure or business logic. So, what do you do then? For sure two things are intuitively clear: you don’t need to panic and, understanding the query may help you in your task. Understanding the query, it seems such a simple statement, though there is more to it. Here are some points on how to approach a query.

State your problem
“A problem well stated is a problem half solved” (Charles F. Kettering). Before performing any work, check what’s requested from you, whether you are having the information required for the task(s) ahead, for example documentation, valid examples, all code, etc. If something is missing, don’t hesitate to request all the information you need. While waiting for information, you can continue with next steps. As we don’t live in a perfect world, there will be also cases in which you’ll have to fill the gaps by yourself by performing additional research/work. When troubleshooting is important to understand what’s wrong and, when possible, have data against which to compare query’s output.

Save the work

Even if you are having a copy of the query somewhere on the server, save the previous version of the query and, when possible, use versioning. It might seem a redundant task, however the fact is that you never know when you need to refer to it and, as you’ll see next, it can/should be used as a baseline for validating the changes. In case you haven’t saved the query, check whether your RDBMS is tracking metadata about the queries run and, if the metadata were not reset in the meantime, you might be lucky enough to find a copy of your query.

I found that is important to save the daily work, the various analysis performed in order to understand a query, the various versions and even the data used for testing. All this work could help you letter to review what you made, the steps you missed, you can reuse one of the queries for further work, etc.

Break down

When the query is too complex, it could be useful to break the query into chunks that could be run and understood in isolation. Typically such chunks derive from query’s structure (e.g. inline queries, subqueries derived from unions). I found that often, focusing only a chunk of a query help isolating issues.


Many programmers still write queries using the old non-ANSI joining syntax in which the join constraints appear in the WHERE clause, making the understanding and troubleshooting of a query more difficult. Often I found myself in the position of transforming first a query to ANSI SQL syntax, before performing further work on it. It’s actually a good occasion to gain a first understanding of query’s structure, but I’d prefer not to do it so often. In addition, during restructure phase it makes sense to differentiate between the join and filter constraints, this helping isolating the issue(s).

Check cardinalities

Wrong join constraints lead to duplicates or fewer records than expected, such differences being difficult to track when the variances in the numbers of records are quite small. Even if RDBMS come in developers’ help by providing metadata about the join relations, the columns and predicates participating in a join are not always so easy to identify. Therefore, in order to address this issue, it’s needed to check the constraints between any two tables between participating in a join. Sometimes, when the query is based on the table with the lowest level of detail, it can be enough to check the variations of the number of records.

Check filter constraints

Filter constraints are maybe more difficult to identify, especially when is needed to reengineer the logic built in applications. Many of the filter constraints are logical, though when you have no documentation about the schemas, is like rambling in the dark, having to check real examples and identify the various values and the impact they have on the behavior of your report.

Validate changes
So, you made the changes, everything looks perfect. Is it so? Often your intuition might tell you that the logic of a query is correct, though as software is not based on magic, at least not all the time, check some of the records to assure that the data are rendered as expected, check totals, compare the current with previous version, identify variations, etc. You don’t need to use all the technique you know, but to choose the best and minimal set of tools that allows you to validate the query.

Perform refactoring
Refactoring, the way to (continuous) code improvement, should become part of each developer’s philosophy about programming. A query, as any other piece of code, is rarely perfect as technical and factual knowledge is relative, features get deprecated and new techniques are introduced. On the other side, there is an old saying in IT – don’t change something that’s already working, so, there should be kept a balance between the two – the apparent and needed for change.

I hope it’s not the case to stress the importance of documentation. From versioning to logic description, it’s a good practice to document the important parts of your work, especially the parts that will facilitate later work.

🔹SQL Server: Application Role (Definitions)

"A SQL Server role created to support the security needs of an application. Such a role is activated by a password and the use of the spsetapprole system stored procedure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A SQL Server role created to support the security needs of an application. Using application roles is an alternative to allowing users access to SQL Server 2000. You can create an application role and assign it to a particular application, allowing users who use the application to access SQL Server." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"This is a special type of role that requires activation using the sp_setapprole stored procedure. This is primarily used to keep users from accessing a database with anything other than a custom application." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A SQL Server role used by the application, instead of the user, to authenticate against a database solution." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A SQL Server role used by the application, instead of the user, to authenticate against a database solution." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A SQL Server role created to support the security needs of an application." (Microsoft, "SQL Server 2012 Glossary", 2012)

11 March 2011

💎SQL Reloaded: Pulling the Strings of SQL Server VIII (Insertions, Deletions and Replacements)

Until now, the operations with strings resumed to concatenation and its reverse operation(s) - extracting a substring or splitting a string into substrings. It was just the warm up! There are several other important operations that involve the internal manipulation of strings – insertion, deletion and replacement of a substring in a given string, operations performed using the Replace and Stuff functions.

Replace function, as its name denotes, replaces all occurrences of a specified string value with another. Several scenarios in which the function is quite useful: the replacement of delimiters, special characters, correcting misspelled words or any other chunks of text. Here are some basic simple examples, following to consider the before mentioned applications in other posts:

-- examples with replace 
DECLARE @str varchar(30) 
SET @str = 'this is a test string' 
SELECT replace(@str, ' ', ',') Example1 
, replace(@str, ' ', ' ') Example2 
, replace(@str, ' ', '') Example3 
, replace(@str, 'is', 'as') Example4  
Example1 Example2 Example3 Example4
this,is,a,test,string this is a test string thisisateststring thas as a test string

When there are good chances that the searched string won’t appear in the “searched” string, and especially when additional logic is depending on the replacement, logic that could be included in the same expression with the replacement, then maybe it makes sense to check first if the searched character is present:

-- replacement with check 
DECLARE @str varchar(30) 
DECLARE @search varchar(30) 
DECLARE @replacememt varchar(30) 
SET @str = 'this is a test string' 
SET @search = 'this string' 
SET @replacememt = 'other string' 
SELECT CASE            
    WHEN CharIndex(@search, @str)>0 THEN Replace(@str, @search, @replacememt)             
    ELSE @str        
END result 

Unfortunately the function doesn’t have the flexibility of the homonym functions provided by the languages from the family of VB (VBScript, VB.NET), which allow to do the replacement starting with a given position, and/or for a given number of occurrences. This type of behavior could be obtained with a simple trick – splitting the string into two other strings, performing the replacement on the second string, and then concatenating the first string and the result of the replacement:

-- replacement starting with a given position 
DECLARE @str varchar(30) 
DECLARE @search varchar(30) 
DECLARE @replacememt varchar(30) 
DECLARE @start int 
SET @str = 'this is a test string' 
SET @search = 's' 
SET @replacememt = 'x' 
SET @start = 7 
SELECT Left(@str, @start-1) FirstPart 
, RIGHT(@str, Len(@str)-@start+1) SecondPart 
, CASE        
    WHEN @start <= LEN(@str) THEN Left(@str, @start-1) + Replace(RIGHT(@str, Len(@str)-@start+1), @search, @replacememt)        
     ELSE @str  
END Replacement 
FirstPart SecondPart Replacement
this i s a test string this ix a text xtring

The logic can be encapsulated in a function together with additional validation logic.

Stuff function inserts a string into another string starting with a given position and deleting a specified number of characters. Even if seldom used, the function it’s quite powerful allowing to insert a string in another, to remove a part of a string or more general, to replace a single occurrence of a string with another string, as can be seen from the below examples:
-- Stuff-based examples 
DECLARE @str varchar(30) 
SET @str = 'this is a test string' SELECT STUFF(@str, 6, 2, 'was ') Example1 , STUFF(@str, 1, 0, 'and ') Example2 , STUFF(@str, 1, 0, 'that') Example3 , STUFF(@str, LEN(@str) + 1, 0, '!') Example4
Example1 Example2 Example3 Example4
this was a test string and this is a test string thatthis is a test string NULL

If in the first example is done a replacement of a text from a fix position, in the next examples are attempted insert on a first, middle respectively end position. As can be seen, the last example doesn’t work as expected, this because the insert position can’t go over the length of the target string. Actually, if the insert needs to be done at the beginning, respectively the end of a string, a concatenation can be much easier to use. A such example is the padding of strings with leading or trailing characters, typically in order to arrive to a given length. SQL Server doesn’t provide such a function, however the function is quite easy to build.
-- left/right padding 
DECLARE @str varchar(30) 
DECLARE @length int  
DECLARE @padchar varchar(1) 
SET @str = '12345'  
SET @length = 10 
SET @padchar = '0' 
SELECT @str StringToPad  
, CASE  
     WHEN LEN(@str)<@length THEN Replicate(@padchar, @length-LEN(@str)) + @str       
     ELSE @str  
END LeftPadding  
, CASE  
     WHEN LEN(@str)<@length THEN @str + Replicate(@padchar, @length-LEN(@str))      
     ELSE @str  
END RightPadding 
StringToPad LeftPadding RightPadding
12345 0000012345 1234500000

The queries work also in SQL databases in Microsoft Fabric.

Happy Coding!

24 February 2011

💎SQL Reloaded: Pulling the Strings of SQL Server VI (Subparts of a String)

No matter how normalized a database is, there will always be the need to encode multiple semantic entities in a string, needing thus to extract them later. For example data like the Street Name and Number, or the Country Phone Prefix and Phone Number, the First and the Last Name, etc. Another frequent scenario is the misuse of a long-sized string data types to store multiple delimited attributes or simply a whole text like a Comment or Description. The easiest scenarios to deal with are the ones when you know the rules behind your encoding, preferably dealing with a fix length encoding or only a given delimiter. An example of fix length encoding is the IBAN Number, the country specific VAT Number or any other artificial constructed/standardized encoding. According to Wikipedia, IBAN (International Bank Account) Number consists of  ISO 3166-1 alpha-2 country code, followed by two check digits that are calculated using a mod-97 technique, and Basic Bank Account Number (BBAN) with up to thirty alphanumeric characters. Taking the IBAN example provided for Switzerland, it could be stored as “CH93 0076 2011 6238 5295 7” or “CH9300762011623852957”, in either case when the country is not provided explicitly it would be interesting to extract it from the IBAN together with the BBAN. How would we do that?

Many of the programming languages I worked with provide a function for extracting a substring from a string - Mid (VB/VBScript), Substring in C# Substr in Oracle, etc. SQL Server is actually providing three functions for this purpose: Substring, Left, respectively Right. The Substring function extracts a substring of a given length starting from a given position. The Left and Right functions return, as their name suggests, the left part, respectively the right part of a character string with the specified number of characters. The use of Left and Right functions seems to be redundant, as they are a particular case of Substring, however they can simplify sometimes the code, as can be seen from below example based on the above IBAN example.

-- parsing a VAT Number 
DECLARE @VAT varchar(50) 
SET @VAT = 'CH9300762011623852957' 
, LEFT(@VAT, 2) CountryCode1 
, SUBSTRING(@VAT, 1, 2) CountryCode2 
, SUBSTRING(@VAT, 3, 2) CheckDigits 
, RIGHT(@VAT, Len(@VAT)-4) BBAN1 
, SUBSTRING(@VAT, 5, Len(@VAT)-4) BBAN2 
, CASE  
     WHEN LEN(@str)<@length THEN Replicate(@padchar, @length-LEN(@str)) + @str       
     ELSE @str  
END LeftPadding  
, CASE  
     WHEN LEN(@str)<@length THEN @str + Replicate(@padchar, @length-LEN(@str))      
     ELSE @str  
END RightPadding 

Substrings Example 1

Even if the IBAN has a variable-length component (the BBAN) given the fact that the other two components are fixed, this allows us to clearly extract each component. The example shows also the equivalent call of Substring function for Left (Country Code extraction), respectively Right (BBAN extraction).

What happens if there are more than one variable-length components? For such scenarios it’s useful to introduce a delimiter, it could be a comma, a dash, space, etc. It’s the case of a First and Last Name stored in the same attribute. Normally only one component qualifies as Last Name, and for simplicity let’s consider it as being stored first and space as delimiter. In order to identify the components, it’s enough to identify the first occurrence of the delimiter by using the CharIndex function.
-- parsing a Person's Name 
DECLARE @Name varchar(50) 
SET @Name = 'Stone Sharon' 
SELECT @Name Name 
, LEFT(@Name, CHARINDEX(' ', @Name)) LastName 
, RIGHT(@Name, LEN(@Name) - CHARINDEX(' ', @Name)) FirstName 
Name LastName FirstName
Stone Sharon Stone Sharon

The code for cutting the left, respectively right part of a string is pretty simple and over the years I used it quite often, so it makes sense to encapsulate it in a function., like I did in an older post.

When a delimiter is used repeatedly in a string, normally we need to identify each component in the string. Sometimes the number of components can be given, other times not. For this purpose can be used a common table expression, and here is another example in which the space is used as delimiter – extracting the words from a given sentence.

-- extracting the words from a sentence 
DECLARE @str nvarchar(100) 
SET @str = 'This is just a test' 
;WITH CTE (PrevString, Position, Word) 
AS ( 
    SELECT LTrim(RTrim( CASE  
         WHEN CharIndex(' ', @str)>0 THEN Right(@str, Len(@str)-CharIndex(' ', @str)) 
         ELSE '' 
     END)) PrevString 
, 1 Position  
, LTrim(RTrim(CASE  
     WHEN CharIndex(' ', @str)>0 THEN LEFT(@str, CharIndex(' ', @str)) 
     ELSE @str 
END)) Word 
     WHEN CharIndex(' ', PrevString)>0 THEN Right(PrevString, Len(PrevString)-CharIndex(' ', PrevString)) 
     ELSE '' 
END)) PrevString 
, Position + 1 Position  
, LTrim(RTrim(CASE  
     WHEN CharIndex(' ', PrevString)>0 THEN LEFT(PrevString, CharIndex(' ', PrevString)) 
     ELSE PrevString 
END)) Word 
WHERE Len(PrevString)>0 
SELECT PrevString 
, Word  
, Position 
OPTION (maxrecursion 100)
PrevString Word Position
is just a test This 1
just a test is 2
a test just 3
test a 4
test 5

The logic works for a sentence, and if we ignore the fact that some punctuation signs are appearing at the end of the words, it might work  as well for a whole phrase, considering that the punctuation signs can be replaced from the end result. It would be useful for example to generalize the logic for a set of delimiters, in this case the other punctuation signs (e.g. “,”, “;”, “!”, etc.), however this would mean to identify which of the delimiters is used first or to apply the the same logic for the first delimiter, then for the second and so on. In addition, if the number of encoded elements within a value remain the same, a pivot can be applied on the final result and have thus all values’ elements within the same row.

Happy Coding!   

05 February 2011

💎🏭SQL Reloaded: Deleting Sequential Data From a Table

Last week I run into an interesting solution to a simple problem, problem I dealt with in several occasions too: how to delete (efficiently) sequential data, or how LuborK calls it in his post, “a problem of efficient ordered delete”.  In this case he’s talking about the deletion of the first n records from a dataset of sequential data. It sounds like a simple problem considering that we can easily select in a first step the first n records using the TOP function and then delete the respective matched records. That’s also what LuborK does, however the novelty of the solution proposed it’s a little unexpected, and this not from the point of view of the approach, but of the gain in performance. The bottom line: by encapsulating the inner query within a view, the deletion of sequential data becomes more efficient.

While reading the respective post I thought: what if the inner query is encapsulated in a inline table-valued function?! Would the performance remain the same or deprecate? Does it make sense to add the overhead of creating a view for each such scenario in order to gain some performance? Then, reading the comments, one of them was pointing something so obvious: the inner query can be encapsulated within a common table expression, following to perform on it the deletion. An elegant solution I haven’t had available under SQL Server 2000 at the times I had to solve the same problem. Actually my approach was a little different, at that time identifying and storing the greatest value of a new data set, building thus a collection of values that would allow me to select a given range. This approach was quite useful in the respective scenarios and quite acceptable as performance, plus the fact that I was reusing the greatest value in further logic – two birds with one shot.

This being said, I was thinking that except the methods provided by LuborK, I could provide in this post several other simple techniques. Please note that I’m not focusing on performance, but on the techniques. Eventually you can display the statistics related to the disk activity and time required to parse, compile and execute each batch of statements. So, let’s start with the creation and population of the table! In order to minimize the differences, I took the same example and added just a little make-up – comments and a small change in the way the values of the second column are inserted.  

-- dropping the table 
-- DROP TABLE dbo.TestTable 

--declaring the table 
CREATE TABLE dbo.TestTable ( 
  id int primary key 
, value varchar (50))  

-- inserting the records in a loop 
DECLARE @index int  
SET @index=1  
WHILE (@index<100000)  
    INSERT INTO dbo.TestTable  
    VALUES (@index,'test '+cast(@index as varchar(10))) 
    SET @index=@index+1  

Supposing we are interested in the first 10000 of records of a sequential data set, it would be much easier if we would know the maximum value from the respective data set. Then we could use the value to perform a simple deletion:
DECLARE @max as int 

-- deletion based on stored max value DECLARE @max int 
SELECT @max = MAX(ID) 
    SELECT top(10000) id 
    FROM TestTable  
) A 

--deleting the data  
WHERE id <= @max 

If we don’t need the maximum value for further logic, the two statements in can be combined in one query:

-- deletion based on max value - within one query DELETE FROM TestTable 
WHERE id <= ( 
     FROM ( 
          SELECT top(10000) id 
          FROM dbo.TestTable  
          ORDER BY ID 
     ) A 

--deleting the data  
WHERE id <= @max 
When dealing with huge volumes of data, and not only then, in case of sequential processing we can store the maximum value in a given table, and pull the values as required. This approach allows us to process any interval within the data set, in this case the delete resumes to a simple delete statement in which @start and @end were retrieved from the respective table:
-- deleting the data within a given interval 
DELETE FROM dbo.TestTable 
WHERE id BETWEEN @start AND @end 

If this seems trivial, let’s spice up things! As I observed, few people know that it’s possible to delete data from a given table using an inline table-valued function. Huh, what’s that?! If it seems impossible to you, take a look at documentation! I’m referring here to the GRANT object permissions section in which of interest is the following statement that refers to the ANSI 92 permissions applicable to the various database objects, for table-valued function being listed the following permissions: DELETE, INSERT, REFERENCES, SELECT, UPDATE. In order to exemplify the idea, let’s create the UDF:
-- creating the UDF function 
@start int 
, @end int) 
    SELECT id, value  
    FROM dbo.TestTable  
    WHERE id BETWEEN @start AND @end 

--deleting the data 
 DELETE FROM dbo.TestUDF1(20001, 30000) 

The example is supposed to work, unless you are having a problem with the permissions. Let’s consider this functionality from the perspective of our problem! What LuborK does is to include the inner query in a view and then delete the data:
-- creating the view 
CREATE VIEW dbo.vTestView 
SELECT top(10000) id 
FROM dbo.TestTable  

--deleting the data 
DELETE FROM dbo.vTestView 

One of the problems with this approach is that the value representing the number of records to be returned is hardcoded. And here comes to help an inline table-valued function, also referred as a “parameterized view”: 

-- creating the UDF function 
@number_records int) 
    SELECT top (@number_records) id, value  
    FROM dbo.TestTable  
    ORDER BY id 

--deleting the data 
DELETE FROM dbo.TestUDF2(10001)  

 And, as closure, the example based on the common table expression:

 -- deletion using a CTE 
AS ( 
     SELECT top(10000) id 
     FROM dbo.TestTable  

You can review the record count after each operation:

-- review record count
SELECT count(*) 
FROM dbo.TestTable;

The queries work also in SQL databases in Microsoft Fabric (see file in GitHub repository). You might want to use another schema (e.g. Test), not to interfere with the existing code. 

Happy coding!

30 January 2011

💠🛠️ SQL Server: Learning Resources

During the past weeks I found several interesting learning resources on SQL Server 2008 topics, so here they are:

For those interest to upgrade their knowledge to SQL Server 2008 R2 Edition check the Microsoft SQL Server 2008 R2 Update for Developers Training Course videos on Channel9. They cover topics like Location Awareness, CLR Integration, StreamInsight, Reporting Services, Application and Multi-Server Management, improvements on BLOB, T-SQL and tools for Excel 2010 or SharePoint 2010.

The ER diagrams of SQL Server’s DMVs or DMFs are available for download as PDF at SQL Server 2008 Systems Views Map, and the same for SQL Server 2005, respectively SQL Server 2008 R2. Even if not all objects and possible relations are shown, they could still save you from lot of effort as the diagram depicts the relations between the most important entities and the cardinality existing between them, plus the additional constraints participating in the joins. The entities represented: Objects, Types and Indexes, Linked Servers, CLR, Database Mirroring, Service Broker, Resource Governor, Transactions, Databases and Storage, Traces and Events, Execution Environment, Server information.

There are several free SQL Server books on SQLServerCentral.com coming from Redgate: A. Kuznetsov’s Defensive Database Programming, B. McGehee’s SQL Server Maintenance Plans, How to become an Exceptional DBA, Mastering SQL Server Profiler and Brad’s Sure Guide to SQL Server 2008, J. Magnabosco’s Protecting SQL Server Data, G. Fritchey’s SQL Server Execution Plans and SQL Server Crib Sheet Compendium (coauthor with A. Prasanna), R. Landrum’s SQL Server Tacklebox, plus several volumes of ‘Best of SQL Server Central’ authors in Vol. 1-4, Vol. 5, Vol. 6 and Vol. 7, and several other books to come.

It seems there are a few new books also on Scribd website: R. Colledge’s SQL Server Administration in Action, R. Vieira’s Professional Microsoft SQL Server 2008 Programming, M. Lee and M. Mansfield’s SQL Server 2008 Administration Instant Reference, plus many other books that can be retrieved by searching SQL Server 2008 on the respective site.

There are several papers coming from Microsoft and several professionals. It worth to check them:
-   K. Lassen’s paper on Using Management Data Warehouse for Performance Monitoring, in which he provides several best practices for performance management of SQL Server 2008. Except the introduction in Management Data Warehouse features, he provides also an extension of the built-in functionality for indexes, plus several functions and queries.
-  D. Kiely's paper on SQL Server 2008 R2 Security Overview for Database Administrators covers some of the most important security features in SQL Server 2008. There is a similar paper targeting SQL Server 2005.
-  B. Beauchemin’s paper on SQL Server 2005 Security Best Practices - Operational and Administrative Tasks describes the best practices for setting up and maintaining security in SQL Server 2005.
-  SQL Server 2008 Compliance Guide a paper written to help organizations and individuals understand how to use the features of the Microsoft® SQL Server® 2008 database software to address their compliance need.
-  SQL Server 2008 Upgrade Technical Reference Guide essential phases, steps and best practices to upgrade existing instances of SQL Server 2000 and 2005 to SQL Server 2008.
-  Technet Wiki Checklists on Database Engine Security, Database Engine Security, Database Engine Connections and  Data Access. (probably more to come)
-   Also Microsoft maintains a list of White Papers on SQL Server 2008, respectively SQL Server 2005.

07 January 2011

💎🏭SQL Reloaded: Pulling the Strings of SQL Server IV (Spaces, Trimming, Length and Comparisons)

In the previous post on concatenation, I was talking about the importance of spaces and other delimiters in making concatenations’ output more “readable”. Excepting their importance in natural language, the spaces have some further implication in the way strings are stored and processed. As remarked in the introductory post from this topic, there are two types of spaces that stand out in the crowds of spaces, namely the trailing spaces, the spaces found at the right extremity of a string,  respectively the leading spaces, the spaces found at the left extremity of a string. 

Are few the cases when the two trailing space are of any use, therefore databases like SQL Server usually ignore them. The philosophy about leading space is slightly different because there are cases in which they are used in order to align the text to the right, however there are tools which are cutting off the leading spaces. When no such tools are available or any of the two types of spaces are not cut off, then we’ll have do to it ourselves, and here we come to the first topic of this post, trimming.


Trimming is the operation of removing the empty spaces found at the endings of a string. Unlike other programming languages which use only one function for this purpose (e.g. Trim function in VB or Oracle), SQL Server makes use of two functions used for this purpose, LTrim used to trim the spaces found at the left ending of the string, respectively RTrim, used to trim the spaces found at the right ending of the string.

-- trimming a string 
SELECT  LTrim(' this is a string ') Length1 -- left trimming 
, RTrim(' this is a string ') Length2 --right trimming 
, LTrim(RTrim(' this is a string ')) Length2 --left & right trimming 

As can be seen it’s not so easy to identify the differences, maybe the next function will help to see that there is actually a difference.

1) If it looks like the two trimming functions are not working with strings having leading or trailing spaces, then maybe you are not dealing with an empty character but rather with other characters like CR, LF, CRLF or other similar characters, rendered sometimes like an empty character.
2)   In SQL Server 2017 was introduced the Trim function which not only replaces the combined use of LTrim and RTrim functions, but it allows to replace other specified characters (including CR, LF, Tab) from the start or end of a string. (see post


Before approaching other operations with strings, it’s maybe useful (actually necessary as we will see) to get a glimpse of the way we can determine the length of a string value, in other words how many characters it has, this being possible by using the Len function:

-- length of a string 
SELECT Len('this is a string') Length1 -- simple string 
, Len('this is a string ') Length2 --ending in space 
, Len(' this is a string') Length3 --starting with a space 
, Len(' this is a string ') Length4 --starting &amp; ending with a space 
, Len(LTrim(' this is a string ')) Length5 --length & left trimming 
,Len(RTrim(' this is a string ')) Length5 --length & right trimming
,Len(LTrim(RTrim(' this is a string '))) Length5 --length, left & right trimming    

In order to understand the above results, one observation is necessary: if a strings ends in with one or more empty characters, the Len function ignores them, though this doesn’t happen with the leading empty characters, they needing to be removed explicitly if needed.


The comparison operation points the differences or similarities existing between two data types, involving at minimum two expressions that reduce at runtime to a data type and the comparison operator. This means that each member of comparison could include any valid combinations of functions as long they are reduced to compatible data types. In what concerns the comparison of strings, things are relatively simple, the comparison being allowed  independently on whether they have fix or varying length. Relatively simple because if we’d have to go into details, then we’d need to talk about character sets (also called character encoding or character maps) and other string goodies the ANSI SQL standard(s) are coming with, including a set of rules that dictate the behavior of comparisons. So, let’s keep things as simple as possible. As per above attempt of definition, a comparison implies typically an equality, respectively difference, based on equal (“=”), respectively not equal (“<>” or “!=”). Here are some simple examples:

-- sample comparisons 
SELECT CASE WHEN 'abc' != 'abc ' THEN 1 ELSE 0 END Example1 
, CASE WHEN ' abc' != 'abc' THEN 1 ELSE 0 END Example2 
, CASE WHEN ' ' != '' THEN 1 ELSE 0 END Example3 
-- error comparison , CASE WHEN 'abc' != NULL THEN 1 ELSE 0 END Example4 
, CASE WHEN 'abc' = NULL THEN 1 ELSE 0 END Example5 
-- adequate NULL comparison , CASE WHEN 'abc' IS NOT NULL THEN 1 ELSE 0 END Example6  
, CASE WHEN 'abc' IS NULL THEN 1 ELSE 0 END Example7 
Example1 Example2 Example3 Example5 Example7
0 1 0 0 0

The first three examples are demonstrating again the behavior of leading, respectively trailing spaces. The next two examples, even if they seem quite logical in terms of natural language semantics, they are wrong from the point of view of SQL semantics, and this because the comparison of values in which one of them is NULL equates to a NULL, thus resulting the above behavior in which both expressions from the 4th and 5th example equate to false. The next two examples show how the NULLs should be handled in comparisons with the help of IS operator, respectively it’s negation – IS NOT. 

 Like in the case of numeric values, the comparison between two strings could be expressed by using the “less than” (“<;”) and “greater than” (“?”) operators, alone or in combination with the equality operator (“<=”, “>=”) or the negation operator (“!>”, “<!”) (see comparison operators in MDSN). Typically an SQL Server database is case insensitive, so there  will be no difference between the following strings: “ABC”, “abc”, “Abc”, etc. Here are some examples:

-- sample comparisons (case sensitive) 
SELECT CASE WHEN 'abc' < 'ABC' THEN 1 ELSE 0 END Example1 
, CASE WHEN 'abc' > 'abc' THEN 1 ELSE 0 END Example2 
, CASE WHEN 'abc' >= 'abc ' THEN 1 ELSE 0 END Example3 
, CASE WHEN 'abc' <> 'ABC' THEN 1 ELSE 0 END Example4 
, CASE WHEN 'abc' > '' THEN 1 ELSE 0 END Example5 
, CASE WHEN ' ' > '' THEN 1 ELSE 0 END Example6 
Example1 Example2 Example3 Example4 Example5 Example6
0 0 1 0 1 0

The case sensitivity could be changed at attribute, table or database level. As we don’t deal with a table and the don’t want to complicate too much the queries, let’s consider changing the sensitivity at database level. So if you are using a non-production database, try the following script in order to enable, respectively to disable the case sensitivity:

--enabling case sensitivity for a database 
ALTER DATABASE <database name>  
COLLATE Latin1_General_CS_AS  

--disabling case sensitivity for a database 
ALTER DATABASE <database name> 
COLLATE Latin1_General_CI_AS 
In order to test the behavior of case sensitivity, enable first the sensitivity and then rerun the previous set of example (involving case sensitivity).
Example1 Example2 Example3 Example4 Example5 Example6
1 0 1 1 1 0
After that you could disable again the case sensitivity by running the last script. Please note that if your database has other collation, you’ll have to change the scripts accordingly in order to point to your database’s collation.

The queries work also in SQL databases in Microsoft Fabric.

Happy coding!
