04 August 2011

📈Graphical Representation: Histogram (Definitions)

"A graph showing a variable's discrete values or ranges of values on the x-axis and counts or percentages on the y-axis. The number of observations for each value or range is presented as a vertical rectangle whose length is proportionate to the number of observations." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"A graphical display of the frequency distribution of a set of data. Histograms display the shape, dispersion, and central tendency of the distribution of a data set." (Clyde M Creveling, "Six Sigma for Technical Processes: An Overview for R Executives, Technical Leaders, and Engineering Managers", 2006)

"A chart that shows quantities of data points that occur within various numeric ranges." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A special form of bar chart used to describe the central tendency, dispersion, and shape of a statistical distribution." (For Dummies, "PMP Certification All-in-One For Dummies, 2nd Ed.", 2013)

[height-balanced histogram:] "A histogram in which column values are divided into buckets so that each bucket contains approximately the same number of rows." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[top frequency histogram:] "A variation of a frequency histogram that ignores nonpopular values that are statistically insignificant, thus producing a better histogram for highly popular values." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[hybrid histogram:] "An enhanced height-based histogram that stores the exact frequency of each endpoint in the sample, and ensures that a value is never stored in multiple buckets." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[frequency histogram:] "A type of histogram in which each distinct column value corresponds to a single bucket. An analogy is sorting coins: all pennies go in bucket 1, all nickels go in bucket 2, and so on." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

"A chart that shows a frequency distribution." (E C Nelson & Stephen L Nelson, "Excel Data Analysis For Dummies ", 2015)

"A bar chart that shows the graphical representation of numerical data." (Project Management Institute, "A Guide to the Project Management Body of Knowledge (PMBOK Guide )", 2017)

16 July 2011

🛠️SQL Server Administration: Could not load file or assembly Microsoft.MSXML…

I’m not programming anymore as I used to do, though from time to time I still fancy some .Net programming. It’s not much, small applications or CLR-based libraries for SQL Server. Quite often, when I return to programming after a long pause it happens that I run into problems, finding that something that was working previously stopped working. During my last attempt I couldn’t load anymore one of the projects I worked on, receiving the following error:

“Could not load file or assembly ‘Microsoft.MSXML, Version=8.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a’ or one of its dependencies. The system cannot find the file specified.”

Same happened when I tried to load other projects. I looked then into GAC in “C:\Windows\assembly” folder and saw no reference to Microsoft.MSXML dll. So I tried to install the msxml6.dll assembly in GAC however, never doing that, I run into another problem. In the meantime I tried to install the Visual Studio 2010 SP1, the MSXML 6.0 and even the Windows .Net SDK. All this without success. After several good hours, I returned to one of the forum posts (here) I run into in a first place. Thomas Sun was pointing that it might be a problem with the Microsoft Document Explorer 20xx. The solution was to reinstall it from  “C:\Program Files\Common Files\microsoft shared\Help 9\Microsoft Document Explorer 2008”. Once I did that everything was back to normal. At least until I’ll run into another issue.

After all this there is still one positive point: I managed to install SP1 and all the goodies it comes with, and I’m thinking here at the support for HMTL5. The downside – several good hours of lost time! I don’t want to think how much time I lost until now trying to solve things that were supposed to work in a first place – probably weeks, months…  That’s part of programmers’ life.

Disclaimer:
As Microsoft changed the whole structure of their support websites, most of the resources become unavailable. Therefore I had to remove the links pointing to the various sources.


18 June 2011

💎SQL Reloaded: Pulling the Strings of SQL Server X (Dynamic Queries)

A dynamic query is a query constructed at runtime, techniques often indispensable in many situations that require a certain flexibility in query’s creation. The creation of a dynamic query is nothing but a set of operations with strings, many of the techniques mentioned before becoming handy. SQL Server provides two functions for the execution of dynamic queries, namely EXECUTE statements (or its shortened form EXEC) and sp_executesql stored procedure. Even if the later it’s more flexible allowing passing parameters from and to the caller and allows reusing executions plans (see Using sp_esecutesql), for the following examples will be used only EXEC. But before let’s look how a static could become dynamic. For this let’s consider the following query based on AdventureWorks database:
 
-- example 1 - simple query   
SELECT *  
FROM Person.Address  
WHERE AddressID = 1 

-- example 2 - query encapsulated in a string: 
EXEC ('SELECT * FROM Person.Address WHERE AddressID = 1') 

-- example 3 - query stored into a string variable      
EXEC ('SELECT * FROM Person.Address WHERE AddressID = 1') 

Supposing that the AddressID is considered as parameter we can write:

-- example 4 - static query     
DECLARE @AddressID int 
SET @AddressID = 1 
SELECT *  
FROM Person.Address  
WHERE AddressID = @AddressID  

-- example 5 - dynamic query  
DECLARE @sql varchar(100) 
DECLARE @AddressID int 
SET @AddressID = 1 
SET @sql = 'SELECT * FROM Person.Address WHERE AddressID = ' + CAST(@AddressID as varchar (10)) 
EXEC (@sql) 

Until here there is no important conceptual difference. What if is needed to pass multiple AddressIDs? We can create a parameter for which expected values, though that’s not a reasonable solution as the number of values can vary. A more elegant solution would be to create a list of values and provided as a string parameter and then concatenate the original query and the string parameter like below. We just need to accommodate the length of the string variable to the expected size of the list of value.
 
-- example 6 (dynamic query) 
DECLARE @sql varchar(100) 
DECLARE @AddressIDs varchar(50) -- supposed parameter 
SET @AddressIDs = '1, 2, 4, 5, 6, 10'  
SET @sql = 'SELECT * FROM Person.Address WHERE AddressID IN (' + @AddressIDs + ')' 
EXEC (@sql) 

There is actually a third solution. As in the previous post on list of values has been introduced the dbo.StringToTable function, the function can be used thus to transform the list in a table:
 
-- example 7 (list of values) 
DECLARE @AddressIDs varchar(50) -- supposed parameter 
SET @AddressIDs = '1,2,4,5,6,10'  
SELECT *  
FROM Person.Address  
WHERE AddressID IN ( 
      SELECT value  
      FROM dbo.StringToTable(@AddressIDs, ',')) 

In the same post was constructed the DoubleList list of values which can be used in a dynamic query in bulk inserts or table-value constructors. The list needs to be slightly modified by replacing the single quote with two single quotes in order to accommodate value’s storage in a string. Considering that there are no integrity constraints on the targeted table, he query for bulk insert can be written as follows:
 
-- example 8 (list of values & bulk insert) 
DECLARE @sql varchar(200) 
DECLARE @AddressTypes varchar(150)  
SET @AddressTypes = '(6,''Archive''), (1,''Billing''), (2,''Home''), (3,''Main Office''), (4,''Primary''), (5,''Shipping'')' 
SET @sql = 'INSERT Person.AddressType (AddressTypeID, Name) VALUES ' + @AddressTypes  
EXEC (@sql) 

  The same technique can be used with a table-value constructor:
 
-- example 9 (list of values & table-value constructor) 
DECLARE @sql varchar(400) 
DECLARE @AddressTypes varchar(150)  
SET @AddressTypes = '(6,''Archive''), (1,''Billing''), (2,''Home''), (3,''Main Office''), (4,''Primary''), (5,''Shipping'')' 
SET @sql = 'SELECT VA.AddressID, VA.AddressTypeID, AT.NAME FROM Purchasing.VendorAddress VA JOIN ( VALUES ' + @AddressTypes + ') AS AT(AddressTypeID, Name) ON VA.AddressTypeID = AT.AddressTypeID' 
EXEC (@sql) 

The above examples are basic, in daily problems such queries can involve multiple parameters and operations. In addition, in the last examples the concatenation step was left out.

 
Happy coding!

17 June 2011

💎SQL Reloaded: Pulling the Strings of SQL Server VII (List of Values)

Introduction

    Lists are one of the basic structures in Mathematics, the term referring to an (ordered) set of elements separated by comma, space or any other delimiter (e.g. “:”, “;”). The elements of a list can be numbers, words, functions, or any other type of objects. In the world of databases, a list is typically formed out of the values of a given column or a given record, however it could span also a combination of rows and records, is such cases two delimiters being needed – one for column and one for row. From here comes probably the denomination of list of values. In a more general accept a list of values could be regarded as a delimited/concatenated subset. Such lists are formed when needed to send the data between the layers of an application or applications, this type of encoding being quite natural. In fact, also the data in a database are stored in similar tabular delimited structure, more complex though.  

    An useful example in which the list of values are quite handy is the passing of multiple values within the parameter of stored procedure or function (see example). This supposes first building the list and then use the values in a dynamic build query (like in the before mentioned example) or by building a table on the fly. We can call the two operations composition, respectively decomposition of list of values.

Composition 

Composition, whether on vertical or horizontal is nothing but a concatenation in which the values alternate with one or more delimiters. Let’s reconsider the concatenation based on the values of a Person.AddressType AdventureWorks table. As the logic for concatenating for one or more attributes is the same, the below example concatenates a list based on a single attribute, namely AddressTypeID in SingleList, respectively two attributes, AddressTypeID and Name.

-- concatenation of values across a table 
;WITH CTE (AddressTypeID, Name, Ranking) 
AS (--preparing the data       
     SELECT AddressTypeID  
     , Name 
     , ROW_NUMBER () OVER(ORDER BY Name) Ranking 
     FROM Person.AddressType 
     -- WHERE ... 
) 
, DAT (SingleList, DoubleList, Ranking) 
AS ( -- concatenating the values 
     SELECT Cast(AddressTypeID as varchar(max)) SingleList 
     , Cast('('+ Cast(AddressTypeID as varchar(10)) + ',''' + Name + ''')' as varchar(max)) DoubleList 
     , Ranking 
     FROM CTE 
     WHERE Ranking = 1 
     UNION ALL 
     SELECT DAT.SingleList + ',' + Cast(CTE.AddressTypeID as varchar(20)) SingleList 
    , Cast(DAT.DoubleList + ', ('+ Cast(CTE.AddressTypeID as varchar(10)) + ',''' + CTE.Name + ''')' as varchar(max)) DoubleList 
    , CTE.Ranking  
     FROM CTE          
       JOIN DAT           
          ON CTE.Ranking = DAT.Ranking + 1       
)  

-- the lists 
SELECT SingleList 
, DoubleList 
FROM DAT 
WHERE Ranking = (SELECT MAX(Ranking) FROM DAT) 

 List of values - concatenation
 

   The second example is based on atypical delimiters, resembling to the structure built for a batch insert or table value constructor-based statement, and as we’ll see later, ideal to be used in a dynamically-built query

Decomposition

Decomposition follows the inverse path, though it’s much easier to exemplify. In fact it’s used the same technique introduced in the last example from the previous post belonging to the same cycle, Subparts of a String, in which a space was used as delimiter. Another example is the dbo.SplitList function which decomposes a string using a loop.


-- decomposition of a string to a table using CTE 
CREATE FUNCTION dbo.StringToTable( 
 @str varchar(500) 
,@Delimiter char(1)) 
RETURNS @Temp TABLE ( 
Id int NOT NULL 
,Value varchar(50)) 
AS 
BEGIN  
     ;WITH CTE (PrevString, Position, Word)  
     AS (  
     SELECT LTrim(RTrim( CASE  
           WHEN CharIndex(@Delimiter, @str)>;0 THEN Right(@str, Len(@str)-CharIndex(@Delimiter, @str))  
           ELSE ''  
      END)) PrevString  
     , 1 Position  
     , LTrim(RTrim(CASE  
           WHEN CharIndex(@Delimiter, @str)>0 THEN LEFT(@str, CharIndex(@Delimiter, @str)-1)  
           ELSE @str  
       END)) Word  
      UNION ALL  
      SELECT LTrim(RTrim(CASE  
            WHEN CharIndex(@Delimiter, PrevString)>0 THEN Right(PrevString, Len(PrevString)-CharIndex(@Delimiter, PrevString))  
             ELSE ''  
       END)) PrevString  
      , Position + 1 Position  
      , LTrim(RTrim(CASE  
           WHEN CharIndex(@Delimiter, PrevString)>0 THEN LEFT(PrevString, CharIndex(@Delimiter, PrevString)-1)  
          ELSE PrevString  
      END)) Word      FROM CTE  
     WHERE Len(PrevString)>0  
    )  
     INSERT @Temp(Id, Value) 
     SELECT Position  
     , Word      FROM CTE  
     OPTION (maxrecursion 100)  
     RETURN 
END    

Here are two examples based on the single list created above and another one based on alphabet:


-- decomposing a list
SELECT Id 
, value 
FROM dbo.StringToTable('6,1,2,3,4,5', ',')     


-- decomposing the "alphabet" 
SELECT Id 
, value 
FROM dbo.StringToTable('a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z', ',') 

List of values - simple decomposition    

List of values - alphabet decomposition
   
Even if the function deals only with a delimiter, it could be used to decompose lists involving multiple delimiters, as long the list is adequately built:


-- decomposing double list 
SELECT Id 
, value 
, Left(value, CHARINDEX(',', value)-1) LeftValue 
, Right(value, len(value)-CHARINDEX(',', value)) RightValue 
FROM dbo.StringToTable('6,Archive;1,Billing;2,Home;3,Main Office;4,Primary;5,Shipping', ';')     

List of values - double decomposition 

 The tables built thus from list of values can be further used in queries when needed to create a table on the fly. It would be interesting maybe to show that the composition and decomposition are inverse functions, however that’s out of scope, at least for current set of posts. 

13 June 2011

SQL Server: Procedure Cache (Definitions)

"An area of SQL Server memory used to hold trees and query plans that have recently been used or are currently in use by SQL Server." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A temporary storage location for the current, executing version of a specific stored procedure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A shared, reserved memory area that stores the most recently executed SQL statements or PL/SQL procedures (including triggers and functions). Also called SQL cache." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"The part of the SQL Server memory pool that is used to store execution plans for Transact-SQL batches, stored procedures, and triggers." (Microsoft, "SQL Server 2012 Glossary", 2012)

30 April 2011

SQL Server: Full-Text Index (Definitions)

"The portion of a full-text catalog that stores all of the full-text words and their locations for a given table." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A special index that efficiently tracks the words you're looking for in a table. They help in enabling special searching functions that differ from those used in regular indexes." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"This is an index that can be queried for large amounts of text. This is maintained by the full-text search service." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A full-text index enables advanced text-based searches to be performed against a database table." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"Index that enables advanced text-based searches to be performed against a database table." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed., 2015)

23 April 2011

🪄SSRS: First Magic Class (Wizarding a Report)

Introduction

    One year back I started a set of posts on SSIS showing how to create a package with ‘SQL Server Import and Export Wizard’ using as source a SQL Server, respectively Oracle database, in a third post showing how to use the Data Flow Task. The “wizarding” thematic was based on the fact the respective posts were showing how to make use of the wizards built within SQL Server and Visual Studio in order to create a basic export package. Therefore the posts were targeting mainly beginners, my intention at that time was to use them as a starting point in showing various objects and techniques. I also intended to start a set of similar posts on SSRS, so here I am, in this first post showing how to use the Report Wizard to create a report in BIDS (SQL Server Business Intelligence Development Studio) using a SQL Server database.

    This short tutorial uses a query based on AdventureWorks2008 sample database and considers that your Reporting Services instance is installed on the same computer with the database. For simplicity, I focused on the minimal information required in order to built a simple report, following to cover some of themes in detail in other posts. The tutorial can be used together with the information provided in MSDN, see Creating a Report Using Report Wizard and Report Layout How-to Topics sections.

Step 1: Create the Query 
 
    Before creating a report of any type or platform, it’s recommended to create and stabilize the query on which the report is based. For simplification let’s use a query based on Sales.vIndividualCustomer view:

-- Customer Addresses 
SELECT SIC.Title  
, SIC.FirstName  
, SIC.LastName  
, SIC.AddressLine1  
, SIC.City  
, SIC.PostalCode  
, SIC.CountryRegionName  
, SIC.PhoneNumber  
, SIC.EmailAddress  
FROM Sales.vIndividualCustomer SIC 

Step 2: Create the Project 
 
   Launch the BIDS from Windows Menu, create a new Project (File/New/Project) by using the “Report Server Project Wizard” Template, give the Project an appropriate Name and other related information. 
 
SSRS Tutorial New Project
   This will open the Report Wizard, and unless you have chosen previously not to show this step, it will appear a tab in which are shown the step that will be performed:
- Select a data source from which to retrieve data
- Design a query to execute against the data source
- Choose the type of report you want to create
- Specify the basic layout of the report
- Specify the formatting for the report
- Select the report type, choose tabular

Step 3: Create/select the Data Source 
 
    Creating a Data Source pointing to the local server isn’t complicated at all, all you have to do is to give your data source a meaningful name (e.g. Adventure Works) and then define connection’s properties by selecting the “Server name” and database’s name, in this case AdventureWorks2008.

SSRS Tutorial Connection Properties 
    Test the connection, just to be sure that everything works. In the end your data source might look like this:
SSRS Tutorial Select Data Source
    You can define your data source as shared by clicking the “Make this a shared data source” checkbox, allowing you thus to reuse the respective data source between several projects.

Step 4: Provide the Query 
 
   Because the query for our report was created beforehand, is enough to copy it in “Query string” textbox. We could have used the “Query Builder” to built the query, though a tool like SSMS (SQL Server Management Studio) can be considered a better choice for query design. As pointed above, it’s recommended to built the query and stabilize its logic before starting the work on the actual report. SSRS Tutorial Design the Query
Step 5: Generate the Report
   
Creating a report supposes choosing a Report Type (Tabular vs. Matrix), the level at which the fields will be displayed (page, group or details), and the Table Style. For this tutorial choose a Tabular type and, as no grouping is needed, in “Design the Table” step choose all the Available fields and drag-and-drop them in “Details”, the bottommost list from “Displayed fields” section. 
 
SSRS Tutorial Design the Table
    In Choose the Table Style go with the first option (e.g. Slate), though it’s up to you which one of the styles you prefer.

Step 6: Deploy the Report 
 
    You can deploy the report if your report server is already configured, however in order to test the report you don’t have to go that far because you can test the report directly in BIDS. So you can go with the actual settings: 
 
SSRS Tutorial Choose the Deployment Location
    In the last step provide a meaningful name for the Report (e.g. Customer Addresses) and here is the report in Design mode: 
 
SSRS Tutorial Design
   By clicking on “Preview” tab you can see how the actual report will look like:
SSRS Tutorial Preview 
   Now I would recommend you to check what objects the wizard has created and what properties are used. Just play with the layout, observe the behavior and what the documentation says. There are plenty of tutorials on the web, so don't be afraid to search further.

09 April 2011

💎SQL Reloaded: Searching a Value within a Database

Some years back, in a Management meeting related to one of the projects I was working on, one of the Managers made a quite interesting statement, something like: "it should be easy to pool in an Excel sheet from the whole database all the information I am interested in". I kind of smiled, because the information he was requiring was spanning within different modules, having different levels of detail and thus requiring several different output datasets in order to answer a simple question like "how is the manufacturing cost varying over time?". After some meetings we figured out that it’s not so easy to pull the data for the respective question, lacking some data which were not available in the respective database. It’s funny how IS (Information Systems) are getting more complex day by day but fail to capture and indirectly answer simple questions, this also because they were not designed to address such questions. Anyway, this post is not about this.

Some days back I spent a couple of hours trying to identify the tables from which the IDs stored in a log where coming from, in the log being saved no reference to the table, just the IDs. Another example of "good" design! While working diligently, I remembered that I could have easily solved the problem if I were using one of the CLR functions created in a post on Data Profiling Using CLR Functions. Unfortunately, because I was working in a production environment with strong security constraints, I wasn’t allowed registering and use any of the respective functions. Still, considering the easiness and flexibility of a the solution I thought that would be interesting to put it into practice. Who knows, maybe someday I will find the opportunity to use it!

Let’s suppose I am searching for an integer ID value within a given schema of AdventureWorks database. So, in theory, I need to do a search on each integer column of all the tables from the respective schema. The tables can be found in a matter of seconds using the following query based on the dbo.ExecuteScalarToInt CLR function introduced in the previous mention post and dbo.vTableColumns view.
 
-- searching a value in whole database 
WITH CTE 
AS (  
    SELECT [Schema_Name]  
    , Table_Name  
    , Column_Name 
    , system_type 
    , dbo.ExecuteScalarToInt('SELECT count(*) FROM ' + Quotename([Schema_Name]) + '.' + Quotename(Table_Name) + ' WHERE ' + Quotename(Column_Name) + ' = ' + Cast(1001 as varchar(20))) NumberRecords 
    FROM dbo.vTableColumns 
    WHERE [Schema_Name]= 'Sales' 
     --AND Table_Name = 'SalesOrderDetail' 
     --AND Column_Name = 'SalesOrderDetailID' 
     AND system_type = 'int' ) 
SELECT * 
FROM CTE 
WHERE NumberRecords>0 
ORDER BY [Schema_Name] 
, Table_Name  

Output:
Schema_Name Table_Name Column_Name system_type NumberRecords
Sales ContactCeditCard CreditCardId int 1
Sales CreditCard CreditCardId int 1
Sales CurrencyRate CurrencyRateId int 1
Sales CustomerAddress AddressId int 1
Sales SalesOrderDetail SalesOrderDetailId int 1
Sales SalesOrderHeader BillToAddressId int 8
Sales SalesOrderHeader CreditCardId int 1
Sales SalesOrderHeader ShipToAddressId int 8

As can be seen the dynamically built query, based on table metadata and searched value, is provided as parameter to dbo.ExecuteScalarToInt function. The query calculates the number of occurrences (see NumberRecords column) of the same value within a given column. Some attention must be given to the search column’s data type, as the data type needs to be compatible with the searched value, otherwise a cast being necessary. I haven’t tried the query in a big database and I would advice you to be cautious when running it in a production environment. In case you arrive to use the query in a big database, please let me know how it behaved!

In case of the problem I had a few days back, the above query it’s only a first step in identifying the tables in which the values occur. Because multiple tables are returned, it’s needed to investigate which of the tables has the highest probability to be the source for the ID saved in log. Of course, some DM (data mining) algorithm could be probably used for this purpose, however some programming is needed and also some amount of diligent work to prove the data. It would be nice to have a tool that performs such analysis or that creates a complex query using any set of attributes within a given database.

💎SQL Reloaded: Table Value Constructors at Work

    Paradoxically, the first new SQL Server 2008 feature I learned, soon after the first CTP was made available, was the use of table value constructors, however I managed all this time without using it. This happened because I was either working on previous SQL Server versions or because other techniques were more appropriate to the problems I had to solve. So, a few days ago, the use of table value constructors came again on the table when I was weighting what’s the best approach to a problem posed by a colleague. Actually, the problem is quite simple – given a pair of values (id, new value) is needed to update a table with the “new value” corresponding to the given “id”. Previously, in similar cases, I preferred to load the data in a staging table because this was allowing me to quickly perform several checks on the set of values (e.g. check for duplicates, identify how many records will be updated). But what do you do when you are working directly in a production or any other type of environment in which you don’t have the possibility of using a staging table?

    For such cases you can either create a dataset on the fly with a set of UNIONS and reuse it in queries or, create an UPDATE statement for each pair of values. You can even automate the creation of UNION and UPDATE statements, however their use doesn’t seem so elegant, especially when you can use table value constructors. The feature seems to be hidden in documentation, as a matter of fact I found it only in the online documentation, the local help having no reference to it. Anyway, excepting a few posts on the same subject I can’t say I have seen any code until now making use of it. What’s interesting to note, before going into details, is that the ANSI SQL 94 (the earliest version of documentation I have) mentions vaguely table value constructors (p. 221), therefore I expect they are part of ANSI standard.

   According to MSDN a table value constructor allows a set of row value expressions to be constructed into a table within a single DML statement. As can be seen from the MSDN examples the list of values can contain a combination of list of values and queries, as long the data type matches and only a single scalar value is provided as a row value expression Another limitation concerns the maximum number of rows constructed, the maximum value being 1000. It’s important to know these limitations, because they might force you use other approaches. Before sketching a solution for this post’s problem, let’s look how a table can be constructed on the fly given a list of pair-values:

-- example table value constructor 
SELECT EmployeeId 
, VacationHours 
FROM ( VALUES (101, '23') 
     , (102, '82') 
     , (103, '66') 
     , (104, '17') 
      , (105, '64') 
      , (106, '56') 
      , (107, '107') 
      , (108, '50') 
      , (109, '109') 
      , (110, '48')) AS Data(EmployeeId, VacationHours) 

    The table thus constructed can be further used in an UPDATE statement, the following example being based on HumanResources.Employee from AdventureWorks database.

-- example table value constructor update UPDATE HumanResources.Employee 
SET VacationHours = Data.VacationHours 
FROM ( VALUES (101, '23') 
      , (102, '82') 
      , (103, '66') 
      , (104, '17') 
      , (105, '64') 
      , (106, '56') 
      , (107, '107') 
      , (108, '50') 
      , (109, '109') 
      , (110, '48')) AS Data(EmployeeId, VacationHours) 
WHERE HumanResources.Employee.EmployeeId = Data.EmployeeId 

    A similar construct can be be used to perform a SELECT, UPDATE or DELETE in which multiple columns participate in the JOIN. For example let’s consider a SELECT based on a set of pairs of values from the Sales.SalesOrderHeader table:

-- example table value constructor
SELECT SOH.* 
FROM Sales.SalesOrderHeader SOH 
     JOIN ( -- set of pairs 
        VALUES(10, 5069) 
         , (10, 6070) 
         , (10, 7060) 
         , (11, 407) 
         , (11, 1603) 
         , (11, 2737)) AS Data(CustomerId, CurrencyRateID) 
      ON SOH.CustomerId = Data.CustomerID 
    AND SOH.CurrencyRateID = Data.CurrencyRateID 

    As can be seen table value constructors are pretty simple to use, especially when dealing with a small number of values. If they are the best choice, that depends on the problem you are trying to solve and its specific constraints!

04 April 2011

SQL Server: Server Role (Definitions)

"A named set of security accounts. A SQL Server role can contain Windows NT users, Windows NT groups, SQL Server users, or other SQL Server roles from the same database." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A predefined role defined at the server level and existing outside individual databases." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A predefined role that exists at the server level. The scope of the role is limited to the SQL Server instance in which it is defined. Fixed server roles allow you to quickly assign permissions on the server to specific individuals. Roles can be compared to groups in the Window NT/2000 operating systems." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A predefined role at the server level that is used to control administrative access to the SQL Server instance." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A predefined role that exists at the server level. The scope of the role is limited to the SQL Server instance in which it is defined." (Microsoft, "SQL Server 2012 Glossary", 2012)

"These are special groups in SQL Server that are used to limit the amount of administrative access that a user has once logged on to SQL Server." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

SQL Server: Fixed Database Role (Definitions)

"A predefined role defined at the database level and existing in each database." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A predefined role that exists in each database. The scope of the role is limited to the database in which it is defined. You use fixed database roles to quickly define permissions within a particular database. Roles can be compared to groups in the Window NT/2000 operating systems." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"These are special groups in each SQL Server database that already have permissions applied and are used to limit the permissions that a user has inside the database." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A role that is created by SQL Server 2005 in each database with a set of permissions that are predetermined and cannot be altered. Users can be added to one or more fixed database roles." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A predefined role that exists in each database. The scope of the role is limited to the database in which it is defined." (Microsoft, "SQL Server 2012 Glossary", 2012)

30 March 2011

SQL Server: Full-Text Catalog (Definitions)

"A catalog that stores a database's full-text index." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The full-text catalog stores all of the full-text indexes for tables within a database." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A Full-Text catalog is a special storage space used to house Full-Text indexes. By default, all Full-Text indexes are housed in a single catalog." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A special storage space used to house Full-Text indexes. By default, all Full-Text indexes are housed in a single catalog." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"A full-text catalog is a logical grouping of SQL Server full-text indexes for management purposes." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"A collection of full-text index components and other files that are organized in a specific directory structure and contain the data that is needed to perform queries." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A full-text catalog is a logical grouping of SQL Server full-text indexes for management purposes." (Jay Natarajan et al, "Pro T-SQL 2012 Programmer's Guide" 3rd Ed., 2012)

"A logical grouping of SQL Server full-text indexes for management purposes." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed., 2015)

SQL Server: Filegroup (Definitions)

"A named collection of one or more database files that forms a single unit of allocation and administration. Filegroups enable the creation of objects in a specific place - for example, placing a heavily accessed table on a very fast drive. They also provide the ability to back up specific objects." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A named collection of one or more files that forms a single unit of allocation or for administration of a database. Filegroups allows you to engage in advanced administration techniques including the specific storage of particular objects on particular hard drives." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A named collection of one or more disk files that represent a single allocation on or for administration of a database. A database can have more than one filegroup." (Allan Hirt et al, "Microsoft SQL Server 2000 High Availability", 2004)

"In SQL Server, a named collection of one or more files that forms a single unit of allocation. Also for administration of a database." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"This is a logical grouping of files used to segregate database objects for storage and performance reasons." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A logical means of controlling the placement of database objects on a file or set of files." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server 2005 Optimization and Maintenance 70-444", 2007)

"A method of optimizing the performance of a database by controlling the placement of database files and database objects. By default, all data and database objects are placed into a single file, in a single filegroup. Unlike files, which can be viewed on the disk, a filegroup is conceptual." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Collections of SQL Server data files. For performance and administrative reasons, you can place user objects into dedicated filegroups." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A named collection of one or more data files that forms a single unit of data allocation or for administration of a database." (Microsoft, "SQL Server 2012 Glossary", 2012)

25 March 2011

🧭Business Intelligence: Troubleshooting (Part II: Approaching a Query)

Business Intelligence Series
Business Intelligence Series

Introduction

You received a (long) query for troubleshooting, reviewing, conversion or any similar tasks. In addition, you don’t know much about the underlying table structure or business logic. So, what do you do then? For sure two things are intuitively clear: you don’t need to panic and, understanding the query may help you in your task. Understanding the query, it seems such a simple statement, though there is more to it. Here are some points on how to approach a query.

State your problem
 
“A problem well stated is a problem half solved” (Charles F. Kettering). Before performing any work, check what’s requested from you, whether you are having the information required for the task(s) ahead, for example documentation, valid examples, all code, etc. If something is missing, don’t hesitate to request all the information you need. While waiting for information, you can continue with next steps. As we don’t live in a perfect world, there will be also cases in which you’ll have to fill the gaps by yourself by performing additional research/work. When troubleshooting is important to understand what’s wrong and, when possible, have data against which to compare query’s output.

Save the work

Even if you are having a copy of the query somewhere on the server, save the previous version of the query and, when possible, use versioning. It might seem a redundant task, however the fact is that you never know when you need to refer to it and, as you’ll see next, it can/should be used as a baseline for validating the changes. In case you haven’t saved the query, check whether your RDBMS is tracking metadata about the queries run and, if the metadata were not reset in the meantime, you might be lucky enough to find a copy of your query.

I found that is important to save the daily work, the various analysis performed in order to understand a query, the various versions and even the data used for testing. All this work could help you letter to review what you made, the steps you missed, you can reuse one of the queries for further work, etc.

Break down

When the query is too complex, it could be useful to break the query into chunks that could be run and understood in isolation. Typically such chunks derive from query’s structure (e.g. inline queries, subqueries derived from unions). I found that often, focusing only a chunk of a query help isolating issues.

Restructure

Many programmers still write queries using the old non-ANSI joining syntax in which the join constraints appear in the WHERE clause, making the understanding and troubleshooting of a query more difficult. Often I found myself in the position of transforming first a query to ANSI SQL syntax, before performing further work on it. It’s actually a good occasion to gain a first understanding of query’s structure, but I’d prefer not to do it so often. In addition, during restructure phase it makes sense to differentiate between the join and filter constraints, this helping isolating the issue(s).

Check cardinalities

Wrong join constraints lead to duplicates or fewer records than expected, such differences being difficult to track when the variances in the numbers of records are quite small. Even if RDBMS come in developers’ help by providing metadata about the join relations, the columns and predicates participating in a join are not always so easy to identify. Therefore, in order to address this issue, it’s needed to check the constraints between any two tables between participating in a join. Sometimes, when the query is based on the table with the lowest level of detail, it can be enough to check the variations of the number of records.

Check filter constraints

Filter constraints are maybe more difficult to identify, especially when is needed to reengineer the logic built in applications. Many of the filter constraints are logical, though when you have no documentation about the schemas, is like rambling in the dark, having to check real examples and identify the various values and the impact they have on the behavior of your report.

Validate changes
 
So, you made the changes, everything looks perfect. Is it so? Often your intuition might tell you that the logic of a query is correct, though as software is not based on magic, at least not all the time, check some of the records to assure that the data are rendered as expected, check totals, compare the current with previous version, identify variations, etc. You don’t need to use all the technique you know, but to choose the best and minimal set of tools that allows you to validate the query.

Perform refactoring
    
Refactoring, the way to (continuous) code improvement, should become part of each developer’s philosophy about programming. A query, as any other piece of code, is rarely perfect as technical and factual knowledge is relative, features get deprecated and new techniques are introduced. On the other side, there is an old saying in IT – don’t change something that’s already working, so, there should be kept a balance between the two – the apparent and needed for change.

Document
    
I hope it’s not the case to stress the importance of documentation. From versioning to logic description, it’s a good practice to document the important parts of your work, especially the parts that will facilitate later work.

SQL Server: Application Role (Definitions)

"A SQL Server role created to support the security needs of an application. Such a role is activated by a password and the use of the spsetapprole system stored procedure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A SQL Server role created to support the security needs of an application. Using application roles is an alternative to allowing users access to SQL Server 2000. You can create an application role and assign it to a particular application, allowing users who use the application to access SQL Server." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"This is a special type of role that requires activation using the sp_setapprole stored procedure. This is primarily used to keep users from accessing a database with anything other than a custom application." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A SQL Server role used by the application, instead of the user, to authenticate against a database solution." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A SQL Server role used by the application, instead of the user, to authenticate against a database solution." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A SQL Server role created to support the security needs of an application." (Microsoft, "SQL Server 2012 Glossary", 2012)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.