24 March 2010

Knowledge Management: Definitions I (The Stored Procedure Case)

I just started to read the SQL Server Programmers Blog’s post on Stored Procedures  and I’m having a problem with the first sentence: "A stored procedure is a group of Transact-Sql statements compiled into a single execution plan". Not sure if the respective statement was supposed to define what a stored procedure is or highlight its most important characteristics; if it's the first choice then, if I’m not mistaking, also a user-defined function (UDF) (referring here especially to multi-statement UDFs) is a group of compiled T-SQL Statements for which is created an execution plan, and a user-defined function is not a stored procedure! In general I’m having a simple rule I’m trying to use when defining concepts - if in doubt with a definition for a given concept stick to the documentation or to the "classics" – definitions provided by well known professionals or sources. (SQL Server 2016 introduced natively compiled UDFs)

From the few books I roughly reviewed on SQL Server-related topics I liked P. Petrovic’s approach for defining the stored procedure, he introducing first the batch defined as “a sequence of Transact-SQL statements and procedural extensions” that “can be stored as a database object, as either a stored procedure or UDF [3]. Now even if I like the approach I’m having a problem on how he introduced the routine because he haven’t gave a proper definition and just mentioned that a routine can be either a stored procedure or UDF [3]. On the other side it’s not always necessary to introduce terms part of the common shared conceptual knowledge, though I find it useful and constructive if such a definition would have been given. 

A definition should be clear, accurate, comprehensive, simple and should avoid confusion, eventually by specifying the intrinsic/extrinsic characteristics that differentiate the defined object from other objects, and the context in which is used. Unfortunately there are also definitions given by professionals that don't meet this criteria,  for example A. Watt  defines a stored procedure as "a module of code that allows you to reuse a desired piece of functionality and call that functionality by name" [5], while R. Dewson defines it as “a collection of compiled T-SQL commands that are directly accessible by SQL Server” [2], R. Rankins et. al as “one or more SQL commands stored in a database as an executable object” [4] or D. Comingore et Al “stored collections of queries or logic used to fulfill a specific requirement” [1]. All definitions are valid and in spite similarities I find them incomplete because they could be used as well for defining an UDF.

      Of course, by reading the chapter or the whole book in which the definition is given, by comparing the concept with other concepts, people arrive to a more accurate form of the definition(s), though that’s not always efficient and constructive because the reader has to “fish” for all the direct-related concepts and highlight the similarities/differences between them (e.g. stored procedures vs. UDF vs. views vs. triggers). Usually for easier assimilation and recall I like to use Knowledge Mapping structures/techniques like Mind Maps or Concept Maps that allows seeing the relations (including similarities/differences) between concepts and even identify new associations between them. In addition, when learning concepts it matters also the form/sequence in which the concepts are presented (maybe that’s why many people prefer a PowerPoint presentation than reading a whole book).

Actually a definition could be built starting from the identified characteristics/properties of concepts and the similarities/differences with other concepts  For example Bill Inmon defines the data warehouse as “a subject-oriented, integrated, time-varying, non-volatile collection of data in support of the management's decision-making process” and even if we could philosophy also on this subject and the intrinsic characteristics of a data warehouse, it reflects Bill Inmon’s conception in a clear, simplistic and direct manner. I could attempt to define a stored procedure using the following considerations:
- it encapsulates T-SQL statements and procedural extensions (input/output parameters, variables, loops, etc.);
- it can be executed as a single statement and thus not reused in other DML or DDL statements;
- it’s a database object;
- caches, optimizes and reuses query execution plans;
- allows code modularization allowing thus code reuse, easier/centralized code maintenance, move business logic (including validation) on the backend;
- minimizes the risks of SQL injection attacks;
- can return multiple recordsets and parameterized calls reducing thus network traffic;
- enforce security by providing restricted access to tables;
- allows some degree of portability and standardized access (given the fact that many RDBMS feature stored procedures).
- allows specifying execution context;
- allows using output parameters, cursors, temporary tables, nested calls (procedure in procedure), create and execute dynamic queries, access to system resources, trap and customize errors, perform transaction-based operations.

Some of the mentioned characteristics apply also to other database objects or they are not essential to be mentioned in a general definition, thus giving a relatively accurate definition for stored procedures is not an easy task. Eventually I could focus on the first three points mentioned above, thus an approximate definition would reduce to the following formulation: a stored procedure is a database object that encapsulates T-SQL statements and procedural extensions, the object being executed exclusively within a single statement using its name and the eventual parameters”. This definition might not be the best, though it’s workable and could be evolved in case new knowledge is discovered or essential new functionality changes are introduced.

Note:
Unfortunately in several books I started to read recently on data warehousing topics I found similar incomplete/vague definitions that, from my point of view, are not adequate given the complexity of the subjects exposed, letting thus lot of place for divagations. Of course, in such technical books the weight is more on the use of exposed concepts rather on concepts’ definition, though I’m expecting more from this type of books!

Disclaimer:

I’m not trying to denigrate the impressive work of other professionals, and I know that I’m making even more mistakes than other people do, I’m just trying to point out a fact I remarked and I consider as important: trying to give a proper accurate definition of the terms introduced in a book or any other form of communication.

References:
[1] Comingore D.,  Hinson D. (2006). Professional SQL Server™ 2005 CLR Programming. Wiley Publishing. ISBN: 978-0-470-05403-1.
[2] Dewson R. (2008). Beginning SQL Server 2008 for Developers: From Novice to Professional. Apress. ISBN: 978-1-4302-0584-5
[3] Petkovic D. (2008). Microsoft® SQL Server™ 2008: A Beginner’s Guide. McGraw-Hill. ISBN: 0-07-154639-1
[4] Rankins R.,  Bertucci P., Gallelli C., Silverstein A.T,  (2007) Microsoft® SQL Server 2005 Unleashed. Sams Publishing. ISBN: 0-672-32824-0
[5] Watt A. (2006). Microsoft SQL Server 2005 for Dummies. Wiley Publishing. ISBN: 978-0-7645-7755-0.
[6] Inmon W.H. (2005) Building the Data Warehouse, 4th Ed. Wiley Publishing. ISBN: 978-0-7645-9944-6

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.