SQL Troubles

13 September 2009

💎SQL Reloaded: Randomized Data Selection

A few years back I was asked to provide a random set of data from a data collection. Dealing only with a small number of records and as they were in Excel, it was much easier to use the Excel’s RAND function. Even if SQL Server provides a RAND function too, it returns the same value within a data set when the same parameter is provided. In order to avoid this behavior it would be useful if could be provided a distinct parameter value for each row, for example using the data set's integer unique identifier or eventually create one, thus if ID is such a column the query would look something like that:

SELECT TOP 10 RAND(ID) PERC, ID, N1, N2

FROM

ORDER BY RAND(ID)

The fact that RAND will return always the same value for the same parameter value might be a problem because the above query will always return the same records. In exchange you can use a multiplier for the parameter provided to RAND function - for example in a first attempt you can use RAND(id), in a second RAND(id*2), in a third RAND(id*3), etc.

This approach offers unfortunately a relatively "weak" randomization which might be not sufficient for repetitive selections, therefore under SQL Server 2005+ could be used instead a CLR-based UDF (user defined function) and use the random functions available in VB, C# or any other CLR programming language, or even the .Net Random class designed to produce a sequence of numbers that meet certain statistical requirements for randomness.

12 September 2009

💎SQL Reloaded: Number of Records I (via COUNT)

Getting the number of records from a table is straightforward, a simple SELECT doing the job:

-- general syntax 
SELECT count(*) NoRecords
FROM <schema>.<table>

-- example 
SELECT count(*) NoRecords
FROM [Person].[Address]

Some prefer using a count(1) instead of a count(*), though it doesn't make a difference considering how SQL Server handles the two scripts in the background.

-- general syntax 
SELECT count(1) NoRecords
FROM <schema>.<table>

-- example 
SELECT count(1) NoRecords
FROM [Person].[Address]

Same can be achieved by using a SUM:

-- general syntax 
SELECT SUM(1) NoRecords
FROM <schema>.<table>

-- example 
SELECT SUM(1) NoRecords
FROM [Person].[Address]

However the SUM is typically used when counting only certain records within the same line (the attributes must be know however beforehand):

-- counting the occurrences for a certain values via SUM
SELECT SUM(CASE WHEN Class IS NULL THEN 1 ELSE 0 END) [Nulls]
, SUM(CASE WHEN Class = 'H' THEN 1 ELSE 0 END) [H]
, SUM(CASE WHEN Class = 'L' THEN 1 ELSE 0 END) [L]
, SUM(CASE WHEN Class = 'M' THEN 1 ELSE 0 END) [M]
, count(*) NoRecords
FROM [Production].[Product]

The sane output can be used via a GROUP BY based on the grouping attributes:

-- counting the occurrences for a certain attribute via COUNT
SELECT Class
, count(*) NoRecords
FROM [Production].[Product]
GROUP BY Class

-- counting the occurrences for a group of attributes via COUNT
SELECT Class
, Color
, count(*) NoRecords
FROM [Production].[Product]
GROUP BY Class
, Color

If you want to implement the functionality in a function or stored procedure and provide the name of the table as parameter, then things get a little more complicated as the query needs to be built dynamically on runtime. In stored procedures a query built dynamically can be usually executed using EXEC or EXECUTE stored procedures, though none of them will do as we need to reuse the actual value returned by the query. This can be achieved with the help of a third stored procedure called sp_executesql, and the work can be done relatively easily with the below stored procedure:

-- dropping the stored procedure
--DROP PROCEDURE IF EXISTS dbo.pGetNumberRecords

-- creating the stored procedure
CREATE PROCEDURE dbo.pGetNumberRecords(
  @Table varchar(100) 
, @NumberRecords int OUTPUT 
)
AS 
DECLARE @SglQuery nvarchar(250) 

IF IsNull(@table,'')<>'' 
BEGIN 
  SET @SglQuery = N'SELECT @NumberRecords = count(*) FROM ' + @table 

  --get the number of records 
  EXEC sp_executesql @Query = @SglQuery
                 , @params = N'@NumberRecords INT OUTPUT' 
                 , @NumberRecords = @NumberRecords OUTPUT 
END 

-- testing the stored procedure 
DECLARE @NumberRecords int 
EXEC dbo.pGetNumberRecords '[Production].[Product]', @NumberRecords OUTPUT 
SELECT @NumberRecords

The overhead of using a stored procedure for getting the records count is bearable though the solution is not necessarily elegant, a nicer approach being the use of functions for the same purpose:

-- dropping the function
--DROP FUNCTION IF EXISTS dbo.GetNumberRecords

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecords(
@table as nvarchar(100)) 
RETURNS int 
AS 
BEGIN 
DECLARE @sql varchar(200) 
DECLARE @NumberRecords int

IF IsNull(@table,'')<>'' 
BEGIN 
  SET @sql = N'SELECT @NumberRecords = count(*) FROM ' + @table 

  --get the number of records 
  EXEC sp_executesql @Query = @sql
                 , @params = N'@NumberRecords INT OUTPUT' 
                 , @NumberRecords = @NumberRecords OUTPUT 
END 

RETURN @NumberRecords
END

-- testing the function
SELECT dbo.GetNumberRecords('[AdventureWorks2014].[Production].[Product]')

Unfortunately UDFs have the restriction that they can’t execute stored procedures, and thus the above technique can’t be used. Even if the function is created, when attempting to run it, the following error occurs:

Msg 557, Level 16, State 2, Line 185
Only functions and some extended stored procedures can be executed from within a function.

There is actually an alternative for executing pass-through queries in functions by using OPENQUERY, OPENROWSET or OPENDATASOURCE functions, allowing thus to execute DML statements against a given linked server, as is the case of first two functions, respectively using ad-hoc connection information as is the case of the third function. For exemplification I chose OPENQUERY function, this being the code created to return the number of records from a specified table (you'll need to replace with your server, see below):

--dropping the test function 
-- DROP FUNCTION IF EXISTS dbo.GetNumberRecordsTest

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecordsTest() 
RETURNS int 
AS 
BEGIN 
  RETURN ( 
     SELECT NoRecords 
  FROM OPENQUERY( <server_name>, 'SET NOCOUNT ON; SELECT count(*) NoRecords FROM [AdventureWorks2014].[Production].[Product]') AS O) 
END 

-- retrieving the number of records
SELECT dbo.GetNumberRecordsTest()

For this piece of code to work is first necessary to refer the server as a linked server:

EXEC sp_serveroption <server_name>, 'Data Access', true

Withought doing this first you’ll get the error:

Msg 7411, Level 16, State 1, Procedure GetNumberRecords, Line 4 Server '>server_name<' is not configured for DATA ACCESS.

You can use sp_helpserver to get server's name:

-- getting server's name
EXEC sp_helpserver

The above function doesn’t help much. It would be helpful if we could provide the table’s name as parameter and store the query string in a variable, something like:

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecords(
@TableName as nvarchar(100)) 
RETURNS int 
AS 
BEGIN 
  DECLARE @sql varchar(200) 
  SET @sql = 'SET NOCOUNT ON; SELECT count(*) NoRecords FROM ' + @TableName

  RETURN ( 
     SELECT NoRecords 
  FROM OPENQUERY( , @sql) AS O) 
END 

-- retrieving the number of records
SELECT dbo.GetNumberRecords()

Unfortunately OPENQUERY function same as OPENROWSET and OPENDATASOURCE expect a constant parameter as query and therefore the error (as per SQL Server 2005):

Msg 102, Level 15, State 1, Procedure GetNumberRecords, Line 12 Incorrect syntax near '@sql'. Expecting STRING or TEXT_LEX

I wonder why the SQL Server team hasn’t allowed the use of variables as parameters for OPENQUERY functions, if not in the previous SQL Server versions at least in SQL Server 2008! Anyway, another impediment would be raised by the deterministic nature of functions.

Notes:

1) There are further alternatives, e.g. sys.partitions DMV provides similar information (see post) ideal for scenarios in which only an approximative value is needed.
2) For other scenarios one can use the sp_MSForEachTable undocumented stored procedure (see post)

Happy coding!

Previous Post <<||>> Next Post

💎SQL Reloaded: Deterministic Functions (GetDate SQL Server 2000/2005)

In an answer on LinkedIn, one of the users made me attentive that in SQL Server 2005 and later versions the GetDate() and other non-deterministic UDFs (user-defined functions) can be used inside functions and views, thing which was not possible under SQL Server 2000. This lack of functionality involved considerable overhead when was needed to use current date in table-valued functions. The solution was to send the date as parameter to the respective objects.

For exemplification the following piece of code would run successfully on SQL Server 2005 but fail on SQL Server 2000, returning the below error message.

-- creating the test function
CREATE FUNCTION dbo.fGetCurrentDate() 
RETURNS smalldatetime AS 
BEGIN 
    RETURN GetDate()
END

-- testing the function
SELECT dbo.fGetCurrentDate()

Error message:
Server: Msg 443, Level 16, State 1, Procedure GetCurrentDate, Line 6 Invalid use of 'getdate' within a function.

A similar error message resulted when using the GetDate() function inside of a view:

-- creating the test view
CREATE VIEW dbo.vGetCurrentDate
AS
SELECT GetDate() AS CurrentDate

-- testing the view
SELECT * 
FROM dbo.vGetCurrentDate

The alternative was to use an UDF instead, which functions like a parameterized view, something like:

-- creating the test UDF
CREATE FUNCTION dbo.udfFilterByDate(
 @date datetime)
 RETURNS TABLE
 AS 
 RETURN(
 SELECT * 
 FROM [AdventureWorks2014].Production.Product
 WHERE [SellStartDate]>=@Date)

 -- testing the UDF
 SELECT *
 FROM dbo.udfFilterByDate(GetDate())

 -- testing the UDF (alternative)
 DECLARE @date as datetime
 SET @date = GetDate()
 SELECT *
 FROM dbo.udfFilterByDate(@date)

Using an UDF can still be necessary when is needed to pass dates and other types of parameters used inside of a correlated query or within the logic.

29 August 2009

🛢DBMS: Lock Granularity (Definitions)

"The size of the locked area - database, file, table, page, row, or column." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"The size of the database element on which a lock is placed (usually a table or a row within a table)." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"Indicates the level of lock use. Locking can take place at the following levels: database, table, page, row, and field (attribute)." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed, 2011)

"The level of data that is involved in a lock. Lock granularity can be at any level of physical data: database, block, set of tables, row, or column level, and include any amount of data contained within those objects." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

🛢DBMS: Extensible Markup Language (Definitions)

"A standard for a markup language, similar to HTML, that allows tags to be defined to describe any kind of data you have, making it very popular as a format for data feeds." (Mike Moran & Bill Hunt , "Search Engine Marketing, Inc", 2005)

"Facilitates the assignment of meaningful structures and definitions of data and services for use by multiple systems. XML simplifies the ability to transmit and share data." (Jill Dyché & Evan Levy, "Customer Data Integration: Reaching a Single Version of the Truth", 2006)

"Simple and flexible text format used to represent data. XML was designed by the World Wide Web Consortium (W3C)." (Sara Morganand & Tobias Thernstrom , "MCITP Self-Paced Training Kit : Designing and Optimizing Data Access by Using Microsoft SQL Server 2005 - Exam 70-442", 2007)

"separates content from format, thus letting the browser decide how and where content gets displayed. XML is not a language, but a system for defining other languages so that they understand their vocabulary." (Craig F Smith & H Peter Alesso, "Thinking on the Web: Berners-Lee, Gödel and Turing", 2008)

"A platform-independent markup language for specifying the structure of data in a text document used for both data storage and the transfer of data." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

"A way of representing data and data relationships in text files, typically for data exchange between software of different types." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed. , 2010)

"A metalanguage used to represent and manipulate data elements. Unlike other markup languages, XML permits the manipulation of a document’s data elements. XML is designed to facilitate the exchange of structured documents such as orders and invoices over the Internet." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A specification for creating text files that contain hierarchical data." (Rod Stephens, "Start Here!™ Fundamentals of Microsoft® .NET Programming", 2011)

"Has been created to overcome some difficulties proper to HTML (Hypertext Markup Language) that – developed as a means for instructing the Web browsers how to display a given Web page – is a ‘presentation-oriented’ markup tool. XML is called ‘extensible’ because, at the difference of HTML, is not characterized by a fixed format, but it lets the user design its own customized markup languages (using, e.g., a specific DTD, Document Type Description) for limitless different types of documents; XML is then a ‘content-oriented’ markup tool." (Gian P Zarri, "RDF and OWL for Knowledge Management", 2011)

"A set of rules for encoding documents electronically. XML was chosen as the standard message format because of its widespread use and open source development efforts." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"A standard metalanguage for defining markup languages that is based on Standard Generalized Markup Language (SGML)." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"Extensible markup language (XML) is a simple, very flexible text format derived from SGML (standard generalized markup language). While XML was originally designed to meet the challenges of large-scale electronic publishing, it plays an increasingly significant role in the exchange of a wide variety of data on the web." (Kamalendu Pal, "Integrating Heterogeneous Enterprise Data Using Ontology in Supply Chain Management", 2019)

"A universal markup language for text and data, using nested tags to add structure and meta-information to the content." (Daniel Leuck et al, "Learning Java" 5th Ed., 2020)

"A 'best practices' subset of SGML that has been designed by the Worldwide Web Consortium (W3C) for use on the Internet." (Microfocus)

"A notation in which you describe the structure of information in a text document by enclosing information in user-defined tags that define the syntactic elements. A flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere. J2EE deployment descriptors are expressed in XML." (Microfocus)

22 August 2009

🛢DBMS: Concurrency Control (Definitions)

"The control of concurrent access. SQL Server uses locking to allow multiple users to access and change shared data at the same time without conflicting with one another." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Concurrency control is the control on the database and transactions which are executed concurrently to ensure that each transaction completed healthy." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"Concurrency control ensures that database operations can be executed concurrently without compromising correctness. Pessimistic concurrency control, such as used in systems with locks, will block any potentially conflicting operations even if they may not turn out to actually conflict. Optimistic concurrency control, the approach used by WiredTiger, will delay checking until after a conflict may have occurred, aborting and retrying one of the operations involved in any write conflict that arises." (MongoDb, "Glossary", 2008)

"Mechanisms to ensure that a database remains consistent and accurate during concurrent use." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

"A DBMS feature that is used to coordinate the simultaneous execution of transactions in a multiprocessing database system while preserving data integrity." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"The control of process contention for resources within multi-process systems." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

20 August 2009

🛢DBMS: Database-as-a-Service (Definitions)

"Database-as-a-service (DBaaS) is a cloud computing service model that provides users with some form of access to a database without the need for setting up physical hardware, installing software or configuring for performance." (Kashif Munir, "Security model for cloud database as a service", 2015)

"A database hosted in the cloud on a pay per use basis, for example Amazon Web Services." (Analytics Insight)

"An on-demand, secure, scalable, and self-service database that automates the database provisioning and administration to support new and existing business applications and operational systems." (Forrester)

"Database as a service (DBaaS) is a cloud computing managed service offering that provides access to a database without requiring the setup of physical hardware, the installation of software or the need to configure the database. Most database administration and maintenance tasks are handled by the service provider, enabling users to quickly benefit from the database service." (TechTarget) [source]

"Database as a service (DBaaS) is a cloud database offering that provides customers with access to a database without having to deploy and manage the underlying infrastructure. DBaaS is delivered as a managed database service, which means that the provider takes care of patching, upgrading, and backing up the database." (Oracle) [source]

"DBaaS (also known as managed database service) is a cloud computing service that lets users access and use a cloud database system without purchasing and setting up their own hardware, installing their own database software, or managing the database themselves (not to mention hiring the high-priced talent required to do so). The cloud provider takes care of everything from periodic upgrades to backups to ensuring that the database system remains available and secure 24/7." (IBM) [source]

🛢DBMS: CRUD (Definitions)

"Create, Read, Update, and Delete. Refers to the major functions that need to be implemented in a database solution." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Acronym for the four main database operations: Create, Read, Update, Delete. These operations correspond to the SQL statements INSERT, SELECT, UPDATE, and DELETE." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"An acronym for the fundamental operations of a database: Create, Read, Update, and Delete. See MongoDB CRUD Operations." (MongoDb, "Glossary", 2008)

"A list of the only functions of data in persistent storage, in a convenient acronym form." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"This acronym is commonly used to describe basic database functions." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"A current-state analysis exercise that inventories the systems, applications, and business processes that can create, read, update, and delete critical enterprise data and attributes. It's used to identify the processes and data access points that are affecting data quality or allowing improper usage of the data without suitable validation and governance." (Forrester)

"CRUD is an acronym that comes from the world of computer programming and refers to the four functions that are considered necessary to implement a persistent storage application: create, read, update and delete." (Sumo Logic) [source]

"CRUD (acronym for Create, Read, Update, Delete) are considered to be basic functions regarding stored data. In computer programming, possible actions are often mapped to these standard CRUD functions in order to clarify the actions. For example, standard HTTP(S) actions GET and POST refer to Read and Create functions regarding stored data." (Innopay)

🛢DBMS: Change Data Capture [CDC] (Definitions)

"As changes are made to a production data source, change data capture reads the source database log. This information can be used to prepare a batch to update the data warehouse, or it can update the data warehouse on a transaction-by-transaction basis. With SQL Server 7.0, transactional replication is an example of change data capture." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"The process of capturing changes made to a production data source. Change data capture is typically performed by reading the log file of the Database Management System of the source database. Change data capture consolidates units of work, ensures data is synchronized with the original source, and reduces data volume in a Data Warehousing environment." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The process of capturing changes made to a production data source; typically used in data warehousing environments." (Craig S Mullins, "Database Administration", 2012)

"Change data capture (CDC) is a set of software design patterns used to determine (and track) the data that has changed so that action can be taken using the changed data. CDC often uses the database transaction log to populate the deltas, although it can also query the database directly." (Piethein Strengholt, "Data Management at Scale", 2020)

"CDC is the process of capturing changes that were made in the source systems and applying these changes throughout the enterprise for both decision support systems such as information warehouse and operational data stores as well as other downstream consuming applications." (Saumya Chaki, "Enterprise Information Management in Practice", 2015)

"An automated approach for ensuring that data changes are synchronized across an enterprise by replicating data changes from a source system to other systems." (Jonathan Ferrar et al, "The Power of People", 2017)

🛢DBMS: Clustering (Definitions)

"Technology that enables you to create a hot spare. That is a server that is actually running and can take over immediately. This technology enables you to mirror an entire server to another computer." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"The use of multiple computers to provide increased reliability, capacity, and management capabilities." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

[federated cluster:] "A grouping of SQL servers used together to achieve scalability by employing a distributed partition view. A federated cluster is not used for availability, only for achieving scalability through scale out." (Allan Hirt et al, "Microsoft SQL Server 2000 High Availability", 2004)

"Any collection of distinct computers that are connected and used as a parallel computer, or to form a redundant system for higher availability. The computers in a cluster are not specialized to cluster computing and could, in principle, be used in isolation as standalone computers. In other words, the components making up the cluster, both the computers and the networks connecting them, are not custom-built for use in the cluster." (Beverly A Sanders, "Patterns for Parallel Programming", 2004)

"Connecting two or more computers in such a way that they behave like a single computer to an application or client. Clustering is used for parallel processing, load balancing, and fault tolerance." (Allan Hirt et al, "Microsoft SQL Server 2000 High Availability", 2004)

"A method of keeping database files physically close to one another on the storage media for improving performance through sequential pre-fetch operations." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"(1) The condition whereby data is physically ordered contiguously by a specified key (usually implemented by means of an index). (2) The use of multiple, 'independent' computing systems working together to form what appears to users as a single highly available system." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures" 2nd Ed, 2012)

"The tendency of elements to become unevenly distributed in the hash table, with many adjacent locations containing elements" (Nell Dale et al, "Object-Oriented Data Structures Using Java 4th Ed.", 2016)

19 August 2009

🛢DBMS: Shared Lock (Definitions)

"A lock created by non-update ('read') operations. Other users may read the data concurrently, but no transaction can acquire an exclusive lock on the data until all the shared locks have been released." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A lock used during READ operations." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"Lock that allows concurrent transactions to read a resource." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"Control over a portion of the database given to one or more transactions that prevents other transactions from modifying the data while the locks are in place." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

"A lock on a database element that prevents other transactions from updating the database element while the lock is held." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"A lock that is issued when a transaction requests permission to read data from a database and no exclusive locks are held on that data by another transaction. A shared lock allows other read-only transactions to access the database." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A lock created by non-update (read) operations." (SQL Server 2012 Glossary, "Microsoft", 2012)

"The type of lock taken by the DBMS when data is read with no intent to update it." (Craig S Mullins, "Database Administration", 2012)

"A lock that limits concurrently running application processes to read-only operations on database data." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

18 August 2009

🛢DBMS: Intent Lock (Definitions)

"Indicates the intention to acquire a share or exclusive lock on a data page." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"Specifies the intent to gain a shared or exclusive lock." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"An lock that indicates that SQL Server wants to acquire a shared or exclusive lock on a more specific resource. An intent lock prevents another transaction from acquiring an exclusive lock on the resource containing that page or row." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A lock of a larger object related to a locked small object. Implicitly, any small-grain lock implies a shared big-grain lock. If there is one or more locks on pages belonging to a table, then there will also be a lock on the table itself, as a separate lock record." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"A lock placed on one level of a resource hierarchy to protect shared or exclusive locks on lower-level resources." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A lock on a resource that indicates that the holder of the lock will read (intent shared) or write (intent exclusive) the resource using concurrency control at a finer granularity than that of the resource with the intent lock. Intent locks allow concurrent readers and writers of a resource. See What type of locking does MongoDB use?." (MongoDb, "Glossary", 2008)

"A lock that is placed on one level of a resource hierarchy to protect shared or exclusive locks on lower-level resources." (SQL Server 2012 Glossary, "Microsoft", 2012)

"A type of lock placed on higher-level database objects when a user or process acquires locks on data pages or rows. An intent lock stays in place for the life of the lower-level locks." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures" 2nd Ed., 2012)

17 August 2009

🛢DBMS: Query Optimizer (Definitions)

"SQL Server code that analyzes queries and database objects and selects the appropriate query plan. The SQL Server optimizer is a cost-based optimizer. It estimates the cost of each permutation of table accesses in terms of CPU cost and I/O cost." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A SQL server tool that formulates an optimum execution plan for a query." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"The SQL Server component responsible for generating the optimum execution plan for a query." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The SQL Server database engine component responsible for generating efficient execution plans for SQL statements." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A term applied to a process, within a database engine, that attempts to find the fastest method of executing a SQL command against a database." (Gavin Powell, "Beginning Database Design", 2006)

"This is the component in SQL Server that analyzes your queries, compares them with available indexes, and decides which index will return a result set the fastest." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"An optimization process running within SQL Server. Any queries submitted to SQL Server are first processed by the query optimizer. It determines the best way to run the query, including what indexes to use and what types of joins to use. The output is a query execution plan, sometimes called a query plan or just a plan." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A process that generates query plans. For each query, the optimizer generates a plan that matches the query to the index that will return results as efficiently as possible. The optimizer reuses the query plan each time the query runs. If a collection changes significantly, the optimizer creates a new query plan." (MongoDb, "Glossary", 2008)

"The Optimizer is an internal technology that is responsible for selecting the most efficient means to accessing or altering information. It uses detailed statistics about the database to make the right decision." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A part of a DBMS that examines a nonprocedural data manipulation request and makes a determination of the most efficient way to process that request." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"The component of a relational database system responsible for analyzing SQL queries and producing optimal access paths for retrieving data from the database." (Craig S Mullins, "Database Administration", 2012)

"A component of the SQL and XQuery compiler that chooses an access plan for a data manipulation language statement by modeling the execution cost of many alternative access plans and choosing the one with the minimal estimated cost." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"Built-in database software that determines the most efficient way to execute a SQL statement by considering factors related to the objects referenced and the conditions specified in the statement." (Oracle)

"The MySQL component that determines the best indexes and join order to use for a query, based on characteristics and data distribution of the relevant tables." (MySQL)

🛢DBMS: Exclusive Locks (Definitions)

"A lock that may not coexist with any other lock on the same object." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"A lock used during WRITE operations." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A lock that prevents any other transaction from acquiring a lock on a resource until the original lock on the resource is released at the end of the transaction. An exclusive lock is always applied during an update operation (INSERT, UPDATE, or DELETE)." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"This type of lock prevents any other transaction from acquiring a lock on a resource until the original lock on the resource is released at the end of the transaction." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A lock on a database element that prevents other transactions from updating or viewing the database element while the lock is held." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"A lock that prevents any other transaction from acquiring a lock on a resource until the original lock on the resource is released at the end of the transaction." (Microsoft, "SQL Server 2012 Glossary", 2012)

"The type of lock taken by the DBMS when data is modified to prohibit other processes or users from reading or modifying the same data." (Craig S Mullins, "Database Administration", 2012)

"A lock that prevents concurrently executing application processes from accessing database data. See also gross lock, shared lock." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"Locks which prevent any other transaction from acquiring a lock until the original lock is released at the end of a transaction, always applied for update (insert, update, delete) operations." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

15 August 2009

🛢DBMS: Deadlocks (Definitions)

"A situation which arises when two users, each having a lock on one piece of data, attempt to acquire a lock on the other’s piece of data. The SQL Server detects deadlocks, and kills one user’s process." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A situation that arises when two users, each having a lock on one piece of data, attempt to acquire a lock on the other’s piece. Each user waits for the other to release his or her lock. SQL Server detects deadlocks and kills one user’s process, returning error code 1205." (Patrick Dalton, "Microsoft SQL Server Black Book", 1997)

"A condition that arises when two or more transactions are waiting for one another to release locks." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"A situation in SQL Server 2000 when two users each have a lock on one piece of data and they attempt to acquire a lock on the other’s piece of data. Each user would wait indefinitely for the other to release the lock, unless one of the user processes is terminated. SQL Server detects deadlocks and terminates one user’s process." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A state in which two users or processes cannot continue processing because they each have a resource that the other needs." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"Occurs when multiple connections that hold locks are waiting for locks to be released that are held by the other transactions, that is, the transactions are waiting for each other to release locks." (Sara Morganand & Tobias Thernstrom , "MCITP Self-Paced Training Kit : Designing and Optimizing Data Access by Using Microsoft SQL Server 2005 - Exam 70-442", 2007)

"A specific type of locking problem when concurrent processes are competing for locks. For example, program1 holds a lock on A and is waiting for a lock on B; program2 holds a lock on B and is waiting for a lock on A." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures" 2nd Ed, 2012)

"A deadlock occurs when two or more user processes each have a lock on a separate page or table and each wants to acquire a lock on the other process’s page or table. The transaction with the least accumulated CPU time is killed and all of its work is rolled back." (Sybase)

"A situation in which two or more users are waiting for data locked by each other." (Oracle)

"A situation where different transactions are unable to proceed, because each holds a lock that the other needs. Because both transactions are waiting for a resource to become available, neither one ever releases the locks it holds." (MySQL)

"Unresolved contention for the use of resources." (IBM)