SQL Troubles

25 April 2009

🛢DBMS: Balanced Tree [BT] (Definitions)

"Short for balanced tree, or binary tree. SQL Server uses B-tree indexing. All leaf pages in a B-tree are the same distance from the root page of the index. B-trees provide consistent and predictable performance, good sequential and random record retrieval, and a flat tree structure." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A data structure that resembles a tree, it is also called a balanced tree." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"This term describes SQL Server index structures." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A relational index that is particularly useful for high-cardinality columns. The B-tree index builds a tree of values with a list of row IDs that have the leaf value. B-tree indexes are almost worthless for low-cardinality columns because they take a lot of space and they usually cannot be combined with other indexes at the same time to increase the focus of the constraints. Contrast with Bitmap index." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed, 2002)

"A structure for storing index keys; an ordered, hierarchical, paged assortment of index keys. Some people say the 'B' stands for 'Balanced'." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"A type of index structure that resembles an inverted tree. The branches of a b-tree index are balanced. Traversing the tree for any index value reads the same number of blocks." (Bob Bryla, "Oracle Database Foundations", 2004)

"Short for binary-tree , this is the structure of an index in SQL Server. It’s called this because it resembles a tree when drawn. Starting with a root page, it expands into leaf pages where data is stored." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"An abstract data type used to store indexes in SQL Server." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance 70-444", 2007)

"A self-balancing tree data structure that allows efficient searching of indexes. A b-tree stores data records in internal and leaf nodes." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"A data structure commonly used by database management systems to store indexes. MongoDB uses B-trees for its indexes." (MongoDb, "Glossary", 2008)

"A keyed, treelike index structure." (Craig S Mullins, "Database Administration", 2012)

"A tree structure for storing database indexes." (Microsoft, "|SQL Server 2012 Glossary", 2012)

"An index organized like an upside-down tree. A B-tree index has two types of blocks: branch blocks for searching and leaf blocks that store values. The leaf blocks contain every indexed data value and a corresponding rowid used to locate the actual row. The 'B' stands for 'balanced' because all leaf blocks automatically stay at the same depth." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

"An index that is arranged as a balanced hierarchy of pages and that minimizes access time by realigning data keys as items are inserted or deleted." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A hierarchical indexing technique based on an inverted tree of nodes containing ranges of indexed values. Going down the hierarchical levels, the nodes progressively contain smaller numbers of index values, so that any value may be searched for in a few trials by starting at the top." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

24 April 2009

🛢DBMS: Bitmap Index/Indexing (Definitions)

"An indexing structure used to provide extremely efficient retrieval. Bitmap indexes are not efficient in update operations, but they work very well in a read-only environment. Microsoft OLAP Services can use bitmap indexes for cubes." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"A relational indexing technique most appropriate for columns with a limited number of potential values (low cardinality). Most optimizers can combine more than one bitmapped index in a single query." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit 2nd Ed ", 2002)

"An index containing one or more bitmaps." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"An index that maintains a binary string of ones and zeros for each distinct value of a column within the index." (Bob Bryla, "Oracle Database Foundations", 2004)

"A specialized form of an index indicating the existence or nonexistence of a condition for a group of blocks or records. Bitmaps are expensive to build and maintain, but provide very fast comparison and access facilities." (William H Inmon, "Building the Data Warehouse", 2005)

"An index containing binary representations for each record using 0’s and 1’s. For example, a bitmap index creates two bitmaps for two values of M for Male and F for Female. When M is encountered, the M bitmap is set to 1 and the F bitmap is set to 0." (Gavin Powell, "Beginning Database Design", 2006)

"A compact, high speed indexing method where the key values and the conditions are compressed to a small size that can be stored and searched rapidly." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"An index that uses a bit array (0s and 1s) to represent the existence of a value or condition." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"An indexing technique using a string of zeroes and ones, or bits. For each key value of the bitmap index a separate string of zeroes and ones is stored." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures", 2012)

"A subcomponent of a single bitmap index entry. Each indexed column value may have one or more bitmap pieces. The database uses bitmap pieces to break up an index entry that is large in relation to the size of a block." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

18 April 2009

🛢DBMS: Buffer Pool [BP] (Definitions)

"A buffer pool is a group of buffers logically connected so that a file using the pool has access to any of the available buffers." (F Peter Fisher & ‎George F Swindle, "Computer Programming Systems", 1964)

"A buffer pool is an area of storage that is segmented to provide a specific number of buffers. Buffers are obtained from the pool when needed and returned when no longer needed. Once returned to the pool, the buffer is then free to be used for another purpose." (Michael P Bouros, "Getting Into VSAM: An Introduction and Technical Reference", 1987)

"The buffer pool is a group of shared memory pages in the resident memory. It is used whenever a data page is read from disk." (Ron Flannery, "The Informix Handbook", 2000)

"A fixed-size allocation of memory, used to store an in-memory copy of a bunch of pages." (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"A buffer pool […] is simply a part of the total buffer cache that is subject to different retention criteria for database objects like tables." (Sam Alapati, "Expert Oracle Database 11g Administration" , 2009)

"A buffer pool is an area of memory into which database pages are read, modified, and held during processing." (IBM, "Business Process Management Performance teams", 2010)

"A block of memory reserved for index and table data pages." (SQL Server 2012 Glossary, "Microsoft", 2012)

"An area of memory set aside and used to avoid I/O operations when actual data is being read from the database. Also referred to as a data cache." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures" 2nd Ed., 2012)

"The buffer pool is a caching mechanism for managing transactions and writing and reading data to or from disks […]" (Charles Bell et al, MySQL High Availability: Tools for Building Robust Data Centers, 2014)

"An area of memory into which data pages are read and in which they are modified and held during processing." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A buffer pool is an area of main memory that has been allocated by the database manager for the purpose of caching table and index data as it is read from disk." (IBM, DB2 documentation)

"The memory area that holds cached InnoDB data for both tables and indexes." (MySQL, "MySQL 8.0 Reference Manual Glossary")

17 April 2009

🛢DBMS: Merge Replication (Definitions)

"A type of replication that allows sites to make autonomous changes to replicated data and, at a later time, merge changes made at all sites. Merge replication does not guarantee transactional consistency." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A type of replication that allows sites to make autonomous changes to replicated data, and at a later time, merge changes and resolve conflicts when necessary." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"The process of transferring data from the Publisher to the Subscriber, allowing the Publisher and Subscriber to update data while connected or disconnected and then merging the updates after they both are connected. Merge replication begins with a snapshot. Thereafter, no data is replicated until the Publisher and Subscriber do a "merge." The merge can be scheduled or done via an ad-hoc request. Merge replication's main benefit is that it supports subscribers who are not on the network much of the time. Transactions, which are committed, however, may be rolled back as the result of conflict resolution." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A replication type that relies on DML operations being captured from both the published database and the subscriber database(s) and automatically synchronized. Typically starts with a snapshot of the publication. Uses triggers to track subsequent changes made at either the publisher or subscriber." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A replication strategy used when multiple subscribers are also acting as publishers. In other words, the data is updated from multiple sources." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A type of replication that allows sites to make autonomous changes to replicated data, and at a later time, merge changes and resolve conflicts when necessary." (Microsoft Technet)

"Merge replication is a method for copying and distributing data and database objects from one SQL Server database to another followed by synchronizing the databases for consistency." (Idera) [source]

16 April 2009

🛢DBMS: Snapshot Replication (Definitions)

"A type of replication that takes a snapshot of current data in a publication at a Publisher and replaces the entire replica at a Subscriber on a periodic basis, in contrast to publishing changes when they occur." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A type of replication that distributes data exactly as it appears at a specific moment in time and does not monitor for modifications made to the data." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A type of replication wherein data and database objects are distributed by copying published items via the Distributor and on to the Subscriber exactly as they appear at a specific moment in time. Snapshot replication provides the distribution of both data and structure (tables, indexes, and so on) on a scheduled basis. It may be thought of as a 'whole table refresh'. No updates to the source table are replicated until the next scheduled snapshot." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"Replication type that relies on a snapshot of the entire article (table) to be automatically sent from a published database to the subscriber database(s). Distributes data exactly as it appears at a given time." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"Replication type that relies on a snapshot of the entire article (table) to be automatically sent from a published database to the subscriber database(s). Distributes data exactly as it appears at a given time." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Replication of data taken at a moment of time. With snapshot replication, the entire data set is replicated at the same time." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A replication in which data is distributed exactly as it appears at a specific moment in time and does not monitor for updates to the data." (Microsoft, 2012)

"Snapshot replication distributes data exactly as it appears at a specific moment in time and does not monitor for updates to the data." (Microsoft Technet)

🛢DBMS: Fragmentation (Definitions)

"Describes the effect to a database that is spread out in various pieces across multiple different devices. Fragmentation can also occur in a database in which the pages are not in physical order." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A condition that occurs when data modifications are made. You can reduce fragmentation and improve read-ahead performance by dropping and re-creating a clustered index. DBCC DBREINDEX, which can rebuild all of the indexes for a table in one statement, is often used to defragment tables." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Fragmentation often occurs in databases when frequent or large data modifications are made. SQL Server 2000 provides methods for reducing fragmentation and improving read-ahead performance by dropping and re-creating a clustered index." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"Occurs when data modifications are made. You can reduce fragmentation and improve read-ahead performance by dropping and re-creating a clustered index." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"When a page fills with data, SQL Server must take half of the data from the full page and move it to a new page to make room for more data. When a new page is created, then the pages inside the database are no longer contiguous. This condition is called fragmentation ." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"A process that occurs when data modifications are made. It is possible to reduce fragmentation and improve read-ahead performance by dropping and re-creating a clustered index." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"A process that degrades performance because of data not being contiguous as a result of data being modified." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A doubly linked series of pages whose logical order does not equal the physical order." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"In databases, indexes can be fragmented similar to how a hard drive can be fragmented. A fragmented index results in slower performance of the database. Fragmentation can be reduced by setting a fill factor on an index so it has empty space. Fragmented indexes can be defragmented by using REORGANIZE (keeps the index online) or by using REBUILD (which defaults to offline but can be run online)." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"The separation of the index into pieces as a result of inserts and deletions in the index." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

15 April 2009

🛢DBMS: Transactional Replication (Definitions)

"A type of replication that marks selected transactions in the Publisher's database transaction log for replication and then distributes them asynchronously to Subscribers as incremental changes, while maintaining transactional consistency." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A type of replication where an initial snapshot of data is applied at Subscribers, and then when data modifications are made at the Publisher, the individual transactions are captured and propagated to Subscribers." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A type of replication where data and database objects are distributed by first applying an initial snapshot at the Subscriber and then later capturing transactions made at the Publisher and propagating them to individual Subscribers. Transactional replication, as with all replication types, begins with a synchronizing snapshot. After the initial synchronization, transactions, which are committed at the Publisher, are automatically replicated to the Subscribers." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A replication type that relies on DML operations being captured from a published database and automatically sent to the subscriber database(s). Starts with a snapshot of the publication. Incremental changes both in data and schema at the source are replicated to the destination as they occur." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"Replication that starts with a snapshot and then keeps the Subscribers up-to-date by using the transaction log. Transactions are recorded on the Publisher, distributed to the Subscribers, and then applied to keep the Subscribers up-to-date." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A type of replication that typically starts with a snapshot of the publication database objects and data." (SQL Server 2012 Glossary, "Microsoft", 2012)

"In SQL Replication, a type of processing in which every transaction is replicated to the target table when it is committed in the source table." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A type of replication that typically starts with a snapshot of the publication database objects and data." (Microsoft Technet)

🛢DBMS: Foreign Key Constraint (Definitions)

"A constraint that establishes a parent-child relationship between two tables via one or more common columns. The foreign key in the child table refers to a primary or unique key in the parent table." (Bob Bryla, "Oracle Database Foundations", 2004)

"A database object that enforces data reference integrity between (or within) tables." (Sara Morganand & Tobias Thernstrom , "MCITP Self-Paced Training Kit : Designing and Optimizing Data Access by Using Microsoft SQL Server 2005 - Exam 70-442", 2007)

"A database object (data integrity mechanism) that maintains referential integrity by establishing and enforcing a link between the data in two tables." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A foreign key constraint is a logical coupling of two SQL tables through the values of specified columns." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"A logical coupling of two SQL tables through the values of specified columns." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide" 4th Ed, 2015)

🛢DBMS: Computed Column (Definitions)

"A virtual column in a table whose value is computed at run time. The values in the column are not stored in the table, but are computed based on the expression that defines the column. An example of the definition of a computed column is: Cost as Price * Quantity." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A virtual column defined at the table level through a Transact-SQL expression." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A column in a table that displays the result of an expression instead of stored data. For example, InventoryCost = QuantityOnHand * ProductCost. A calculated column could be calculated on-the-fly with the results not being stored, or the data can be persisted, where the computed data is held within the table." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"If a table represents data about something, a column is the place in a table to keep information about some aspect of that thing. Each row in the table contains one value in each column." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

"A virtual column in a table whose value is computed at run time." (SQL Server 2012 Glossary, "Microsoft", 2012)

"In SQL, a relationship between the value of one column and the value of another column." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

14 April 2009

🛢DBMS: Roll Forward (Definitions)

"To recover from disasters, such as media failure, by reading the transaction log and reapplying all readable and complete transactions. See also roll back." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"To apply all the completed transactions from a database or log backup in order to recover a database to a point in time or the point of failure." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"To apply logged changes to data in a roll forward set to bring the data forward in time." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"The process of applying committed transactions. As a part of the recovery process, committed transactions are rolled forward to ensure the database is recovered in a consistent state with the changed data in the database." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"To apply logged changes to the data in a roll forward set to bring the data forward in time." (SQL Server 2012 Glossary, "Microsoft", 2012)

"To update the data in a restored database or table space by applying changes recorded in the database log files." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

13 April 2009

🛢DBMS: Roll Back (Definitions)

"To remove partially completed transactions after a database or other system failure." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"To reverse changes made by transactions that were uncommitted at the point in time to which a database is being recovered." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"To reverse or undo changes made so far." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"The process of undoing uncommitted transactions. As a part of the recovery process, uncommitted transactions are rolled back to ensure the database is recovered in a consistent state." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Undo the changes made by a transaction, restoring the database to the state it was in before the transaction began." (Jan L Harrington, "Relational Database Design and Implementation" 3rd Ed., 2009)

"End a transaction, undoing any changes made by the transaction and restoring the database to the state it was in before the transaction began." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"To reverse changes." (Microsoft, "SQL Server 2012 Glossary", 2012)

"To undo the database statements performed prior to a commit of the transaction." (Craig S Mullins, "Database Administration", 2012)

"To restore data that is changed by an SQL statement to the state at its last commit point." (IBM, "Informix Servers 12.1", 2014)

"To return to a previous version, as in rolling back a device driver." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"To restore data that is changed by an SQL statement to the state at its last commit point." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"To return the values changed by a transaction to their original state." (Microsoft, "ODBC Glossary")

12 April 2009

🛢DBMS: Index Selectivity (Definitions)

"The ratio of duplicate key values in an index. An index is selective when it lets the optimizer pinpoint a single row, such as a search for a unique key. An index on nonunique entries is less selective. An index on values such as 'M' or 'F' (for male or female) is extremely nonselective." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A ratio of the number of rows accessed for a query to die total number of rows in a table." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"The number of distinct values divided by the total number of values. For example, if the values are {A,A,B,B} then the number of distinct values, that is, the number that a SELECT DISTINCT . . . statement would return, is two, while the total number of values is four. So selectivity is 2/4 in this case. Selectivity is usually expressed as a percentage ('selectivity is 50%'), but some express it as a ratio ('selectivity is 0.5') instead. " (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"This is a term used to describe the number of duplicate values in a column." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"The degree of unique values for the data contained in a table column." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A measure of how likely an index will be used in query processing." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A value indicating the proportion of a row set retrieved by a predicate or combination of predicates, for example, WHERE last_name = 'Smith'. A selectivity of 0 means that no rows pass the predicate test, whereas a value of 1 means that all rows pass the test. The adjective selective means roughly "choosy." Thus, a highly selective query returns a low proportion of rows (selectivity close to 0), whereas an unselective query returns a high proportion of rows (selectivity close to 1)." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[selectivity function:] "A function that calculates the percentage of rows that will be returned by a filter function in the WHERE clause of a query. The optimizer uses selectivity information to determine the fastest way to execute an SQL query." (IBM, "Informix Servers 12.1", 2014)

"The probability that any table row will satisfy a predicate." (IBM, "Informix Servers 12.1", 2014)

"A property of data distribution, the number of distinct values in a column (its cardinality) divided by the number of records in the table. High selectivity means that the column values are relatively unique, and can retrieved efficiently through an index. If you (or the query optimizer) can predict that a test in a WHERE clause only matches a small number (or proportion) of rows in a table, the overall query tends to be efficient if it evaluates that test first, using an index." (MySQL, "MySQL 8.0 Reference Manual Glossary")

"In a query, the measure of how many rows from a row set pass a predicate test, for example, WHERE last_name = 'Smith'. A selectivity of 0.0 means no rows, whereas a value of 1.0 means all rows. A predicate becomes more selective as the value approaches 0.0 and less selective (or more unselective) as the value approaches 1.0." (Oracle, "Oracle Database Concepts", 0)

10 April 2009

🛢DBMS: Surrogate Key (Definitions)

"A unique identifier for a row within a database table. A surrogate, or candidate, key can be made up of one or more columns. By definition, every table must have at least one surrogate key (in which case it becomes the primary key for a table automatically). However, it is possible for a table to have more than one surrogate key (in which case one of them must be designated as the primary key). Any surrogate key that is not the primary key is called the alternate key." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A primary key that is typically invisible to the end user. Normally, surrogate keys are used where end users have their own pre-existing identification schemes (such as an ISBN in a database of books), so the users can modify their existing identifiers." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"Integer keys that are sequentially assigned as needed in the staging area to populate a dimension table and join to the fact table. In the dimension table, the surrogate key is the primary key. In the fact table, the surrogate key is a foreign key to a specific dimension and may be part of the fact table’s primary key, although this is not required. A surrogate key usually cannot be interpreted by itself. That is, it is not a smart key in any way. Surrogate keys are required in many data warehouse situations to handle slowly changing dimensions, as well as missing or inapplicable data. Also known as artificial keys, integer keys, meaningless keys, non-natural keys, and synthetic keys." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed., 2002)

"A surrogate key is a substitute key that is usually an arbitrary numeric value assigned by the load process or the database system. The advantage of the surrogate key is that it can be structured so that it is always unique throughout the span of integration for the data warehouse." (Claudia Imhoff et al, "Mastering Data Warehouse Design", 2003)

"A single-part, artificially established identifier for an entity. Surrogate key assignment is a special case of derived data - one where the primary key is derived. A common way of deriving surrogate key values is to assign integer values sequentially." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

[artificial key:] "A system-generated, nonsignificant, surrogate identifier or globally unique identifier (GUID) used to uniquely identify a row in a table. This is also known as a surrogate key." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A redundant, unique key generated for a record in a data warehouse table to allow integration of data from multiple source systems and to support changing data over time." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"The primary key column of a dimension table. The surrogate key is unique to the data warehouse. Key values have no intrinsic meaning, and are assigned as part of the ETL process. By avoiding the use of a natural key, the data warehouse is able to handle changes to operational data in a different manner from transaction systems. The use of a surrogate key also eliminates the need to join fact and dimension tables via multi-part keys." (Christopher Adamson, "Mastering Data Warehouse Aggregates", 2006)

"Used as a replacement or substitute for a descriptive primary key, allowing for better control, better structure, less storage space, more efficient indexing, and absolute surety of uniqueness. Surrogate keys are usually integers, and usually automatically generated using auto counters or sequences." (Gavin Powell, "Beginning Database Design", 2006)

"An artificial key field, usually with system-assigned sequential numbers, used in the dimensional model to link a dimension table to the fact table. In a dimension table, the surrogate key is the primary key which becomes a foreign key in the fact table." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A single-part, artificially established, physical identifier for a data set, usually not visible to business users, and used for database management and performance. Surrogate key assignment is a special case of derived data - one where the primary key is derived. A common way of deriving surrogate key values is to assign integer values sequentially. Sometimes referred to as a dummy key, sequential key, or auto-number field." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A system-assigned primary key, generally numeric and auto-incremented." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

🛢DBMS: Alternate Key (Definitions)

"Column or combination of columns, not the primary key columns, whose values uniquely identify a row in a table." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling 2nd Ed.", 2005)

"A candidate key that is not used as the table's primary key." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"Unique identifier for an entity instance other than the primary key." (Craig S Mullins, "Database Administration", 2012)

"Other columns that could be used as a primary key but are not implemented as such." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"In relational theory, any unique key that is not a primary key." (David C Hay, "Data Model Patterns: A Metadata Map", 2010)

05 April 2009

🛢DBMS: Composite Key (Definitions)

"An index key that includes two or more columns; for example, authors(au_lname, au_fname)." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A key composed of two or more columns. A drawback of composite keys is that they require more complex joins when two or more tables are joined." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Key in a database table made up of several columns. Same as Concatenated key. The overall key in a typical fact table is a subset of the foreign keys in the fact table. In other words, it usually does not require every foreign key to guarantee uniqueness of a fact table row." (Ralph Kimball & Margy Ross, "The Data Warehouse Toolkit" 2nd Ed., 2002)

"A key composed of two or more columns." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A candidate key made up of more than one attribute or column." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling" 2nd Ed., 2005)

"A primary key, unique key, or foreign key consisting of more than one field." (Gavin Powell, "Beginning Database Design", 2006)

"Multiple key columns used to uniquely identify a record." (Reed Jacobsen & Stacia Misner, "Microsoft SQL Server 2005 Analysis Services Step by Step", 2006)

"A candidate key comprising more than one attribute." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A key that includes two or more fields. Also called a compound key or concatenated key." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"A key for a database table made up of more than one attribute or field." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A key whose definition consists of two or more fields in a file, columns in a table, or attributes in a relation." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A key with more than one attribute." (Craig S Mullins, "Database Administration", 2012)

"An ordered set of key columns or expressions where the referenced column names are from the same table. See also key." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)