12 April 2009

DBMS: Index Selectivity (Definitions)

"The ratio of duplicate key values in an index. An index is selective when it lets the optimizer pinpoint a single row, such as a search for a unique key. An index on nonunique entries is less selective. An index on values such as 'M' or 'F' (for male or female) is extremely nonselective." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A ratio of the number of rows accessed for a query to die total number of rows in a table." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"The number of distinct values divided by the total number of values. For example, if the values are {A,A,B,B} then the number of distinct values, that is, the number that a SELECT DISTINCT . . . statement would return, is two, while the total number of values is four.  So selectivity is 2/4 in this case. Selectivity is usually expressed as a percentage ('selectivity is 50%'), but some express it as a ratio ('selectivity is 0.5') instead. " (Peter Gulutzan & Trudy Pelzer, "SQL Performance Tuning", 2002)

"This is a term used to describe the number of duplicate values in a column." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"The degree of unique values for the data contained in a table column." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A measure of how likely an index will be used in query processing." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A value indicating the proportion of a row set retrieved by a predicate or combination of predicates, for example, WHERE last_name = 'Smith'. A selectivity of 0 means that no rows pass the predicate test, whereas a value of 1 means that all rows pass the test.  The adjective selective means roughly "choosy." Thus, a highly selective query returns a low proportion of rows (selectivity close to 0), whereas an unselective query returns a high proportion of rows (selectivity close to 1)." (Oracle, "Database SQL Tuning Guide Glossary", 2013)

[selectivity function:] "A function that calculates the percentage of rows that will be returned by a filter function in the WHERE clause of a query. The optimizer uses selectivity information to determine the fastest way to execute an SQL query." (IBM, "Informix Servers 12.1", 2014)

"The probability that any table row will satisfy a predicate." (IBM, "Informix Servers 12.1", 2014)

"A property of data distribution, the number of distinct values in a column (its cardinality) divided by the number of records in the table. High selectivity means that the column values are relatively unique, and can retrieved efficiently through an index. If you (or the query optimizer) can predict that a test in a WHERE clause only matches a small number (or proportion) of rows in a table, the overall query tends to be efficient if it evaluates that test first, using an index." (MySQL, "MySQL 8.0 Reference Manual Glossary")

"In a query, the measure of how many rows from a row set pass a predicate test, for example, WHERE last_name = 'Smith'. A selectivity of 0.0 means no rows, whereas a value of 1.0 means all rows. A predicate becomes more selective as the value approaches 0.0 and less selective (or more unselective) as the value approaches 1.0." (Oracle, "Oracle Database Concepts", 0)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.