Disclaimer: This is work in progress intended to consolidate information from various sources. It considers only on-premise SQL Server, for other platforms please refer to the documentation.
Last updated: 15-Feb-2024
[SQL Server] columnstore indexes (CI)
- {def} a technology for storing, retrieving and managing data by using a columnar data format (aka columnstore)
- store compressed data on a per-column rather than a per-row basis [5]
- {benefit} designed for analytics and data warehousing workloads
- data warehousing
- {scenario} store fact tables and large dimension tables
- ⇐ tend to require full table scans rather than table seeks
- analytics workloads
- {scenario} [SQL Server 2016 SP1] can be used for real-time analytics on operational databases
- ⇐ an updatable nonclustered columnstore index can be created on a rowstore table
- {benefit} performance increase
- can achieve up to 100x better performance [4]
- offers an order of magnitude better performance than a rowstore index
- {feature} uses batch mode execution
- improves query performance typically by two to four times
- have high performance gains for analytic queries that scan large amounts of data, especially on large tables (>1 million rows)
- {benefit} reduces significantly the data warehouse storage costs
- {feature} data compression
- ⇒ provides high compression rates, typically by 10 times
- ⇒ reduces total I/O from the physical media
- ⇐ queries often select only a few columns from a table
- minimizes or eliminates system I/O bottlenecks
- reduces significantly the memory footprint
- ⇒ query performance can improve
- because SQL Server can perform more query and data operations in memory
- {benefit} built in memory
- ⇒ sufficient memory must be available
- {benefit} part of the database engine
- no special hardware is needed
- {concept} columnstore
- {def} data structure logically organized as a table with rows and columns, and physically stored in a column-wise data format
- stores values from the same domain which commonly have similar values
- when a query references a column, then only that column is fetched from disk [3]
- ⇐ the columns not requested are skipped
- ⇒ they are not loaded into memory
- when a query is executed, the rows must be reconstructed
- ⇒ row reconstruction takes some time and uses some CPU and memory resources [3]
- [SQL Server 2016] columnstore index on rowstore tables
- columnstore is updated when data changes in the rowstore table
- both indexes work against the same data
- {concept}rowstore
- {def} data that's logically organized as a table with rows and columns, and physically stored in a row-wise data format
- ⇐ the traditional way to store relational table data
- refers to a table where the underlying data storage format is either
- a heap
- a clustered index
- a memory-optimized table
- {concept} rowstore index
- performs best on queries that seek into the data, when searching for a particular value, or for queries on a small range of values
- ⇒ appropriate for transactional workloads
- because they tend to require mostly table seeks instead of table scans
- {concept} rowgroup
- {def} a group of rows that are compressed into columnstore format at the same time
- {constraint} has a maximum number of rows per rowgroup, which is 1,048,576 =2^20 rows
- contains one column segment for every column in the table
- can have more than one delta rowgroup that form the deltastore
- e.g. when multiple threads create columnstore indexes using parallel execution plans [5]
- ⇐ each thread will work with its own subset of data, creating separate rowgroups [5]
- [partitions] each table partition has its own set of row groups [5]
- ⇐ too many partitions may prevent workloads from benefiting from a CCI [11]
- ⇐ data aren’t pushed into a compressed columnstore segment until the rowgroup limit is reached
- {event} rowgroup is compressed
- marked as read-only [16]
- a compressed rowgroup is considered as fragmented when either
- row number < rowgroup limit but dictionary size reached the maximum
- nothing can be done to increase the number of rows [15]
- the trim_reason is other than DICTIONARY_SIZE
- it has nonzero deleted rows that exceeds a minimum threshold [15]
- {event} all data from rowgroup deleted
- transitions from COMPRESSED into TOMBSTONE state
- later removed by the tuple-mover background process
- {event} rows in the columnstore indexes can be moved to different locations
- row-id in the nonclustered indexes aren’t updated
- ⇐ the mappings between old and new row locations are stored in an internal structure (aka mapping index)
- {event} rowgroup build
- all column data are combined on a per-row group basis, encoded and compressed [5]
- the rows within a row group can be rearranged if that helps to achieve a better compression rate [5]
- {feature} data compression
- the table is sliced into rowgroups, and each rowgroup is compresses in a column-wise manner
- the number of rows in the rowgroup must be
- large enough to improve compression rates
- small enough to benefit from in-memory operations
- having too many small rowgroups decreases columnstore index’s quality
- uses its own compression mechanism
- ⇒ row or page compression cannot be used on it [3]
- [SQL Server 2016] page compression has been removed
- ⇐ in some cases, page compression disallowed the creation of columnstore indexes with a very large number of columns [5]
- {feature} compression delay
- computed when a delta rowgroup is closed [7]
- keeps the ‘active’ rows in delta rowgroup and only transition these rows to compressed rowgroup after a specified delay [7]
- ⇐ reduces the overall maintenance overhead of NCCI [7]
- ⇒ leads to a larger number of delta rowgroups [7]
- {best practice} if the workload is primarily inserting data and querying it, the default COMPRESSION_DELAY of 0 is the recommended option [7]
- {best practice} [OLTP workload] if > 10% rows are marked deleted in recently compressed rowgroups, then consider a value that accommodates the behavior [7]
- via: create nonclustered columnstore index with (compression_delay= 150)
- {feature} data encoding
- all values in the data are replaced with 64-bit integers using one of two encoding algorithms
- {concept} dictionary encoding
- stores distinct values from the data in a separate structure (aka dictionary}
- every value in a dictionary has a unique ID assigned [5]
- the ID is used for replacement
- {concept} global dictionary
- shared across all segments that belong to the same index partition [5]
- {concept} local dictionary
- created for individual segments using values that are not present in the global dictionary
- {concept} value-based encoding
- mainly used for numeric and integer data types that do not have enough duplicated values [5]
- dictionary encoding would be inefficient [5]
- converts integer and numeric values to a smaller range of 64-bit integers in 2 steps
- {step} [numeric data types] are converted to integers using the minimum positive exponent (aka magnitude that allows this conversion) [5]
- {goal} convert all numeric values to integers [5]
- [integer data types] the smallest negative exponent is chosen that can be applied to all values without losing their precision [5]
- {goal} reduce the interval between the minimum and maximum values stored in the segment [5]
- {step} the minimum value (aka base value) in the segment is identified and subtracted it from all other values [5]
- ⇒ makes the minimum value in the segment number 0 [5]
- after encoding the data are compressed and stored as a LOB allocation unit
- {concept} column segment
- {def} a column of data from within the rowgroup
- is compressed together and stored on physical media
- SQL Server loads an entire segment to memory when it needs to access its data
- {concept} segment metadata
- store metadata about each segment
- e.g. minimum and maximum values
- ⇐ segments that do not have the required data are skipped [5]
- {concept} deltastore
- {def} all of the delta rowgroups of a columnstore index
- its operations are handled behind the scenes
- can be in either states
- {state} open (aka open delta store)
- accepts new rows and allow modifications and deletions of data
- {state} closed (aka closed data store)
- a delta store is closed when it reaches its rowgroup limit
- {concept} delta rowgroup
- {def} a clustered B-tree index that's used only with columnstore indexes
- improves columnstore compression and performance by storing rows until the number of rows reaches the rowgroup limit and are then moved into the columnstore
- {event} reaches the maximum number of rows
- it transitions from an ‘open’ to ‘closed’ state
- a closed rowgroup is compressed by the tuple-mover and stored into the columnstore as COMPRESSED rowgroup
- {event} compressed
- the existing delta rowgroup transitions into TOMBSTONE state to be removed later by the tuple-mover when there is no reference to it
- {concept} tuple-mover
- background process that checks for closed row group
- if it finds a closed rowgroup, it compresses the delta rowgroup and stores it into the columnstore as a COMPRESSED rowgroup
- {concept} clustered columnstore index (CCI)
- is the primary storage for the entire table
- {characteristic) updatable
- has two structures that support data modifications
- ⇐ both use the B-Tree format to store data [5]
- ⇐ created on demand [5]
- delete bitmap
- indicates which rows were deleted from a table
- upon deletion the row continues to be stored into the rowgroup
- during query execution SQL Server checks the delete bitmap and excludes deleted rows from the processing [5]
- delta store
- includes newly inserted rows
- updating a row triggers the deletion of the existing row and insertion of a new version of a row to a delta store
- ⇒ the update does not change the row data
- ⇒ the updated row is inserted to a delete bitmap
- [partitions] each partition can have a single delete bitmap and multiple delta stores
- ⇐ this makes each partition self-contained and independent from other partitions
- ⇒ allows performing a partition switch on tables that have clustered columnstore indexes defined [5]
- {feature} supports minimal logging for batch sizes >= rowgroup’s limit [12]
- [SQL Server 2017] supports non-persisted computed columns in clustered columnstore indexes [2]
- store some data temporarily into a clustered index (aka deltastore) and a btree list of IDs for deleted rows
- ⇐ {benefit} reduces fragmentation of the column segments and improves performance
- combines query results from both the columnstore and the deltastore to return the correct query results
- [partitions] too many partitions can hurt the performance of a clustered columnstore index [11]
- {concept} nonclustered columnstore index (NCCI)
- {def} a secondary index that's created on a rowstore table
- is defined as one or more columns of the table and has an optional condition that filters the rows
- designed to be used for workloads involving a mix of transactional and analytics workload*
- functions the same as a clustered columnstore index
- ⇐ has same performance optimizations (incl. batchmode operators)
- {exception} doesn’t supports persisted computed columns
- can’t be created on a columnstore index that has a computed column [2]
- however behave differently between the various versions of SQL Server
- [SQL Server 2012|2014] {restriction} readonly
- contains a copy of part or all of the rows and columns in the underlying table
- include a row-id , which is either the address of
- a row in a heap table
- a clustered index key value
- includes all columns from the clustered index even when not explicitly defined in the CREATE statement
- the not specified columns will not be available in the sys.index_columns view
- [SQL Server 2016] multiple nonclustered rowstore indexes can be created on a columnstore index and perform efficient table seeks on the underlying columnstore
- ⇒ once created, makes it possible to drop one or more btree nonclustered indexes
- enables real-time operational analytics where the OLTP workload uses the underlying clustered index while analytics run concurrently on the columnstore index
- {concept} batch mode execution (aka vector-based execution, vectorized execution)
- {def} query processing method used to process multiple rows together in groups of rows, or batches, rather than one row at a time
- SQL Server can push a predicate to the columnstore index scan operator, preventing unnecessary rows from being loaded into the batch [5]
- queries can process up to 900 rows together
- enables efficient query execution (by a 3-4x factor) [4]
- ⇐ the size of the batches varies to fit into the CPU cache
- ⇒ reduces the number of times that the CPU needs to request external data from memory or other components [5]
- improves the performance of aggregations, which can be calculated on a per-batch rather than a per-row basis [5]
- tries to minimize the copy of data between operators by creating and maintaining a special bitmap that indicates if a row is still valid in the batch [5]
- ⇐ subsequent operators will ignore the non-valid rows
- every operator has a queue of work items (batches) to process [5]
- worker threads from a shared pool pick items from queues and process them while migrating from operator to operator [5]
- is closely integrated with, and optimized around, the columnstore storage format.
- columnstore indexes use batch mode execution
- ⇐ improves query performance typically by two to four times
- {concept} tuple mover
- single-threaded process that works in the background, preserving system resources
- runs every five minutes
- converts closed delta stores to row groups that store data in a column-based storage format [5]
- can be disabled via trace flag T-634
- ⇐ the conversion of closed delta stores to row groups can be forced by reorganizing an index [5]
- runs in parallel using multiple threads
- decreases significantly conversion time at a cost of extra CPU load and memory usage [5]
- via: ALTER INDEX REORGANIZE command
- it doesn’t prevent other sessions from inserting new data into a table [5]
- deletions and data modifications would be blocked for the duration of the operation [5]
- {recommendation} consider forcing index reorganization manually to reduce execution, and therefore locking, time [5]
- considered fragmented if it has
- multiple delta rowgroups
- deleted rows
- require maintenance like that of regular B-Tree indexes [5]
- {issue] partially populated row groups
- {issue} overhead of delta store and delete bitmap scans during query execution
- rebuilding the columnstore index addresses the issues
- the strategy depends on the volatility of the data and the ETL processes implemented in the system [5]
- {recommendation} rebuild indexes when a table has a considerable volme of deleted rows and/or a large number of partially populated rowgroups [5]
- {recommendation} rebuild partition(s) that still have a large number of rows in open delta stores after the ETL process has completed, especially if the ETL process does not use a bulk insert API [5]
- creating/dropping/disabling/rebuilding functions like any other index
- columnstore statistics
- a statistics object is created at the time of columnstore index creation; however, it is neither populated nor updated afterward [5]
- ⇐ SQL Server relies on segment information, B-Tree indexes (when available), and column-level statistics when deciding if a columnstore index needs to be used [5]
- it is beneficial to create missing column-level statistics on the columns that participate in a columnstore index and are used in query predicates and as join keys [5]
- ⇐ statistics rarely update automatically on very large tables [5]
- ⇒ statistics must be updated ‘manually’
- [SQL Server 2019] included into the schema-only clone of a database functionality [8]
- enable performance troubleshooting without the need to manual capture the statistics information
- columnstore indexes has been added to sp_estimate_data_compression_savings. In SQL Server 2019 both
- COLUMNSTORE and COLUMNSTORE_ARCHIVE have been added to allow you to estimate the space savings if
- either of these indexes are used on a table.
- via DBCC CLONEDATABASE
- [in-memory tables]
- {limitation} a columnstore index must include all the columns and can’t have a filtered condition [2]
- {limitation} queries on columnstore indexes run only in InterOP mode, and not in the in-memory native mode [2]
- {operation} designing columnstore indexes
- {best practice} understand as much as possible data’s characteristics
- {best practice} identify workload’s characteristics
- {operation} create a clustered columnstore index
- via CREATE CLUSTERED COLUMNSTORE INDEX command
- not needed to specify any columns in the statement
- ⇐ the index will include all table columns
- {operation} index rebuilding
- forces SQL Server to remove deleted rows physically from the index and to merge the delta stores’ and row groups’ data [5]
- all column segments are recreated with row groups fully populated [5]
- [<SQL Server 2019] offline operation
- [SQL Server 2019 Enterprise] online operation
- ⇒ higher availability
- ⇐ pausing and resuming create and rebuild operations are not supported [11]
- very resource intensive process
- holds a schema modification (Sch-M) lock on the table
- ⇒ prevents other sessions from accessing it [5]
- ⇐ the overhead can be mitigated by using table/index partitioning
- ⇒ indexes will be rebuild on a partition basis for those partition with volatile data [5]
- {operation} index reorganization
- [<SQL Server 2019] a reorganize operation is required to merge smaller COMPRESSED rowgroups, following an internal threshold policy that determines how to remove deleted rows and combine the compressed rowgroups
- [SQL Server 2019] a background merge task also works to merge COMPRESSED rowgroups from where a large number of rows has been deleted
- ⇐ after merging smaller rowgroups, the index quality should be improved.
- the tuple-mover is helped by a background merge task that automatically compresses smaller OPEN delta rowgroups that have existed for some time as determined by an internal threshold, or merges COMPRESSED rowgroups from where a large number of rows has been deleted
- via: ALTER INDEX REORGANIZE command
- [SQL Server 2016] performs additional defragmentation
- removes deleted rows from row groups that have 10 or more percent of the rows logically deleted [5]
- merges closed row groups together, keeping the total number of rows less than or equal than rowgroup’s limit [5]
- ⇐ both processes can be done together [5]
- [SQL Server 2014] the only action performed is compressing and moving the data from closed delta stores to rowgroups [5]
- ⇐ delete bitmap and open delta stores stay intact [5]
- via: ALTER INDEX REORGANIZE
- uses all available system resources while it is running [5]
- ⇒ speeds up the execution process
- reduce the time during which other sessions cannot modify or delete data in a table [5]
- close and compress all open row groups
- via: ALTER INDEX REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON)
- row groups aren’t merged during this operation [5]
- {operation} estimate compression savings
- [SQL Server 2019] COLUMNSTORE and COLUMNSTORE_ARCHIVE added
- allows estimating the space savings if either of these indexes are used on a table [8]
- {limitation} not available in all editions
- via: sp_estimate_data_compression_savings
- {operation} [bulk loads] when the number of rows is less than deltastore’s limit, all the rows go directly to the deltastore
- [large bulk load] most of the rows go directly to the columnstore without passing through the deltastore
- some rows at the end of the bulk load might be too few in number to meet the minimum size of a rowgroup
- ⇒ the final rows go to the deltastore instead of the columnstore
- bulk insert operations provide the number of rows in the batch as part of the API call [5]
- best results are achieved by choosing a batch size that is divisible by rowgroup’s limit [5]
- ⇐ guarantees that every batch produces one or several fully populated row groups [5]
- ⇒ reduce the total number of row groups in a table [5]
- ⇒ improves query performance
- ⇐ the batch size shouldn’t exceed rowgroup’s limit [5]
- row groups can be still created on the fly in a manner to similar a bulk insert when the size of the insert batch is close to or exceeds [5]
- {operation} [non-bulk operations] trickle inserts go directly to a delta store
- {feature} parallel inserts
- [SQL Server 2016] requires following conditions for parallel insert on CCI [6]
- must specify TABLOCK
- no NCI on the clustered columnstore index
- no identity column
- database compatibility is set to 130
- {recommendation} minimize the use of string columns in facts tables [5]
- string data use more space
- their encoding involves additional overhead during batch mode execution [5]
- queries with predicates on string columns may have less efficient execution plans that also require significantly larger memory grants as compared to their non-string counterparts [5]
- {recommendation} [SQL Server 2012|2014] do not push string predicates down toward the lowest operators in execution plans.
- {recommendation} add another dimension table and replace the string value in the facts table with a synthetic, integer-based ID key that references a new table [5]
- {operation} upgrading to SQL Server 2016
- make sure that queries against the tables with columnstore indexes can utilize parallelism in case if database compatibility level less than 130 [5]
- {feature} [SQL Server 2019] automated columnstore index maintenance [8]
- {improvement} [SQL Server 2019] better columnstore metadata memory management
- {improvement} [SQL Server 2019] low-memory load path for columnstore tables
- {improvement} [SQL Server 2019] improved performance for bulk loading to columnstore indexes
- {improvement} [SQL Server 2019] server startup process has been made faster for databases that use in-memory columnstore tables for HTAP
- {feature} DMVs
- sys.column_store_segments
- returns one row for each column per segment
- sys.column_store_dictionaries
- provides information about the dictionaries used by a columnstore index
- sys.dm_db_column_store_row_group_physical_stats
- rowgroup statuses
- sys.column_store_row_groups
- provides clustered columnstore index information on a per-segment basis
[1] SQL Docs (2020) Columnstore indexes: Overview [link]
[3] Dejan Sarka et al (2012) Exam 70-463: Implementing a Data Warehouse with Microsoft SQL Server 2012 (Training Kit)
CCI - clustered columnstore index
CI - columnstore index
DBCC - Database Console Commands
DMV - Dynamic Management View
ETL - Extract, Transform, Load
HTAP - Hybrid Transactional/Analytical Processing
LOB - Line of Business
NCCI - nonclustered columnstore index
OLTP - On-Line Transaction Processing
SP - Service Pack