SQL Troubles: SQL Server 2016

Showing posts with label SQL Server 2016. Show all posts

16 February 2025

💠🛠️🗒️SQL Server: Columnstore Indexes [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources. It considers only on-premise SQL Server, for other platforms please refer to the documentation.

Last updated: 15-Feb-2024

[SQL Server] columnstore indexes (CI)

{def} a technology for storing, retrieving and managing data by using a columnar data format (aka columnstore)

store compressed data on a per-column rather than a per-row basis [5]

{benefit} designed for analytics and data warehousing workloads

data warehousing

{scenario} store fact tables and large dimension tables

⇐ tend to require full table scans rather than table seeks

analytics workloads

{scenario} [SQL Server 2016 SP1] can be used for real-time analytics on operational databases

⇐ an updatable nonclustered columnstore index can be created on a rowstore table

{benefit} performance increase

can achieve up to 100x better performance [4]
offers an order of magnitude better performance than a rowstore index

{feature} uses batch mode execution

improves query performance typically by two to four times

have high performance gains for analytic queries that scan large amounts of data, especially on large tables (>1 million rows)

{benefit} reduces significantly the data warehouse storage costs

{feature} data compression

⇒ provides high compression rates, typically by 10 times

⇒ reduces total I/O from the physical media

⇐ queries often select only a few columns from a table
minimizes or eliminates system I/O bottlenecks

reduces significantly the memory footprint

⇒ query performance can improve

because SQL Server can perform more query and data operations in memory

{benefit} built in memory

⇒ sufficient memory must be available

{benefit} part of the database engine

no special hardware is needed

{concept} columnstore

{def} data structure logically organized as a table with rows and columns, and physically stored in a column-wise data format

stores values from the same domain which commonly have similar values

when a query references a column, then only that column is fetched from disk [3]

⇐ the columns not requested are skipped

⇒ they are not loaded into memory

when a query is executed, the rows must be reconstructed

⇒ row reconstruction takes some time and uses some CPU and memory resources [3]

[SQL Server 2016] columnstore index on rowstore tables

columnstore is updated when data changes in the rowstore table

both indexes work against the same data

{concept}rowstore

{def} data that's logically organized as a table with rows and columns, and physically stored in a row-wise data format

⇐ the traditional way to store relational table data
refers to a table where the underlying data storage format is either

a heap
a clustered index
a memory-optimized table

{concept} rowstore index

performs best on queries that seek into the data, when searching for a particular value, or for queries on a small range of values

⇒ appropriate for transactional workloads

because they tend to require mostly table seeks instead of table scans

{concept} rowgroup

{def} a group of rows that are compressed into columnstore format at the same time

{constraint} has a maximum number of rows per rowgroup, which is 1,048,576 =2^20 rows
contains one column segment for every column in the table
can have more than one delta rowgroup that form the deltastore

e.g. when multiple threads create columnstore indexes using parallel execution plans [5]

⇐ each thread will work with its own subset of data, creating separate rowgroups [5]

[partitions] each table partition has its own set of row groups [5]

⇐ too many partitions may prevent workloads from benefiting from a CCI [11]

⇐ data aren’t pushed into a compressed columnstore segment until the rowgroup limit is reached

{event} rowgroup is compressed

marked as read-only [16]
a compressed rowgroup is considered as fragmented when either

row number < rowgroup limit but dictionary size reached the maximum

nothing can be done to increase the number of rows [15]
the trim_reason is other than DICTIONARY_SIZE

it has nonzero deleted rows that exceeds a minimum threshold [15]

{event} all data from rowgroup deleted

transitions from COMPRESSED into TOMBSTONE state
later removed by the tuple-mover background process

{event} rows in the columnstore indexes can be moved to different locations

row-id in the nonclustered indexes aren’t updated

⇐ the mappings between old and new row locations are stored in an internal structure (aka mapping index)

{event} rowgroup build

all column data are combined on a per-row group basis, encoded and compressed [5]

the rows within a row group can be rearranged if that helps to achieve a better compression rate [5]

{feature} data compression

the table is sliced into rowgroups, and each rowgroup is compresses in a column-wise manner

the number of rows in the rowgroup must be

large enough to improve compression rates
small enough to benefit from in-memory operations

having too many small rowgroups decreases columnstore index’s quality

uses its own compression mechanism

⇒ row or page compression cannot be used on it [3]
[SQL Server 2016] page compression has been removed

⇐ in some cases, page compression disallowed the creation of columnstore indexes with a very large number of columns [5]

{feature} compression delay

computed when a delta rowgroup is closed [7]
keeps the ‘active’ rows in delta rowgroup and only transition these rows to compressed rowgroup after a specified delay [7]

⇐ reduces the overall maintenance overhead of NCCI [7]
⇒ leads to a larger number of delta rowgroups [7]

{best practice} if the workload is primarily inserting data and querying it, the default COMPRESSION_DELAY of 0 is the recommended option [7]
{best practice} [OLTP workload] if > 10% rows are marked deleted in recently compressed rowgroups, then consider a value that accommodates the behavior [7]

via: create nonclustered columnstore index with (compression_delay= 150)

{feature} data encoding

all values in the data are replaced with 64-bit integers using one of two encoding algorithms
{concept} dictionary encoding

stores distinct values from the data in a separate structure (aka dictionary}

every value in a dictionary has a unique ID assigned [5]

the ID is used for replacement

{concept} global dictionary

shared across all segments that belong to the same index partition [5]

{concept} local dictionary

created for individual segments using values that are not present in the global dictionary

{concept} value-based encoding

mainly used for numeric and integer data types that do not have enough duplicated values [5]

dictionary encoding would be inefficient [5]

converts integer and numeric values to a smaller range of 64-bit integers in 2 steps

{step} [numeric data types] are converted to integers using the minimum positive exponent (aka magnitude that allows this conversion) [5]

{goal} convert all numeric values to integers [5]
[integer data types] the smallest negative exponent is chosen that can be applied to all values without losing their precision [5]

{goal} reduce the interval between the minimum and maximum values stored in the segment [5]

{step} the minimum value (aka base value) in the segment is identified and subtracted it from all other values [5]

⇒ makes the minimum value in the segment number 0 [5]

after encoding the data are compressed and stored as a LOB allocation unit

{concept} column segment

{def} a column of data from within the rowgroup
is compressed together and stored on physical media
SQL Server loads an entire segment to memory when it needs to access its data

{concept} segment metadata

store metadata about each segment

e.g. minimum and maximum values
⇐ segments that do not have the required data are skipped [5]

{concept} deltastore

{def} all of the delta rowgroups of a columnstore index
its operations are handled behind the scenes

can be in either states

{state} open (aka open delta store)

accepts new rows and allow modifications and deletions of data

{state} closed (aka closed data store)

a delta store is closed when it reaches its rowgroup limit

{concept} delta rowgroup

{def} a clustered B-tree index that's used only with columnstore indexes
improves columnstore compression and performance by storing rows until the number of rows reaches the rowgroup limit and are then moved into the columnstore
{event} reaches the maximum number of rows

it transitions from an ‘open’ to ‘closed’ state
a closed rowgroup is compressed by the tuple-mover and stored into the columnstore as COMPRESSED rowgroup

{event} compressed

the existing delta rowgroup transitions into TOMBSTONE state to be removed later by the tuple-mover when there is no reference to it

{concept} tuple-mover

background process that checks for closed row group

if it finds a closed rowgroup, it compresses the delta rowgroup and stores it into the columnstore as a COMPRESSED rowgroup

{concept} clustered columnstore index (CCI)

is the primary storage for the entire table
{characteristic) updatable

has two structures that support data modifications

⇐ both use the B-Tree format to store data [5]
⇐ created on demand [5]
delete bitmap

indicates which rows were deleted from a table
upon deletion the row continues to be stored into the rowgroup
during query execution SQL Server checks the delete bitmap and excludes deleted rows from the processing [5]

delta store

includes newly inserted rows
updating a row triggers the deletion of the existing row and insertion of a new version of a row to a delta store

⇒ the update does not change the row data
⇒ the updated row is inserted to a delete bitmap

[partitions] each partition can have a single delete bitmap and multiple delta stores

⇐ this makes each partition self-contained and independent from other partitions

⇒ allows performing a partition switch on tables that have clustered columnstore indexes defined [5]

{feature} supports minimal logging for batch sizes >= rowgroup’s limit [12]
[SQL Server 2017] supports non-persisted computed columns in clustered columnstore indexes [2]
store some data temporarily into a clustered index (aka deltastore) and a btree list of IDs for deleted rows

⇐ {benefit} reduces fragmentation of the column segments and improves performance
combines query results from both the columnstore and the deltastore to return the correct query results

[partitions] too many partitions can hurt the performance of a clustered columnstore index [11]

{concept} nonclustered columnstore index (NCCI)

{def} a secondary index that's created on a rowstore table

is defined as one or more columns of the table and has an optional condition that filters the rows
designed to be used for workloads involving a mix of transactional and analytics workload*
functions the same as a clustered columnstore index

⇐ has same performance optimizations (incl. batchmode operators)
{exception} doesn’t supports persisted computed columns

can’t be created on a columnstore index that has a computed column [2]

however behave differently between the various versions of SQL Server

[SQL Server 2012|2014] {restriction} readonly

contains a copy of part or all of the rows and columns in the underlying table

include a row-id , which is either the address of

a row in a heap table
a clustered index key value

includes all columns from the clustered index even when not explicitly defined in the CREATE statement

the not specified columns will not be available in the sys.index_columns view

[SQL Server 2016] multiple nonclustered rowstore indexes can be created on a columnstore index and perform efficient table seeks on the underlying columnstore

⇒ once created, makes it possible to drop one or more btree nonclustered indexes

enables real-time operational analytics where the OLTP workload uses the underlying clustered index while analytics run concurrently on the columnstore index

{concept} batch mode execution (aka vector-based execution, vectorized execution)

{def} query processing method used to process multiple rows together in groups of rows, or batches, rather than one row at a time

SQL Server can push a predicate to the columnstore index scan operator, preventing unnecessary rows from being loaded into the batch [5]
queries can process up to 900 rows together

enables efficient query execution (by a 3-4x factor) [4]
⇐ the size of the batches varies to fit into the CPU cache
⇒ reduces the number of times that the CPU needs to request external data from memory or other components [5]

improves the performance of aggregations, which can be calculated on a per-batch rather than a per-row basis [5]
tries to minimize the copy of data between operators by creating and maintaining a special bitmap that indicates if a row is still valid in the batch [5]

⇐ subsequent operators will ignore the non-valid rows
every operator has a queue of work items (batches) to process [5]
worker threads from a shared pool pick items from queues and process them while migrating from operator to operator [5]

is closely integrated with, and optimized around, the columnstore storage format.

columnstore indexes use batch mode execution

⇐ improves query performance typically by two to four times

{concept} tuple mover

single-threaded process that works in the background, preserving system resources

runs every five minutes

converts closed delta stores to row groups that store data in a column-based storage format [5]

can be disabled via trace flag T-634
⇐ the conversion of closed delta stores to row groups can be forced by reorganizing an index [5]

runs in parallel using multiple threads

decreases significantly conversion time at a cost of extra CPU load and memory usage [5]

via: ALTER INDEX REORGANIZE command

it doesn’t prevent other sessions from inserting new data into a table [5]
deletions and data modifications would be blocked for the duration of the operation [5]

{recommendation} consider forcing index reorganization manually to reduce execution, and therefore locking, time [5]

considered fragmented if it has

multiple delta rowgroups
deleted rows

require maintenance like that of regular B-Tree indexes [5]

{issue] partially populated row groups
{issue} overhead of delta store and delete bitmap scans during query execution
rebuilding the columnstore index addresses the issues
the strategy depends on the volatility of the data and the ETL processes implemented in the system [5]

{recommendation} rebuild indexes when a table has a considerable volme of deleted rows and/or a large number of partially populated rowgroups [5]
{recommendation} rebuild partition(s) that still have a large number of rows in open delta stores after the ETL process has completed, especially if the ETL process does not use a bulk insert API [5]

creating/dropping/disabling/rebuilding functions like any other index

columnstore statistics

a statistics object is created at the time of columnstore index creation; however, it is neither populated nor updated afterward [5]

⇐ SQL Server relies on segment information, B-Tree indexes (when available), and column-level statistics when deciding if a columnstore index needs to be used [5]
it is beneficial to create missing column-level statistics on the columns that participate in a columnstore index and are used in query predicates and as join keys [5]

⇐ statistics rarely update automatically on very large tables [5]

⇒ statistics must be updated ‘manually’

[SQL Server 2019] included into the schema-only clone of a database functionality [8]

enable performance troubleshooting without the need to manual capture the statistics information

columnstore indexes has been added to sp_estimate_data_compression_savings. In SQL Server 2019 both
COLUMNSTORE and COLUMNSTORE_ARCHIVE have been added to allow you to estimate the space savings if
either of these indexes are used on a table.

via DBCC CLONEDATABASE

[in-memory tables]

{limitation} a columnstore index must include all the columns and can’t have a filtered condition [2]
{limitation} queries on columnstore indexes run only in InterOP mode, and not in the in-memory native mode [2]

{operation} designing columnstore indexes

{best practice} understand as much as possible data’s characteristics
{best practice} identify workload’s characteristics

{operation} create a clustered columnstore index

via CREATE CLUSTERED COLUMNSTORE INDEX command
not needed to specify any columns in the statement

⇐ the index will include all table columns

{operation} index rebuilding

forces SQL Server to remove deleted rows physically from the index and to merge the delta stores’ and row groups’ data [5]

all column segments are recreated with row groups fully populated [5]

[<SQL Server 2019] offline operation
[SQL Server 2019 Enterprise] online operation

⇒ higher availability
⇐ pausing and resuming create and rebuild operations are not supported [11]

very resource intensive process
holds a schema modification (Sch-M) lock on the table

⇒ prevents other sessions from accessing it [5]
⇐ the overhead can be mitigated by using table/index partitioning

⇒ indexes will be rebuild on a partition basis for those partition with volatile data [5]

{operation} index reorganization

[<SQL Server 2019] a reorganize operation is required to merge smaller COMPRESSED rowgroups, following an internal threshold policy that determines how to remove deleted rows and combine the compressed rowgroups
[SQL Server 2019] a background merge task also works to merge COMPRESSED rowgroups from where a large number of rows has been deleted

⇐ after merging smaller rowgroups, the index quality should be improved.
the tuple-mover is helped by a background merge task that automatically compresses smaller OPEN delta rowgroups that have existed for some time as determined by an internal threshold, or merges COMPRESSED rowgroups from where a large number of rows has been deleted
via: ALTER INDEX REORGANIZE command

[SQL Server 2016] performs additional defragmentation

removes deleted rows from row groups that have 10 or more percent of the rows logically deleted [5]
merges closed row groups together, keeping the total number of rows less than or equal than rowgroup’s limit [5]
⇐ both processes can be done together [5]

[SQL Server 2014] the only action performed is compressing and moving the data from closed delta stores to rowgroups [5]

⇐ delete bitmap and open delta stores stay intact [5]

via: ALTER INDEX REORGANIZE

uses all available system resources while it is running [5]

⇒ speeds up the execution process
reduce the time during which other sessions cannot modify or delete data in a table [5]

close and compress all open row groups

via: ALTER INDEX REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON)
row groups aren’t merged during this operation [5]

{operation} estimate compression savings

[SQL Server 2019] COLUMNSTORE and COLUMNSTORE_ARCHIVE added

allows estimating the space savings if either of these indexes are used on a table [8]
{limitation} not available in all editions

via: sp_estimate_data_compression_savings

{operation} [bulk loads] when the number of rows is less than deltastore’s limit, all the rows go directly to the deltastore

[large bulk load] most of the rows go directly to the columnstore without passing through the deltastore

some rows at the end of the bulk load might be too few in number to meet the minimum size of a rowgroup

⇒ the final rows go to the deltastore instead of the columnstore

bulk insert operations provide the number of rows in the batch as part of the API call [5]

best results are achieved by choosing a batch size that is divisible by rowgroup’s limit [5]

⇐ guarantees that every batch produces one or several fully populated row groups [5]

⇒ reduce the total number of row groups in a table [5]
⇒ improves query performance

⇐ the batch size shouldn’t exceed rowgroup’s limit [5]

row groups can be still created on the fly in a manner to similar a bulk insert when the size of the insert batch is close to or exceeds [5]

{operation} [non-bulk operations] trickle inserts go directly to a delta store
{feature} parallel inserts

[SQL Server 2016] requires following conditions for parallel insert on CCI [6]

must specify TABLOCK
no NCI on the clustered columnstore index
no identity column
database compatibility is set to 130

{recommendation} minimize the use of string columns in facts tables [5]

string data use more space
their encoding involves additional overhead during batch mode execution [5]
queries with predicates on string columns may have less efficient execution plans that also require significantly larger memory grants as compared to their non-string counterparts [5]

{recommendation} [SQL Server 2012|2014] do not push string predicates down toward the lowest operators in execution plans.
{recommendation} add another dimension table and replace the string value in the facts table with a synthetic, integer-based ID key that references a new table [5]
{operation} upgrading to SQL Server 2016

make sure that queries against the tables with columnstore indexes can utilize parallelism in case if database compatibility level less than 130 [5]

{feature} [SQL Server 2019] automated columnstore index maintenance [8]
{improvement} [SQL Server 2019] better columnstore metadata memory management
{improvement} [SQL Server 2019] low-memory load path for columnstore tables
{improvement} [SQL Server 2019] improved performance for bulk loading to columnstore indexes
{improvement} [SQL Server 2019] server startup process has been made faster for databases that use in-memory columnstore tables for HTAP
{feature} DMVs

sys.column_store_segments

returns one row for each column per segment

sys.column_store_dictionaries

provides information about the dictionaries used by a columnstore index

sys.dm_db_column_store_row_group_physical_stats

rowgroup statuses

sys.column_store_row_groups

provides clustered columnstore index information on a per-segment basis

Previous Post <<||>> Next Post

References:
[1] SQL Docs (2020) Columnstore indexes: Overview [link]

[2] Microsoft Learn (2024) SQL: What's new in columnstore indexes [link]
[3] Dejan Sarka et al (2012) Exam 70-463: Implementing a Data Warehouse with Microsoft SQL Server 2012 (Training Kit)

[4] SQL Docs (2019) Columnstore indexes - Query performance [link]

[5] Dmitri Korotkevitch (2016) Pro SQL Server Internals 2nd Ed.

[6] Microsoft Learn (2016) Columnstore Index: Parallel load into clustered columnstore index from staging table [link]

[7] Microsoft Learn (2016) Columnstore Index Defragmentation using REORGANIZE Command [link]

[8] Microsoft (2018) Microsoft SQL Server 2019: Technical white paper [link]

Acronyms:
CCI - clustered columnstore index
CI - columnstore index
DBCC - Database Console Commands
DMV - Dynamic Management View
ETL - Extract, Transform, Load
HTAP - Hybrid Transactional/Analytical Processing
LOB - Line of Business
NCCI - nonclustered columnstore index
OLTP - On-Line Transaction Processing
SP - Service Pack

01 April 2021

💎SQL Reloaded: Processing JSON Files with Complex Structure in SQL Server 2016+

Unfortunately (or fortunately, for the challenge-searchers), not all JSON data files have a simple (matrix) structure, while the data might not even have a proper (readable) definition. It's the case of the unemployment data provided by the Cologne municipality (source). However with a language page translator and some small effort one can identify the proximate data definition:

Source Field	Target Field	Data Type
AM_ALO_INSG_AA	ALO_Total	int
AM_ALO_SGB2_AA	ALO_SGB2	int
AM_ALO_UNTER25_AA	ALO_Under25	int
AM_ALO_INSG_AP	ALO_Total_Perc	float
AM_ALO_SGB2_AP	ALO_SGB2_Perc	float
AM_ALO_UNTER25_AP	ALO_Under25_Perc	float
AM_ALO_INSG_HA	ALO_Total_Hist	int
AM_ALO_SGB2_HA	ALO_SGB2_Hist	int
AM_ALO_UNTER25_HA	ALO_Under25_Hist	int
AM_ALO_INSG_HP	ALO_Total_HistPerc	float
AM_ALO_SGB2_HP	ALO_SGB2_HistPerc	float
AM_ALO_UNTER25_HP	ALO_Under25_HistPerc	float
AM_SVB_INSG_AA	SVB_Total	int
AM_SVB_MANN_AA	SVB_Men	int
AM_SVB_FRAU_AA	SVB_Women	int
AM_SVB_DEUTSCH_AA	SVB_German	int
AM_SVB_AUSLAND_AA	SVB_AUSLAND	int
AM_SVB_U25J_AA	SVB_Under25Yo	int
AM_SVB_UEBER55J_AA	SVB_Over55Yo	int
AM_SVB_INSG_AP	SVB_Total_Perc	float
AM_SVB_MANN_AP	SVB_Men_Perc	float
AM_SVB_FRAU_AP	SVB_Women_Perc	float
AM_SVB_DEUTSCH_AP	SVB_German_Perc	float
AM_SVB_AUSLAND_AP	SVB_AUSLAND_Perc	float
AM_SVB_U25J_AP	SVB_Under25Yo_Perc	float
AM_SVB_UEBER55J_AP	SVB_Over55Yo_Perc	float
AM_SVB_INSG_HA	SVB_Total_Hist	int
AM_SVB_MANN_HA	SVB_Men_Hist	int
AM_SVB_FRAU_HA	SVB_Women_Hist	int
AM_SVB_DEUTSCH_HA	SVB_German_Hist	int
AM_SVB_AUSLAND_HA	SVB_AUSLAND_Hist	int
AM_SVB_U25J_HA	SVB_Under25Yo_Hist	int
AM_SVB_UEBER55J_HA	SVB_Over55Yo_Hist	int
AM_SVB_INSG_HP	SVB_Total_HistPerc	float
AM_SVB_MANN_HP	SVB_Men_HistPerc	float
AM_SVB_FRAU_HP	SVB_Women_HistPerc	float
AM_SVB_DEUTSCH_HP	SVB_German_HistPerc	float
AM_SVB_AUSLAND_HP	SVB_AUSLAND_HistPerc	float
AM_SVB_U25J_HP	SVB_Under25Yo_HistPerc	float
AM_SVB_UEBER55J_HP	SVB_Over55Yo_HistPerc	float
SHAPE.AREA	SHAPE.AREA	int
SHAPE.LEN	SHAPE.LEN	int

As previously stated (see post), it makes sense to build the logic over several iterations, making first sure that the references to file's columns were used correctly (observe the way the various elements were referenced in the queries):

SELECT DAT.ObjectId 
, DAT.Nummer
, DAT.Name
, DAT.ALO_Total
, DAT.ALO_SGB2
FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$.features')
 WITH( 
	  ObjectId int '$.properties.OBJECTID'
	, Nummer int '$.properties.NUMMER'
	, Name nvarchar(max) '$.properties.NAME'
	, ALO_Total int '$.properties.AM_ALO_INSG_AA'
	, ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA'
) AS DAT;

Output (first 13 records):

ObjectId	Nummer	Name	ALO_Total	ALO_SGB2
1	211	Godorf	121	92
2	308	LÃ¶venich	156	83
3	307	Weiden	552	383
4	306	Junkersdorf	305	164
5	309	Widdersdorf	200	93
6	404	Vogelsang	310	208
7	505	Weidenpesch	527	351
8	502	Mauenheim	190	127
12	207	Hahnwald	15	3
13	213	Meschenich	644	554

The logic seems to work, however the German umlauts aren't displayed as expected ('LÃ¶venich', when it should have been 'Lövenich'). This is caused by the differences in character sets. An easy way to address this is to use a functions which does the conversion (see the dbo.ReplaceCodes2Umlauts UDF from an older post).

By applying the function on the Names, adding the further columns and an INSERT clause, the query becomes:

-- importing the JSON file
SELECT DAT.ObjectId 
, DAT.Nummer
, dbo.ReplaceCodes2Umlauts(DAT.Name) Name
, DAT.ALO_Total
, DAT.ALO_SGB2
, DAT.ALO_Under25
, DAT.ALO_Total_Perc
, DAT.ALO_SGB2_Perc
, DAT.ALO_Under25_Perc
, DAT.ALO_Total_Hist
, DAT.ALO_SGB2_Hist
, DAT.ALO_Under25_Hist
, DAT.ALO_Total_HistPerc
, DAT.ALO_SGB2_HistPerc
, DAT.ALO_Under25_HistPerc
, DAT.SVB_Total
, DAT.SVB_Men
, DAT.SVB_Women
, DAT.SVB_German
, DAT.SVB_AUSLAND
, DAT.SVB_Under25Yo
, DAT.SVB_Over55Yo
, DAT.SVB_Total_Perc
, DAT.SVB_Men_Perc
, DAT.SVB_Women_Perc
, DAT.SVB_German_Perc
, DAT.SVB_AUSLAND_Perc
, DAT.SVB_Under25Yo_Perc
, DAT.SVB_Over55Yo_Perc
, DAT.SVB_Total_Hist
, DAT.SVB_Men_Hist
, DAT.SVB_Women_Hist
, DAT.SVB_German_Hist
, DAT.SVB_AUSLAND_Hist
, DAT.SVB_Under25Yo_Hist
, DAT.SVB_Over55Yo_Hist
, DAT.SVB_Total_HistPerc
, DAT.SVB_Men_HistPerc
, DAT.SVB_Women_HistPerc
, DAT.SVB_German_HistPerc
, DAT.SVB_AUSLAND_HistPerc
, DAT.SVB_Under25Yo_HistPerc
, DAT.SVB_Over55Yo_HistPerc
, DAT.Shape_Area 
, DAT.Shape_Len 
INTO dbo.Unemployment_Cologne
FROM OPENROWSET (BULK 'D:\data\Arbeitsmarkt Statistik Koeln Stadtteil.json',CODEPAGE='65001', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$.features')
 WITH( 
	  ObjectId int '$.properties.OBJECTID'
	, Nummer int '$.properties.NUMMER'
	, Name nvarchar(max) '$.properties.NAME'
	, ALO_Total int '$.properties.AM_ALO_INSG_AA'
	, ALO_SGB2 int '$.properties.AM_ALO_SGB2_AA'
	, ALO_Under25 int '$.properties.AM_ALO_UNTER25_AA'
	, ALO_Total_Perc float '$.properties.AM_ALO_INSG_AP'
	, ALO_SGB2_Perc float '$.properties.AM_ALO_SGB2_AP'
	, ALO_Under25_Perc float '$.properties.AM_ALO_UNTER25_AP'
	, ALO_Total_Hist int '$.properties.AM_ALO_INSG_HA'
	, ALO_SGB2_Hist int '$.properties.AM_ALO_SGB2_HA'
	, ALO_Under25_Hist int '$.properties.AM_ALO_UNTER25_HA'
	, ALO_Total_HistPerc float '$.properties.AM_ALO_INSG_HP'
	, ALO_SGB2_HistPerc float '$.properties.AM_ALO_SGB2_HP'
	, ALO_Under25_HistPerc float '$.properties.AM_ALO_UNTER25_HP'
	, SVB_Total int '$.properties.AM_SVB_INSG_AA'
	, SVB_Men int '$.properties.AM_SVB_MANN_AA'
	, SVB_Women int '$.properties.AM_SVB_FRAU_AA'
	, SVB_German int '$.properties.AM_SVB_DEUTSCH_AA'
	, SVB_AUSLAND int '$.properties.AM_SVB_AUSLAND_AA'
	, SVB_Under25Yo int '$.properties.AM_SVB_U25J_AA'
	, SVB_Over55Yo int '$.properties.AM_SVB_UEBER55J_AA'
	, SVB_Total_Perc float '$.properties.AM_SVB_INSG_AP'
	, SVB_Men_Perc float '$.properties.AM_SVB_MANN_AP'
	, SVB_Women_Perc float '$.properties.AM_SVB_FRAU_AP'
	, SVB_German_Perc float '$.properties.AM_SVB_DEUTSCH_AP'
	, SVB_AUSLAND_Perc float '$.properties.AM_SVB_AUSLAND_AP'
	, SVB_Under25Yo_Perc float '$.properties.AM_SVB_U25J_AP'
	, SVB_Over55Yo_Perc float '$.properties.AM_SVB_UEBER55J_AP'
	, SVB_Total_Hist int '$.properties.AM_SVB_INSG_HA'
	, SVB_Men_Hist int '$.properties.AM_SVB_MANN_HA'
	, SVB_Women_Hist int '$.properties.AM_SVB_FRAU_HA'
	, SVB_German_Hist int '$.properties.AM_SVB_DEUTSCH_HA'
	, SVB_AUSLAND_Hist int '$.properties.AM_SVB_AUSLAND_HA'
	, SVB_Under25Yo_Hist int '$.properties.AM_SVB_U25J_HA'
	, SVB_Over55Yo_Hist int '$.properties.AM_SVB_UEBER55J_HA'
	, SVB_Total_HistPerc float '$.properties.AM_SVB_INSG_HP'
	, SVB_Men_HistPerc float '$.properties.AM_SVB_MANN_HP'
	, SVB_Women_HistPerc float '$.properties.AM_SVB_FRAU_HP'
	, SVB_German_HistPerc float '$.properties.AM_SVB_DEUTSCH_HP'
	, SVB_AUSLAND_HistPerc float '$.properties.AM_SVB_AUSLAND_HP'
	, SVB_Under25Yo_HistPerc float '$.properties.AM_SVB_U25J_HP'
	, SVB_Over55Yo_HistPerc float '$.properties.AM_SVB_UEBER55J_HP'
	, Shape_Area float '$.properties."SHAPE.AREA"'
	, Shape_Len float '$.properties."SHAPE.LEN"'
) AS DAT;

Once the data made available, one can go on and discover the data and the relationships existing between the various columns.

Happy coding!

💎SQL Reloaded: Processing JSON Files with Flat Matrix Structure in SQL Server 2016+

Besides the CSV format, many of the data files made available under the open data initiatives are stored in JSON format, which makes data more difficult to process, even if JSON offers a richer structure that goes beyond the tabular structure of CSV files. Fortunately, starting with SQL Server 2016, JSON became a native format, which makes the processing of JSON files relatively easy, the easiness with which one can process the data depending on how they are structured.

Let’s consider as example a JSON file with the world population per country and year that can be downloaded from DataHub (source). The structure behind resembles a tabular model (see the table on the source website), having a flat structure. Just export the data to a file with the JSON extension (e.g. ‘population-figures-by-country.json’) locally (e.g. ‘D:/Data’). The next step is to understand file’s structure. Some repositories provide good documentation in this respect, though there are also many exceptions. Having a JSON editor like Visual Studio which reveals the structure makes easier the process.

As in the case of CSV files, is needed to infer the data types. There are two alphanumeric fields (Country & Country Code), while the remaining fields are numeric. The only challenge raised by the data seems to be the difference in format between the years 2002 and 2015 in respect to the other years, as the values of the former contain a decimal after comma. All the numeric values should have been whole values.

It’s recommended to start small and build the logic iteratively. Therefore, for the first step just look at files content via the OPENROWSET function:

-- looking at the JSON file 
SELECT *
FROM OPENROWSET (BULK 'D:\data\population-figures-by-country.json', SINGLE_CLOB)  as jsonfile

In a second step one can add the OPENJSON function by looking only at the first record:

-- querying a json file (one record)
SELECT *
FROM OPENROWSET (BULK 'D:\data\population-figures-by-country.json', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$[0]')

In a third step one can add a few columns (e.g. Country & Country Code) to make sure that the select statement works correctly.

-- querying a json file (all records, a few fields)
SELECT Country 
, CountryCode 
FROM OPENROWSET (BULK 'D:\data\population-figures-by-country.json', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$')
 WITH ( 
  Country nvarchar(max) '$.Country'
, CountryCode nvarchar(3) '$.Country_Code'
) AS DAT;

In a next step can be added all the columns and import the data in a table (e.g. dbo.CountryPopulation) on the fly:

-- importing a json file (all records) on the fly
SELECT DAT.Country
, DAT.CountryCode
, DAT.Y1960
, DAT.Y1961
, DAT.Y1962
, DAT.Y1963
, DAT.Y1964
, DAT.Y1965
, DAT.Y1966
, DAT.Y1967
, DAT.Y1968
, DAT.Y1969
, DAT.Y1970
, DAT.Y1971
, DAT.Y1972
, DAT.Y1973
, DAT.Y1974
, DAT.Y1975
, DAT.Y1976
, DAT.Y1977
, DAT.Y1978
, DAT.Y1979
, DAT.Y1980
, DAT.Y1981
, DAT.Y1982
, DAT.Y1983
, DAT.Y1984
, DAT.Y1985
, DAT.Y1986
, DAT.Y1987
, DAT.Y1988
, DAT.Y1989
, DAT.Y1990
, DAT.Y1991
, DAT.Y1992
, DAT.Y1993
, DAT.Y1994
, DAT.Y1995
, DAT.Y1996
, DAT.Y1997
, DAT.Y1998
, DAT.Y1999
, DAT.Y2000
, DAT.Y2001
, Cast(DAT.Y2002 as bigint) Y2002
, Cast(DAT.Y2003 as bigint) Y2003
, Cast(DAT.Y2004 as bigint) Y2004
, Cast(DAT.Y2005 as bigint) Y2005
, Cast(DAT.Y2006 as bigint) Y2006
, Cast(DAT.Y2007 as bigint) Y2007
, Cast(DAT.Y2008 as bigint) Y2008
, Cast(DAT.Y2009 as bigint) Y2009
, Cast(DAT.Y2010 as bigint) Y2010
, Cast(DAT.Y2011 as bigint) Y2011
, Cast(DAT.Y2012 as bigint) Y2012
, Cast(DAT.Y2013 as bigint) Y2013
, Cast(DAT.Y2014 as bigint) Y2014
, Cast(DAT.Y2015 as bigint) Y2015
, DAT.Y2016
INTO dbo.CountryPopulation
FROM OPENROWSET (BULK 'D:\data\population-figures-by-country.json', SINGLE_CLOB)  as jsonfile 
     CROSS APPLY OPENJSON(BulkColumn,'$')
 WITH ( 
  Country nvarchar(max) '$.Country'
, CountryCode nvarchar(3) '$.Country_Code'
, Y1960 bigint '$.Year_1960'
, Y1961 bigint '$.Year_1961'
, Y1962 bigint '$.Year_1962'
, Y1963 bigint '$.Year_1963'
, Y1964 bigint '$.Year_1964'
, Y1965 bigint '$.Year_1965'
, Y1966 bigint '$.Year_1966'
, Y1967 bigint '$.Year_1967'
, Y1968 bigint '$.Year_1968'
, Y1969 bigint '$.Year_1969'
, Y1970 bigint '$.Year_1970'
, Y1971 bigint '$.Year_1971'
, Y1972 bigint '$.Year_1972'
, Y1973 bigint '$.Year_1973'
, Y1974 bigint '$.Year_1974'
, Y1975 bigint '$.Year_1975'
, Y1976 bigint '$.Year_1976'
, Y1977 bigint '$.Year_1977'
, Y1978 bigint '$.Year_1978'
, Y1979 bigint '$.Year_1979'
, Y1980 bigint '$.Year_1980'
, Y1981 bigint '$.Year_1981'
, Y1982 bigint '$.Year_1982'
, Y1983 bigint '$.Year_1983'
, Y1984 bigint '$.Year_1984'
, Y1985 bigint '$.Year_1985'
, Y1986 bigint '$.Year_1986'
, Y1987 bigint '$.Year_1987'
, Y1988 bigint '$.Year_1988'
, Y1989 bigint '$.Year_1989'
, Y1990 bigint '$.Year_1990'
, Y1991 bigint '$.Year_1991'
, Y1992 bigint '$.Year_1992'
, Y1993 bigint '$.Year_1993'
, Y1994 bigint '$.Year_1994'
, Y1995 bigint '$.Year_1995'
, Y1996 bigint '$.Year_1996'
, Y1997 bigint '$.Year_1997'
, Y1998 bigint '$.Year_1998'
, Y1999 bigint '$.Year_1999'
, Y2000 bigint '$.Year_2000'
, Y2001 bigint '$.Year_2001'
, Y2002 decimal(19,1) '$.Year_2002'
, Y2003 decimal(19,1) '$.Year_2003'
, Y2004 decimal(19,1) '$.Year_2004'
, Y2005 decimal(19,1) '$.Year_2005'
, Y2006 decimal(19,1) '$.Year_2006'
, Y2007 decimal(19,1) '$.Year_2007'
, Y2008 decimal(19,1) '$.Year_2008'
, Y2009 decimal(19,1) '$.Year_2009'
, Y2010 decimal(19,1) '$.Year_2010'
, Y2011 decimal(19,1) '$.Year_2011'
, Y2012 decimal(19,1) '$.Year_2012'
, Y2013 decimal(19,1) '$.Year_2013'
, Y2014 decimal(19,1) '$.Year_2014'
, Y2015 decimal(19,1) '$.Year_2015'
, Y2016 bigint '$.Year_2016'
) AS DAT;

As can be seen the decimal values were converted to bigint to preserve the same definition. Moreover, this enables data processing later, as no additional (implicit) conversions are necessary.

Also, the columns’ names were changed either for simplification/convenience or simply taste.

Writing such a monster query can be time-consuming, though preparing the metadata into Excel can decrease considerably the effort. With copy-past and a few tricks (e.g. replacing values, splitting columns based on a delimiter) one can easily prepare such a structure:

Source field	Target field	DataType	Value	Import Clause	Select Clause
Country	Country	nvarchar(max)	emen Rep.	, Country nvarchar(max) '$.Country'	, DAT.Country
Country_Code	CountryCode	nvarchar(3)	YEM	, CountryCode nvarchar(3) '$.Country_Code'	, DAT.CountryCode
Year_1960	Y1960	bigint	5172135	, Y1960 bigint '$.Year_1960'	, DAT.Y1960
Year_1961	Y1961	bigint	5260501	, Y1961 bigint '$.Year_1961'	, DAT.Y1961
Year_1962	Y1962	bigint	5351799	, Y1962 bigint '$.Year_1962'	, DAT.Y1962
Year_1963	Y1963	bigint	5446063	, Y1963 bigint '$.Year_1963'	, DAT.Y1963
Year_1964	Y1964	bigint	5543339	, Y1964 bigint '$.Year_1964'	, DAT.Y1964
Year_1965	Y1965	bigint	5643643	, Y1965 bigint '$.Year_1965'	, DAT.Y1965
Year_1966	Y1966	bigint	5748588	, Y1966 bigint '$.Year_1966'	, DAT.Y1966
Year_1967	Y1967	bigint	5858638	, Y1967 bigint '$.Year_1967'	, DAT.Y1967
Year_1968	Y1968	bigint	5971407	, Y1968 bigint '$.Year_1968'	, DAT.Y1968
Year_1969	Y1969	bigint	6083619	, Y1969 bigint '$.Year_1969'	, DAT.Y1969
Year_1970	Y1970	bigint	6193810	, Y1970 bigint '$.Year_1970'	, DAT.Y1970
Year_1971	Y1971	bigint	6300554	, Y1971 bigint '$.Year_1971'	, DAT.Y1971
Year_1972	Y1972	bigint	6407295	, Y1972 bigint '$.Year_1972'	, DAT.Y1972
Year_1973	Y1973	bigint	6523452	, Y1973 bigint '$.Year_1973'	, DAT.Y1973
Year_1974	Y1974	bigint	6661566	, Y1974 bigint '$.Year_1974'	, DAT.Y1974
Year_1975	Y1975	bigint	6830692	, Y1975 bigint '$.Year_1975'	, DAT.Y1975
Year_1976	Y1976	bigint	7034868	, Y1976 bigint '$.Year_1976'	, DAT.Y1976
Year_1977	Y1977	bigint	7271872	, Y1977 bigint '$.Year_1977'	, DAT.Y1977
Year_1978	Y1978	bigint	7536764	, Y1978 bigint '$.Year_1978'	, DAT.Y1978
Year_1979	Y1979	bigint	7821552	, Y1979 bigint '$.Year_1979'	, DAT.Y1979
Year_1980	Y1980	bigint	8120497	, Y1980 bigint '$.Year_1980'	, DAT.Y1980
Year_1981	Y1981	bigint	8434017	, Y1981 bigint '$.Year_1981'	, DAT.Y1981
Year_1982	Y1982	bigint	8764621	, Y1982 bigint '$.Year_1982'	, DAT.Y1982
Year_1983	Y1983	bigint	9111097	, Y1983 bigint '$.Year_1983'	, DAT.Y1983
Year_1984	Y1984	bigint	9472170	, Y1984 bigint '$.Year_1984'	, DAT.Y1984
Year_1985	Y1985	bigint	9847899	, Y1985 bigint '$.Year_1985'	, DAT.Y1985
Year_1986	Y1986	bigint	10232733	, Y1986 bigint '$.Year_1986'	, DAT.Y1986
Year_1987	Y1987	bigint	10628585	, Y1987 bigint '$.Year_1987'	, DAT.Y1987
Year_1988	Y1988	bigint	11051504	, Y1988 bigint '$.Year_1988'	, DAT.Y1988
Year_1989	Y1989	bigint	11523267	, Y1989 bigint '$.Year_1989'	, DAT.Y1989
Year_1990	Y1990	bigint	12057039	, Y1990 bigint '$.Year_1990'	, DAT.Y1990
Year_1991	Y1991	bigint	12661614	, Y1991 bigint '$.Year_1991'	, DAT.Y1991
Year_1992	Y1992	bigint	13325583	, Y1992 bigint '$.Year_1992'	, DAT.Y1992
Year_1993	Y1993	bigint	14017239	, Y1993 bigint '$.Year_1993'	, DAT.Y1993
Year_1994	Y1994	bigint	14692686	, Y1994 bigint '$.Year_1994'	, DAT.Y1994
Year_1995	Y1995	bigint	15320653	, Y1995 bigint '$.Year_1995'	, DAT.Y1995
Year_1996	Y1996	bigint	15889449	, Y1996 bigint '$.Year_1996'	, DAT.Y1996
Year_1997	Y1997	bigint	16408954	, Y1997 bigint '$.Year_1997'	, DAT.Y1997
Year_1998	Y1998	bigint	16896210	, Y1998 bigint '$.Year_1998'	, DAT.Y1998
Year_1999	Y1999	bigint	17378098	, Y1999 bigint '$.Year_1999'	, DAT.Y1999
Year_2000	Y2000	bigint	17874725	, Y2000 bigint '$.Year_2000'	, DAT.Y2000
Year_2001	Y2001	bigint	18390135	, Y2001 bigint '$.Year_2001'	, DAT.Y2001
Year_2002	Y2002	decimal(19,1)	18919179.0	, Y2002 decimal(19,1) '$.Year_2002'	, Cast(DAT.Y2002 as bigint) Y2002
Year_2003	Y2003	decimal(19,1)	19462086.0	, Y2003 decimal(19,1) '$.Year_2003'	, Cast(DAT.Y2003 as bigint) Y2003
Year_2004	Y2004	decimal(19,1)	20017068.0	, Y2004 decimal(19,1) '$.Year_2004'	, Cast(DAT.Y2004 as bigint) Y2004
Year_2005	Y2005	decimal(19,1)	20582927.0	, Y2005 decimal(19,1) '$.Year_2005'	, Cast(DAT.Y2005 as bigint) Y2005
Year_2006	Y2006	decimal(19,1)	21160534.0	, Y2006 decimal(19,1) '$.Year_2006'	, Cast(DAT.Y2006 as bigint) Y2006
Year_2007	Y2007	decimal(19,1)	21751605.0	, Y2007 decimal(19,1) '$.Year_2007'	, Cast(DAT.Y2007 as bigint) Y2007
Year_2008	Y2008	decimal(19,1)	22356391.0	, Y2008 decimal(19,1) '$.Year_2008'	, Cast(DAT.Y2008 as bigint) Y2008
Year_2009	Y2009	decimal(19,1)	22974929.0	, Y2009 decimal(19,1) '$.Year_2009'	, Cast(DAT.Y2009 as bigint) Y2009
Year_2010	Y2010	decimal(19,1)	23606779.0	, Y2010 decimal(19,1) '$.Year_2010'	, Cast(DAT.Y2010 as bigint) Y2010
Year_2011	Y2011	decimal(19,1)	24252206.0	, Y2011 decimal(19,1) '$.Year_2011'	, Cast(DAT.Y2011 as bigint) Y2011
Year_2012	Y2012	decimal(19,1)	24909969.0	, Y2012 decimal(19,1) '$.Year_2012'	, Cast(DAT.Y2012 as bigint) Y2012
Year_2013	Y2013	decimal(19,1)	25576322.0	, Y2013 decimal(19,1) '$.Year_2013'	, Cast(DAT.Y2013 as bigint) Y2013
Year_2014	Y2014	decimal(19,1)	26246327.0	, Y2014 decimal(19,1) '$.Year_2014'	, Cast(DAT.Y2014 as bigint) Y2014
Year_2015	Y2015	decimal(19,1)	26916207.0	, Y2015 decimal(19,1) '$.Year_2015'	, Cast(DAT.Y2015 as bigint) Y2015
Year_2016	Y2016	bigint	27584213	, Y2016 bigint '$.Year_2016'	, DAT.Y2016

Based on this structure, one can add two further formulas in Excel to prepare the statements as above and then copy the fields (last two columns were generated using the below formulas):

=", " & TRIM(B2) & " " & C2 & " '$." & TRIM(A2) & "'" 
=", DAT." & TRIM(B2)

Consuming data in which the values are stored in a matrix structure can involve further challenges sometimes, even if this type of storage tends to save space. For example, adding the values for a new year would involve extending the table with one more column, while performing calculations between years would involve referencing each column in formulas. Therefore, transforming the data from a matrix to a normalized structure can have some benefit. This can be achieved by writing a query via the UNPIVOT operator:

-- unpivoting the data 
SELECT RES.Country
, RES.CountryCode
, Cast(Replace(RES.[Year], 'Y', '') as int) [Year]
, RES.Population
--INTO dbo.CountryPopulationPerYear
FROM 
( -- basis data
	SELECT Country
	, CountryCode
	, Y1960, Y1961, Y1962, Y1963, Y1964, Y1965, Y1966, Y1967, Y1968, Y1969
	, Y1970, Y1971, Y1972, Y1973, Y1974, Y1975, Y1976, Y1977, Y1978, Y1979
	, Y1980, Y1981, Y1982, Y1983, Y1984, Y1985, Y1986, Y1987, Y1988, Y1989
	, Y1990, Y1991, Y1992, Y1993, Y1994, Y1995, Y1996, Y1997, Y1998, Y1999
	, Y2000, Y2001, Y2002, Y2003, Y2004, Y2005, Y2006, Y2007, Y2008, Y2009
	, Y2010, Y2011, Y2012, Y2013, Y2014, Y2015, Y2016
	FROM dbo.CountryPopulation
) DAT
UNPIVOT  -- unpivot logic
   (Population FOR [Year] IN  (Y1960, Y1961, Y1962, Y1963, Y1964, Y1965, Y1966, Y1967, Y1968, Y1969
, Y1970, Y1971, Y1972, Y1973, Y1974, Y1975, Y1976, Y1977, Y1978, Y1979
, Y1980, Y1981, Y1982, Y1983, Y1984, Y1985, Y1986, Y1987, Y1988, Y1989
, Y1990, Y1991, Y1992, Y1993, Y1994, Y1995, Y1996, Y1997, Y1998, Y1999
, Y2000, Y2001, Y2002, Y2003, Y2004, Y2005, Y2006, Y2007, Y2008, Y2009
, Y2010, Y2011, Y2012, Y2013, Y2014, Y2015, Y2016)
) RES

Also this can be performed in two steps, first preparing the query, and in a final step inserting the data into a table (e.g. dbo.CountryPopulationPerYear) on the fly (re-execute the previous query after uncommenting the INSERT clause to generate the table).

--reviewing the data 
SELECT Country
, CountryCode
, AVG(Population) AveragePopulation
, Max(Population) - Min(Population) RangePopulation
FROM dbo.CountryPopulationPerYear
WHERE [Year] BETWEEN 2010 AND 2019
GROUP BY Country
, CountryCode
ORDER BY Country

On the other side making comparisons between consecutive years is easier when using a matrix structure:

--reviewing the data 
SELECT Country
, CountryCode
, Y2016
, Y2010
, Y2010-Y2010 [2016-2010]
, Y2011-Y2010 [2011-2010]
, Y2012-Y2011 [2011-2011]
, Y2013-Y2012 [2011-2012]
, Y2014-Y2013 [2011-2013]
, Y2015-Y2014 [2011-2014]
, Y2016-Y2015 [2011-2015]
FROM dbo.CountryPopulation
ORDER BY Country

Unless the storage space is a problem, in theory one can store the data in both formats as there can be requests which can benefit from one structure or the other.

Happy coding!

SQL Troubles

Pages

16 February 2025

💠🛠️🗒️SQL Server: Columnstore Indexes [Notes]

01 April 2021

💎SQL Reloaded: Processing JSON Files with Complex Structure in SQL Server 2016+

💎SQL Reloaded: Processing JSON Files with Flat Matrix Structure in SQL Server 2016+

About Me