SQL Troubles

18 March 2025

🏭🗒️Microsoft Fabric: Statistics in Warehouse [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 18-Mar-2024

[Microsoft Fabric] Statistics

{def} objects that contain relevant information about data, to allow query optimizer to estimate plans' costs [1]

critical for the warehouse and lakehouse SQL endpoint for executing queries quickly and efficiently [3]

when a query is executed, the engine tries to collect existing statistics for certain columns in the query and use that information to assist in choosing an optimal execution plan [3]
inaccurate statistics can lead to unoptimized query plans and execution times [5]

{type} user-defined statistics

statistics defined manually by the users via DDL statement [1]
users can create, update and drop statistic

via CREATE|UPDATE|DROP STATISTICS

users can review the contents of histogram-based single-column statistics [1]

via DBCC SHOW_STATISTICS

only, a limited version of these statements is supported [1]

{recommendation} focus on columns heavily used in query workloads

e.g. GROUP BYs, ORDER BYs, filters, and JOINs

{recommendation} consider updating column-level statistics regularly [1]

e.g. after data changes that significantly change rowcount or distribution of the data [1]

{type} automatic statistics

statistics created and maintained automatically by the query engine at query time [1]
when a query is issued and query optimizer requires statistics for plan exploration, MF automatically creates those statistics if they don't already exist [1]

then the query optimizer can utilize them in estimating the plan costs of the triggering query [1]
if the query engine determines that existing statistics relevant to query no longer accurately reflect the data, those statistics are automatically refreshed [1]

these automatic operations are done synchronously [1]

the query duration includes this time [1]

{object type} histogram statistics

created per column needing histogram statistics at query time [1]
contains histogram and density information regarding the distribution of a particular column [1]
similar to the statistics automatically created at query-time in Azure Synapse Analytics dedicated pools [1]
name begins with _WA_Sys_.
contents can be viewed with DBCC SHOW_STATISTICS

{object type} average column length statistics

created for variable character columns (varchar) greater than 100 needing average column length at query-time [1]
contain a value representing the average row size of the varchar column at the time of statistics creation [1]
name begins with ACE-AverageColumnLength_
contents cannot be viewed and are nonactionable by users [1]

{object type} table-based cardinality statistics

created per table needing cardinality estimation at query-time [1]
contain an estimate of the rowcount of a table [1]
named ACE-Cardinality [1]
contents cannot be viewed and are nonactionable by user [1]

[lakehouse] SQL analytics endpoint

uses the same engine as the warehouse to serve high performance, low latency SQL queries [4]
{feature} automatic metadata discovery

a seamless process reads the delta logs and from the files folder and ensures SQL metadata for tables is always up to date [4]

e.g. statistics [4]

{limitation} only single-column histogram statistics can be manually created and modified [1]
{limitation} multi-column statistics creation is not supported [1]
{limitation} other statistics objects might appear in sys.stats

besides the statistics created manually/automatically [1]

⇐ the objects are not used for query optimization [1]

{limitation} if a transaction has data insertion into an empty table and issues a SELECT before rolling back, the automatically generated statistics can still reflect the uncommitted data, causing inaccurate statistics [5]

{recommendation} update statistics for the columns mentioned in the SELECT [5]

{recommendation} ensure all table statistics are updated after large DML transactions [2]

17 March 2025

🏭🗒️Microsoft Fabric: Z-Order [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 17-Mar-2024

[Microsoft Fabric] Z-Order

{def} technique to collocate related information in the same set of files [2]

⇐ reorganizes the layout of each data file so that similar column values are strategically collocated near one another for maximum efficiency [1]
{benefit} efficient query performance

reduces the amount of data to read [2] for certain queries

when the data is appropriately ordered, more files can be skipped [3]
particularly important for the ordering of multiple columns [3]

{benefit} data skipping

automatically skips irrelevant data, further enhancing query speeds

via data-skipping algorithms [2]

{benefit} flexibility

can be applied to multiple columns, making it versatile for various data schemas

aims to produce evenly-balanced data files with respect to the number of tuples

⇐ but not necessarily data size on disk [2]

⇐ the two measures are most often correlated [2]

⇐ but there can be situations when that is not the case, leading to skew in optimize task times [2]

via ZORDER BY clause

applicable to columns with high cardinality commonly used in query predicates [2]
multiple columns can be specified as a comma-separated list

{warning} the effectiveness of the locality drops with each extra column [2]

has tradeoffs

it’s important to analyze query patterns and select the right columns when Z Ordering data [3]

{warning} using columns that do not have statistics collected on them is ineffective and wastes resources [2]

statistics collection can be configured on certain columns by reordering columns in the schema, or by increasing the number of columns to collect statistics on [2]

{characteristic} not idempotent

every time is executed, it will try to create a new clustering of data in all files in a partition [2]

it includes new and existing files that were part of previous z-ordering [2]

compatible with v-order

{concept} [Databricks] liquid clustering

replaces table partitioning and ZORDER to simplify data layout decisions and optimize query performance [4] [6]

not compatible with the respective features [4] [6]

tables created with liquid clustering enabled have numerous Delta table features enabled at creation [4] [6]
provides flexibility to redefine clustering keys without rewriting existing data [4] [6]

⇒ allows data layout to evolve alongside analytic needs over time [4] [6]

applies to

streaming tables
materialized views

{scenario} tables often filtered by high cardinality columns [4] [6]
{scenario} tables with significant skew in data distribution [4] [6]
{scenario} tables that grow quickly and require maintenance and tuning effort [4] [6]
{scenario} tables with concurrent write requirements [4] [6]
{scenario} tables with access patterns that change over time [4] [6]
{scenario} tables where a typical partition key could leave the table with too many or too few partitions [4] [6]

Previous Post <<||>> Next Post

References:

[1] Bennie Haelen & Dan Davis (2024) Delta Lake Up & Running: Modern Data Lakehouse Architectures with Delta Lake

[2] Delta Lake (2023) Optimizations [link]

[3] Delta Lake (2023) Delta Lake Z Order, by Matthew Powers [link]
[4] Delta Lake (2025) Use liquid clustering for Delta tables [link]
[5] Databricks (2025) Delta Lake table format interoperability [link]
[6] Microsoft Learn (2025) Use liquid clustering for Delta tables [link]

Resources:
[R1] Azure Guru (2024) Z Order in Delta Lake - Part 1 [link]

[R2] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
MF - Microsoft Fabric

🏭🗒️Microsoft Fabric: V-Order [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 17-Mar-2024

[Microsoft Fabric] V-Order

{def} write time optimization to the parquet file format that enables fast reads under the MF compute engine [2]

all parquet engines can read the files as regular parquet files [2]
results in a smaller and therefore faster files to read [5]

{benefit} improves read performance
{benefit} decreases storage requirements
{benefit} optimizes resources' usage

reduces the compute resources required for reading data

e.g. network bandwidth, disk I/O, CPU usage

still conforms to the open-source Parquet file format [5]

they can be read by non-Fabric tools [5]

delta tables created and loaded by Fabric items automatically apply V-Order

e.g. data pipelines, dataflows, notebooks [5]

delta tables and its features are orthogonal to V-Order [2]

e.g. Z-Order, compaction, vacuum, time travel
table properties and optimization commands can be used to control the v-order of the partitions [2]

compatible with Z-Order [2]
not all files have this optimization applied [5]

e.g. Parquet files uploaded to a Fabric lakehouse, or that are referenced by a shortcut
the files can still be read, the read performance likely won't be as fast as an equivalent Parquet file that's had V-Order applied [5]

required by certain features

[hash encoding] to assign a numeric identifier to each unique value contained in the column [5]

{command} OPTIMIZE

optimizes a Delta table to coalesce smaller files into larger ones [5]
can apply V-Order to compact and rewrite the Parquet files [5]

[warehouse]

works by applying certain operations on Parquet files

special sorting
row group distribution
dictionary encoding
compression

enabled by default
⇒ compute engines require less network, disk, and CPU resources to read data from storage [1]

provides cost efficiency and performance [1]

the effect of V-Order on performance can vary depending on tables' schemas, data volumes, query, and ingestion patterns [1]

fully-compliant to the open-source parquet format [1]

⇐ all parquet engines can read it as regular parquet files [1]

required by certain features

[Direct Lake mode] depends on V-Order

{operation} disable V-Order

causes any new Parquet files produced by the warehouse engine to be created without V-Order optimization [3]
irreversible operation

once disabled, it cannot be enabled again [3]

{scenario} write-intensive warehouses

warehouses dedicated to staging data as part of a data ingestion process [1]

{warning} consider the effect of V-Order on performance before deciding to disable it [1]

{recommendation} test how V-Order affects the performance of data ingestion and queries before deciding to disable it [1]

via ALTER DATABASE CURRENT SET VORDER = OFF; [3]

{operation} check current status

via SELECT name, is_vorder_enabled FROM sys.databases; [post]

{feature} [lakehouse] Load to Table

allows to load a single file or a folder of files to a table [6]
tables are always loaded using the Delta Lake table format with V-Order optimization enabled [6]

[Direct Lake semantic model]

data is prepared for fast loading into memory [5]

makes less demands on capacity resources [5]
results in faster query performance [5]

because less memory needs to be scanned [5]

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Fabric: Understand V-Order for Microsoft Fabric Warehouse [link]
[2] Microsoft Learn (2024) Delta Lake table optimization and V-Order [link]
[3] Microsoft Learn (2024) Disable V-Order on Warehouse in Microsoft Fabric [link]
[4] Miles Cole (2024) To V-Order or Not: Making the Case for Selective Use of V-Order in Fabric Spark [link]
[5] Microsoft Learn (2024) Understand storage for Direct Lake semantic models [link]

[6] Microsoft Learn (2025] Fabric: Load to Delta Lake table [link]

Resources:
[R1] Serverless.SQL (2024) Performance Analysis of V-Ordering in Fabric Warehouse: On or Off?, by Andy Cutler [link]
[R2] Redgate (2023 Microsoft Fabric: Checking and Fixing Tables V-Order Optimization, by Dennes Torres [link]
[R3] Sandeep Pawar (2023) Checking If Delta Table in Fabric is V-order Optimized [link]

[R4] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
MF - Microsoft Fabric

🏭🗒️Microsoft Fabric: Caching in Warehouse [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 17-Mar-2024

[Microsoft Fabric] Caching

{def} technique that improves the performance of data processing by storing frequently accessed data and metadata in a faster storage layer [1]

e.g. local memory, local SSD disk
⇐ subsequent requests can be served faster, directly from the cache [1]

if a set of data has been previously accessed by a query, any subsequent queries will retrieve that data directly from the in-memory cache [1]
local memory operations are notably faster compared to fetching data from remote storage [1]
⇐ significantly diminishes IO latency [1]

fully transparent to the user
consistently active and operates seamlessly in the background [1]
orchestrated and upheld by MF

it doesn't offer users the capability to manually clear the cache [1]

provides transactional consistency

ensures that any modifications to the data in storage after it has been initially loaded into the in-memory cache, will result in consistent data [1]

when the cache reaches its capacity threshold and fresh data is being read for the first time, objects that have remained unused for the longest duration will be removed from the cache [1]

process is enacted to create space for the influx of new data and maintain an optimal cache utilization strategy [1]

{type} in-memory cache

data in cache is organized in a compressed columnar format (aka columnar storage) [1]

⇐ optimized for analytical queries
each column of data is stored separately [1]

{benefit} allows for better compression [1]

since similar data values are stored together [1]

{benefit} reduces the memory footprint

when queries need to perform operations on a specific column, the engine can work more efficiently speeding up queries' execution [1]

it doesn't have to process unnecessary data from other columns
can perform operations on multiple columns simultaneously [1]

taking advantage of modern multi-core processors [1]

when retrieves data from storage, the data is transformed from its original file-based format into highly optimized structures in in-memory cache [1]
{scenario} analytical workloads where queries involve scanning large amounts of data to perform aggregations, filtering, and other data manipulations [1]

{type} disk cache

complementary extension to the in-memory cache
any data loaded into the in-memory cache is also serialized to the SSD cache [1]

data removed from the in-memory cache remains within the SSD cache for an extended period
when subsequent query requests the data, it is retrieved from the SSD cache into the in-memory cache quicker [1]

{issue} cold run (aka cold cache) performance

the first 1-3 executions of a query perform noticeably slower than subsequent executions [2]

if the first run's performance is crucial, try manually creating statistics (aka pre-warming the cache) [2]
otherwise, one can rely on the automatic statistics generated in the first query run and leveraged in subsequent runs [2]
as long as underlying data does not change significantly [2]

differentiated from

[Kusto] caching policy [link]
[Apache Spark] intelligent cache [link]
[Power BI] query caching [link]
[Azure] caching [link]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2024) Fabric: Caching in Fabric data warehousing [link]

[2] Microsoft Learn (2024) Fabric Data Warehouse performance guidelines [link]

[3] Sandeep Pawar (2023) Pre-Warming The Direct Lake Dataset For Warm Cache Import-Like Performance [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
IO - Input/Output
MF - Microsoft Fabric
SSD - Solid State Drive

16 March 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part XII: Databases)

After reviewing the server and database properties, the next step is to review the database configurations using the sys.databases metadata. There's a such object even for warehouse and lakehouse, respectively for mirrored databases.

Except the last line from the query, all the other attributes are supported by all above mentioned repositories.

-- database metadata
SELECT db.name
, db.database_id
--, db.source_database_id
, db.owner_sid
, db.create_date
, db.compatibility_level
, db.collation_name
--, db.user_access
, db.user_access_desc
--, db.is_read_only
--, db.is_auto_close_on
--, db.is_auto_shrink_on
--, db.state
, db.state_desc
--, db.is_in_standby
--, db.is_cleanly_shutdown
--, db.is_supplemental_logging_enabled
--, db.snapshot_isolation_state
, db.snapshot_isolation_state_desc
--, db.is_read_committed_snapshot_on
--, db.recovery_model
, db.recovery_model_desc
, db.page_verify_option
, db.page_verify_option_desc
, db.is_auto_create_stats_on
--, db.is_auto_create_stats_incremental_on
, db.is_auto_update_stats_on
--, db.is_auto_update_stats_async_on
--, db.is_ansi_null_default_on
--, db.is_ansi_nulls_on
--, db.is_ansi_padding_on
--, db.is_ansi_warnings_on
--, db.is_arithabort_on
--, db.is_concat_null_yields_null_on
--, db.is_numeric_roundabort_on
--, db.is_quoted_identifier_on
--, db.is_recursive_triggers_on
--, db.is_cursor_close_on_commit_on
--, db.is_local_cursor_default
, db.is_fulltext_enabled
--, db.is_trustworthy_on
, db.is_db_chaining_on
--, db.is_parameterization_forced
, db.is_master_key_encrypted_by_server
--, db.is_query_store_on
--, db.is_published
--, db.is_subscribed
--, db.is_merge_published
--, db.is_distributor
--, db.is_sync_with_backup
, db.service_broker_guid
--, db.is_broker_enabled
--, db.log_reuse_wait
, db.log_reuse_wait_desc
--, db.is_date_correlation_on
--, db.is_cdc_enabled
--, db.is_encrypted
--, db.is_honor_broker_priority_on
--, db.replica_id
--, db.group_database_id
--, db.resource_pool_id
--, db.default_language_lcid
--, db.default_language_name
--, db.default_fulltext_language_lcid
--, db.default_fulltext_language_name
--, db.is_nested_triggers_on
--, db.is_transform_noise_words_on
--, db.two_digit_year_cutoff
--, db.containment
--, db.containment_desc
, db.target_recovery_time_in_seconds
--, db.delayed_durability
--, db.delayed_durability_desc
--, db.is_memory_optimized_elevate_to_snapshot_on
--, db.is_federation_member
--, db.is_remote_data_archive_enabled
--, db.is_mixed_page_allocation_on
, db.is_temporal_history_retention_enabled
--, db.catalog_collation_type
, db.catalog_collation_type_desc
, db.physical_database_name
--, db.is_result_set_caching_on
, db.is_accelerated_database_recovery_on
--, db.is_tempdb_spill_to_remote_store
--, db.is_stale_page_detection_on
, db.is_memory_optimized_enabled
, db.is_data_retention_enabled
--, db.is_ledger_on
--, db.is_change_feed_enabled
--, db.is_data_lake_replication_enabled
--, db.is_change_streams_enabled
--, db.data_lake_log_publishing
, db.data_lake_log_publishing_desc
, db.is_vorder_enabled
--, db.is_optimized_locking_on
FROM sys.databases db

Output (consolidated):

Attribute	Warehouse	Mirrored Db	Lakehouse	SQL database
name	Warehouse Test 001	MirroredDatabase_1	Lakehouse_DWH	SQL DB Test...
database_id	5	6	7	28
owner_sid	0x01010000000000051...	0x01010000000000051...	x01010000000000051...	AAAAAAWQAAAAAA...
create_date	2025-02-22T18:56:28.700	2025-03-16T14:57:56.600	2025-03-16T15:07:59.563	2025-02-22T03:01:53.7130000
compatibility_level	160	160	160	160
collation_name	Latin1_General_100_BIN2_UTF8	Latin1_General_100_BIN2_UTF8	Latin1_General_100_BIN2_UTF8	SQL_Latin1_General_CP1_CI_AS
user_access_desc	MULTI_USER	MULTI_USER	MULTI_USER	MULTI_USER
state_desc	ONLINE	ONLINE	ONLINE	ONLINE
snapshot_isolation_state_desc	ON	ON	ON	ON
recovery_model_desc	SIMPLE	SIMPLE	SIMPLE	FULL
page_verify_option	0	0	0	2
page_verify_option_desc	NONE	NONE	NONE	CHECKSUM
is_auto_create_stats_on	1	1	1	1
is_auto_update_stats_on	1	1	1	1
is_fulltext_enabled	1	1	1	1
is_db_chaining_on	0	0	0	0
is_master_key_encrypted_by_server	0	0	0	0
service_broker_guid	1F2261FC-5031-...	7D882362-567E-...	0D8938AB-BA79-...	2b74bed3-4405-...
log_reuse_wait_desc	NOTHING	NOTHING	NOTHING	NOTHING
target_recovery_time_in_seconds	60	60	60	60
is_temporal_history_retention_enabled	1	1	1	1
catalog_collation_type_desc	DATABASE_DEFAULT	DATABASE_DEFAULT	DATABASE_DEFAULT	SQL_Latin1_General_CP1_CI_AS
physical_database_name	Warehouse Test 001	MirroredDatabase_1	Lakehouse_DWH	3f4a3e79-e53e-...
is_accelerated_database_recovery_on	1	1	1	1
is_memory_optimized_enabled	1	1	1	1
is_data_retention_enabled	1	1	1	1
data_lake_log_publishing_desc	AUTO	UNSUPPORTED	UNSUPPORTED	UNSUPPORTED
is_vorder_enabled	1	1	1	0

The output for the SQL database was slightly different formatted and is_vorder_enabled is not available. Otherwise, the above query can be used for all environments.

All attributes except the last two are known from the earlier versions of SQL Server. is_vorder_enabled reflects the current status of V-Order [2], while data_lake_log_publishing_desc reflects the current state of Delta Lake log publishing [3].

Consolidating the output from different sources helps identify the differences and one can easily use Excel formulas for this.

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2025) SQL Server 2022: sys.databases (Transact-SQL) [link]

[2] Microsoft Learn (2025) SQL Server 2022: Disable V-Order on Warehouse in Microsoft Fabric [link]

[3] Microsoft Learn (2025) SQL Server 2022: Delta Lake logs in Warehouse in Microsoft Fabric [link]

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part XI: Database and Server Properties)

When taking over a SQL Server, respectively database, one of the first checks I do focuses on the overall configuration, going through the UI available for admins to see if I can find anything that requires further investigation. If no documentation is available on the same, I run a few scripts and export their output as baseline.

Especially when documenting the configuration, it's useful to export the database options and properties defined at database level. Besides the collation and probably the recovery mode, typically the rest of the configuration is similar, though in exceptional cases one should expect also surprises that require further investigation!

The following query retrieves in a consolidated way all the options and properties of a SQL database in Microsoft Fabric.

-- database settings/properties 
SELECT DATABASEPROPERTYEX(DB_NAME(), 'Collation') Collation
--, DATABASEPROPERTYEX(DB_NAME(), 'ComparisonStyle')  ComparisonStyle
, DATABASEPROPERTYEX(DB_NAME(), 'Edition') Edition
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiNullDefault') IsAnsiNullDefault
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiNullsEnabled') IsAnsiNullsEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiPaddingEnabled') IsAnsiPaddingEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAnsiWarningsEnabled') IsAnsiWarningsEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsArithmeticAbortEnabled') IsArithmeticAbortEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoClose') IsAutoClose
, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoCreateStatistics') IsAutoCreateStatistics
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoCreateStatisticsIncremental') IsAutoCreateStatisticsIncremental
--, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoShrink') IsAutoShrink
, DATABASEPROPERTYEX(DB_NAME(), 'IsAutoUpdateStatistics') IsAutoUpdateStatistics
--, DATABASEPROPERTYEX(DB_NAME(), 'IsClone') IsClone
--, DATABASEPROPERTYEX(DB_NAME(), 'IsCloseCursorsOnCommitEnabled') IsCloseCursorsOnCommitEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsDatabaseSuspendedForSnapshotBackup') IsDatabaseSuspendedForSnapshotBackup
, DATABASEPROPERTYEX(DB_NAME(), 'IsFulltextEnabled') IsFulltextEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsInStandBy') IsInStandBy
--, DATABASEPROPERTYEX(DB_NAME(), 'IsLocalCursorsDefault') IsLocalCursorsDefault
--, DATABASEPROPERTYEX(DB_NAME(), 'IsMemoryOptimizedElevateToSnapshotEnabled') IsMemoryOptimizedElevateToSnapshotEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsMergePublished') IsMergePublished
--, DATABASEPROPERTYEX(DB_NAME(), 'IsNullConcat') IsNullConcat
--, DATABASEPROPERTYEX(DB_NAME(), 'IsNumericRoundAbortEnabled') IsNumericRoundAbortEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsParameterizationForced') IsParameterizationForced
--, DATABASEPROPERTYEX(DB_NAME(), 'IsQuotedIdentifiersEnabled') IsQuotedIdentifiersEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsPublished') IsPublished
--, DATABASEPROPERTYEX(DB_NAME(), 'IsRecursiveTriggersEnabled') IsRecursiveTriggersEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsSubscribed') IsSubscribed
--, DATABASEPROPERTYEX(DB_NAME(), 'IsSyncWithBackup') IsSyncWithBackup
--, DATABASEPROPERTYEX(DB_NAME(), 'IsTornPageDetectionEnabled') IsTornPageDetectionEnabled
--, DATABASEPROPERTYEX(DB_NAME(), 'IsVerifiedClone') IsVerifiedClone
--, DATABASEPROPERTYEX(DB_NAME(), 'IsXTPSupported') IsXTPSupported
, DATABASEPROPERTYEX(DB_NAME(), 'LastGoodCheckDbTime') LastGoodCheckDbTime
, DATABASEPROPERTYEX(DB_NAME(), 'LCID') LCID
--, DATABASEPROPERTYEX(DB_NAME(), 'MaxSizeInBytes') MaxSizeInBytes
, DATABASEPROPERTYEX(DB_NAME(), 'Recovery') Recovery
--, DATABASEPROPERTYEX(DB_NAME(), 'ServiceObjective') ServiceObjective
--, DATABASEPROPERTYEX(DB_NAME(), 'ServiceObjectiveId') ServiceObjectiveId
, DATABASEPROPERTYEX(DB_NAME(), 'SQLSortOrder') SQLSortOrder
, DATABASEPROPERTYEX(DB_NAME(), 'Status') Status
, DATABASEPROPERTYEX(DB_NAME(), 'Updateability') Updateability
, DATABASEPROPERTYEX(DB_NAME(), 'UserAccess') UserAccess
, DATABASEPROPERTYEX(DB_NAME(), 'Version') Version
--, DATABASEPROPERTYEX(DB_NAME(), 'ReplicaID') ReplicaID

Output:

Collation

Edition

IsAutoCreateStatistics

IsAutoUpdateStatistics

IsFulltextEnabled

LastGoodCheckDbTime

LCID

Recovery

SQLSortOrder

Status

Updateability

UserAccess

Version

SQL_Latin1_General_CP1_CI_AS

FabricSQLDB

12/31/1899

1033

FULL

ONLINE

READ_WRITE

MULTI_USER

981

The query can be run also against the SQL analytics endpoints available for warehouses in Microsoft Fabric.

Output:

Collation

Edition

IsAutoCreateStatistics

IsAutoUpdateStatistics

IsFulltextEnabled

LastGoodCheckDbTime

LCID

Recovery

SQLSortOrder

Status

Updateability

UserAccess

Version

Latin1_General_100_BIN2_UTF8

DataWarehouse

12/31/1899

1033

SIMPLE

ONLINE

READ_WRITE

MULTI_USER

987

Respectively, for lakehouses:

Collation

Edition

IsAutoCreateStatistics

IsAutoUpdateStatistics

IsFulltextEnabled

LastGoodCheckDbTime

LCID

Recovery

SQLSortOrder

Status

Updateability

UserAccess

Version

Latin1_General_100_BIN2_UTF8

LakeWarehouse

12/31/1899

1033

SIMPLE

ONLINE

READ_WRITE

MULTI_USER

987

A similar output is obtained if one runs the query against SQL database's SQL analytics endpoint:

Output:

Collation

Edition

IsAutoCreateStatistics

IsAutoUpdateStatistics

IsFulltextEnabled

LastGoodCheckDbTime

LCID

Recovery

SQLSortOrder

Status

Updateability

UserAccess

Version

Latin1_General_100_BIN2_UTF8

LakeWarehouse

12/31/1899

1033

SIMPLE

ONLINE

READ_WRITE

MULTI_USER

987

SQL databases seem to inherit the collation from the earlier versions of SQL Server.

Another meaningful value for SQL databases is MaxSizeInBytes, which in my environment had a value of 3298534883328 bytes ÷ 1,073,741,824 = 3,072 GB.

There are however also server properties. Here's the consolidated overview:

-- server properties
SELECT --SERVERPROPERTY('BuildClrVersion') BuildClrVersion
 SERVERPROPERTY('Collation') Collation
--, SERVERPROPERTY('CollationID') CollationID
, SERVERPROPERTY('ComparisonStyle') ComparisonStyle
--, SERVERPROPERTY('ComputerNamePhysicalNetBIOS') ComputerNamePhysicalNetBIOS
, SERVERPROPERTY('Edition') Edition
--, SERVERPROPERTY('EditionID') EditionID
, SERVERPROPERTY('EngineEdition') EngineEdition
--, SERVERPROPERTY('FilestreamConfiguredLevel') FilestreamConfiguredLevel
--, SERVERPROPERTY('FilestreamEffectiveLevel') FilestreamEffectiveLevel
--, SERVERPROPERTY('FilestreamShareName') FilestreamShareName
--, SERVERPROPERTY('HadrManagerStatus') HadrManagerStatus
--, SERVERPROPERTY('InstanceDefaultBackupPath') InstanceDefaultBackupPath
, SERVERPROPERTY('InstanceDefaultDataPath') InstanceDefaultDataPath
--, SERVERPROPERTY('InstanceDefaultLogPath') InstanceDefaultLogPath
--, SERVERPROPERTY('InstanceName') InstanceName
, SERVERPROPERTY('IsAdvancedAnalyticsInstalled') IsAdvancedAnalyticsInstalled
--, SERVERPROPERTY('IsBigDataCluster') IsBigDataCluster
--, SERVERPROPERTY('IsClustered') IsClustered
, SERVERPROPERTY('IsExternalAuthenticationOnly') IsExternalAuthenticationOnly
, SERVERPROPERTY('IsExternalGovernanceEnabled') IsExternalGovernanceEnabled
, SERVERPROPERTY('IsFullTextInstalled') IsFullTextInstalled
--, SERVERPROPERTY('IsHadrEnabled') IsHadrEnabled
--, SERVERPROPERTY('IsIntegratedSecurityOnly') IsIntegratedSecurityOnly
--, SERVERPROPERTY('IsLocalDB') IsLocalDB
--, SERVERPROPERTY('IsPolyBaseInstalled') IsPolyBaseInstalled
--, SERVERPROPERTY('IsServerSuspendedForSnapshotBackup') IsServerSuspendedForSnapshotBackup
--, SERVERPROPERTY('IsSingleUser') IsSingleUser
--, SERVERPROPERTY('IsTempDbMetadataMemoryOptimized') IsTempDbMetadataMemoryOptimized
, SERVERPROPERTY('IsXTPSupported') IsXTPSupported
, SERVERPROPERTY('LCID') LCID
, SERVERPROPERTY('LicenseType') LicenseType
, SERVERPROPERTY('MachineName') MachineName
, SERVERPROPERTY('NumLicenses') NumLicenses
, SERVERPROPERTY('PathSeparator') PathSeparator
--, SERVERPROPERTY('ProcessID') ProcessID
, SERVERPROPERTY('ProductBuild') ProductBuild
--, SERVERPROPERTY('ProductBuildType') ProductBuildType
--, SERVERPROPERTY('ProductLevel') ProductLevel
--, SERVERPROPERTY('ProductMajorVersion') ProductMajorVersion
--, SERVERPROPERTY('ProductMinorVersion') ProductMinorVersion
--, SERVERPROPERTY('ProductUpdateLevel') ProductUpdateLevel
--, SERVERPROPERTY('ProductUpdateReference') ProductUpdateReference
--, SERVERPROPERTY('ProductUpdateType') ProductUpdateType
, SERVERPROPERTY('ProductVersion') ProductVersion
, SERVERPROPERTY('ResourceLastUpdateDateTime') ResourceLastUpdateDateTime
, SERVERPROPERTY('ResourceVersion') ResourceVersion
, SERVERPROPERTY('ServerName') ServerName
, SERVERPROPERTY('SqlCharSet') SqlCharSet
, SERVERPROPERTY('SqlCharSetName') SqlCharSetName
, SERVERPROPERTY('SqlSortOrder') SqlSortOrder
, SERVERPROPERTY('SqlSortOrderName') SqlSortOrderName
, SERVERPROPERTY('SuspendedDatabaseCount') SuspendedDatabaseCount

Output (consolidated):

Property	SQL database	Warehouse	Lakehouse
Collation	SQL_Latin1_General_CP1_CI_AS	SQL_Latin1_General_CP1_CI_AS	SQL_Latin1_General_CP1_CI_AS
ComparisonStyle	196609	196609	196609
Edition	SQL Azure	SQL Azure	SQL Azure
EngineEdition	12	11	11
InstanceDefaultDataPath	NULL	NULL	NULL
IsAdvancedAnalyticsInstalled	1	1	1
IsExternalAuthenticationOnly	1	0	0
IsExternalGovernanceEnabled	1	1	1
IsFullTextInstalled	1	0	0
IsXTPSupported	1	1	1
LCID	1033	1033	1033
LicenseType	DISABLED	DISABLED	DISABLED
MachineName	NULL	NULL	NULL
NumLicenses	NULL	NULL	NULL
PathSeparator	\	\	\
ProductBuild	2000	502	502
ProductVersion	12.0.2000.8	12.0.2000.8	12.0.2000.8
ResourceLastUpdateDateTime	11/6/2024 3:41:27 PM	3/5/2025 12:05:50 PM	3/5/2025 12:05:50 PM
ResourceVersion	16.00.5751	17.00.502	17.00.502
ServerName	...	....	....
SqlCharSet	1	1	1
SqlCharSetName	iso_1	iso_1	iso_1
SqlSortOrder	52	52	52
SqlSortOrderName	nocase_iso	nocase_iso	nocase_iso
SuspendedDatabaseCount	NULL	0	0

It's interesting that all three instances have the same general collation, while the Engine Edition of SQL databases is not compatible with the others [2]. The Server Names has been removed manually from the output from obvious reasons. The warehouse and lakehouse are in the same environment (SQL Azure instance, see sys.databases), and therefore the same values are shown (though this might happen independently of the environments used).

The queries were run in a trial Microsoft Fabric environment. Other environments can have upon case different properties. Just replace the "--" from the commented code to get a complete overview.

The queries should run also in the other editions of SQL Server. If DATABASEPROPERTYEX is not supported, one should try DATABASEPROPERTY instead.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) SQL Server 2022: DATABASEPROPERTYEX (Transact-SQL) [link]
[2] Microsoft Learn (2024) SQL Server 2022: SERVERPROPERTY (Transact-SQL) [link]

🏭🗒️Microsoft Fabric: Azure Storage Account [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 16-Mar-2025

[Microsoft Fabric] Azure Storage Account

{def} a container provided by Microsoft Azure that houses all storage data in the cloud, including blobs (binary large objects), files, queues, and tables
provides a unique namespace for Azure Storage data that's accessible from anywhere in the world over HTTP or HTTPS [1]

every object stored has a URL address that includes customer's unique account name [1]

the combination of the account name and the service endpoint forms the endpoints for the storage account [1]

is an ARM resource

⇐ the deployment and management service for Azur [
belongs to a Azure resource group

⇐ a logical container for grouping Azure services [3]

{characteristic} durable
{characteristic} highly available
{characteristic} secure
{characteristic} massively scalable
{concept} storage account name

must be between 3 and 24 characters in length
may contain numbers and lowercase letters only
must be unique within Azure

{operation} create storage account
{operation} migrate storage account

{scenario} move a storage account to a different subscription
{scenario} move a storage account to a different resource group
{scenario} move a storage account to a different region
{scenario} upgrade to a general-purpose v2 storage account
{scenario} migrate a classic storage account to Azure Resource Manager

{operation} delete storage account

deletes the entire account, including all data in the account [3]
recovery is not guaranteed

under certain circumstances, a deleted storage account might be recovered [3]

deleting the resource group, deletes the storage account and any other resources in that resource group [3]
{recommendation} back up any data before deleting the account [3]

[operation} monitoring storage accounts

see storage metrics in Azure Monitor

{operation} transfer data into a storage account
{type} general purpose v1 (GPv1)

can no longer be created from the Azure portal [3]
no new feature development is expected for this account type [3]
currently there's no plan to deprecate support [3]

at least one year's advance notice will be provided before deprecating [3]

{type} general purpose v2 (GPv2)

recommended for most scenarios [3]
support the latest Azure Storage features and incorporate all of the functionality of GPv1 [4]
upgrading to a GPv2 storage account from GPv1 or Blob storage accounts is straightforward [4]

permanent and cannot be undone [4]
there's no downtime or risk of data loss associated [4]
happens via a simple ARM operation that changes the account type [4]
the upgrade is free
changing the storage access tier after the upgrade may result in bill changes [4]

blob access tiers

enable you to choose the most cost-effective storage based on your anticipated usage patterns [4]

storage account encryption

all data is automatically encrypted on the service side [1]

provides three storage services:

table storage

a NoSQL key/value store

queue storage

a simple queueing service

blob storage

used extensively throughout Azure to store things from logs to virtual machine disks [2]
enables storing files predominantly in three different formats [2]

⇐ according to read/write workload [2]
block blobs

optimized for storing text or binary files
allow for efficient parallel upload/download of a block or list of blocks and modification of a blob at the block granularity [2]
modifications to an individual blob take a two-phase approach [2]

{phase 1} one uploads the changes as a set of blocks.
{phase 2} one commits the changes by identifying the list of uploaded blocks
commonly used for files where the typical workload is to read or write the entire file [2]

e.g. text files, CSVs, and binary files

append blobs

a variant of a block blob that is optimized for append-only write workloads [2]

e.g. logging

doesn't allow deleting or updating of existing blocks [2]

page blobs

optimized for predominantly random read/write workloads against portions of the blob, where data is stored in pages [2]

e.g. virtual machine disks

file structure for a blob in Blob Storage

container

logical grouping of blobs

similar to how folders group files on your local machine [2]

at the root of the Storage account [2]
it can be used to set access permissions on the blobs it contains [2]
within each container can have an unlimited quantity of blobs [2]
each blob (or more precisely, each container and blob pair) identifies a partition [2]

each file have in Blob Storage is its own partition [2]

when creating the Storage account one can define the degree of replication of the data desired for high availability and disaster recovery purposes [2]

at minimum data stored within a Storage account is replicated on three separate nodes within a single facility (i.e., building)

{option} locally redundant storage (LRS)

stores three copies of the data within a facility (which is naturally within a specific geographic region) [2]

{option} zone-redundant storage (ZRS)

augments LRS by enabling a replica within another facility within the same region.
supports only block blobs [2]

{option} geo-redundant storage (GRS)

automatically replicates blob storage to another geographic region that is hundreds of miles away from the primary [2]
this secondary replica is not readable unless the primary becomes unavailable

when this happens, the failover is transparent to your application, but Azure will send you an email notification) [2]
when new data arrives, that data is first replicated to the three local replicas and then asynchronously replicated to the secondary geographic replica (where it is also replicated three times) [2]

{option} read-only geo-redundant storage (RA-GRS)

variant of GRS providing a secondary endpoint that enables reading from the secondary Storage account [2]

Azure Monitor

provides a unified monitoring experience [4]
stores metrics that include aggregated transaction statistics and capacity data about requests to the storage service [4]
receives metric data from the Azure Storage [4]

Previous Post <<||>> Next Post

Resources:
[1] Microsoft Learn (2025) Azure: Storage account overview [link]
[2] Zoiner Tejada (2017) Mastering Azure Analytics
[3] Microsoft Learn (2025) Azure: Create a storage account [link]

[4] Microsoft Learn (2025) Upgrade to a general-purpose v2 storage account [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:

ARM - Azure Resource Manager

GRS - Geo-Redundant Storage
LRS - Locally Redundant Storage
RA-GRS - Read-Only Geo-Redundant Storage

ZRS - Zone-Redundant Storage

15 March 2025

🏭🗒️Microsoft Fabric: Azure Data Lake Storage Gen2 (ADLS Gen2) [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes.

Last updated: 21-Jan-2025

[Microsoft Fabric] Azure Data Lake Storage Gen2 (ADLS Gen2)

{def} a set of capabilities built on Azure Blob Storage and dedicated to big data analytics [27]
{capability} low-cost storage

provides cloud storage that is less expensive than the cloud storage that relational databases provide [25]

{capability} performant massive storage

designed to service multiple petabytes of information while sustaining hundreds of gigabits of throughput [27]

processing is executed at near-constant per-request latencies that are measured at the service, account, and file levels [27]

{capability} tiered storage
{capability} high availability/disaster recovery
{capability} file system semantics
{capability} file-level security
{capability} scalability
supports a range of tools and programming languages that enable large amounts of data to be reported on, queried, and transformed [25]
included in customer subscriptions

customers can bring their data lake and integrate it with D365FO

combines BYOD and Entity store [25]
entity store

staged in the data lake and provides a set of simplified (denormalized) data structures to make reporting easier [25]

users can be given direct access to the relevant data and can create reports [25]

instead of exporting data by using BYOD, customers can select the data that is staged in the data lake

the data in the data lake are synchronized via the D365FO data feed service [25]
reflects the updated D365FO data within a few minutes after the changes occur [25]

further data can be brought into the data lake
the data can be consumed via cloud-based services [25]
{benefit} no need to monitor and manage complex data export and orchestration schedules [25]
{benefit} no user intervention is required to update data in the data lake [25]
o{benefit} cost effective data storage

[licensing] included in a customer's subscription

the customer must pay for

data storage
I/O costs that are incurred when data is read and written to the data lake [25]
I/O costs because D365FO apps write data to the data lake or update the data in it [25]

{prerequisite} the data lakes must be provisioned in the same country or region as the D365 environment [25]

the stored data comply with the CMD folder standard
integration makes warm-path reporting available as the default reporting option

because data in a data lake is updated within minutes [25]
approach acceptable for most reporting scenarios (incl. near-real-time reporting) [25]

{benefit} designed to store large amounts of data [25]
{benefit} designed for big data analytics [25]
{benefit} has many associated services that enable analytics, data transformation, and the application of AI and machine learning [25]

hierarchical namespace

organizes objects/files into a hierarchy of directories for efficient data access [27]
renaming or deleting a directory become single atomic metadata operations on the directory [28]
no need to enumerate and process all objects that share the name prefix of the directory [28]

{feature} query acceleration

enables applications and analytics frameworks to significantly optimize data processing by retrieving only the data that they require to perform a given operation [28]
accepts filtering predicates and column projections which enable applications to filter rows and columns at the time that data is read from disk [28]

only the data that meets the conditions of a predicate are transferred over the network to the application [28]
reduces network latency and compute cost [28]

a query acceleration request is specified via SQL

a request processes only one file

advanced relational features of SQL aren't supported [28]

e.g. joins and group by aggregates,

supports CSV and JSON formatted data as input to each request
compatible with the blobs in storage accounts that don't have a hierarchical namespace enabled on them [28]
designed for distributed analytics frameworks

e.g. Apache Spark and Apache Hive

these engines include query optimizers that can incorporate knowledge of the underlying I/O service's capabilities when determining an optimal query plan for user queries [28]
these frameworks integrate query acceleration

⇒ users see improved query latency and a lower total cost of ownership without having to make any changes to the queries

includes a storage abstraction layer within the framework [28]

designed for data processing applications that perform large-scale data transformations that might not directly lead to analytics insights [28]

Previous Post <<||>> Next Post

References
[25] Azure Synapse Analytics (2020) Azure Data Lake overview [link]

[27] Introduction to Azure Data Lake Storage Gen2 [link]

[28] Azure Data Lake Storage query acceleration [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
AI - Artificial Intelligence
BYOD - Bring Your Own Device
CSV - Comma-Separated Values
I/O - Input/Output

JSON - JavaScript Object Notation)

SQL Troubles

Pages

18 March 2025

🏭🗒️Microsoft Fabric: Statistics in Warehouse [Notes]

17 March 2025

🏭🗒️Microsoft Fabric: Z-Order [Notes]

🏭🗒️Microsoft Fabric: V-Order [Notes]

🏭🗒️Microsoft Fabric: Caching in Warehouse [Notes]

16 March 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part XII: Databases)

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part XI: Database and Server Properties)

🏭🗒️Microsoft Fabric: Azure Storage Account [Notes]

15 March 2025

🏭🗒️Microsoft Fabric: Azure Data Lake Storage Gen2 (ADLS Gen2) [Notes]

About Me