Showing posts with label SQL Reloaded. Show all posts
Showing posts with label SQL Reloaded. Show all posts

23 February 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part X: Templates for Database Objects)

One of the new features remarked in SQL databases when working on the previous post is the availability of templates in SQL databases. The functionality is useful even if is kept to a minimum. Probably, more value can be obtained when used in combination with Copilot, which requires at least a F12 capacity.

Schemas

Schemas are used to create a logical grouping of objects such as tables, stored procedures, and functions. From a structural and security point of view it makes sense to create additional schemas to manage the various database objects and use the default dbo schema only occasionally (e.g. for global created objects).

-- generated template - schema
CREATE SCHEMA SchemaName

-- create schema
CREATE SCHEMA Test

One can look at the sys.schemas to retrieve all the schemas available:

-- retrieve all schemas
SELECT schema_id
, name
, principal_id
FROM sys.schemas
ORDER BY schema_id

Tables

Tables, as database objects that contain all the data in a database are probably the elements that need the greatest attention in design and data processing. In some cases a table can be dedenormalized and it can store all the data needed, much like in MS Excel, respectively, benormalized in fact and dimension tables. 

Tables can be created explicitly by defining in advance their structure (see Option 1), respectively on the fly (see Option 2). 

-- Option 1
-- create the table manually (alternative to precedent step
CREATE TABLE [Test].[Customers](
	[CustomerId] [int] NOT NULL,
	[AddressID] [int] NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NULL,
	[CompanyName] [nvarchar](128) NULL,
	[SalesPerson] [nvarchar](256) NULL
) ON [PRIMARY]
GO

-- insert records
INSERT INTO Test.Customers
SELECT CustomerId
, Title
, FirstName 
, LastName
, CompanyName
, SalesPerson
FROM SalesLT.Customer -- checking the output (both scenarios) SELECT top 100 * FROM Test.Customers

One can look at the sys.tables to retrieve all the tables available:

-- retrieve all tables
SELECT schema_name(schema_id) schema_name
, object_id
, name
FROM sys.tables
ORDER BY schema_name
, name

Views

Views are much like virtual table based on the result-set of an SQL statement that combines data from one or multiple tables.  They can be used to encapsulate logic, respectively project horizontally or  vertically a subset of the data. 

-- create view
CREATE OR ALTER VIEW Test.vCustomers
-- Customers 
AS
SELECT CST.CustomerId 
, CST.Title
, CST.FirstName 
, IsNull(CST.MiddleName, '') MiddleName
, CST.LastName 
, CST.CompanyName 
, CST.SalesPerson 
FROM SalesLT.Customer CST

-- test the view 
SELECT *
FROM Test.vCustomers
WHERE CompanyName = 'A Bike Store'

One can look at the sys.views to retrieve all the views available:

-- retrieve all views
SELECT schema_name(schema_id) schema_name
, object_id
, name
FROM sys.views
ORDER BY schema_name
, name

User-Defined Functions

A user-defined function (UDF) allows to create a function by using a SQL expression. It can be used alone or as part of a query, as in the below example.

-- generated template - user defined function
CREATE FUNCTION [dbo].[FunctionName] (
    @param1 INT,
    @param2 INT
)
RETURNS INT AS BEGIN RETURN
    @param1 + @param2
END

-- user-defined function: 
CREATE OR ALTER FUNCTION Test.GetFirstMiddleLastName (
    @FirstName nvarchar(50),
    @MiddleName nvarchar(50),
    @LastName nvarchar(50)
)
RETURNS nvarchar(150) AS 
BEGIN 
   RETURN IsNull(@FirstName, '') + IsNull(' ' + @MiddleName, '') + IsNull(' ' + @LastName, '') 
END

-- test UDF on single values
SELECT Test.GetFirstMiddleLastName ('Jack', NULL, 'Sparrow')
SELECT Test.GetFirstMiddleLastName ('Jack', 'L.', 'Sparrow')

-- test UDF on a whole table
SELECT TOP 100 Test.GetFirstMiddleLastName (FirstName, MiddleName, LastName)
FROM SalesLT.Customer

One can look at the sys.objects to retrieve all the scalar functions available:

-- retrieve all scalar functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_SCALAR_FUNCTION'
ORDER BY schema_name
, name

However, UDFs prove to be useful when they mix the capabilities of functions with the ones of views allowing to create a "parametrized view" (see next example) or even encapsulate a multi-line statement that returns a dataset. Currently, there seems to be no template available for creating such functions.

-- table-valued function
CREATE OR ALTER FUNCTION Test.tvfGetCustomers (
    @CompanyName nvarchar(50) NULL
)
RETURNS TABLE
-- Customers by Company
AS
RETURN (
	SELECT CST.CustomerId 
	, CST.CompanyName
	, CST.Title
	, IsNull(CST.FirstName, '') + IsNull(' ' + CST.MiddleName, '') + IsNull(' ' + CST.LastName, '') FullName
	, CST.FirstName 
	, CST.MiddleName 
	, CST.LastName 
	FROM SalesLT.Customer CST
	WHERE CST.CompanyName = IsNull(@CompanyName, CST.CompanyName)
);

-- test function for values
SELECT *
FROM Test.tvfGetCustomers ('A Bike Store')
ORDER BY CompanyName
, FullName

-- test function for retrieving all values
SELECT *
FROM Test.tvfGetCustomers (NULL)
ORDER BY CompanyName
, FullName

One can look at the sys.objects to retrieve all the table-valued functions available:

-- retrieve all table-valued functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_INLINE_TABLE_VALUED_FUNCTION'
ORDER BY schema_name , name

Stored Procedures

A stored procedure is a prepared SQL statement that is stored as a database object and precompiled. Typically, the statements considered in SQL functions can be created also as stored procedure, however the latter doesn't allow to reuse the output directly.

-- get customers by company
CREATE OR ALTER PROCEDURE Test.spGetCustomersByCompany (
    @CompanyName nvarchar(50) NULL
)
AS
BEGIN
	SELECT CST.CustomerId 
	, CST.CompanyName
	, CST.Title
	, IsNull(CST.FirstName, '') + IsNull(' ' + CST.MiddleName, '') + IsNull(' ' + CST.LastName, '') FullName
	, CST.FirstName 
	, CST.MiddleName 
	, CST.LastName 
	FROM SalesLT.Customer CST
	WHERE CST.CompanyName = IsNull(@CompanyName, CST.CompanyName)
	ORDER BY CST.CompanyName
	, FullName
END 

-- test the procedure 
EXEC Test.spGetCustomersByCompany NULL -- all customers
EXEC Test.spGetCustomersByCompany 'A Bike Store' -- individual customer

One can look at the sys.objects to retrieve all the stored procedures available:

-- retrieve all scalar functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_STORED_PROCEDURE'
ORDER BY schema_name , name

In the end, don't forget to drop the objects created above (note the order of the dependencies):

-- drop function 
DROP FUNCTION IF EXISTS Test.GetFirstMiddleLastName

-- drop function 
DROP FUNCTION IF EXISTS Test.tvfGetCustomers 
-- drop precedure DROP VIEW IF EXISTS Test.Test.spGetCustomersByCompany -- drop view DROP VIEW IF EXISTS Test.vCustomers -- drop schema DROP SCHEMA IF EXISTS Test

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Overview of Copilot in Fabric [link]

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part IX: From OLTP to OLAP Data Models)

With SQL databases Microsoft brought OLTP to Microsoft Fabric which allows addressing a wider range of requirements, though this involves also some challenges that usually are addressed by the transition from the OLTP to OLAP architectures. Typically, there's an abstraction layer that is built on top of the OLTP data models that allows to address the various OLAP requirements. As soon as OLTP and OLAP models are mixed together, this opens the door to design and data quality issues that have impact on the adoption of solutions by users. Probably, those who worked with MS Access or even MS Excel directly or in combination with SQL Server can still remember the issues they run into.

Ideally, it should be a separation layer between OLTP and the OLAP data. This can be easily achieved in SQL databases by using two different schemas that mimic the interaction between the two types of architectures. So, supposing that the dbo schema from the SalesLT is the data as maintain by the OLTP layer, one can add an additional schema Test in which the OLAP logic is modelled. This scenario is not ideal, though it allows to model the two aspects of the topic considered. The following steps are to be performed in the environment in which the SalesLT database was created. 

Independently in which layer one works, it's ideal to create a set of views that abstracts the logic and ideally simplifies the processing of data. So, in a first step it's recommended to abstract the data from the source by creating a set of views like the one below:

-- drop view (cleaning)
-- DROP VIEW IF EXISTS SalesLT.vCustomerLocations 

-- create view
CREATE VIEW SalesLT.vCustomerLocations
-- Customers with main office
AS
SELECT CST.CustomerId 
, CSA.AddressID
, CST.Title
, CST.FirstName 
, IsNull(CST.MiddleName, '') MiddleName
, CST.LastName 
, CST.CompanyName 
, CST.SalesPerson 
, IsNull(CSA.AddressType, '') AddressType
, IsNull(ADR.City, '') City
, IsNull(ADR.StateProvince, '') StateProvince
, IsNull(ADR.CountryRegion, '') CountryRegion
, IsNull(ADR.PostalCode, '') PostalCode
FROM SalesLT.Customer CST
	 LEFT JOIN SalesLT.CustomerAddress CSA
	   ON CST.CustomerID = CSA.CustomerID
	  AND CSA.AddressType = 'Main Office'
	 	LEFT JOIN SalesLT.Address ADR
		  ON CSA.AddressID = ADR.AddressID

The view uses LEFT instead of FULL joins because this allows more flexibility, respectively identifying the gaps existing between entities (e.g. customers without addresses). In these abstractions, the number of transformations is kept to a minimum to reflect the data as reflected by the source. It may be chosen to minimize the occurrence of NULL values as this simplifies the logic for comparisons (see the use of IsNull).

Once the abstraction from the OLTP layer was built, one can make the data available in the OLAP layer:

-- create schema
CREATE SCHEMA Test

-- dropping the target table (for cleaning)
-- DROP TABLE IF EXISTS Test.CustomerLocations

-- Option 1
-- create the table on the fly
SELECT *
INTO Test.CustomerLocations
FROM SalesLT.vCustomerLocations

-- Option 2
-- create the table manually (alternative to precedent step
CREATE TABLE [Test].[CustomerLocations](
	[CustomerId] [int] NOT NULL,
	[AddressID] [int] NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NULL,
	[MiddleName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NULL,
	[CompanyName] [nvarchar](128) NULL,
	[SalesPerson] [nvarchar](256) NULL,
	[AddressType] [nvarchar](50) NULL,
	[City] [nvarchar](30) NULL,
	[StateProvince] [nvarchar](50) NULL,
	[CountryRegion] [nvarchar](50) NULL,
	[PostalCode] [nvarchar](15) NULL
) ON [PRIMARY]
GO

-- insert records
INSERT INTO Test.CustomerLocations
SELECT *
FROM SalesLT.vCustomerLocations


-- checking the output (both scenarios)
SELECT top 100 *
FROM Test.CustomerLocations


-- drop the view (for cleaning)
-- DROP VIEW IF EXISTS Test.vCustomerLocations

-- create view
CREATE VIEW Test.vCustomerLocations
-- Customer locations
AS
SELECT CSL.CustomerId 
, CSL.AddressID
, CSL.Title
, CSL.FirstName 
, CSL.MiddleName 
, CSL.LastName 
, Concat(CSL.FirstName, ' ' + CSL.MiddleName, ' ', CSL.LastName) FullName
, CSL.CompanyName 
, CSL.SalesPerson 
, CSL.AddressType
, CSL.City
, CSL.StateProvince
, CSL.CountryRegion 
, CSL.PostalCode
FROM Test.CustomerLocations CSL

-- test the view
SELECT top 100 *
FROM Test.vCustomerLocations

Further on, one can create additional objects as required. Usually, a set of well-designed views is enough, offering the needed flexibility with a minimum of code duplication. In addition, one can build stored procedures and table-valued functions as needed:

-- drop the function (for cleaning)
-- DROP FUNCTION IF EXISTS Test.tvfGetCustomerAddresses

-- generated template - function
CREATE FUNCTION Test.tvfGetCustomerAddresses (
    @CountryRegion nvarchar(50) NULL,
    @StateProvince nvarchar(50) NULL
)
RETURNS TABLE
-- Customers by Country & State province
AS
RETURN (
SELECT CSL.CustomerId 
, CSL.AddressID
, CSL.Title
, CSL.FirstName 
, CSL.MiddleName 
, CSL.LastName 
, CSL.FullName
, CSL.CompanyName 
, CSL.SalesPerson 
, CSL.AddressType 
, CSL.City
, CSL.StateProvince 
, CSL.CountryRegion 
, CSL.PostalCode
FROM Test.vCustomerLocations CSL
WHERE CSL.CountryRegion = IsNull(@CountryRegion, CSL.CountryRegion)
  AND CSL.StateProvince = IsNull(@StateProvince, CSL.StateProvince)
);

-- retrieving all records
SELECT *
FROM Test.tvfGetCustomerAddresses(NULL, NULL)

-- providing parameters
SELECT *
FROM Test.tvfGetCustomerAddresses('United States', 'Utah')

-- filtering on non-parametrized volumns
SELECT *
FROM Test.tvfGetCustomerAddresses('United States', 'Utah')
WHERE City = 'Salt Lake City'



-- drop the procedure (for cleaning)
-- DROP PROCEDURE IF EXISTS Test.spGetCustomerAddresses 

-- generated template - stored procedure
CREATE PROCEDURE Test.spGetCustomerAddresses (
    @CountryRegion nvarchar(50) NULL,
    @StateProvince nvarchar(50) NULL
)
-- Customers by Country & State province
AS
BEGIN
	SELECT CSL.CustomerId 
	, CSL.AddressID
	, CSL.Title
	, CSL.FirstName 
	, CSL.MiddleName 
	, CSL.LastName 
	, CSL.FullName
	, CSL.CompanyName 
	, CSL.SalesPerson 
	, CSL.AddressType 
	, CSL.City
	, CSL.StateProvince 
	, CSL.CountryRegion 
	, CSL.PostalCode
	FROM Test.vCustomerLocations CSL
	WHERE CSL.CountryRegion = IsNull(@CountryRegion, CSL.CountryRegion)
	AND CSL.StateProvince = IsNull(@StateProvince, CSL.StateProvince)
END 

-- retrieving all records
EXEC Test.spGetCustomerAddresses NULL, NULL

-- providing parameters
 EXEC Test.spGetCustomerAddresses 'United States', 'Utah'

These steps can repeated for each entity in scope.

This separation between OLTP and OLAP is usually necessary given that business processes need a certain amount of time until they are correctly reflected as per reporting needs. Otherwise, the gaps can negatively impact the quality of data used for reporting. For some reports these deviation might be acceptable, though there will be probably also (many) exceptions. Independently of the solution used, it's still needed to make sure that the data are appropriate for the processes and reporting. 

If no physical separation is needed between the two types of layers, one can remove the persisted tables from the logic and keep the objects as they are.

Independently of which architecture is chosen, one shouldn't forget to validate one's presumptions in what concerns the data model (e.g. customers without addresses, address types, etc.).

Previous Post <<||>> Next Post

06 February 2025

🌌🏭KQL Reloaded: First Steps (Part V: Database Metadata)

When working with a new data repository, one of the first things to do is to look at database's metadata, when available, and try to get a birds eye view of what's available, how big is the databases in terms of size, tables and user-defined objects, how the schema was defined, how the data are stored, eventually how often backup are taken, what users have access and to what, etc. 

So, after creating some queries in KQL and figuring out how things work, I tried to check what metadata are available, how it can be accessed, etc. The target is not to provide a full list of the available metadata, but to understand what information is available, in what format, how easy is to extract the important metadata, etc. 

So, the first set of metadata is related to database:

// get database metadata metadata
.show databases (ContosoSales)

// get database metadata metadata (multiple databases)
.show databases (ContosoSales, Samples)

// get database schema metadata
.show databases (ContosoSales) schema

// get database schema metadata (multiple databases) 
.show databases (ContosoSales, Samples) schema

// get database schema violations metadata
.show database ContosoSales schema violations

// get database entities metadata
.show databases entities with (showObfuscatedStrings=true)
| where DatabaseName == "ContosoSales"

// get database metadata 
.show databases entities with (resolveFunctionsSchema=true)
| where DatabaseName == "ContosoSales" and EntityType == "Table"
//| summarize count () //get the number of tables

// get a function's details
.show databases entities with (resolveFunctionsSchema=true)
| where DatabaseName == "ContosoSales" 
    and EntityType == "Function" 
    and EntityName == "SalesWithParams"

// get external tables metadata
.show external tables

// get materialized views metadata
.show materialized-views

// get query results metadata
.show stored_query_results

// get entities groups metadata
.show entity_groups

Then, it's useful to look at the database objects. 

// get all tables 
.show tables 
//| count

// get tables metadata
.show tables (Customers, NewSales)

// get tables schema
.show table Customers cslschema

// get schema as json
.show table Customers schema as json

// get table size: Customers
Customers
| extend sizeEstimateOfColumn = estimate_data_size(*)
| summarize totalSize_MB=round(sum(sizeEstimateOfColumn)/1024.00/1024.00,2)

Unfortunately, the public environment has restrictions in what concerns the creation of objects, while for the features available one needs to create some objects to query the corresponding metadata.

Furthermore, it would be interesting to understand who has access to the various repositories, what policies were defined, and so on. 

// get principal roles
.show database ContosoSales principal roles

// get principal roles for table
.show table Customers principal roles

// get principal roles for function:
.show function SalesWithParams principal roles

// get retention policies
.show table Customers policy retention

// get sharding policies
.show table Customers policy sharding

There are many more objects one can explore. It makes sense to document the features, respectively the objects used for the various purposes.

In addition, one should check also the best practices available for the data repository (see [2]).

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Management commands overview [link]
[2] Microsoft Learn (2024) Kusto: Best practices for schema management [link]

23 January 2025

💎SQL Reloaded: Number of Records VI (via sp_MSForEachTable Undocumented Stored Procedure)

Starting with SQL Server 2000 it's possible to execute a command via the undocumented stored procedure sp_MSForEachTable for each table available in a database, respectively for subsets of the tables. In a previous post I shown how the stored procedure can be used in several scenarios, including how to get the total number of records in each set of tables. However, the code used generates a result set for each table, which makes it difficult to aggregate the information for further processing. In many scenarios, it would be useful to store the result as a temporary or even persisted table.

-- dropping the tables
DROP TABLE IF EXISTS #Tables
DROP TABLE IF EXISTS #TablesRecordCount

-- create a temporary table to store the input list
SELECT TableName
INTO #Tables 
FROM (VALUES ('Person.Address')
, ('Person.AddressType')
, ('Person.BusinessEntity')) DAT(TableName)


-- create a temporary table to store the results
CREATE TABLE dbo.#TablesRecordCount (
  table_name nvarchar(150) NOT NULL
, number_records bigint
, run_date datetime2(0)
, comment nvarchar(255)
)

-- getting the number of records for the list of tables into the result table
INSERT INTO #TablesRecordCount
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) numer_records, GetDate() run_date, ''testing round 1'' comment FROM ?'
, @whereand = ' And Object_id In (Select Object_id(TableName) FROM #Tables)'

-- reviewing the result
SELECT *
FROM #TablesRecordCount
ORDER BY number_records DESC

The above solution uses two temporary tables, though it can be easily adapted to persist the result in a standard table: just replace the "#" with the schema part (e.g. "dbo."). This can be useful in troubleshooting scenarios, when the code is run at different points in time, eventually for different sets of tables. 

The code is pretty simple and can be extended as needed. Unfortunately, there's no guarantee that the sp_MSForEachTable stored procedure will be supported in the next versions of the SQL Server. For example, the stored procedure is not available in SQL databases, respectively in Fabric warehouses. In SQL databases the following error is thrown:

"Msg 2812, Level 16, State 62, Line 1, Could not find stored procedure 'sys.sp_MSForEachTable'."

To test whether the feature works in your environment, it's enough to run a call to the respective stored procedure:

-- retrieve the record count for all tables
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) numer_records FROM ?'

Or, you can check whether it works for one table (replace the Person.AddressType table with one from your environment):

-- getting the number of records for the list of tables into another table
EXEC sp_MSForEachTable @command1='SELECT ''?'' [Table], COUNT(*) numer_records FROM ?'
, @whereand = ' And Object_id = Object_id(''Person.AddressType'')'

The solution could prove to be useful in multiple scenarios, though one should consider also the risk of being forced to rewrite the code when the used stored procedure becomes unavailable. Even if it takes more time to write, a solution based on cursors can be more feasible (see previous post).

Update 29-Jan-2025: Probably, despite their usefulness, the undocumented features will not be brought to SQL databases (see [1], 47:30). So, be careful about using the respective features as standard solutions in production environments!

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Reactor (2025) Ask The Expert - Fabric Edition - Fabric Databases [link]

💎🏭SQL Reloaded: Number of Records V (via Cursors, a Solution for Warehouses in Microsoft Fabric)

After deploying the sample warehouse available in Microsoft Fabric, I tried to check the number of records available in the deployed tables under the dbo schema. Surprisingly, the sys.partitions.count column has 0 values for all the tables associated with the respective schema (see post). 

There are only a few tables available, and taking a record count for each table should be enough, which is relatively simple with the undocumented sp_MSForEachTable. Unfortunately, this approach doesn't work neither, so one needs to revert to the use of old-fashioned cursors (as I used to do in SQL Server 2000):

-- number of records via cursor
DECLARE @table_name nvarchar(150)
DECLARE @sql nvarchar(250)
DECLARE @number_records bigint 
DECLARE @number_tables int, @iterator int

DROP TABLE IF EXISTS dbo.#tables;

CREATE TABLE dbo.#tables (
  ranking int NOT NULL
, table_name nvarchar(150) NOT NULL
, number_records bigint
)

INSERT INTO #tables
SELECT row_number() OVER(ORDER BY object_id) ranking
, concat(schema_name(schema_id),'.', name) table_name
, NULL number_records
FROM sys.tables obj
WHERE obj.schema_id = schema_id('dbo')
ORDER BY table_name

SET @iterator = 1
SET @number_tables = IsNull((SELECT count(*) FROM #tables), 0)

WHILE (@iterator <= @number_tables)
BEGIN 
    SET @table_name = (SELECT table_name FROM #tables WHERE ranking = @iterator)
    SET @sql = CONCAT(N'SELECT @NumberRecords = count(*) FROM ', @table_name)

	BEGIN TRY
		--get the number of records
		EXEC sp_executesql @Query = @sql
		, @params = N'@NumberRecords bigint OUTPUT'
		, @NumberRecords = @number_records OUTPUT

		IF IsNull(@number_records, 0)> 0  
		BEGIN
                SET @sql = 'UPDATE #tables' 
             + ' SET number_records = ' + Str(@number_records)
             + ' WHERE table_name = ''' + @table_name + '''';

		 EXEC(@sql)
		END 
	END TRY
	BEGIN CATCH  
	 -- no action needed in case of error
        END CATCH;

	SET @iterator = @iterator + 1
END

SELECT *
FROM dbo.#tables;

--DROP TABLE IF EXISTS dbo.#tables;
Results:
ranking table_name number_records
1 dbo.Date 5844
2 dbo.Geography 305179
3 dbo.HackneyLicense 42958
4 dbo.Time 86400
5 dbo.Weather 526330
6 dbo.Trip 2838927
7 dbo.Medallion 13668

Comments:
1) It's a lot of code for a simple task, though the code can be easily duplicated and adapted for similar requirements. Unfortunately, it can lead in time also to many instances of the same code. When possible, one should consider maybe encapsulating the logic in a stored procedure. 
2) It's usually a good idea to check how many records are available in the tables used for testing, as this can impact queries' performance and tables' appropriateness for the tests performed. Moreover, it's a good idea to understand the volume of data when taking over or working with a database. 
3) If one removes the row_number function, the code should run also in SQL Server 2000. Similar solutions were used then for retrieving the record count.
4) Microsoft recommends not to drop the temporary tables explicitly, but let SQL Server handle this cleanup automatically and take thus advantage of the Optimistic Latching Algorithm, which helps prevent contention on TempDB [1]..
5) There are others who stumbled over this issue (see [1]).
6) The solution has been tested successfully also in SQL databases.
7) The whole code must be run together because the temporary table seems to have only a transitory scope! An attempt to rerun the last SELECT from #tables raises the error: "Invalid object name '#tables'"

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Koen Verbeeck (2024) Get row counts of all tables in a Microsoft Fabric warehouse [link]
[2] Haripriya SB (2024) Do NOT drop #temp tables (link)

17 January 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part VIII: Permissions) [new feature]

Data-based solutions usually target a set of users who (ideally) have restricted permissions to the functionality. Therefore, as part of the process are defined several personas that target different use cases, for which the permissions must be restricted accordingly. 

In the simplest scenario the user must have access to the underlying objects for querying the data. Supposing that an Entra User was created already, the respective user must be given access also in the Fabric database (see [1], [2]). From database's main menu follow the path to assign read permissions:
Security >> Manage SQL Security >> (select role: db_datareader)

Manage SQL Security
Manage SQL Security

Manage access >> Add >> (search for User)

Manage access
Manage access

(select user) >> Share database >> (select additional permissions) >> Save

Manage additional permissions
Manage additional permissions

The easiest way to test whether the permissions work before building the functionality is to login over SQL Server Management Studio (SSMS) and check the access using the Microsoft Entra MFA. Ideally, one should have a User's credentials that can be used only for testing purposes. After the above setup was done, the new User was able to access the data. 

A second User can be created for testing with the maximum of permissions allowed on the SQL database side, which is useful for troubleshooting. Alternatively, one can use only one User for testing and assign or remove the permissions as needed by the test scenario. 

It's a good idea to try to understand what's happening in the background. For example, the expectation was that for the Entra User created above also a SQL user is created, which doesn't seem to be the case, at least per current functionality available. 

 Before diving deeper, it's useful to retrieve User's details: 

-- retrieve current user
SELECT SUser_Name() sys_user_name
, User_Id() user_id 
, USER_NAME() user_name
, current_user [current_user]
, user [user]; 
Output:
sys_user_name user_id user_name current_user user
JamesClavell@[domain].onmicrosoft.com 0 JamesClavell@[domain].onmicrosoft.com JamesClavell@[domain].onmicrosoft.com JamesClavell@[domain].onmicrosoft.com

Retrieving the current User is useful especially when testing in parallel functionality with different Users. Strangely, User's ID is 0 when only read permissions were assigned. However, a valid User identifier is added for example when to the User is assigned also the db_datawriter role. Removing afterwards the db_datawriter role to the User keeps as expected User's ID. For troubleshooting purposes, at least per current functionality, it might be a good idea to create the Users with a valid User ID (e.g. by assigning temporarily the db_datawriter role to the User). 

The next step is to look at the Users with access to the database:

-- database access 
SELECT USR.uid
, USR.name
--, USR.sid 
, USR.hasdbaccess 
, USR.islogin
, USR.issqluser
--, USR.createdate 
--, USR.updatedate 
FROM sys.sysusers USR
WHERE USR.hasdbaccess = 1
  AND USR.islogin = 1
ORDER BY uid
Output:
uid name hasdbaccess islogin issqluser
1 dbo 1 1 1
6 CharlesDickens@[...].onmicrosoft.com 1 1 0
7 TestUser 1 1 1
9 JamesClavell@[...].onmicrosoft.com 1 1 0

For testing purposes, besides the standard dbo role and two Entra-based roles, it was created also a SQL role to which was granted access to the SalesLT schema (see initial post):

-- create the user
CREATE USER TestUser WITHOUT LOGIN;

-- assign access to SalesLT schema 
GRANT SELECT ON SCHEMA::SalesLT TO TestUser;
  
-- test impersonation (run together)
EXECUTE AS USER = 'TestUser';

SELECT * FROM SalesLT.Customer;

REVERT; 

Notes:
1) Strangely, even if access was given explicitly only to the SalesLT schema, the TestUser User has access also to sys.sysusers and other DMVs. That's valid also for the access over SSMS
2) For the above created User there are no records in the sys.user_token and sys.login_token DMVs, in contrast with the user(s) created for administering the SQL database. 

Let's look at the permissions granted explicitly:

-- permissions granted explicitly
SELECT DPR.principal_id
, DPR.name
, DPR.type_desc
, DPR.authentication_type_desc
, DPE.state_desc
, DPE.permission_name
FROM sys.database_principals DPR
     JOIN sys.database_permissions DPE
	   ON DPR.principal_id = DPE.grantee_principal_id
WHERE DPR.principal_id != 0 -- removing the public user
ORDER BY DPR.principal_id
, DPE.permission_name;
Result:
principal_id name type_desc authentication_type_desc state_desc permission_name
1 dbo SQL_USER INSTANCE GRANT CONNECT
6 CharlesDickens@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT AUTHENTICATE
6 CharlesDickens@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT CONNECT
7 TestUser SQL_USER NONE GRANT CONNECT
7 TestUser SQL_USER NONE GRANT SELECT
9 JamesClavell@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT CONNECT

During troubleshooting it might be useful to check current user's permissions at the various levels via sys.fn_my_permissions:

-- retrieve database-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions(NULL, 'Database');

-- retrieve schema-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions('SalesLT', 'Schema');

-- retrieve object-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions('SalesLT.Customer', 'Object')
WHERE permission_name = 'SELECT';

Notes:
1) See also [1] and [4] in what concerns the limitations that apply to managing permissions in SQL databases.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Share your SQL database and manage permissions [link]
[2] Microsoft Learn (2024) Microsoft Fabric: Share data and manage access to your SQL database in Microsoft Fabric  [link]
[3] Microsoft Learn (2024) Authorization in SQL database in Microsoft Fabric [link]
[4] Microsoft Learn (2024) Authentication in SQL database in Microsoft Fabric [link]

[5] Microsoft Fabric Learn (2025) Manage access for SQL databases in Microsoft Fabric with workspace roles and item permissions [link

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.