Showing posts with label views. Show all posts
Showing posts with label views. Show all posts

22 March 2025

💠🛠️🗒️SQL Server: Indexed Views [Notes]

Disclaimer: This is work in progress based on notes gathered over the years, intended to consolidate information from the various sources. The content needs yet to be reviewed against the current documentation. 

Last updated: 22-Mar-2024

[SQL Server 2005] Indexed View

  • {def} a materialized view
    • materializes the data from the view queries, storing it in the database in a way similar to tables [6]
      • ⇒ its definition is computed and the resulting data stored just like a table [3]
      • the view is indexed by creating a unique clustered index on it
        • the resulting structure is physically identical to a table with a clustered index
          • ⇐ nonclustered indexes also are supported on this structure
      • can be created on a partitioned table, respectively can be partitioned [1]
    • {benefit} can improve the performance of some types of queries [3]
      • e.g. queries that aggregate many rows
      • ⇐ because the view is stored the same way a table with a clustered index is stored [1]
      • ⇐ not well-suited for underlying data that are frequently updated [3]
      •  more expensive to use and maintain than filtered indexes [5]
    • [query optimizer] 
      • can use it to speed up the query execution [1]
        • the view doesn't have to be referenced in the query for the optimizer to consider it for a substitution [1]
        • {downside} DML query performance can degrade significantly [1]
          • ⇐ in some cases, a query plan can't even be produced [1]
          • when executing UPDATE, DELETE or INSERT on the base table referenced, the indexed views must be updated as well [1]
          • {recommendation} test DML queries before production use [1]
            • analyze the query plan and tune/simplify the DML statemen [1]
      • can use the structure to return results more efficiently to the user
        • contains logic to use this index in either of the cases 
          • the original query text referenced the view explicitly [2]
          • the user submits a query that uses the same components as the view (in any equivalent order) [2]
          • ⇐ the query processor expands indexed views early in the query pipeline and always uses the same matching code for both cases [2]
            • the WITH(NOEXPAND) hint tells the query processor not to expand the view definition [2]
            • also instructs the query processor to perform an index scan of the indexed view rather than expand it into its component parts [5]
            • any extra rows in the indexed view are reported as 8907 errors [5]
            • any missing rows are reported as 8908 errors [5]
      • expose some of the benefits of view materialization while retaining the benefits of global reasoning about query operations [2]
      • expanded (aka in-lined) before optimization begins
        • gives the Query Optimizer opportunities to optimize queries globally [2]
        • makes it difficult for the (query) optimizer to consider plans that perform the view evaluation first, then process the rest of the query [2]
          • arbitrary tree matching is a computationally complex problem, and the feature set of views is too large to perform this operation efficiently [2]
      • cases in which it does not match the view
        • indexed views are inserted into the Memo and evaluated against other plan choices
          • while they are often the best plan choice, this is not always the case [2]
          • the Query Optimizer can detect logical contradictions between the view definition and the query that references the view [2]
        • there are also some cases where the Query Optimizer does not recognize an indexed view even when it would be a good plan choice [2]
          • often, these cases deal with complex interactions between high-level features within the query processor (e.g. computed column matching, the algorithm to explore join orders) [2]
          • consider the WITH (NOEXPAND) hint to force the query processor to pick that indexed view [2]
            •  this usually is enough to get the plan to include the indexed view [2]
        • indexed view alternatives 
          • are generated and stored in the Memo 
          • are compared using costing equations against other possible plans
          • partial matches cost the residual operations as well
            • an indexed-view plan can be generated but not picked when the Query Optimizer considers other plans to have lower costs [2]
        • maintained as part of the update processing for tables on which the view is based
          • this makes sure that the view provides a consistent result if it is selected by the Query Optimizer for any query plan [2]
          • some query operations are incompatible with this design guarantee
            • restrictions are placed on the set of supported constructs in indexed views to make sure that the view can be created, matched, and updated efficiently [2]
        • {operation} updating indexed views
          • the core question behind the restrictions is “Can the query processor compute the necessary changes to the Indexed View clustered and nonclustered indexes without having to recompute the whole indexed view?” [2]
            • if so, the query processor can perform these changes efficiently as part of the maintenance of the base tables that are referenced in the view[2]
              • this property is relatively easy for filters, projections (compute scalar), and inner joins on keys[2]
              • operators that destroy or create data are more difficult to maintain, so often these are restricted from use in indexed views. [2]
        • matching indexed views is supported in cases beyond exact matches of the query text to the view definition [2]
          • it also supports using an indexed view for inexact matches where the definition of the view is broader than the query submitted by the user [2]
            • then applies residual filters, projections (columns in the select list), and even aggregates to use the view as a partial precomputation of the query result [2]
    • {concept} statistics on indexed views
      • normally statistics aren't needed
        • because the substitution of the indexed views into the query plan is considered only after all the statistics for the underlying tables and indexes are attached to the query plan [3]
        • used if the view is directly referenced by the NOEXPAND hint in a FROM clause 
          • an error is generated and the plan is not created if the NOEXPAND hint is used on a view that does not also contain an index [3]
      • can’t be created by using sp_createstats or updated by using sp_updatestats. 
      • auto update and auto create statistics features work for indexed views
        • created manually
          • via CREATE STATISTICS on the indexed view columns
          • via UPDATE STATISTICS to update column or index statistics on indexed views
    • {operation} creating a view
      • requires that the underlying object’s schema can’t change
      • requires WITH SCHEMABINDING option [5]
      • ⇒ must include the two-part names of all referenced tables [5]
      • ⇐ the tables can't be dropped and the columns can't be altetd when participate in a view unless the view is tropped [5]
      • ⇐ an error is raised [5]
      • the user must hold 
        • the CREATE VIEW permission in the database [1]
        • ALTER permission on the schema in which the view is being created [1]
        • if the base table resides within a different schema, the REFERENCES permission on the table is required as a minimum [1]
        • if the user creating the index differs from the users who created the view, for the index creation alone the ALTER permission on the view is required [1]
    • {operation} creating an index on the view[
      • indexes can only be created on views that have the same owner as the referenced table or tables (aka intact ownership chain between the view and the tables) [1]
    • {operation} dropping a view
      • makes all indexes on the view to be dropped  [1]
        • ⇐ all nonclustered indexes and auto-created statistics on the view are dropped when the clustered index is dropped [1]
          • {exception} ser-created statistics on the view are maintained [1]
      • nonclustered indexes can be individually dropped [1]
      • dropping the clustered index on the view 
        • removes the stored result set [1]
        • the optimizer returns to processing the view like a standard view [1]
    • {operation} disable indexes on tables and views
      • when a clustered index on a table is disabled, indexes on views associated with the table are also disabled [1]
    • {option} EXPAND VIEWS
      • allows to prevent the Database Engine from using indexed views [1]
        • if any of the listed options are incorrectly set, this option prevents the optimizer from using the indexes on the views [1]
        • via OPTION (EXPAND VIEWS) hint
    • {recommendation} when using datetime and smalldatetime string literals in indexed views, explicitly convert the literal to the date type by using a deterministic date format style [1]
    • {limitation} AVG is not allowed {workaround} use SUM and COUNT_BIG (5]
    • {limitation} impacted by SET options [1]
      • {restriction} require fixed values for several SET options [1]
      • {recommendation} set the ARITHABORT user option to ON server-wide as soon as the first indexed view or index on a computed column is created in any database on the server [1]
    • {limitation} further requirements apply (see [1])
    • {limitation} aren't supported on top of temporal queries
      • ⇐ queries that use FOR SYSTEM_TIME clause).
    • {scenario}simplifying SQL queries
    • {scenario} abstracting data models from user models
    • {scenario} enforcing user security


References:
[1] Microsoft Learn (2024) SQL Server: Create indexed views [link]
[2] Kalen Delaney et all (2009) Microsoft® SQL Server® 2008 Internals
[3] Microsoft Learn (2024) SQL Server: Views [link]
[4] Microsoft Learn (2024) SQL Server: CREATE INDEX (Transact-SQL) [link]
[5] Kalen Delaney et all (2012) Microsoft® SQL Server® 2012 Internals
[6] Dmitri Korotkevitch (2016) Pro SQL Server Internals 2nd Ed.

Resources:
[R1] Microsoft Learn (2024) SQL Server: Optimize index maintenance to improve query performance and reduce resource consumption [link]

Acronyms:
DML - Data Manipulation Language
QO - Query Optimizer

19 March 2025

💠🛠️🗒️SQL Server: Views [Notes]

Disclaimer: This is work in progress based on notes gathered over the years, intended to consolidate information from the various sources. The content needs yet to be reviewed against the current documentation.

Last updated: 19-Mar-2024

[SQL Server 2005] View (aka virtual table)

  • {def} a database object that encapsulates a SQL statement and that can be used as a virtual table in further SQL statements
    • cannot be executed by itself
      •  it must be used within a query [15]
    • doesn't store any data 
      • except index views
      • data is dynamically produced from the underlying table when the view is used [32]
        • views depend on the underlying tables and act like a filter on the underlying tables [32]
    • used just like regular tables without incurring additional cost
      • unless the view is indexed [25]
    • turning a query into a view
      • remove the ORDER BY clause
      • assure there are no name duplicates
      • assure that each column has a name
    • projected columns
      • columns included in the view 
    • view’s column list 
      • renames every output column just as if every column had those alias names in the SELECT statement
      • a view is more self-documenting if the column names of the view are specified in the SELECT statement and not listed separately in the view [27]
    • {restriction} sorting is not allowed in a view
      •  unless the view includes a TOP predicate 
        • ORDER BY clause serves only to define which rows qualify for the TOP predicate [15]
          • the only way to logically guarantee sorted results is to define the ORDER BY clause in the executing query [15]
        • [SQL Server 2005] had a bug in the Query Optimizer that would enable an ORDER BY in a view using a top 100 percent predicate [15]
          • the behavior was never documented or officially supported [15]
      • OFFSET FETCH clause
    • {restriction} parameters can’t be passed to a view  [100]
      • {alternative} use an inline table-valued function 
    • {restriction} cannot reference a variable inside the SELECT statement [100]
    • {restriction} cannot create a table, whether permanent or temporary
      • ⇒ cannot use the SELECT/INTO syntax in a view
    • {restriction} can reference only permanent tables
      • ⇒  cannot reference a temporary table [100]
    • {benefit} present the correct fields to the user
    • {benefit} enforce security 
        • by specifying 
          • only needed columns
            • projects a predefined set of columns [15]
            • hides sensitive, irrelevant, or confusing columns [15]
            • should be used in parallel with SQL Server–enforced security [15]
          • only needed records
        • by allowing users access to the view without the need to give access to the used tables
          • grant users read permission from only the views, and restrict access to the physical tables [15]
    • {benefit} maintainability
    • {benefit} provides a level of abstraction
      • hides the complexity of the underlying data structures 
      • encapsulates (business)logic
      • denormalize or flatten complex joins 
      • can consolidate data across databases/servers
      • can be used as single version of truth
    • {benefit} allow changing data in the base tables
    • {downside} layers of nested views require needless overhead for views’ understanding
    • {downside} single-purpose views quickly become obsolete and clutter the database [15]
    • {downside} complex views are perceived as having poor performance [15]
    • {best practice} use generic/standard naming conventions
    • {best practice} use aliases for cryptic/recurring column names
    • {best practice} consider only the requested columns
    • {best practice} group specific purpose view under own schema 
    • {best practice} avoid hardcoding values 
    • {best practice} use views for column-level security together with SQL Server–enforced security
    • {best practice} limit views to ad-hoc queries and reports
      • for extensibility and control [15]
      •  performance isn’t the reason [15]
    • {poor practices} create views for single-purpose queries (aka one time requests)
    • {operation} create a view
    • {operation} drop a view
    • {operation} alter a view
    • {operation} select data
    • {operation} update data
      • unless the view is a simple single table view, it’s difficult to update the underlying data through the view [15]
    • {type} inline views
      • exist only during the execution of a query [32]
      • simplify the development of a one-time query [32]
        • allows creating queries in steps
          • enables troubleshooting 
      • can replace inline UDFs
      • alternatives
        • inline UDFs
        • temporary tables
    • {type} indexed views
      • materialize the data, storing the results of the view in a clustered index on disk [15]
      • similar to a covering index 
        • but with greater control 
          • can include data from multiple data sources [15]
          • no need to include the clustered index keys [15]
        • designing an indexed view is more like designing an indexing structure than creating a view [15]
      • can cause deadlock when two or more of the participating tables is updated/inserted/deleted from two or more sessions in parallel such that they block each other causing a deadlock scenario [29]
    • {type} compatibility views
      • allow accessing a subset of the SQL Server 2000 system tables
        • don’t contain any metadata related to features added after
      • views have the same names as many of the system tables in previous version, as well as the same column names
        • ⇒ any code that uses the SQL Server 2000 system tables won’t break [16]
        • there’s no guarantee that will be returned exactly the same results as the ones from the corresponding tables in SQL Server 2000 [16]
      • accessible from any database
      • hidden in the resource database
        • e.g. sysobjects, sysindexes, sysusers, sysdatabases
    • {type} [SQL Server 2015] catalog views
      • general interface to the persisted system metadata
      • built on an inheritance model
        • ⇒  no need to redefine internally sets of attributes common to many objects
      • available over sys schema
        • must be included in object’s reference
      • some of the names are easy to remember because they are similar to the SQL Server 2000 system table names [16]
      • the columns displayed are very different from the columns in the compatibility views
      • some metadata appears only in the master database 
        • keeps track of system-wide data (e.g. databases and logins)
        • other metadata is available in every database (e.g. objects and permissions)
        • metadata appearing only in the msdb database isn’t available through catalog views but is still available in system tables, in the schema dbo (e.g. backup and restore, replication, Database Maintenance Plans, Integration Services, log shipping, and SQL Server Agent)
    • {type} partitioned views 
      • allow the data in a large table to be split into smaller member tables
        • the data is partitioned between the member tables based on ranges of data values in one of the columns [4]
        • the data ranges for each member table are defined in a CHECK constraint specified on the partitioning column [4]
        • a view that uses UNION ALL to combine selects of all the member tables into a single result set is then defined [4]
        • when SELECT statements referencing the view specify a search condition on the partition column, the query optimizer uses the CHECK constraint definitions to determine which member table contains the rows [4]
    • {type} distributed partition views (DPV)
      • local partitioned views
        • a single table is horizontally split into multiple tables, usually all have the same structure [30]
      • cross database partitioned views 
        • tables are split among different databases on the same server instance
      • distributed (across server or instance) partitioned views
        • tables participating in the view reside in different databases which reside ondifferent servers or different instances
    • {type} nested views
      • views referred by other views [15]
      • can lead to an abstraction layer with nested views several layers deep 
        • too difficult to diagnose and maintain [15]
    • {type} updatable view
      • view that allows updating the underlying tables
        • only one table may be updated
        • if the view includes joins, then the UPDATE statement that references the view must change columns in only one table [15]
      • typically not a recommended solution for application design
      • WITH CHECK OPTION causes the WHERE clause of the view to check the data being inserted or updated through the view in addition to the data being retrieved [15]
        • it makes the WHERE clause a two-way restriction [15]
          • ⇒  can protect the data from undesired inserts and updates [15]
        • ⇒  useful when the view should limit inserts and updates with the same restrictions applied to the WHERE clause [15]
        • when CHECK OPTION isn’t use, records inserted in the view that don’t match the WHERE constraints will disappear (aka disappearing rows) [15]
    • {type} non-updatable views
      • views that don’t allow updating the underlying tables
      • {workaround} build an INSTEAD OF trigger that inspects the modified data and then performs a legal UPDATE operation based on that data [15]
    • {type} horizontally positioned views.
      • used s to enforce row-level security with the help of a WITH CHECK option
        • {downside} has a high maintenance cost [15] 
        •  {alternative} row-level security can be designed using user-access tables and stored procedures [15]
    • {type} schema-bound views
      • the SELECT statement must include the schema name for any referenced objects [15]
        • SELECT * (all columns) is not permitted [15]
    • {type} subscription views 
      • a view used to export Master Data Services data to subscribing systems

References:
[4] Microsoft (2013) SQL Server 2000 Documentation
[15] Adam Jorgensen et al (2012) Microsoft® SQL Server® 2012 Bible
[16] Bob Beauchemin et al (2012) Microsoft SQL Server 2012 Internals
[25] Basit A Masood-Al-Farooq et al (2014) SQL Server 2014 Development Essentials: Design, implement, and deliver a successful database solution with Microsoft SQL Server 2014
[30] Kevin Cox (2007) Distributed Partitioned Views / Federated Databases: Lessons Learned
[32] Sikha S Bagui & Richard W Earp (2006) Learning SQL on SQL Server 2005
[100] Itzik Ben-Gan et al (2012) Exam 70-461: Querying Microsoft SQL Server 201

Acronyms:
DPV - Distributed Partition Views
UDF - User-Defined Function

26 February 2025

💎🤖SQL Reloaded: Copilot Stories (Part II: Create a Schema for Customer Addresses)

Since the first post on a related topic, Copilot became available in Microsoft 365, and this opens the door for prompt engineering experiments. A prompt is a "natural language instruction that tells a large language model (LLM) to perform a task" [1], process also known as instruction tuning. Prompt engineering is thus "the process of creating and refining the prompt used by the model" [1].

For example, one can use Prompt Engineering to learn how to perform certain tasks in a programming language like SQL: learning how to define a table, write a query and database objects, etc. In a first experiment I created a set of prompts for creating a set of objects to store and perform CRUD (Create, Read, Update, Delete) operations. The experiment focused on Customer Addresses, Countries, Regions. Let's look at the prompts and their outputs. The following prompts depict the attempts in the order in which they were considered, so feel free to experiment!

Before Copilot, I used macros in Excel or stored procedures to generate similar code. which allowed me to reduce considerably the development time.

Prompt:
Give me the definition of a SQL Server stored procedure for updating the Customers from the dbo schema based on input parameters. 

Output:

CREATE PROCEDURE UpdateCustomer
    @CustomerID INT,
    @CustomerName NVARCHAR(100),
    @ContactName NVARCHAR(100),
    @Country NVARCHAR(50)
AS
BEGIN
    UPDATE dbo.Customers
    SET 
        CustomerName = @CustomerName,
        ContactName = @ContactName,
        Country = @Country
    WHERE 
        CustomerID = @CustomerID;
END;

Prompt:
Give me the definition of a SQL Server stored procedure for inserting Customers in the dbo schema based on input parameters.

Output:

CREATE PROCEDURE InsertCustomer
    @CustomerName NVARCHAR(100),
    @ContactName NVARCHAR(100),
    @Country NVARCHAR(50)
AS
BEGIN
    INSERT INTO dbo.Customers (CustomerName, ContactName, Country)
    VALUES (@CustomerName, @ContactName, @Country);
END;

Prompt:
Give me the definition of a SQL Server stored procedure for deleting a Customers from the dbo schema based on Customer Id.

Output:

CREATE PROCEDURE DeleteCustomer
    @CustomerID INT
AS
BEGIN
    DELETE FROM dbo.Customers
    WHERE CustomerID = @CustomerID;
END;

Prompt:
Give me the definition of the SQL Server table used for the above examples.

Output:

CREATE TABLE dbo.Customers (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    CustomerName NVARCHAR(100) NOT NULL,
    ContactName NVARCHAR(100),
    Country NVARCHAR(50)
);

Prompt:
Extend the definition of the previous table with Address information. 

Output:

CREATE TABLE dbo.Customers (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    CustomerName NVARCHAR(100) NOT NULL,
    ContactName NVARCHAR(100),
    Country NVARCHAR(50),
    AddressLine1 NVARCHAR(200),
    AddressLine2 NVARCHAR(200),
    City NVARCHAR(100),
    State NVARCHAR(100),
    PostalCode NVARCHAR(20)
);

Prompt:
Remove the Address information from the above table into a separate Addresses table with the Customer Id as foreign key.

Output:

-- Modified Customers Table:
CREATE TABLE dbo.Customers (
    CustomerID INT IDENTITY(1,1) PRIMARY KEY,
    CustomerName NVARCHAR(100) NOT NULL,
    ContactName NVARCHAR(100),
    Country NVARCHAR(50)
);

-- New Addresses Table:
CREATE TABLE dbo.Addresses (
    AddressID INT IDENTITY(1,1) PRIMARY KEY,
    CustomerID INT,
    AddressLine1 NVARCHAR(200),
    AddressLine2 NVARCHAR(200),
    City NVARCHAR(100),
    State NVARCHAR(100),
    PostalCode NVARCHAR(20),
    FOREIGN KEY (CustomerID) REFERENCES dbo.Customers(CustomerID)
);

Prompt:
Create a Countries table that can be referenced from the Addresses table with Country ISO codes.

Output:

--Create the Countries Table:
CREATE TABLE dbo.Countries (
    CountryCode CHAR(2) PRIMARY KEY, -- ISO 3166-1 alpha-2 code
    CountryName NVARCHAR(100) NOT NULL
);

--Modify the Addresses Table:
CREATE TABLE dbo.Addresses (
    AddressID INT IDENTITY(1,1) PRIMARY KEY,
    CustomerID INT,
    AddressLine1 NVARCHAR(200),
    AddressLine2 NVARCHAR(200),
    City NVARCHAR(100),
    State NVARCHAR(100),
    PostalCode NVARCHAR(20),
    CountryCode CHAR(2),
    FOREIGN KEY (CustomerID) REFERENCES dbo.Customers(CustomerID),
    FOREIGN KEY (CountryCode) REFERENCES dbo.Countries(CountryCode)
);

Prompt:
Extend the Countries table with ISO 3 characters code and ISO Number.

Output:

CREATE TABLE dbo.Countries (
    CountryCode CHAR(2) PRIMARY KEY, -- ISO 3166-1 alpha-2 code
    CountryName NVARCHAR(100) NOT NULL,
    ISO3Code CHAR(3), -- ISO 3166-1 alpha-3 code
    ISONumericCode INT -- ISO 3166-1 numeric code
);

Prompt:
Create a Regions table with Region code and description that allows to group the countries in territories. A country can be assigned to multiple regions.

Output:

-- Create the Regions Table:
CREATE TABLE dbo.Regions (
    RegionCode CHAR(3) PRIMARY KEY,
    RegionDescription NVARCHAR(100) NOT NULL
);

-- Create the Countries Table (including the previous columns):
CREATE TABLE dbo.Countries (
    CountryCode CHAR(2) PRIMARY KEY, -- ISO 3166-1 alpha-2 code
    CountryName NVARCHAR(100) NOT NULL,
    ISO3Code CHAR(3), -- ISO 3166-1 alpha-3 code
    ISONumericCode INT -- ISO 3166-1 numeric code
);

-- Create the CountryRegions Junction Table:
CREATE TABLE dbo.CountryRegions (
    CountryCode CHAR(2),
    RegionCode CHAR(3),
    PRIMARY KEY (CountryCode, RegionCode),
    FOREIGN KEY (CountryCode) REFERENCES dbo.Countries(CountryCode),
    FOREIGN KEY (RegionCode) REFERENCES dbo.Regions(RegionCode)
);

Prompt:
Create a view based on Customers that includes all the above tables. 

Output:

CREATE VIEW CustomerDetails AS
SELECT 
    c.CustomerID,
    c.CustomerName,
    c.ContactName,
    a.AddressLine1,
    a.AddressLine2,
    a.City,
    a.State,
    a.PostalCode,
    co.CountryCode,
    co.CountryName,
    co.ISO3Code,
    co.ISONumericCode,
    r.RegionCode,
    r.RegionDescription
FROM 
    dbo.Customers c
LEFT JOIN 
    dbo.Addresses a ON c.CustomerID = a.CustomerID
LEFT JOIN 
    dbo.Countries co ON a.CountryCode = co.CountryCode
LEFT JOIN 
    dbo.CountryRegions cr ON co.CountryCode = cr.CountryCode
LEFT JOIN 
    dbo.Regions r ON cr.RegionCode = r.RegionCode;

Prompt:
In the above view use 3 letter encoding for table's name. 

Output:

CREATE VIEW CustomerDetails AS
SELECT 
    cus.CustomerID,
    cus.CustomerName,
    cus.ContactName,
    add.AddressLine1,
    add.AddressLine2,
    add.City,
    add.State,
    add.PostalCode,
    cou.CountryCode,
    cou.CountryName,
    cou.ISO3Code,
    cou.ISONumericCode,
    reg.RegionCode,
    reg.RegionDescription
FROM 
    dbo.Customers cus
LEFT JOIN 
    dbo.Addresses add ON cus.CustomerID = add.CustomerID
LEFT JOIN 
    dbo.Countries cou ON add.CountryCode = cou.CountryCode
LEFT JOIN 
    dbo.CountryRegions cr ON cou.CountryCode = cr.CountryCode
LEFT JOIN 
    dbo.Regions reg ON cr.RegionCode = reg.RegionCode;

Unfortunately, the definition uses the "add" reserved word for addresses as prefix. Therefore, the following prompt:

Prompt:
Modify the above view to use 3-letter aliases that aren't as reserved words. 

Output:

CREATE VIEW dbo.CustomerDetails AS
SELECT 
    cus.CustomerID, -- Customer ID
    cus.CustomerName, -- Customer name
    cus.ContactName, -- Contact name
    adr.AddressLine1, -- First line of the address
    adr.AddressLine2, -- Second line of the address
    adr.City, -- City of the address
    adr.State, -- State or region of the address
    adr.PostalCode, -- Postal or ZIP code of the address
    cou.CountryCode, -- Country code
    cou.CountryName, -- Country name
    cou.ISO3Code, -- ISO 3166-1 alpha-3 code
    cou.ISONumericCode, -- ISO 3166-1 numeric code
    reg.RegionCode, -- Region code
    reg.RegionDescription -- Region description
FROM 
    dbo.Customers cus
LEFT JOIN 
    dbo.Addresses adr ON cus.CustomerID = adr.CustomerID
LEFT JOIN 
    dbo.Countries cou ON adr.CountryCode = cou.CountryCode
LEFT JOIN 
    dbo.CountryRegions cr ON cou.CountryCode = cr.CountryCode
LEFT JOIN 
    dbo.Regions reg ON cr.RegionCode = reg.RegionCode;

Prompt:

Modify the schema of the above objects to include the schema name.

Prompt:
Change the definition of the above tables and test whether the tables already exists.

Prompt:
Annotate the definition of the above objects with a comment.

Prompt:
Generate the scripts for inserting 10 customers with different addresses in the above tables.

Prompt:
Aggregate the update and insert commands into the same stored procedure.

Output:

CREATE PROCEDURE dbo.UpsertCustomer
    @CustomerID INT = NULL, -- Optional parameter for insert
    @CustomerName NVARCHAR(100),
    @ContactName NVARCHAR(100),
    @Country NVARCHAR(50)
AS
BEGIN
    IF EXISTS (SELECT 1 FROM dbo.Customers WHERE CustomerID = @CustomerID)
    BEGIN
        -- Update existing customer
        UPDATE dbo.Customers
        SET 
            CustomerName = @CustomerName,
            ContactName = @ContactName,
            Country = @Country
        WHERE 
            CustomerID = @CustomerID;
    END
    ELSE
    BEGIN
        -- Insert new customer
        INSERT INTO dbo.Customers (CustomerName, ContactName, Country)
        VALUES (@CustomerName, @ContactName, @Country);
    END
END;

I used to create similar code in the past to consolidate the logic for inserts and updates within the same stored procedure. This allows to reduce the number of objects created in a database and minimize the impact of some other issues (e.g. code duplication for validation).

I also use the comma before each line of text where it applies (and not at the end). However, this can be addressed with the following prompt:

Prompt: 
Rewrite the above database objects by butting the comma in front of each line where it applies.

Prompt:
Modify the view so each join condition is written on its own line.

Prompt:
Modify the view with create or alter command.

Prompt:
Modify the view to check for null values.

Comments:
1) I could use the scripts also in a SQL database, though one must pay attention to the order in which the steps must be run.
2) All these are basic steps; so, a natural question: how far can we go? One can generate further objects, and knowledge about the data model seems to be available in Copilot, though normal data models are much more complex than this! Probably, there's a limit beyond which Copilot will start to be inefficient as long the data model is not available explicitly. SQL Server Copilot would help probably to overcome such limitations, though the feature is still in limited preview.
3) I wonder whether given a set of best practices, Copilot will be able to use them and create the CRUD objects accordingly. More experiments are needed though.
4) It will be interesting to observe how much the prompts generated by other people lead to similar or different outcomes. (I observed that nontechnical prompts generated in the past different outcomes.)
5) The fact that I was able to change some formatting (e.g. comma before each line) for all objects with just a prompt frankly made my day! I can't stress enough how many hours of work the unformatted code required in the past! It will be interesting to check if this can be applied also to a complex database schema.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft 365 (2024) Overview of prompts [link]

Resources:
[R1] Microsoft 365 (2025) Microsoft 365 Copilot release notes [link]
[R2] Microsoft 365 (2025) What’s new in Microsoft 365 Copilot | Jan 2025 [link]
[R3] Microsoft 365 (2025) Copilot is now included in Microsoft 365 Personal and Family [link]

Acronyms:
LLM - large language model
CRUD - Create, Read, Update, Delete

25 February 2025

🏭💠🗒️Microsoft Fabric: T-SQL Notebook [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 25-Feb-2024

[Microsoft Fabric] T-SQL notebook

  • {def} notebook that enables to write and run T-SQL code within a notebook [1]
  • {feature} allows to manage complex queries and write better markdown documentation [1]
  • {feature} allows the direct execution of T-SQL on
    • connected warehouse
    • SQL analytics endpoint
    • ⇐ queries can be run directly on the connected endpoint [1]
      • multiple connections are allowed [1]
  • allows running cross-database queries to gather data from multiple warehouses and SQL analytics endpoints [1]
  • the code is run by the primary warehouse
    • used as default in commands which supports three-part naming, though no warehouse was provided [1]
    • three-part naming consists of 
      • database name
        • the name of the warehouse or SQL analytics endpoint [1]
      • schema name
      • table name
  • {feature} autogenerate T-SQL code using the code template from the object explorer's context [1] menu
  • {concept} code cells
    • allow to create and run T-SQL code
      • each code cell is executed in a separate session [1]
        • {limitation} the variables defined in one cell are not available in another cell [1]
        • one can check the execution summary after the code is executed [1]
      • cells can be run individually or together [1]
      • one cell can contain multiple lines of code [1]
        • users can select and run subparts of a cell’s code [1]
    • {feature} Table tab
      • lists the records from the returned result set
        • if the execution contains multiple result set, you can switch from one to another via the dropdown menu [1]
  • a query can be saved as 
    • view
      • via 'Save as' view
      • {limitation} does not support three-part naming [1]
        • the view is always created in the primary warehouse [1]
          • by setting the warehouse as the primary warehouse [1]
    • table
      • via 'Save as' table
      • saved as CTAS 
    • ⇐ 'Save as' is only available for the selected query text
      • the query text must be selected before using the Save as options
  • {limitation} doesn’t support 
    • parameter cell
      • the parameter passed from pipeline or scheduler can't be used [1]
    • {feature} Recent Run 
      • {workaround} use the current data warehouse monitoring feature to check the execution history of the T-SQL notebook [1]
    • {feature} the monitor URL inside the pipeline execution
    • {feature} snapshot 
    • {feature} Git support 
    • {feature} deployment pipeline support 

References:
[1] Microsoft Learn (2025) T-SQL support in Microsoft Fabric notebooks [link
[2] Microsoft Learn (2025) Create and run a SQL Server notebook [link
[3] Microsoft Learn (2025) T-SQL surface area in Microsoft Fabric [link
[4] Microsoft Fabric Updates Blog (2024) Announcing Public Preview of T-SQL Notebook in Fabric [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms
CTAS - Create Table as Select
T-SQL - Transact SQL

23 February 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part X: Templates for Database Objects)

One of the new features remarked in SQL databases when working on the previous post is the availability of templates in SQL databases. The functionality is useful even if is kept to a minimum. Probably, more value can be obtained when used in combination with Copilot, which requires at least a F12 capacity.

Schemas

Schemas are used to create a logical grouping of objects such as tables, stored procedures, and functions. From a structural and security point of view it makes sense to create additional schemas to manage the various database objects and use the default dbo schema only occasionally (e.g. for global created objects).

-- generated template - schema
CREATE SCHEMA SchemaName

-- create schema
CREATE SCHEMA Test

One can look at the sys.schemas to retrieve all the schemas available:

-- retrieve all schemas
SELECT schema_id
, name
, principal_id
FROM sys.schemas
ORDER BY schema_id

Tables

Tables, as database objects that contain all the data in a database are probably the elements that need the greatest attention in design and data processing. In some cases a table can be dedenormalized and it can store all the data needed, much like in MS Excel, respectively, benormalized in fact and dimension tables. 

Tables can be created explicitly by defining in advance their structure (see Option 1), respectively on the fly (see Option 2). 

-- Option 1
-- create the table manually (alternative to precedent step
CREATE TABLE [Test].[Customers](
	[CustomerId] [int] NOT NULL,
	[AddressID] [int] NULL,
	[Title] [nvarchar](8) NULL,
	[FirstName] [nvarchar](50) NULL,
	[LastName] [nvarchar](50) NULL,
	[CompanyName] [nvarchar](128) NULL,
	[SalesPerson] [nvarchar](256) NULL
) ON [PRIMARY]
GO

-- insert records
INSERT INTO Test.Customers
SELECT CustomerId
, Title
, FirstName 
, LastName
, CompanyName
, SalesPerson
FROM SalesLT.Customer -- checking the output (both scenarios) SELECT top 100 * FROM Test.Customers

One can look at the sys.tables to retrieve all the tables available:

-- retrieve all tables
SELECT schema_name(schema_id) schema_name
, object_id
, name
FROM sys.tables
ORDER BY schema_name
, name

Views

Views are much like virtual table based on the result-set of an SQL statement that combines data from one or multiple tables.  They can be used to encapsulate logic, respectively project horizontally or  vertically a subset of the data. 

-- create view
CREATE OR ALTER VIEW Test.vCustomers
-- Customers 
AS
SELECT CST.CustomerId 
, CST.Title
, CST.FirstName 
, IsNull(CST.MiddleName, '') MiddleName
, CST.LastName 
, CST.CompanyName 
, CST.SalesPerson 
FROM SalesLT.Customer CST

-- test the view 
SELECT *
FROM Test.vCustomers
WHERE CompanyName = 'A Bike Store'

One can look at the sys.views to retrieve all the views available:

-- retrieve all views
SELECT schema_name(schema_id) schema_name
, object_id
, name
FROM sys.views
ORDER BY schema_name
, name

User-Defined Functions

A user-defined function (UDF) allows to create a function by using a SQL expression. It can be used alone or as part of a query, as in the below example.

-- generated template - user defined function
CREATE FUNCTION [dbo].[FunctionName] (
    @param1 INT,
    @param2 INT
)
RETURNS INT AS BEGIN RETURN
    @param1 + @param2
END

-- user-defined function: 
CREATE OR ALTER FUNCTION Test.GetFirstMiddleLastName (
    @FirstName nvarchar(50),
    @MiddleName nvarchar(50),
    @LastName nvarchar(50)
)
RETURNS nvarchar(150) AS 
BEGIN 
   RETURN IsNull(@FirstName, '') + IsNull(' ' + @MiddleName, '') + IsNull(' ' + @LastName, '') 
END

-- test UDF on single values
SELECT Test.GetFirstMiddleLastName ('Jack', NULL, 'Sparrow')
SELECT Test.GetFirstMiddleLastName ('Jack', 'L.', 'Sparrow')

-- test UDF on a whole table
SELECT TOP 100 Test.GetFirstMiddleLastName (FirstName, MiddleName, LastName)
FROM SalesLT.Customer

One can look at the sys.objects to retrieve all the scalar functions available:

-- retrieve all scalar functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_SCALAR_FUNCTION'
ORDER BY schema_name
, name

However, UDFs prove to be useful when they mix the capabilities of functions with the ones of views allowing to create a "parametrized view" (see next example) or even encapsulate a multi-line statement that returns a dataset. Currently, there seems to be no template available for creating such functions.

-- table-valued function
CREATE OR ALTER FUNCTION Test.tvfGetCustomers (
    @CompanyName nvarchar(50) NULL
)
RETURNS TABLE
-- Customers by Company
AS
RETURN (
	SELECT CST.CustomerId 
	, CST.CompanyName
	, CST.Title
	, IsNull(CST.FirstName, '') + IsNull(' ' + CST.MiddleName, '') + IsNull(' ' + CST.LastName, '') FullName
	, CST.FirstName 
	, CST.MiddleName 
	, CST.LastName 
	FROM SalesLT.Customer CST
	WHERE CST.CompanyName = IsNull(@CompanyName, CST.CompanyName)
);

-- test function for values
SELECT *
FROM Test.tvfGetCustomers ('A Bike Store')
ORDER BY CompanyName
, FullName

-- test function for retrieving all values
SELECT *
FROM Test.tvfGetCustomers (NULL)
ORDER BY CompanyName
, FullName

One can look at the sys.objects to retrieve all the table-valued functions available:

-- retrieve all table-valued functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_INLINE_TABLE_VALUED_FUNCTION'
ORDER BY schema_name , name

Stored Procedures

A stored procedure is a prepared SQL statement that is stored as a database object and precompiled. Typically, the statements considered in SQL functions can be created also as stored procedure, however the latter doesn't allow to reuse the output directly.

-- get customers by company
CREATE OR ALTER PROCEDURE Test.spGetCustomersByCompany (
    @CompanyName nvarchar(50) NULL
)
AS
BEGIN
	SELECT CST.CustomerId 
	, CST.CompanyName
	, CST.Title
	, IsNull(CST.FirstName, '') + IsNull(' ' + CST.MiddleName, '') + IsNull(' ' + CST.LastName, '') FullName
	, CST.FirstName 
	, CST.MiddleName 
	, CST.LastName 
	FROM SalesLT.Customer CST
	WHERE CST.CompanyName = IsNull(@CompanyName, CST.CompanyName)
	ORDER BY CST.CompanyName
	, FullName
END 

-- test the procedure 
EXEC Test.spGetCustomersByCompany NULL -- all customers
EXEC Test.spGetCustomersByCompany 'A Bike Store' -- individual customer

One can look at the sys.objects to retrieve all the stored procedures available:

-- retrieve all scalar functions
SELECT schema_name(schema_id) schema_name
, name
, object_id
FROM sys.objects 
WHERE type_desc = 'SQL_STORED_PROCEDURE'
ORDER BY schema_name , name

In the end, don't forget to drop the objects created above (note the order of the dependencies):

-- drop function 
DROP FUNCTION IF EXISTS Test.GetFirstMiddleLastName

-- drop function 
DROP FUNCTION IF EXISTS Test.tvfGetCustomers 
-- drop precedure DROP VIEW IF EXISTS Test.Test.spGetCustomersByCompany -- drop view DROP VIEW IF EXISTS Test.vCustomers -- drop schema DROP SCHEMA IF EXISTS Test

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Overview of Copilot in Fabric [link]

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.