Showing posts with label DML. Show all posts
Showing posts with label DML. Show all posts

22 March 2025

💠🛠️🗒️SQL Server: Indexed Views [Notes]

Disclaimer: This is work in progress based on notes gathered over the years, intended to consolidate information from the various sources. The content needs yet to be reviewed against the current documentation. 

Last updated: 22-Mar-2024

[SQL Server 2005] Indexed View

  • {def} a materialized view
    • materializes the data from the view queries, storing it in the database in a way similar to tables [6]
      • ⇒ its definition is computed and the resulting data stored just like a table [3]
      • the view is indexed by creating a unique clustered index on it
        • the resulting structure is physically identical to a table with a clustered index
          • ⇐ nonclustered indexes also are supported on this structure
      • can be created on a partitioned table, respectively can be partitioned [1]
    • {benefit} can improve the performance of some types of queries [3]
      • e.g. queries that aggregate many rows
      • ⇐ because the view is stored the same way a table with a clustered index is stored [1]
      • ⇐ not well-suited for underlying data that are frequently updated [3]
      •  more expensive to use and maintain than filtered indexes [5]
    • [query optimizer] 
      • can use it to speed up the query execution [1]
        • the view doesn't have to be referenced in the query for the optimizer to consider it for a substitution [1]
        • {downside} DML query performance can degrade significantly [1]
          • ⇐ in some cases, a query plan can't even be produced [1]
          • when executing UPDATE, DELETE or INSERT on the base table referenced, the indexed views must be updated as well [1]
          • {recommendation} test DML queries before production use [1]
            • analyze the query plan and tune/simplify the DML statemen [1]
      • can use the structure to return results more efficiently to the user
        • contains logic to use this index in either of the cases 
          • the original query text referenced the view explicitly [2]
          • the user submits a query that uses the same components as the view (in any equivalent order) [2]
          • ⇐ the query processor expands indexed views early in the query pipeline and always uses the same matching code for both cases [2]
            • the WITH(NOEXPAND) hint tells the query processor not to expand the view definition [2]
            • also instructs the query processor to perform an index scan of the indexed view rather than expand it into its component parts [5]
            • any extra rows in the indexed view are reported as 8907 errors [5]
            • any missing rows are reported as 8908 errors [5]
      • expose some of the benefits of view materialization while retaining the benefits of global reasoning about query operations [2]
      • expanded (aka in-lined) before optimization begins
        • gives the Query Optimizer opportunities to optimize queries globally [2]
        • makes it difficult for the (query) optimizer to consider plans that perform the view evaluation first, then process the rest of the query [2]
          • arbitrary tree matching is a computationally complex problem, and the feature set of views is too large to perform this operation efficiently [2]
      • cases in which it does not match the view
        • indexed views are inserted into the Memo and evaluated against other plan choices
          • while they are often the best plan choice, this is not always the case [2]
          • the Query Optimizer can detect logical contradictions between the view definition and the query that references the view [2]
        • there are also some cases where the Query Optimizer does not recognize an indexed view even when it would be a good plan choice [2]
          • often, these cases deal with complex interactions between high-level features within the query processor (e.g. computed column matching, the algorithm to explore join orders) [2]
          • consider the WITH (NOEXPAND) hint to force the query processor to pick that indexed view [2]
            •  this usually is enough to get the plan to include the indexed view [2]
        • indexed view alternatives 
          • are generated and stored in the Memo 
          • are compared using costing equations against other possible plans
          • partial matches cost the residual operations as well
            • an indexed-view plan can be generated but not picked when the Query Optimizer considers other plans to have lower costs [2]
        • maintained as part of the update processing for tables on which the view is based
          • this makes sure that the view provides a consistent result if it is selected by the Query Optimizer for any query plan [2]
          • some query operations are incompatible with this design guarantee
            • restrictions are placed on the set of supported constructs in indexed views to make sure that the view can be created, matched, and updated efficiently [2]
        • {operation} updating indexed views
          • the core question behind the restrictions is “Can the query processor compute the necessary changes to the Indexed View clustered and nonclustered indexes without having to recompute the whole indexed view?” [2]
            • if so, the query processor can perform these changes efficiently as part of the maintenance of the base tables that are referenced in the view[2]
              • this property is relatively easy for filters, projections (compute scalar), and inner joins on keys[2]
              • operators that destroy or create data are more difficult to maintain, so often these are restricted from use in indexed views. [2]
        • matching indexed views is supported in cases beyond exact matches of the query text to the view definition [2]
          • it also supports using an indexed view for inexact matches where the definition of the view is broader than the query submitted by the user [2]
            • then applies residual filters, projections (columns in the select list), and even aggregates to use the view as a partial precomputation of the query result [2]
    • {concept} statistics on indexed views
      • normally statistics aren't needed
        • because the substitution of the indexed views into the query plan is considered only after all the statistics for the underlying tables and indexes are attached to the query plan [3]
        • used if the view is directly referenced by the NOEXPAND hint in a FROM clause 
          • an error is generated and the plan is not created if the NOEXPAND hint is used on a view that does not also contain an index [3]
      • can’t be created by using sp_createstats or updated by using sp_updatestats. 
      • auto update and auto create statistics features work for indexed views
        • created manually
          • via CREATE STATISTICS on the indexed view columns
          • via UPDATE STATISTICS to update column or index statistics on indexed views
    • {operation} creating a view
      • requires that the underlying object’s schema can’t change
      • requires WITH SCHEMABINDING option [5]
      • ⇒ must include the two-part names of all referenced tables [5]
      • ⇐ the tables can't be dropped and the columns can't be altetd when participate in a view unless the view is tropped [5]
      • ⇐ an error is raised [5]
      • the user must hold 
        • the CREATE VIEW permission in the database [1]
        • ALTER permission on the schema in which the view is being created [1]
        • if the base table resides within a different schema, the REFERENCES permission on the table is required as a minimum [1]
        • if the user creating the index differs from the users who created the view, for the index creation alone the ALTER permission on the view is required [1]
    • {operation} creating an index on the view[
      • indexes can only be created on views that have the same owner as the referenced table or tables (aka intact ownership chain between the view and the tables) [1]
    • {operation} dropping a view
      • makes all indexes on the view to be dropped  [1]
        • ⇐ all nonclustered indexes and auto-created statistics on the view are dropped when the clustered index is dropped [1]
          • {exception} ser-created statistics on the view are maintained [1]
      • nonclustered indexes can be individually dropped [1]
      • dropping the clustered index on the view 
        • removes the stored result set [1]
        • the optimizer returns to processing the view like a standard view [1]
    • {operation} disable indexes on tables and views
      • when a clustered index on a table is disabled, indexes on views associated with the table are also disabled [1]
    • {option} EXPAND VIEWS
      • allows to prevent the Database Engine from using indexed views [1]
        • if any of the listed options are incorrectly set, this option prevents the optimizer from using the indexes on the views [1]
        • via OPTION (EXPAND VIEWS) hint
    • {recommendation} when using datetime and smalldatetime string literals in indexed views, explicitly convert the literal to the date type by using a deterministic date format style [1]
    • {limitation} AVG is not allowed {workaround} use SUM and COUNT_BIG (5]
    • {limitation} impacted by SET options [1]
      • {restriction} require fixed values for several SET options [1]
      • {recommendation} set the ARITHABORT user option to ON server-wide as soon as the first indexed view or index on a computed column is created in any database on the server [1]
    • {limitation} further requirements apply (see [1])
    • {limitation} aren't supported on top of temporal queries
      • ⇐ queries that use FOR SYSTEM_TIME clause).
    • {scenario}simplifying SQL queries
    • {scenario} abstracting data models from user models
    • {scenario} enforcing user security


References:
[1] Microsoft Learn (2024) SQL Server: Create indexed views [link]
[2] Kalen Delaney et all (2009) Microsoft® SQL Server® 2008 Internals
[3] Microsoft Learn (2024) SQL Server: Views [link]
[4] Microsoft Learn (2024) SQL Server: CREATE INDEX (Transact-SQL) [link]
[5] Kalen Delaney et all (2012) Microsoft® SQL Server® 2012 Internals
[6] Dmitri Korotkevitch (2016) Pro SQL Server Internals 2nd Ed.

Resources:
[R1] Microsoft Learn (2024) SQL Server: Optimize index maintenance to improve query performance and reduce resource consumption [link]

Acronyms:
DML - Data Manipulation Language
QO - Query Optimizer

22 January 2025

🏭🗒️Microsoft Fabric: Clone Tables in Warehouses [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 22-Jan-2025

[Microsoft Fabric] Zero-copy Clone

  • {def} a replica of an existing OneLake table created by copying existing table's metadata and referencing its data files [1]
    • the metadata is copied while the underlying data of the table stored as parquet files is not copied [1]
    • its creation is like creating a delta table [1]
    • DML/DDL changes on the source 
      • are not reflected in the clone table [1]
      • are not reflected on the source [1]
    • can be created within or across schemas in a warehouse [1]
    • created based on either:
      • current point-in-time
        • based on the present state of the table [1]
      • previous point-in-time
        • based on a point-in-time up to seven days in the past
          • the table clone contains the data as it appeared at a desired past point in time
          • all CRUD operations are retained for seven calendar days
        • created with a timestamp based on UTC
  • {characteristic} autonomous existence
    • the original source and the clones can be deleted without any constraints [1]
    • once a clone is created, it remains in existence until deleted by the user [1]
  • {characteristic} inherits 
    • object-level SQL security from the source table of the clone [1]
      • DENY permission can be set on the table clone if desired [1]
        • the workspace roles provide read access by default [1]
    • all attributes that exist at the source table, whether the clone was created within the same schema or across different schemas in a warehouse [1]
    • the primary and unique key constraints defined in the source table [1]
  • a read-only delta log is created for every table clone that is created within the Warehouse [1]
  • {benefit} facilitates development and testing processes 
    • by creating copies of tables in lower environments [1]
  • {benefit} provides consistent reporting and zero-copy duplication of data for analytical workloads and ML modeling and testing [1]
  • {benefit} provides the capability of data recovery in the event of a failed release or data corruption by retaining the previous state of data [1]
  • {benefit} helps create historical reports that reflect the state of data as it existed as of a specific point-in-time in the past [1]
  • {limitation} table clones across warehouses in a workspace are not currently supported [1]
  • {limitation} table clones across workspaces are not currently supported [1]
  • {limitation} clone table is not supported on the SQL analytics endpoint of the Lakehouse [1]
  • {limitation} clone of a warehouse or schema is currently not supported [1]
  • {limitation} table clones submitted before the retention period of seven days cannot be created [1]
  • {limitation} cloned tables do not currently inherit row-level security or dynamic data masking [1]
  • {limitation} changes to the table schema prevent a clone from being created prior to the table schema change [1]
  • {best practice} create the clone tables in dedicated schema(s)
  • [syntax] CREATE TABLE <schema.clone_table_name> AS CLONE OF <schema.table_name>

Previous Post  <<||>> Next Post

References:
[1] Microsoft Learn (2023) Clone table in Microsoft Fabric [link]
[2] Microsoft Learn (2024) Tutorial: Clone tables in the Fabric portal [link]
[3] Microsoft Learn (2024) Tutorial: Clone a table with T-SQL in a Warehouse [link]
[4] Microsoft Learn (2024) SQL: CREATE TABLE AS CLONE OF [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

31 January 2024

🏭🗒️Microsoft Fabric: Parquet Format [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 31-Jan-2024

[Microsoft Fabric] Parquet format

  • {definition} open source, column-oriented data file format designed for efficient data storage and retrieval [1]
    • provides efficient data compression and encoding schemes with enhanced performance to handle complex data in bulk [1]
    • designed to be a common interchange format for both batch and interactive workloads [1]
  • {characteristic} open source file format
    • similar to other columnar-storage file formats available in Hadoop [1]
      • e.g. RCFile, ORC
    • became an industry standard 
      •  {benefit} provides interoperability across multiple tools
  • {characteristic} language agnostic [1]
    • different programming languages can be used to manipulate the data
  • {characteristic} column-based format [1]
    • files are organized by column
      • ⇐ rather than by row
      • ⇒ saves storage space and speeds up analytics queries [1]
    •  reads only the needed columns 
      • ⇐ non-relevant data are skipped
      • ⇒ greatly minimizes the IO [1]
        • aggregation queries are less time-consuming compared to row-oriented databases [1]
    • {benefit} increased data throughput and performance [1]
      • ⇒ recommended for analytical workloads
  • {characteristic} highly efficient data compression/decompression [1]
    • supports flexible compression options and efficient encoding schemes [1]
      • data can be compressed by using one of the several codecs available [1]
        • ⇒ different data files can be compressed differently [1]
    •  reduced storage requirements [1]
      • by at least one-third on large datasets
      • ⇒ {benefit} saves on cloud storage space
    •  greatly improves scan and deserialization time [1]
      • ⇒ {benefit} reduces the processing costs
    • {downside} can be slower to write than row-based file formats
      • primarily because they contain metadata about the file contents 
      • though have fast read times
  • {characteristic} supports complex data types and advanced nested data structures [1]
    • implemented using the record-shredding and assembly algorithm
      • accommodates complex data structures that can be used to store the data [1]
      • optimized to work with complex data in bulk and features different ways for efficient data compression and encoding types [1]
        • the approach is best especially for those queries that need to read certain columns from a large table [1]
  • {characteristic} cloud-ready
    • works best with interactive and serverless technologies [1]
  • {characteristic} immutable
    • a file can't be update to modify the column name, reorder or drop columns [2]
      • ⇐ requires rewriting the whole file [2]
  • {characteristic} binary-based file
    • ⇒ not easily readable (by humans)
  • {characteristic} self-describing 
    •  contains metadata about schema and structure
    • {concept} row groups (aka segments) 
      • contains data from the same columns
        • {constraint} column names are case sensitive
    • {concept} file footer 
      • stores metadata statistics for each row group [2]
        • min/max statistics 
        • the number of rows
        • can be leveraged by data processing engines to run queries more efficiently [2]
          • ⇐ depending on the query, entire row group can be skipped [2]
    • {concept} file header
  •  large datasets can be split across multiple parquet files
    • ⇐ the structure can be flat or hierarchical 
    • managing multiple files has several challenges
    • the files can be used to define a table (aka parquet table)
      • ⇐ {constraint} the files must have the same definition
        • ⇐ schema enforcement must be coded manually [2]
      • {limitation} [Data Lake] no support for ACID transactions [2]
        • ⇒ easy to corrupt [2]
          • partially written files will break any subsequent read operations
            • the compute engine will try to read in the corrupt files and error out [2]
            • corrupted files must be manually identified and deleted manually to fix the issues [2]
      • {limitation} it's not easy to delete rows from it [2]
        • requires reading all the data, filtering out the data not needed, and then rewriting the entire table [2]
      • {limitation} doesn't support DML transactions [2]
      • {limitation} there is no change data feed [2]
      • {limitation} slow file listing [2]
        • small files require excessive I/O overhead
          • ideally the files should be between 64 MB and 1 GB
          • ideally the files should be compacted into larger files (aka small file compaction, bin-packing)
      • {limitation} expensive footer reads to gather statistics for file skipping [2]
        • fetching all the footers and building the file-level metadata for the entire table is slow [2]
          • ⇐ it requires a file-listing operation [2]
        • the effectiveness of data skipping depends on how many files can be can skipped when performing a query [2]
      • {limitation} doesn't support schema enforcement [2]
      • {limitation} doesn't support check constraints [2]
      • {limitation} doesn't support data versioning [2]
    • {concept} table partitioning
      • {definition} common optimization approach used to store the data of the same table in different directories, with partitioning column values encoded in the path of each partition directory [6]
      • {recommendation} avoid partitioning by columns with very high cardinality
    • {concept} bin-packing (aka compaction, bin-compaction)
      • aims to produce evenly-balanced data files with respect to their size on disk, 
        • ⇐ but not necessarily in respect to the number of tuples per file [7]
      • requires an algorithm that efficiently organizes the files into equal size containers [6]
      • {characteristic} idempotent
        •  if it is run twice on the same dataset, the second run has no effect [7]
  • {feature} [Microsoft Fabric] V-order
    • {definition} write time optimization to the parquet file format that enables lightning-fast reads under the MF compute engines [3]
    • applies special sorting, row group distribution, dictionary encoding and compression on parquet files [3]
      • requires less compute engines resources in to read it [3]
        • provides further cost efficiency and performance
          • has a 15% impact on average write times but provides up to 50% more compression [3]
    • {characteristic} open-source parquet format compliant
      • all parquet engines can read it as a regular parquet file [3]
      • ⇐ table properties and optimization commands can be used on control V-Order on its partitions [3]
      • compatible with other features [3]
    • applied at parquet file level [3]
    • enabled by default
  • {command} OPTIMIZE
    • merges all changes into bigger, consolidated parquet files (aka bin-compaction) [3]
    • [Spark] dynamically optimizes partitions while generating files with a default 128 MB size [5]
      • the target file size may be changed per workload requirements using configurations [5]
    • properly designing the table physical structure based on the ingestion frequency and expected read patterns is likely more important than running the optimization command [3]
    • running the compaction operation brings the data lake in an unusable state for readers [7]
    • {warning} manually compacting the files is inefficient and error prone [7]
      • no way to differentiate files that contain new data from files that contain existing data that was just compacted into new files [7]
  • [Delta Lake] when ZORDER and VORDER are used together, Apache Spark performs bin-compaction, ZORDER, VORDER sequentially [3]

Resources:
[1] Databricks (2023) What is Parquet? (link)
[2] Data Lake (2023) Delta Lake vs. Parquet Comparison (link)
[3] Data Mozart (2023) Parquet file format – everything you need to know! (link)
[4] Microsoft Learn (2023) Query Parquet files using serverless SQL pool in Azure Synapse Analytics (link)
[5] Microsoft Learn (2023) Lakehouse tutorial: Prepare and transform data in the lakehouse (link)
[6] Apache Spark (2023) Spark SQL Guide (link)
[7] Delta Lake (2023) Delta Lake Small File Compaction with OPTIMIZE (link)
[8] Delta Lake (2023) Optimizations (link)

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
ACID - atomicity, consistency, isolation, durability
IO - Input/Output
MF - Microsoft Fabric
ORC - Optimized Row Columnar
RCFile - Record Columnar File

08 January 2023

💠🛠️SQL Server: DELETE vs. TRUNCATE TABLE Cheat Sheet

The comparison between the DELETE and TRUNCATE TABLE commands resumes to more than saying that one method is faster than the other or that one should always use TRUNCATE TABLE when deleting all the records from a table, which typically is not advisable in production environments. I tried to provide an overview of the two commands, though this should be considered as "work in progress".
 
Disclaimer: please refer to the SQL Server Docs for the complete set of features broken down on version.
 
DELETETRUNCATE TABLE
DefinitionDML command that removes one or more rows from a table or viewDDL command that removes all rows from a table or specified partitions of a table, without logging the individual row deletions
Scopetables, views, memory-optimized tables, common table expressions, MERGE, sp_MSforEachTable, linked serverstables, partitions
Behavior• removes rows one at a time and records an entry in the transaction log for each deleted row
  - for big tables the transaction log fills fast and may reach its limit
• fully logged ⇒rollback supported
• executed using a row lock, though locks might be escalated to a larger scope ⇒ performance may degrade
• [views] deletes the records only from the base table
• allows outputting the deleted records (via OUTPUT clause)
• [heaps] pages made empty may remain allocated ⇒ can’t be reused by other objects
• equivalent of a DROP & CREATE TABLE
• uses an optimized logging mode:
   - removes the data by deallocating the data pages used to store the table data and records only the page deallocations in the transaction log
   - a deferred-drop mechanism unhooks the allocations for the table and putting them on the ‘deferred-drop queue’ for later processing by a background task deallocates all the pages and extents
• fully logged, however rollback supported only with explicit transactions
• leaves zero pages in the table
• resets the identity property
Syntax (simple form)
DELETE <table_name>
FROM <database>
[WHERE <search_condition>]   
[OPTION (<query_options>)]
TRUNCATE TABLE <table_name>
[ WITH ( PARTITIONS ( { 
<partition_number_expression> 
| <range> }))]
Performance• degrades with the numbers of records
• [large tables] can cause the transaction log to become full
• the operation completes almost instantaneously
• best practice because is faster and uses fewer system and transaction log resources
Constraints• a DELETE may fail if - violates a trigger - tries to remove a row referenced by data in another table with a FOREIGN KEY constraint
• TOP can’t be used in a DELETE statement against partitioned views • doesn’t reset the identity property
• can’t be used with views
• can’t be used on tables:
   - referenced by a FOREIGN KEY constraint, except self-references
   - participate in an indexed view - published using transactional or merge replication - system-versioned temporal.
   - referenced by an EDGE constraint
• can’t activate a trigger
• can’t be used on views & memory-optimized tables
Permissions• [minimum] DELETE permission on target table, and SELECT permission, it if includes a WHERE clause
• default to
   - table owner
   - members of the sysadmin fixed server role   
   - db_owner and db_datawriter fixed database roles
• table owners & members of sysadmin, db_owner & db_securityadmin roles can transfer permissions to other users
• [minimum] ALTER permission on target table
• default to
   - table owner - members of the sysadmin fixed server role
   - db_owner and db_ddladmin fixed database roles
• permissions are not transferable
• doesn’t support direct permissions (workaround: use TRUNCATE in stored procedure, and assign the required permission to it using the EXECUTE AS clause)
Scenarios• delete a set of records from a table
   - based on fix constraints
   - based on records from another table
   - based on a join with a source table
• empty a set of tables from a database
Recommendations• use a TRUNCATE when is safe to delete all the records ⇒ make sure that a backup or copy of the data is available
• [large tables] consider dropping the indexes before performing a DELETE when this covers all or most of the data, and recreate them afterwards
• [large tables] if the volume of data to be deleted is big compared with the remaining data, consider moving the data to a table with a similar structure, perform a TRUNCATE and then move the data back (see [5])
• [large tables] consider deleting data in batches with log truncation in single-user mode (be careful in production environments)
• [heaps] specify the TABLOCK hint in the DELETE statement
• [heaps] create a clustered index on the heap before deleting the row
Myths•TRUNCATE TABLE is a non-logged operation (see [4])
•TRUNCATE TABLE is a minimally logged operation (see [3])
Related conceptsnon-logged/minimally-logged/fully-logged operations, deferred-drop mechanism, sp_MSforEachTable stored procedure, transaction log, MERGE, DDL, DML

Resources:
[1] Microsoft SQL Docs (2022) DELETE [link]
[2] Microsoft SQL Docs (2022) TRUNCATE TABLE [link]
[3] Microsoft TechNet (2017) SQL Server: Understanding Minimal Logging Under Bulk-Logged Recovery Model vs. Logging in Truncate Operation [link]
[4] Paul Randal (2013) The Myth that DROP and TRUNCATE TABLE are Non-Logged [link]
[5] SQL-troubles (2018) ERP Systems: Dynamics AX 2009 – Deleting Obsolete Companies [link]
[6] Microsoft TechNet (2014) SQL Server: An Examination of Logging in Truncate Table Statement and Its Comparison with Delete Statement [link]

01 January 2011

🔖Book Review: Lynn Beighley's "Head First SQL" (2011)

I love the Head First O’Reilly book series which besides the book on SQL, has acquired over its short existence several other titles like the ones on Data Analysis, Statistics and Excel, topics strongly correlated with the thematic of the current blog, but also related tangential topics like Project Management, Design Patterns, Object-Oriented Analysis and Design, Software Development, Web Design, Programming, PHP & MySQL, HTML with CSS & XHTML, C#, Java, AJAX, JavaScript, Pyhton, Servlets and JSP, EJB or Rails. (The book series contains more titles which seems to be available also at Head First Labs together with other supporting information.) 

What I like about this book series is that approaches the themes from a basic level, using a rich set of "brain-friendly" visual representations, conversational and personalized style, challenging food for thought, plus some advices on how to use book’s format, facts that allow the reader to make most of the learning process. In fact the books are approachable for any newbies in IT branch, so I recommend the whole set of books to beginners as well to experienced programmers, considering the fact that everybody has something to learn, at least when considering the presentational format of the book.

Coming back to the current title, Head First SQL, it covers almost everything somebody has to know about database design, DML and DDL statements, database objects and security in a nutshell. It focuses only on the SQL Standard (I wish I know which one as there are several versions/revisions, see SQL 1999), so it could be used for all type of RDBMS that implements the standard, even if the book is built around MySQL. A consequence of this aspect is that any vendor specific flavors won’t be found in here, so the book should be used in combination with the documentation or the books that target the vendor-specific functionality. There are also several topics not covered but mentioned roughly: reserved words and special characters, temporary tables, data conversion, some numeric functions and operators, indexes, etc.

I must remark that some of the definitions are somehow ambiguous, and I feel that in some cases additional clarifications are necessary. Some examples in this direction could be considered the very first important definitions, the ones of the database seen as "a container that holds tables and SQL structures related to those tables", the one of column seen as piece of data stored by a table, or of a row seen as a single set of columns that describe attributes of a single thing. The first definition is quite acceptable if it’s ignored the existence of non-relational databases (it’s also true that the book targets RDBMS even if not evident), however the other two definitions are too vague, at least from my point of view. 

It’s also true that more rigor and detail would involve the introduction of other concepts (e.g. entity, matrix, etc.), requiring more space and maybe complicate the topics. The reader could intuit from explanatory text and examples what’s all about, however expecting the reader to bridge such gaps should be avoided especially when we consider the intrinsic nature of this book in particular, and IT books in general. Anyway, this shouldn’t stop you discover and enjoy the book!

Notes:
1. As I learned on the hard way, not all book formats and writing styles are matching everybody’s taste and expectations. The simplistic and rich visually format of the book could annoy as well, so might be a good idea to browse first the book on Scribd before buying it.
2. The source code is available here.

09 March 2009

🛢DBMS: Trigger (Definitions)

"A special form of stored procedure that goes into effect when a user gives a change command such as insert, delete, or update to a specified table or column. Triggers are often used to enforce referential integrity." (Karen Paulsell et al, "Sybase SQL Server: Performance and Tuning Guide", 1996)

"A special form of stored procedure that goes into effect when data within a table is modified. Triggers are often created to enforce integrity or consistency among logically related data in different tables." (Patrick Dalton, "Microsoft SQL Server Black Book", 1997)

"A special type of stored procedure that is set off by actions taken on a table. Triggers allow for complex relationships between tables and complex business rules to be checked automatically." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

"A stored procedure that executes automatically when data in a specified table is modified. Triggers are often created to enforce referential integrity or consistency among logically related data in different tables." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Code stored in the database that executes automatically when certain events occur. Traditionally associated only with table write events such as INSERT, UPDATE, or DELETE, newer versions of Oracle provide the ability to define triggers on views and on other system events such as logon, logoff, and system error." (Bill Pribyl & Steven Feuerstein, "Learning Oracle PL/SQL", 2001)

"A stored procedure that executes when data in a specified table is modified. Triggers are often created to enforce referential integrity or consistency among logically related data in different tables." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"A trigger is a stored procedure that is fired when data is modified from a table using any of the three modification statements: DELETE, INSERT, or UPDATE. FOR and AFTER are synonymous, and are usually implied when referring to triggers, rather than INSTEAD OF triggers. Triggers are often created to enforce referential integrity or consistency among logically related data in different tables." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A chunk of code that executes when a specified event occurs, usually before or after an INSERT, UPDATE, or DELETE command." (Gavin Powell, "Beginning Database Design", 2006)

"A database method that is automatically invoked as the result of Data Manipulation Language (DML) activity within a persistence mechanism." (Pramod J Sadalage & Scott W Ambler, "Refactoring Databases: Evolutionary Database Design", 2006)

"A stored procedure that is fired when data is modified from a table using any of the three modification statements DELETE, INSERT, or UPDATE. FOR and AFTER are synonymous and are usually implied when referring to triggers rather than INSTEAD OF triggers. Triggers are often created to enforce referential integrity or consistency among logically related data in different tables." (Thomas Moore, "MCTS 70-431: Implementing and Maintaining Microsoft SQL Server 2005", 2006)

"A stored procedure that executes when certain conditions occurs such as when a record is created, modified, or deleted. Triggers can perform special actions such as creating other records or validating changes." (Rod Stephens, "Beginning Database Design Solutions", 2008)

"A type of stored procedure that fires in response to action on a table. DML triggers are associated with INSERT, UPDATE, and DELETE statements. DDL triggers are associated with CREATE, ALTER, and DROP statements." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Stored in, and managed by, your database server, this software is executed when a certain event occurs. These events can range from information creation or modification to structural changes to your database. When the event occurs, the trigger is executed, causing a pre-determined set of actions to take place. These actions can encompass data validation, alerts, warnings, and other administrative operations. Triggers can invoke other triggers and stored procedures." (Robert D. Schneider and Darril Gibson, "Microsoft SQL Server 2008 All-In-One Desk Reference For Dummies", 2008)

"A stored procedure that executes in response to a Data Manipulation Language (DML) or Data Definition Language (DDL) event." (Jim Joseph, "Microsoft SQL Server 2008 Reporting Services Unleashed", 2009)

"A SQL program module that is executed when a specific data modification activity occurs. Triggers are stored in the database they manipulate." (Jan L Harrington, "SQL Clearly Explained" 3rd Ed., 2010)

"A stored procedure that can be triggered and executed automatically when a database operation such as insert, update, or delete takes place." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A procedural SQL code that is automatically invoked by the relational database management system upon the occurrence of a data manipulation event." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A software routine guaranteed to execute when an event occurs. Often a trigger will monitor changes to data values. A trigger includes a monitoring procedure, a set or range of values to check data integrity, and one or more procedures invoked in response, which may update other data or fulfill a data subscription." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"An event that causes a handler to be run." (Jon Orwant et al, "Programming Perl, 4th Ed.", 2012)

"An event-driven specialized procedure that is attached to database tables; typically implemented to support data integrity requirements." (Craig S Mullins, "Database Administration", 2012)

"A database object that is associated with a single base table or view and that defines a rule. The rule consists of a set of SQL statements that runs when an insert, update, or delete database operation occurs on the associated base table or view." (IBM, "Informix Servers 12.1", 2014)

"A database object that is associated with a single base table or view and that defines a rule. The rule consists of a set of SQL statements that runs when an insert, update, or delete database operation occurs on the associated base table or view." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

 "A PL/SQL or Java procedure that fires when a table or view is modified or when specific user or database actions occur. Procedures are explicitly run, whereas triggers are implicitly run." (Oracle, "Oracle Database Concepts")

"A stored procedure that executes in response to a data manipulation language (DML) or data definition language (DDL) event." (Microsoft Technet,)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.