Showing posts with label OPENQUERY. Show all posts
Showing posts with label OPENQUERY. Show all posts

09 February 2025

🏭🗒️Microsoft Fabric: Kusto Query Language (KQL) [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 9-Feb-2025

[Microsoft Fabric] Kusto Query Language (KQL)

  • {def} a read-only request to process query language [1]
    • designed for data exploration and summarization [1]
      • very similar to SQL
        • the explain command can be used to transform SQL into KQL code
          • ⇐ not all the SQL syntax can be translated
    • statements are sequenced being executed in the order of their arrangement
      • funnel like processing where data is piped from one operator to the next
        • data is filtered, rearranged or summarized at each step and then fed into the following step
        • statements are sequenced by a pipe (|)
    • returns data in a tabular or graph format
    • designed and developed to take advantage of cloud computing through clustering and scaling compute [2]
      • ideal engine to power fast, real-time dashboards
    • case-sensitive in general 
    • named after the undersea pioneer Jacques Cousteau [2]
    • operation sequence
      • filter data
      • aggregate data
      • order data
      • modify column output
  • supports standard data types 
    • string
      • a sequence of zero or more Unicode characters
      • characters are encoded in UTF-8.
    • int 
      • 32-bit whole-number integer
    • long
      • signed 64-bit whole-number integer
    • real (aka double)
      • 64-bit decimal-based number 
      • and provides high precision with decimal points.
    • decimal
      • a 128-bit decimal number
      • provides the highest precision of decimal points
      • {recommendation} if precision is not needed, use the real type instead [2]
    • bool 
      • a boolean value that can be a true (1), false (0), or null
    • datetime 
      • represents a date in the UTC zone
    • timespan 
      • represents a time interval
        •  days, hours, minutes, seconds, milliseconds, microseconds, tick
      • if no time frame is specified, it will default to day
    • dynamic
      • a special data type that can take 
        • any value from the other data types
        • arrays
        • a {name = value} property bag
    • guid
      • a 128-bit globally unique value
  • statement types
    • tabular expression statement
    • let statement
      • used to 
        • set variable names equal to an expression
        • create views
      • ⇐ used mostly to 
        • help break complex expressions into multiple parts, each represented by a variable
        • sett constants outside the query to aid in readability
    • set statement
      • used to set the query duration
  • {tool}Microsoft Santinel 
    • {def} a cloud native SIEM and SOAR that provides cyberthreat detection, investigation, response, and proactive hunting, with a bird's-eye view across your enterprise [3]
  • {tool} Kusto Explorer
    • {def} user-friendly interface to query and analyze data with KQL [4]
  • {tool} Azure Data Studio 
    • {def} lightweight, cross-platform data management and development tool for data professionals [5]

References:
[1] Microsoft (2024) Real-time Analytics: End-to-End Workshop
[2] Mark Morowczynski et al (2024) The Definitive Guide to KQL: Using Kusto Query Language for Operations, Defending, and Threat Hunting
[3] Microsoft Learn (2024) Azure: What is Microsoft Sentinel [link]
[4] Microsoft Learn (2024) Kusto: Kusto.Explorer installation and user interface [link]
[5] Microsoft Learn (2024) SQL: What is Azure Data Studio? [link]

Acronyms:
KQL - Kusto Query Language (
SIEM - security information and event management
SOAR - security orchestration, automation, and response
SQL - Structured Query Language
UTC - Universal Time Coordinated

12 September 2009

💎SQL Reloaded: Number of Records I (via COUNT)

Getting the number of records from a table is straightforward, a simple SELECT doing the job:

-- general syntax 
SELECT count(*) NoRecords
FROM <schema>.<table>

-- example 
SELECT count(*) NoRecords
FROM [Person].[Address]

Some prefer using a count(1) instead of a count(*), though it doesn't make a difference considering how SQL Server handles the two scripts in the background.

-- general syntax 
SELECT count(1) NoRecords
FROM <schema>.<table>

-- example 
SELECT count(1) NoRecords
FROM [Person].[Address]

Same can be achieved by using a SUM:

-- general syntax 
SELECT SUM(1) NoRecords
FROM <schema>.<table>

-- example 
SELECT SUM(1) NoRecords
FROM [Person].[Address]

However the SUM is typically used when counting only certain records within the same line (the attributes must be know however beforehand):

-- counting the occurrences for a certain values via SUM
SELECT SUM(CASE WHEN Class IS NULL THEN 1 ELSE 0 END) [Nulls]
, SUM(CASE WHEN Class = 'H' THEN 1 ELSE 0 END) [H]
, SUM(CASE WHEN Class = 'L' THEN 1 ELSE 0 END) [L]
, SUM(CASE WHEN Class = 'M' THEN 1 ELSE 0 END) [M]
, count(*) NoRecords
FROM [Production].[Product]

The sane output can be used via a GROUP BY based on the grouping attributes:

-- counting the occurrences for a certain attribute via COUNT
SELECT Class
, count(*) NoRecords
FROM [Production].[Product]
GROUP BY Class

-- counting the occurrences for a group of attributes via COUNT
SELECT Class
, Color
, count(*) NoRecords
FROM [Production].[Product]
GROUP BY Class
, Color

If you want to implement the functionality in a function or stored procedure and provide the name of the table as parameter, then things get a little more complicated as the query needs to be built dynamically on runtime. In stored procedures a query built dynamically can be usually executed using EXEC or EXECUTE stored procedures, though none of them will do as we need to reuse the actual value returned by the query. This can be achieved with the help of a third stored procedure called sp_executesql, and the work can be done relatively easily with the below stored procedure:

-- dropping the stored procedure
--DROP PROCEDURE IF EXISTS dbo.pGetNumberRecords

-- creating the stored procedure
CREATE PROCEDURE dbo.pGetNumberRecords(
  @Table varchar(100) 
, @NumberRecords int OUTPUT 
)
AS 
DECLARE @SglQuery nvarchar(250) 

IF IsNull(@table,'')<>'' 
BEGIN 
  SET @SglQuery = N'SELECT @NumberRecords = count(*) FROM ' + @table 

  --get the number of records 
  EXEC sp_executesql @Query = @SglQuery
                 , @params = N'@NumberRecords INT OUTPUT' 
                 , @NumberRecords = @NumberRecords OUTPUT 
END 

-- testing the stored procedure 
DECLARE @NumberRecords int 
EXEC dbo.pGetNumberRecords '[Production].[Product]', @NumberRecords OUTPUT 
SELECT @NumberRecords 

The overhead of using a stored procedure for getting the records count is bearable though the solution is not necessarily elegant, a nicer approach being the use of functions for the same purpose:

-- dropping the function
--DROP FUNCTION IF EXISTS dbo.GetNumberRecords

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecords(
@table as nvarchar(100)) 
RETURNS int 
AS 
BEGIN 
DECLARE @sql varchar(200) 
DECLARE @NumberRecords int

IF IsNull(@table,'')<>'' 
BEGIN 
  SET @sql = N'SELECT @NumberRecords = count(*) FROM ' + @table 

  --get the number of records 
  EXEC sp_executesql @Query = @sql
                 , @params = N'@NumberRecords INT OUTPUT' 
                 , @NumberRecords = @NumberRecords OUTPUT 
END 

RETURN @NumberRecords
END

-- testing the function
SELECT dbo.GetNumberRecords('[AdventureWorks2014].[Production].[Product]')

Unfortunately UDFs have the restriction that they can’t execute stored procedures, and thus the above technique can’t be used. Even if the function is created, when attempting to run it, the following error occurs:
Msg 557, Level 16, State 2, Line 185
Only functions and some extended stored procedures can be executed from within a function.
There is actually an alternative for executing pass-through queries in functions by using OPENQUERY, OPENROWSET or OPENDATASOURCE functions, allowing thus to execute DML statements against a given linked server, as is the case of first two functions, respectively using ad-hoc connection information as is the case of the third function. For exemplification I chose OPENQUERY function, this being the code created to return the number of records from a specified table (you'll need to replace with your server, see below):

--dropping the test function 
-- DROP FUNCTION IF EXISTS dbo.GetNumberRecordsTest

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecordsTest() 
RETURNS int 
AS 
BEGIN 
  RETURN ( 
     SELECT NoRecords 
  FROM OPENQUERY( <server_name>, 'SET NOCOUNT ON; SELECT count(*) NoRecords FROM [AdventureWorks2014].[Production].[Product]') AS O) 
END 

-- retrieving the number of records
SELECT dbo.GetNumberRecordsTest()

 For this piece of code to work is first necessary to refer the server as a linked server:

EXEC sp_serveroption <server_name>, 'Data Access', true

Withought doing this first you’ll get the error:
Msg 7411, Level 16, State 1, Procedure GetNumberRecords, Line 4 Server '>server_name<' is not configured for DATA ACCESS.
You can use sp_helpserver to get server's name:
-- getting server's name
EXEC sp_helpserver 

The above function doesn’t help much. It would be helpful if we could provide the table’s name as parameter and store the query string in a variable, something like:

-- creating the test function
CREATE FUNCTION dbo.GetNumberRecords(
@TableName as nvarchar(100)) 
RETURNS int 
AS 
BEGIN 
  DECLARE @sql varchar(200) 
  SET @sql = 'SET NOCOUNT ON; SELECT count(*) NoRecords FROM ' + @TableName

  RETURN ( 
     SELECT NoRecords 
  FROM OPENQUERY( , @sql) AS O) 
END 

-- retrieving the number of records
SELECT dbo.GetNumberRecords()

Unfortunately OPENQUERY function same as OPENROWSET and OPENDATASOURCE expect a constant parameter as query and therefore the error (as per SQL Server 2005):
Msg 102, Level 15, State 1, Procedure GetNumberRecords, Line 12 Incorrect syntax near '@sql'. Expecting STRING or TEXT_LEX
I wonder why the SQL Server team hasn’t allowed the use of variables as parameters for OPENQUERY functions, if not in the previous SQL Server versions at least in SQL Server 2008! Anyway, another impediment would be raised by the deterministic nature of functions.

Notes:
1) There are further alternatives, e.g. sys.partitions DMV provides similar information (see post) ideal for scenarios in which only an approximative value is needed.
2) For other scenarios one can use the sp_MSForEachTable undocumented stored procedure (see post)

Happy coding!

Previous Post <<||>> Next Post
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.