08 March 2025

🏭🎗️🗒️Microsoft Fabric: Eventstreams [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 8-Mar-2025

Real-Time Intelligence architecture
Real-Time Intelligence architecture [4]

[Microsoft Fabric] Eventstream(s)

  • {def} feature in Microsoft Fabric's Real-Time Intelligence experience, that allows to bring real-time events into Fabric
    • bring real-time events into Fabric, transform them, and then route them to various destinations without writing any code 
      • ⇐ aka no-code solution
      • {feature} drag and drop experience 
        • gives users an intuitive and easy way to create your event data processing, transforming, and routing logic without writing any code 
    • work by creating a pipeline of events from multiple internal and external sources to different destinations
      • a conveyor belt that moves data from one place to another [1]
      • transformations to the data can be added along the way [1]
        • filtering, aggregating, or enriching
  • {def} eventstream
    • an instance of the Eventstream item in Fabric [2]
    • {feature} end-to-end data flow diagram 
      • provide a comprehensive understanding of the data flow and organization [2].
  • {feature} eventstream visual editor
    • used to design pipelines by dragging and dropping different nodes [1]
    • sources
      • where event data comes from 
      • one can choose
        • the source type
        • the data format
        • the consumer group
      • Azure Event Hubs
        • allows to get event data from an Azure event hub [1]
        • allows to create a cloud connection  with the appropriate authentication and privacy level [1]
      • Azure IoT Hub
        • SaaS service used to connect, monitor, and manage IoT assets with a no-code experience [1]
      • CDC-enabled databases
        • software process that identifies and tracks changes to data in a database, enabling real-time or near-real-time data movement [1]
        • Azure SQL Database 
        • PostgreSQL Database
        • MySQL Database
        • Azure Cosmos DB
      • Google Cloud Pub/Sub
        • messaging service for exchanging event data among applications and services [1]
      • Amazon Kinesis Data Streams
        • collect, process, and analyze real-time, streaming data [1]
      • Confluent Cloud Kafka
        • fully managed service based on Apache Kafka for stream processing [1]
      • Fabric workspace events
        • events triggered by changes in Fabric Workspace
          • e.g. creating, updating, or deleting items. 
        • allows to capture, transform, and route events for in-depth analysis and monitoring within Fabric [1]
        • the integration offers enhanced flexibility in tracking and understanding workspace activities [1]
      • Azure blob storage events
        • system triggers for actions like creating, replacing, or deleting a blob [1]
          • these actions are linked to Fabric events
            • allowing to process Blob Storage events as continuous data streams for routing and analysis within Fabric  [1]
        • support streamed or unstreamed events [1]
      • custom endpoint
        • REST API or SDKs can be used to send event data from custom app to eventstream [1]
        • allows to specify the data format and the consumer group of the custom app [1]
      • sample data
        • out-of-box sample data
    • destinations
      • where transformed event data is stored. 
        • in a table in an eventhouse or a lakehouse [1]
        • redirect data to 
          • another eventstream for further processing [1]
          • an activator to trigger an action [1]
      • Eventhouse
        • offers the capability to funnel your real-time event data into a KQL database [1]
      • Lakehouse
        • allows to preprocess real-time events before their ingestion in the lakehouse
          • the events are transformed into Delta Lake format and later stored in specific lakehouse tables [1]
            • facilitating the data warehousing needs [1]
      • custom endpoint
        • directs real-time event traffic to a bespoke application [1]
        • enables the integration of proprietary applications with the event stream, allowing for the immediate consumption of event data [1]
        • {scenario} aim to transfer real-time data to an independent system not hosted on the Microsoft Fabric [1]
      • Derived Stream
        • specialized destination created post-application of stream operations like Filter or Manage Fields to an eventstream
        • represents the altered default stream after processing, which can be routed to various destinations within Fabric and monitored in the Real-Time hub [1]
      • Fabric Activator
        • enables to use Fabric Activator to trigger automated actions based on values in streaming data [1]
    • transformations
      • filter or aggregate the data as is processed from the stream [1]
      • include common data operations
        • filtering
          • filter events based on the value of a field in the input
          • depending on the data type (number or text), the transformation keeps the values that match the selected condition, such as is null or is not null [1]
        • joining
          • transformation that combines data from two streams based on a matching condition between them [1]
        • aggregating
          • calculates an aggregation every time a new event occurs over a period of time [1]
            • Sum, Minimum, Maximum, or Average
          • allows renaming calculated columns, and filtering or slicing the aggregation based on other dimensions in your data [1]
          • one can have one or more aggregations in the same transformation [1]
        • grouping
          • allows to calculate aggregations across all events within a certain time window [1]
            • one can group by the values in one or more fields [1]
          • allows for the renaming of columns
            • similar to the Aggregate transformation 
            • ⇐ provides more options for aggregation and includes more complex options for time windows [1]
          • allows to add more than one aggregation per transformation [1]
            • allows to define the logic needed for processing, transforming, and routing event data [1]
        • union
          • allows to connect two or more nodes and add events with shared fields (with the same name and data type) into one table [1]
            • fields that don't match are dropped and not included in the output [1]
        • expand
          • array transformation that allows to create a new row for each value within an array [1]
        • manage fields
          • allows to add, remove, change data type, or rename fields coming in from an input or another transformation [1]
        • temporal windowing functions 
          • enable to analyze data events within discrete time periods [1]
          • way to perform operations on the data contained in temporal windows [1]
            • e.g. aggregating, filtering, or transforming streaming events that occur within a specified time period [1]
            • allow analyzing streaming data that changes over time [1]
              • e.g. sensor readings, web-clicks, on-line transactions, etc.
              • provide great flexibility to keep an accurate record of events as they occur [1]
          • {type} tumbling windows
            • divides incoming events into fixed and nonoverlapping intervals based on arrival time [1]
          • {type} sliding windows 
            • take the events into fixed and overlapping intervals based on time and divides them [1]
          • {type} session windows 
            • divides events into variable and nonoverlapping intervals that are based on a gap of lack of activity [1]
          • {type} hopping windows
            • are different from tumbling windows as they model scheduled overlapping window [1]
          • {type} snapshot windows 
            • group event stream events that have the same timestamp and are unlike the other windowing functions, which require the function to be named [1]
            • one can add the System.Timestamp() to the GROUP BY clause [1]
          • {type} window duration
            • the length of each window interval [1]
            • can be in seconds, minutes, hours, and even days [1]
          • {parameter} window offset
            • optional parameter that shifts the start and end of each window interval by a specified amount of time [1]
          • {concept} grouping key
            • one or more columns in your event data that you wish to group by [1]
          • aggregation function
            • one or more of the functions applied to each group of events in each window [1]
              • where the counts, sums, averages, min/max, and even custom functions become useful [1]
    • see the event data flowing through the pipeline in real-time [1]
    • handles the scaling, reliability, and security of event stream automatically [1]
      • no need to write any code or manage any infrastructure [1]
    • {feature} eventstream editing canvas
      • used to 
        • add and manage sources and destinations [1]
        • see the event data [1]
        • check the data insights [1]
        • view logs for each source or destination [1]
  • {feature} Apache Kafka endpoint on the Eventstream item
    • {benefit} enables users to connect and consume streaming events through the Kafka protocol [2]
      • application using the protocol can send or receive streaming events with specific topics [2]
      • requires updating the connection settings to use the Kafka endpoint provided in the Eventstream [2]
  • {feature} support runtime logs and data insights for the connector sources in Live View mode [3]
    • allows to examine detailed logs generated by the connector engines for the specific connector [3]
      • help with identifying failure causes or warnings [3]
      • ⇐ accessible in the bottom pane of an eventstream by selecting the relevant connector source node on the canvas in Live View mode [3]
  • {feature} support data insights for the connector sources in Live View mode [3]
  • {feature} integrates eventstreams CI/CD tools
      • {benefit} developers can efficiently build and maintain eventstreams from end-to-end in a web-based environment, while ensuring source control and smooth versioning across projects [3]
  • {feature} REST APIs
    •  allow to automate and manage eventstreams programmatically
      • {benefit} simplify CI/CD workflows and making it easier to integrate eventstreams with external applications [3]
  • {recommendation} use event streams feature with at least SKU: F4 [2]
  • {limitation} maximum message size: 1 MB [2]
  • {limitation} maximum retention period of event data: 90 days [2]

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Use real-time eventstreams in Microsoft Fabric [link]
[2] Microsoft Learn (2025) Microsoft Fabric: Fabric Eventstream - overview [link]
[3] Microsoft Learn (2024) Microsoft Fabric: What's new in Fabric event streams? [link]
[4] Microsoft Learn (2025) Real Time Intelligence L200 Pitch Deck [link]

Acronyms:
API - Application Programming Interface
CDC - Change Data Capture
CI/CD  - Continuous Integration/Continuous Delivery
DB - database
IoT - Internet of Things
KQL - Kusto Query Language
RTI - Real-Time Intelligence
SaaS - Software-as-a-Service
SDK - Software Development Kit
SKU - Stock Keeping Unit

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.