Showing posts with label Eventstream. Show all posts
Showing posts with label Eventstream. Show all posts

08 March 2025

🏭🎗️🗒️Microsoft Fabric: Real-Time Intelligence (RTI) [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 8-Mar-2025

[Microsoft Fabric] Real-Time Intelligence [RTI]

  • [def] a real-time SaaS platform within Microsoft Fabric 
    • service that empowers users to extract insights and visualize data in motion
    • offers an end-to-end solution for 
      • event-driven scenarios
        • ⇐ rather than schedule-driven solutions. 
      • streaming data
      • data logs
    • {benefit} help companies accelerate speed and precision of business by providing [4]
      • {goal} operational efficiency
        • by allowing to streamline processes and make data driven decisions with accurate, up to date information [4]
      • {goal} end-to-end visibility
        • by allowing to gain a holistic understanding of business health and discover actionable insights for timely action [4]
      • {goal} competitive advantage
        • by allowing to quickly react to shifting market trends, identify opportunities and mitigate risk in real time [4]
    • seamlessly connects time-based data from various sources using no-code connectors [1]
      • enables immediate 
        • visual insights
        • geospatial analysis
        • trigger-based reactions 
        • ⇐ all are part of an organization-wide data catalog [1]
      • ⇐ time oriented data is difficult to manage, yet critical for success [4]
        • {challenge} capture high throughput data from disparate sources in real time [4]
        • {challenge} model scenarios using event data [4]
        • {challenge} choose from an array of bespoke technologies and data formats [4]
        • {challenge} leverage the power of AI against data in real time [4]
        • without the ability to leverage time oriented data, businesses are vulnerable to risks [4]
          • {risk} poor decision-making
          • {risk} financial loss
          • {risk} reduced operational efficiency
          • {risk} impaired data integrity
          • {risk} non-compliance
          • {risk} negative user experience
      • {capability} single unified SaaS solution
        • in opposition to a fragmented, fragile tech stack
      • {capability} accessible data and analytics tools
        • in opposition to advanced skillsets required
      • {capability} real-time stream processing
        • in opposition to batch data processing
    • once  a stream of data is connected, the entire SaaS solution becomes accessible [1]
    • handles 
      • data ingestion
      • data transformation
      • data storage
      • data analytics
      • data visualization
      • data tracking
      • AI
      • real-time actions
    • can be used for 
      • data analysis
      • immediate visual insights
      • centralization of data in motion for an organization
      • actions on data
      • efficient querying, transformation, and storage of large volumes of structured or unstructured data [1]
  • helps evaluate data from 
    • IoT systems
    • system logs
    • free text
    • semi structured data, or contribute data for consumption by others in your organization, 
  • provides a versatile solution
    • transforms the data into a dynamic, actionable resource that drives value across the entire organization
  • its components are built on trusted, core Microsoft rather than schedule-driven solutions 
    • ⇐ together they extend the overall Fabric capabilities to provide event-driven solutions [1]
  • {feature} Real-Time hub 
    • serves as a centralized catalog that facilitates the easy access, addition, exploration, and data sharing [1]
    • expands the range of data sources
      • ⇐ it enables broader insights and visual clarity across various domains [1]
    • ensures that data is accessible to all [1]
      • promoting quick decision-making and informed action
    • the sharing of streaming data from diverse sources unlocks the potential to build BI solutions across the organization [1]
    • use the data consumption tools to explore the data [1]
  • {feature} Real-Time dashboards 
    • come equipped with out-of-the-box interactions 
      • {benefit} simplify the process of understanding data, making it accessible to anyone who wants to make decision based on data in motion using visual tools, Natural Language and Copilot
  • {feature} Fabric Activator
    • {benefit} allows to turn insights into actions by setting up alerts from various parts of Fabric to react to data patterns or conditions in real-time [1]
  • {feature} Real-Time hub events 
    • a catalog of data in motionless
    • contains:
      • data streams 
        • all data streams that are actively running in Fabric to which the user has access to
      • Microsoft sources: 
        • easily discover streaming sources that the users have and quickly configure ingestion of those sources into Fabric
          • e.g. Azure Event Hubs, Azure IoT Hub, Azure SQL DB CDC, Azure Cosmos DB CDC, PostgreSQL DB CDC
      • Fabric events
        • event-driven capabilities support real-time notifications and data processing 
          • ⇒ one can monitor and react to events [1]
            • e.g. Fabric Workspace Item events, Azure Blob Storage events
          • ⇐ the events can be used to trigger other actions or workflows [1]
            • e.g. invoking a data pipeline or sending a notification via email. 
        • the events can be sent to other destinations via eventstreams [1]
  • {feature} Eventstreams
    • event processing capabilities 
    • {benefit} allow to capture, transform, and route high volumes of real-time events to various destinations with a no-code experience [1]
    • support multiple data sources and data destinations [1]
    • {benefit} allow to do filtering, data cleansing, transformation, windowed aggregations, and dupe detection, to land the data in the needed shape [1]
    • one can use the content-based routing capabilities to send data to different destinations based on filters [1]
    • derived eventstreams allows constructing new streams as a result of transformations and/or aggregations that can be shared to consumers in Real-Time hub [1]
  • {feature} Eventhouses
    • the ideal analytics engine to process data in motion
    • tailored to time-based, streaming events with structured, semi structured, and unstructured data [1]
    • data is automatically indexed and partitioned based on ingestion time
      • ⇐ provides fast and complex analytic querying capabilities on high-granularity data [1]
    • the stored data can be made available in OneLake for consumption by other Fabric experiences [1]
      • ⇐ the data is ready for lightning-fast query using various code, low-code, or no-code options in Fabric [1]
    • the data can be queried in native KQL or in T-SQL in the KQL query set [1]
References:
[1] Microsoft Fabric (2024) What is Real-Time Intelligence? [link]
[2] Microsoft Fabric (2024) Real-Time Intelligence documentation in Microsoft Fabric [link
[3] Microsoft Fabric Updates Blog (2024) Fabric workloads are now generally available! [link]
[4] Microsoft Learn (2025) Real Time Intelligence L200 Pitch Deck [link]

Acronyms:
AI - Artificial Intelligence
CDC - Change Data Capture
IoT - Internet of Things
KQL  - Kusto Query Language
RTI - Real-Time Intelligence
SaaS - Software-as-a-Service
SQL - Structured Query Language

🏭🎗️🗒️Microsoft Fabric: Eventstreams [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 8-Mar-2025

Real-Time Intelligence architecture
Real-Time Intelligence architecture [4]

[Microsoft Fabric] Eventstream(s)

  • {def} feature in Microsoft Fabric's Real-Time Intelligence experience, that allows to bring real-time events into Fabric
    • bring real-time events into Fabric, transform them, and then route them to various destinations without writing any code 
      • ⇐ aka no-code solution
      • {feature} drag and drop experience 
        • gives users an intuitive and easy way to create your event data processing, transforming, and routing logic without writing any code 
    • work by creating a pipeline of events from multiple internal and external sources to different destinations
      • a conveyor belt that moves data from one place to another [1]
      • transformations to the data can be added along the way [1]
        • filtering, aggregating, or enriching
  • {def} eventstream
    • an instance of the Eventstream item in Fabric [2]
    • {feature} end-to-end data flow diagram 
      • provide a comprehensive understanding of the data flow and organization [2].
  • {feature} eventstream visual editor
    • used to design pipelines by dragging and dropping different nodes [1]
    • sources
      • where event data comes from 
      • one can choose
        • the source type
        • the data format
        • the consumer group
      • Azure Event Hubs
        • allows to get event data from an Azure event hub [1]
        • allows to create a cloud connection  with the appropriate authentication and privacy level [1]
      • Azure IoT Hub
        • SaaS service used to connect, monitor, and manage IoT assets with a no-code experience [1]
      • CDC-enabled databases
        • software process that identifies and tracks changes to data in a database, enabling real-time or near-real-time data movement [1]
        • Azure SQL Database 
        • PostgreSQL Database
        • MySQL Database
        • Azure Cosmos DB
      • Google Cloud Pub/Sub
        • messaging service for exchanging event data among applications and services [1]
      • Amazon Kinesis Data Streams
        • collect, process, and analyze real-time, streaming data [1]
      • Confluent Cloud Kafka
        • fully managed service based on Apache Kafka for stream processing [1]
      • Fabric workspace events
        • events triggered by changes in Fabric Workspace
          • e.g. creating, updating, or deleting items. 
        • allows to capture, transform, and route events for in-depth analysis and monitoring within Fabric [1]
        • the integration offers enhanced flexibility in tracking and understanding workspace activities [1]
      • Azure blob storage events
        • system triggers for actions like creating, replacing, or deleting a blob [1]
          • these actions are linked to Fabric events
            • allowing to process Blob Storage events as continuous data streams for routing and analysis within Fabric  [1]
        • support streamed or unstreamed events [1]
      • custom endpoint
        • REST API or SDKs can be used to send event data from custom app to eventstream [1]
        • allows to specify the data format and the consumer group of the custom app [1]
      • sample data
        • out-of-box sample data
    • destinations
      • where transformed event data is stored. 
        • in a table in an eventhouse or a lakehouse [1]
        • redirect data to 
          • another eventstream for further processing [1]
          • an activator to trigger an action [1]
      • Eventhouse
        • offers the capability to funnel your real-time event data into a KQL database [1]
      • Lakehouse
        • allows to preprocess real-time events before their ingestion in the lakehouse
          • the events are transformed into Delta Lake format and later stored in specific lakehouse tables [1]
            • facilitating the data warehousing needs [1]
      • custom endpoint
        • directs real-time event traffic to a bespoke application [1]
        • enables the integration of proprietary applications with the event stream, allowing for the immediate consumption of event data [1]
        • {scenario} aim to transfer real-time data to an independent system not hosted on the Microsoft Fabric [1]
      • Derived Stream
        • specialized destination created post-application of stream operations like Filter or Manage Fields to an eventstream
        • represents the altered default stream after processing, which can be routed to various destinations within Fabric and monitored in the Real-Time hub [1]
      • Fabric Activator
        • enables to use Fabric Activator to trigger automated actions based on values in streaming data [1]
    • transformations
      • filter or aggregate the data as is processed from the stream [1]
      • include common data operations
        • filtering
          • filter events based on the value of a field in the input
          • depending on the data type (number or text), the transformation keeps the values that match the selected condition, such as is null or is not null [1]
        • joining
          • transformation that combines data from two streams based on a matching condition between them [1]
        • aggregating
          • calculates an aggregation every time a new event occurs over a period of time [1]
            • Sum, Minimum, Maximum, or Average
          • allows renaming calculated columns, and filtering or slicing the aggregation based on other dimensions in your data [1]
          • one can have one or more aggregations in the same transformation [1]
        • grouping
          • allows to calculate aggregations across all events within a certain time window [1]
            • one can group by the values in one or more fields [1]
          • allows for the renaming of columns
            • similar to the Aggregate transformation 
            • ⇐ provides more options for aggregation and includes more complex options for time windows [1]
          • allows to add more than one aggregation per transformation [1]
            • allows to define the logic needed for processing, transforming, and routing event data [1]
        • union
          • allows to connect two or more nodes and add events with shared fields (with the same name and data type) into one table [1]
            • fields that don't match are dropped and not included in the output [1]
        • expand
          • array transformation that allows to create a new row for each value within an array [1]
        • manage fields
          • allows to add, remove, change data type, or rename fields coming in from an input or another transformation [1]
        • temporal windowing functions 
          • enable to analyze data events within discrete time periods [1]
          • way to perform operations on the data contained in temporal windows [1]
            • e.g. aggregating, filtering, or transforming streaming events that occur within a specified time period [1]
            • allow analyzing streaming data that changes over time [1]
              • e.g. sensor readings, web-clicks, on-line transactions, etc.
              • provide great flexibility to keep an accurate record of events as they occur [1]
          • {type} tumbling windows
            • divides incoming events into fixed and nonoverlapping intervals based on arrival time [1]
          • {type} sliding windows 
            • take the events into fixed and overlapping intervals based on time and divides them [1]
          • {type} session windows 
            • divides events into variable and nonoverlapping intervals that are based on a gap of lack of activity [1]
          • {type} hopping windows
            • are different from tumbling windows as they model scheduled overlapping window [1]
          • {type} snapshot windows 
            • group event stream events that have the same timestamp and are unlike the other windowing functions, which require the function to be named [1]
            • one can add the System.Timestamp() to the GROUP BY clause [1]
          • {type} window duration
            • the length of each window interval [1]
            • can be in seconds, minutes, hours, and even days [1]
          • {parameter} window offset
            • optional parameter that shifts the start and end of each window interval by a specified amount of time [1]
          • {concept} grouping key
            • one or more columns in your event data that you wish to group by [1]
          • aggregation function
            • one or more of the functions applied to each group of events in each window [1]
              • where the counts, sums, averages, min/max, and even custom functions become useful [1]
    • see the event data flowing through the pipeline in real-time [1]
    • handles the scaling, reliability, and security of event stream automatically [1]
      • no need to write any code or manage any infrastructure [1]
    • {feature} eventstream editing canvas
      • used to 
        • add and manage sources and destinations [1]
        • see the event data [1]
        • check the data insights [1]
        • view logs for each source or destination [1]
  • {feature} Apache Kafka endpoint on the Eventstream item
    • {benefit} enables users to connect and consume streaming events through the Kafka protocol [2]
      • application using the protocol can send or receive streaming events with specific topics [2]
      • requires updating the connection settings to use the Kafka endpoint provided in the Eventstream [2]
  • {feature} support runtime logs and data insights for the connector sources in Live View mode [3]
    • allows to examine detailed logs generated by the connector engines for the specific connector [3]
      • help with identifying failure causes or warnings [3]
      • ⇐ accessible in the bottom pane of an eventstream by selecting the relevant connector source node on the canvas in Live View mode [3]
  • {feature} support data insights for the connector sources in Live View mode [3]
  • {feature} integrates eventstreams CI/CD tools
      • {benefit} developers can efficiently build and maintain eventstreams from end-to-end in a web-based environment, while ensuring source control and smooth versioning across projects [3]
  • {feature} REST APIs
    •  allow to automate and manage eventstreams programmatically
      • {benefit} simplify CI/CD workflows and making it easier to integrate eventstreams with external applications [3]
  • {recommendation} use event streams feature with at least SKU: F4 [2]
  • {limitation} maximum message size: 1 MB [2]
  • {limitation} maximum retention period of event data: 90 days [2]

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Use real-time eventstreams in Microsoft Fabric [link]
[2] Microsoft Learn (2025) Microsoft Fabric: Fabric Eventstream - overview [link]
[3] Microsoft Learn (2024) Microsoft Fabric: What's new in Fabric event streams? [link]
[4] Microsoft Learn (2025) Real Time Intelligence L200 Pitch Deck [link]

Acronyms:
API - Application Programming Interface
CDC - Change Data Capture
CI/CD  - Continuous Integration/Continuous Delivery
DB - database
IoT - Internet of Things
KQL - Kusto Query Language
RTI - Real-Time Intelligence
SaaS - Software-as-a-Service
SDK - Software Development Kit
SKU - Stock Keeping Unit

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.