Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!
Last updated: 24-May-2025
-- create schema CREATE SCHERA IF NOT EXISTS <lakehouse_name>.<schema_name> -- create a materialized view CREATE MATERIALIZED VIEW IF NOT EXISTS <lakehouse_name>.<schema_name>.<view_name> ( CONSTRAINT <constraint_name> CHECK (<constraint>) ON MISMATCH DROP )
AS SELECT ... FROM ... -- WHERE ... --GROUP BY ...
[Microsoft Fabric] Materialized Lake Views (MLV)
- {def} persisted, continuously updated view of data [1]
- {benefit} allows to build declarative data pipelines using SQL, complete with built-in data quality rules and automatic monitoring of data transformations
- simplifies the implementation of multi-stage Lakehouse processing [1]
- streamline data workflows
- enable developers to focus on business logic [1]
- ⇐ not on infrastructural or data quality-related issues [1]
- the views can be created in a notebook [2]
- can have data quality constraints enforced and visualized for every run, showing completion status and conformance to data quality constraints defined in a single view [1]
- empowers developers to set up complex data pipelines with just a few SQL statements and then handle the rest automatically [1]
- faster development cycles
- trustworthy data
- quicker insights
- {goal} process only the new or changed data instead of reprocessing everything each time [1]
- ⇐ leverages Delta Lake’s CDF under the hood
- ⇒ it can update just the portions of data that changed rather than recompute the whole view from scratch [1]
- {operation} creation
- allows defining transformations at each layer [1]
- e.g. aggregation, projection, filters
- allows specifying certain checks that the data must meet [1]
- incorporate data quality constraints directly into the pipeline definition
- via CREATE MATERIALIZED LAKE VIEW
- the SQL syntax is declarative and Fabric figures out how to produce and maintain it [1]
- {operation} refresh
- refreshes only when its source has new data [1]
- if there’s no change, it can skip running entirely (saving time and resources) [1]
- {feature} automatically generate a visual report that shows trends on data quality constraints
- {benefit} allows to easily identify the checks that introduce maximum errors and the associated MLVs for easy troubleshooting [1]
- {feature} can be combined with Shortcut Transformation feature for CSV ingestion
- {benefit} allows building an end-to-end Medallion architecture
- {feature} dependency graph
- allows to see the dependencies existing between the various objects [2]
- ⇐ automatically generated [2]
- {feature} data quality report
- built-in Power BI dashboard that shows several aggregated metrics [2]
- {feature|planned} support for PySpark
- {feature|planned} incremental refresh
- {feature|planned} integration with Data Activator
Previous Post <<||>> Next Post
References:
[1] Microsoft Fabric Update Blog (2025) Simplifying Medallion
Implementation with Materialized Lake Views in Fabric [link|aka]
[2] Power BI Tips (2025) Microsoft Fabric Notebooks with Materialized Views - Quick Tips [link]
[3] Microsoft Learn (2025) [link]
Acronyms:
CDF - Change Data Feed
ETL - Extract, Transfer, Load
MF - Microsoft Fabric
MLV - Materialized Lake views
MLV - Materialized Lake views
No comments:
Post a Comment