SQL Troubles: 🏭🗒️Microsoft Fabric: Materialized Lake Views (MLV) [Notes] 🆕🗓️

24 May 2025

🏭🗒️Microsoft Fabric: Materialized Lake Views (MLV) [Notes] 🆕🗓️

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 27-Jul-2025

-- create schema
CREATE SCHERA IF NOT EXISTS <lakehouse_name>.<schema_name>

-- create a materialized view
CREATE MATERIALIZED VIEW IF NOT EXISTS <lakehouse_name>.<schema_name>.<view_name> 
[(
    CONSTRAINT <constraint_name> CHECK (<constraint>) ON MISMATCH DROP 
)]

[PARTITIONED BY (col1, col2, ... )] 
[COMMENT “description or comment”] 
[TBLPROPERTIES (“key1”=”val1”, “key2”=”val2”,

AS 
SELECT ...
FROM ...
-- WHERE ...
--GROUP BY ...

[Microsoft Fabric] Materialized Lake Views (MLVs)

{def} persisted, continuously updated view of data [1]

{benefit} allows to build declarative data pipelines using SQL, complete with built-in data quality rules and automatic monitoring of data transformations

simplifies the implementation of multi-stage Lakehouse processing [1]

⇐ aids in the creation, management, and monitoring of views [3]
⇐ improves transformations through a declarative approach [3]
streamline data workflows
enable developers to focus on business logic [1]

⇐ not on infrastructural or data quality-related issues [1]

the views can be created in a notebook [2]

{benefit} allows developers visualize lineage across all entities in lakehouse, view the dependencies, and track its execution progress [3]

can have data quality constraints enforced and visualized for every run, showing completion status and conformance to data quality constraints defined in a single view [1]
empowers developers to set up complex data pipelines with just a few SQL statements and then handle the rest automatically [1]

faster development cycles
trustworthy data
quicker insights

{goal} process only the new or changed data instead of reprocessing everything each time [1]

⇐ leverages Delta Lake’s CDF under the hood

⇒ it can update just the portions of data that changed rather than recompute the whole view from scratch [1]

{operation} creation

allows defining transformations at each layer [1]

e.g. aggregation, projection, filters

allows specifying certain checks that the data must meet [1]

incorporate data quality constraints directly into the pipeline definition

via CREATE MATERIALIZED LAKE VIEW

the SQL syntax is declarative and Fabric figures out how to produce and maintain it [1]

{operation} refresh

refreshes only when its source has new data [1]

if there’s no change, it can skip running entirely (saving time and resources) [1]

via REFRESH MATERIALIZED LAKE VIEW [workspace.lakehouse.schema].MLV_Identifier [FULL];

{operation} list views from schema [3]

via SHOW MATERIALIZED LAKE VIEWS <IN/FROM> Schema_Name;

{opetation} retrieve definition

via SHOW CREATE MATERIALIZED LAKE VIEW MLV_Identifier;

{operstion} update definition

via ALTER MATERIALIZED LAKE VIEW MLV_Identifier RENAME TO MLV_Identifier_New;

{operstion} drop view

via DROP MATERIALIZED LAKE VIEW MLV_Identifier;
{warning} dropping or renaming a materialized lake view affects the lineage view and scheduled refresh [3]

{recommendation} update the reference in all dependent materialized lake views [3]

{operation} schedule view run

lets users set how often the MLV should be refreshed based on business needs and lineage execution timing [5]
depends on

data update frequency: the frequency with which the data is updated [5]
query performance requirements: Business requirement to refresh the data in defined frequent intervals [5]
system load: optimizing the time to run the lineage without overloading the system [5]

{operation} view run history

users can access the last 25 runs including lineage and run metadata

available from the dropdown for monitoring and troubleshooting

{concept} lineage

the sequence of MLV that needs to be executed to refresh the MLV once new data is available [5]

{feature} automatically generate a visual report that shows trends on data quality constraints

{benefit} allows to easily identify the checks that introduce maximum errors and the associated MLVs for easy troubleshooting [1]

{feature} can be combined with Shortcut Transformation feature for CSV ingestion

{benefit} facilitate the building of end-to-end Medallion architectures

{feature} dependency graph

allows to see the dependencies existing between the various objects [2]

⇐ automatically generated [2]

{feature} data quality

{benefit} allows to compose precise queries to exclude poor quality data from the source tables [5]
[medallion architecture] ensuring data quality is essential at every stage of the architecture [5]
maintained by setting constraints when defining the MLVs [5]
{action} FAIL

stops refreshing an MLV if any constraint is violated [5]
{default} halt is at the first instance

even without specifying the FAIL keyword [5]
takes precedence over DROP

{action} DROP

processes the MLV and removes records that don't meet the specified constrain [5]

provides the count of removed records in the lineage view [5]

{constraint} updating data quality constraints after creating an MLV isn't supported [5]

⇐ the MLV must be recreated

{constraint} the use of functions and pattern search with operators in constraint condition is restricted [5]

e.g. LIKE, regex

{known issue} the creation and refresh of an MLV with a FAIL action in constraint may result in a "delta table not found" error

{recommendation} recreate the MLV and avoid using the FAIL action [5]

{feature} data quality report

built-in Power BI dashboard that shows several aggregated metrics [2]

{feature} monitor hub

centralized portal to browse MLV runs in the lakehouse [7]
{operation} view runs' status [7]
{operation} search and filter the runs [7]

based on different criteria

{operation} cancel in-progress run [7]
{operation} drill down run execution details [7]

doesn't support

{feature|planned} PySpark [3]
{feature|planned} incremental refresh [3]
{feature|planned} integration with Data Activator [3]
{feature|planned} API [3]
{feature|planned} cross-lakehouse lineage and execution [3]
{limitation} Spark properties set at the session level aren't applied during scheduled lineage refresh [4]
{limitation} creation with delta time-travel [4]
{limitation} DML statements [4]
{limitation} UDFs in CTAS [4]
{limitation} temporary views can't be used to define MLVs [4]

Previous Post <<||>> Next Post

References:

[1] Microsoft Fabric Update Blog (2025) Simplifying Medallion Implementation with Materialized Lake Views in Fabric [link|aka]

[2] Power BI Tips (2025) Microsoft Fabric Notebooks with Materialized Views - Quick Tips [link]

[3] Microsoft Learn (2025) What are materialized lake views in Microsoft Fabric? [link]

[4] Microsoft Learn (2025) Materialized lake views Spark SQL reference [link]

[5] Microsoft Learn (2025) Manage Fabric materialized lake views lineage [link]

[6] Microsoft Learn (2025) Data quality in materialized lake views [link]

[7] Microsoft Learn (2025) Monitor materialized lake views [link]

Resources:
[R1] Databricks (2025) Use materialized views in Databricks SQL [link]

[R2] Microsoft Learn (2025) Implement medallion architecture with materialized lake views [link]

Acronyms:

API -

CDF - Change Data Feed

CTA -

DML -

ETL - Extract, Transfer, Load

MF - Microsoft Fabric
MLV - Materialized Lake views

UDF - User-defined functions

SQL Troubles

Pages

24 May 2025

🏭🗒️Microsoft Fabric: Materialized Lake Views (MLV) [Notes] 🆕🗓️

No comments:

About Me