SQL Troubles: integration

Showing posts with label integration. Show all posts

06 October 2025

🏭🗒️Microsoft Fabric: Git [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 6-Oct-2025

[Microsoft Fabric] Git

{def} an open source, distributed version control platform

enables developers commit their work to a local repository and then sync their copy of the repository with the copy on the server [1]
to be differentiated from centralized version control

where clients must synchronize code with a server before creating new versions of code [1

provides tools for isolating changes and later merging them back together

{benefit} simultaneous development

everyone has their own local copy of code and works simultaneously on their own branches

Git works offline since almost every operation is local

{benefit} faster release

branches allow for flexible and simultaneous development

{benefit} built-in integration

integrates into most tools and products

every major IDE has built-in Git support

this integration simplifies the day-to-day workflow

{benefit} strong community support

the volume of community support makes it easy to get help when needed

{benefit} works with any team

using Git with a source code management tool increases a team's productivity

by encouraging collaboration, enforcing policies, automating processes, and improving visibility and traceability of work

the team can either

settle on individual tools for version control, work item tracking, and continuous integration and deployment
choose a solution that supports all of these tasks in one place

e.g. GitHub, Azure DevOps

{benefit} pull requests

used to discuss code changes with the team before merging them into the main branch
allows to ensure code quality and increase knowledge across team
platforms like GitHub and Azure DevOps offer a rich pull request experience

{benefit} branch policies

protect important branches by preventing direct pushes, requiring reviewers, and ensuring clean build

used to ensure that pull requests meet requirements before completion

teams can configure their solution to enforce consistent workflows and process across the team

{feature} continuous integration
{feature} continuous deployment
{feature} automated testing
{feature} work item tracking
{feature} metrics
{feature} reporting
{operation} commit

snapshot of all files at a point in time [1]

every time work is saved, Git creates a commit [1]
identified by a unique cryptographic hash of the committed content [1]
everything is hashed
it's impossible to make changes, lose information, or corrupt files without Git detecting it [1]

create links to other commits, forming a graph of the development history [2A]
{operation} revert code to a previous commit [1]
{operation} inspect how files changed from one commit to the next [1]
{operation} review information e.g. where and when changes were made [1]

{operation} branch

lightweight pointers to work in progress
each developer saves changes to their own local code repository

there can be many different changes based on the same commit

branches manage this separation

once work created in a branch is finished, it can be merged back into the team's main (or trunk) branch

main branch

contains stable, high-quality code from which programmers release

feature branches

contain work in progress, which are merged into the main branch upon completion
allows to isolate development work and minimize conflicts among multiple developers [2]

release branch

by separating the release branch from development in progress, it's easier to manage stable code and ship updates more quickly

if a file hasn't changed from one commit to the next, Git uses the previously stored file [1]
files are in one of three states

{state}modified

when a file is first modified, the changes exist only in the working directory

they aren't yet part of a commit or the development history

the developer must stage the changed files to be included in the commit
the staging area contains all changes to include in the next commit

{state}committed

once the developer is happy with the staged files, the files are packaged as a commit with a message describing what changed

this commit becomes part of the development history

{state}staged

staging lets developers pick which file changes to save in a commit to break down large changes into a series of smaller commits

by reducing the scope of commits, it's easier to review the commit history to

{best practice} set up a shared Git repository and CI/CD pipelines [2]

enables effective collaboration and deployment in PBIP [2]

enables implementing version control in PBIP [2]

it’s essential for managing project history and collaboration [2]

allows to track changes throughout the model lifecycle [2]
allows to enable effective governance and collaboratin

provides robust version tracking and collaboration features, ensuring traceability

{best practice} use descriptive commit messages [2]

allows to ensure clarity and facilitate collaboration in version control [2]

{best practice} avoid sharing Git credentials [2]

compromises security and accountability [2]

can lead to potential breaches [2]

{best practice} define a naming conventions for files and communicated accordingly [2]
{best practice} avoid merging changes directly into the master branch [2]

{risk} this can lead to integration issues [2]

{best practice} use git merge for integrating changes from one branch to another [2]

{benefit} ensures seamless collaboration [2]

{best practice} avoid skipping merges [2]

failing to merge regularly can lead to complex conflicts and integration challenges [2]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2022) DeveOps: What is Git? [link]
[2] M Anand, Microsoft Fabric Analytics Engineer Associate: Implementing Analytics Solutions Using Microsoft Fabric (DP-600), 2025

Acronyms:

PBIP - Power BI Project
CI/CD - Continuous Integration and Continuous Deployment
IDE - Integrated Development Environments

30 September 2025

🤖〽️Prompt Engineering: Copilot Unabridged (Part 102: Mapping the Next Decade of AI - A Journey Through Innovation)

Prompt Engineering Series

Prompt: "write a post of 600 words that depicts a map that reflects the evolution of AI for the next 10 years"

Introduction

Artificial Intelligence (AI) is no longer a distant dream - it’s a living, breathing force reshaping industries, societies, and even our sense of self. As we look ahead to the next ten years, the evolution of AI can be visualized as a dynamic map, charting a journey through five transformative landscapes: Foundation, Expansion, Integration, Autonomy, and Reflection.

Phase 1: Foundation (2025–2026)

We begin in the foothills of Foundation, where AI continues to solidify its core capabilities. Large language models, multimodal systems, and generative AI are refined for reliability, safety, and scalability.

Focus Areas: Model alignment, data efficiency, and ethical guardrails.
Key Players: Tech giants like Microsoft, OpenAI, Google, and emerging startups.
Milestones: AI becomes embedded in productivity tools, education platforms, and customer service systems.

This phase sets the groundwork for trust and usability, ensuring that AI is not just powerful - but responsible.

Phase 2: Expansion (2026–2028) [>>]

As we ascend into Expansion, AI spreads across domains like healthcare, law, finance, and agriculture. Specialized models emerge, trained on domain-specific data to deliver expert-level insights.

Focus Areas: Vertical AI, federated learning, and edge deployment.
Key Trends: AI-powered diagnostics, legal research assistants, and autonomous farming systems.
Challenges: Data privacy, regulatory frameworks, and workforce adaptation.

This is the era of AI democratization, where access and utility grow exponentially.

Phase 3: Integration (2028–2030)

Now we enter the bustling crossroads of Integration, where AI becomes deeply woven into the fabric of daily life. Human-AI collaboration reaches new heights, with systems that anticipate needs, adapt to preferences, and operate seamlessly across devices.

Focus Areas: Agentic AI, multimodal fusion, and ambient intelligence.
Examples: Personal AI copilots, smart cities, and AI-enhanced creativity tools.
Ethical Questions: How much autonomy should AI have? What boundaries must be drawn?

This phase marks a shift from tool to partner, as AI begins to co-author our experiences.

Phase 4: Autonomy (2030–2033)

In the highlands of Autonomy, AI systems gain the ability to make complex decisions independently. Autonomous agents manage logistics, negotiate contracts, and even conduct scientific research.

Focus Areas: Reinforcement learning, causal reasoning, and self-improving systems.
Breakthroughs: AI-led drug discovery, decentralized governance, and synthetic biology.
Risks: Misalignment, unintended consequences, and concentration of power.

Here, the map becomes more unpredictable - AI is no longer just reactive, but proactive, shaping outcomes with minimal human input.

Phase 5: Reflection (2033–2035)

Finally, we reach the summit: Reflection. Society begins to grapple with the philosophical, cultural, and existential implications of AI. What does it mean to be human in an age of intelligent machines?

Focus Areas: AI ethics, digital rights, and human-AI symbiosis.
Cultural Shifts: New forms of art, education, and governance emerge.
Global Dialogue: Nations collaborate on AI treaties, safety protocols, and shared values.

This phase is not just about technology - it’s about identity, values, and the future we choose to build.

The Road Ahead

This map is not linear - it’s a living terrain shaped by innovation, regulation, and imagination. Detours will occur. Some paths may converge, others may diverge. But one thing is clear: the next decade of AI will be defined not just by what we build, but by how we navigate it.

So as we set out on this journey, let’s carry not just tools - but wisdom. Not just ambition-but empathy. Because the future of AI is not just a destination - it’s a reflection of us.

Just try the prompt on Copilot or your favorite AI-powered assistant! Have you got a different/similar result? How big or important is the difference? Any other thoughts?
Just share the link to the post with me and I'll add it to this post as a resource!

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

16 April 2025

🧮ERP: Implementations (Part XIV: A Never-Ending Story)

ERP Implementations Series

An ERP implementation is occasionally considered as a one-time endeavor after which an organization will live happily ever after. In an ideal world that would be true, though the work never stops – things that were carved out from the implementation, optimizations, new features, new regulations, new requirements, integration with other systems, etc. An implementation is thus just the beginning from what it comes and it's essential to get the foundation right – and that’s the purpose of the ERP implementation – provide a foundation on which something bigger and solid can be erected.

No matter how well an ERP implementation is managed and executed, respectively how well people work towards the same goals, there’s always something forgotten or carved out from the initial project. Usually, the casual suspects are the integrations with other systems, though there can be also minor or even bigger features that are planned to be addressed later, if the implementation hasn’t consumed already all the financial resources available, as it's usually the case. Some of the topics can be addressed as Change Requests or consolidated on projects of their own.

Even simple integrations can become complex when the processes are poorly designed, and that typically happens more often than people think. It’s not necessarily about the lack of skillset or about the technologies used, but about the degree to which the processes can work in a loosely coupled interconnected manner. Even unidirectional integrations can raise challenges, though everything increases in complexity when the flow of data is bidirectional. Moreover, the complexity increases with each system added to the overall architecture.

Like a sculpture’s manual creation, processes in an ERP implementation form a skeleton that needs chiseling and smoothing until the form reaches the desired optimized shape. However, optimization is not a one-time attempt but a continuous work of exploring what is achievable, what works, what is optimal. Sometimes optimization is an exact science, while other times it’s about (scientifical) experimentation in which theory, ideas and investments are put to good use. However, experimentation tends to be expensive at least in terms of time and effort, and probably these are the main reasons why some organizations don’t even attempt that – or maybe it’s just laziness, pure indifference or self-preservation. In fact, why change something that already works?

Typically, software manufacturers make available new releases on a periodic basis as part of their planning for growth and of attracting more businesses. Each release that touches used functionality typically needs proper evaluation, testing and whatever organizations consider as important as part of the release management process. Ideally, everything should go smoothly though life never ceases to surprise and even a minor release can have an important impact when earlier critical functionality stopped working. Test automation and other practices can make an important difference for organizations, though these require additional effort and investments that usually pay off when done right.

Regulations and other similar requirements must be addressed as they can involve penalties or other risks that are usually worth avoiding. Ideally such requirements should be supported by design, though even then a certain volume of work is involved. Moreover, the business context can change unexpectedly, and further requirements need to be considered eventually.

The work on an ERP system and the infrastructure built around it is a never-ending story. Therefore, organizations must have not only the resources for the initial project, but also what comes after that. Of course, some work can be performed manually, some requirements can be delayed, some risks can be assumed, though the value of an ERP system increases with its extended usage, at least in theory.

Previous Post <<||>> Next Post

15 April 2025

🧮ERP: Implementations (Part XII: The Process Perspective)

ERP Implementations Series

Technology can have a tremendous potential impact on organizations, helping them achieve their strategic goals and objectives, however it takes more than an implementation of one or more technologies to leverage that potential! This applies to ERP and other technology implementations altogether, but the role of technology is more important in the latter through its transformative role. ERP implementations can be the foundation on which the whole future of the organization is built upon, and it’s ideal to have a broader strategy that looks at all the facets of an organization pre-, during and postimplementation.

One of the most important assets an organization has is its processes, organization’s success depending on the degree the processes are used to leverage the various strategies. Many customers want their business processes to be implemented on the new platform and that's the point where many projects go in the wrong direction! There are probably areas where this approach makes sense, though organizations need to look also at the alternatives available in the new ecosystem, identify and prioritize the not existing features accordingly. There will be also extreme cases in which one or a mix of systems will be considered as not feasible, and this is an alternative that should be considered during such evaluations!

An ERP system allows organizations to implement their key value-creation processes by providing a technological skeleton with a set of configurations and features that can be used to address a wide set of requirements. Such a framework is an enabler - makes things possible - though the potential is not reached automatically, and this is one of the many false assumptions associated with such projects. Customers choose such a system and expect magic to happen! Many of the false perceptions are strengthened by implementers or the other parties involved in the projects. As in other IT areas, there are many misconceptions that pervade.

An ERP provides thus a basis on which an organization can implement its processes. Doing an ERP implementation without process redesign is seldom possible, even if many organizations want to avoid it at all costs. Even if organization’s processes are highly standardized, expecting a system to model them by design is utopian, given that ERP system tends to target the most important aspects identified across industries. And thus, customizations come into play, some of them done without looking for alternatives already existing in the intrinsic or extended range of solutions available in an ERP’s ecosystem.

One of the most important dangers is when an organization’s processes are so complex that their replication in the new environment creates more issues that the implementation can solve. At least in the first phases of the implementation, organizations must learn to compromise and focus on the critical aspects without which the organization can’t do its business. Moreover, the costs of implementations tend to increase exponentially, when multiple complex requirements are added to address the gaps. Organizations should always look at alternatives – integrations with third party systems tend to be more cost-effective than rebuilding the respective functionality from scratch!

It's also true that some processes are too complex to be implemented, though the solution resides usually in the middle. Each customization adds another level of complexity, and a whole range of risk many customers take. Conversely, there’s no blueprint that works for everybody. Organizations must thus compromise and that’s probably one of the most important aspects they should be aware of! However, also compromises must be made in the right places, while evaluating alternatives and the possible outcomes. It’s important to be aware of the full extent of the implications for their decisions.

Previous Post <<||>> Next Post

13 April 2025

🏭🗒️Microsoft Fabric: Continuous Integration & Continuous Deployment [CI/CD] [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 13-Apr-2025

[Microsoft Fabric] Continuous Integration & Continuous Deployment [CI/CD]

{def} development processes, tools, and best practices used to automates the integration, testing, and deployment of code changes to ensure efficient and reliable development

can be used in combination with a client tool

e.g. VS Code, Power BI Desktop
don’t necessarily need a workspace

developers can

create branches
commit changes to that branch locally
push changes to the remote repo
create a pull request to the main branch
⇐ all steps can be performed without a workspace [1]

workspace is needed only as a testing environment [1]

to check that everything works in a real-life scenario [1]

addresses a few pain points [2]

manual integration issues

manual changes can lead to conflicts and errors

slow down development [2]

development delays

manual deployments are time-consuming and prone to errors

lead to delays in delivering new features and updates [2]

inconsistent environments

inconsistencies between environment cause issues that are hard to debug [2]

lack of visibility

can be challenging to

track changes though their lifetime [2]
understand the state of the codebase[2]

{process} continuous integration (CI)
{process} continuous deployment (CD)

architecture

{layer} development database

{recommendation} should be relatively small [1]

{layer} test database

{recommendation} should be as similar as possible to the production database [1]

{layer} production database

data items

items that store data
items' definition in Git defines how the data is stored [1]

{stage} development

{best practice} back up work to a Git repository

back up the work by committing it into Git [1]
{prerequisite} the work environment must be isolated [1]

so others don’t override the work before it gets committed [1]
commit to a branch no other developer is using [1]
commit together changes that must be deployed together [1]

helps later when

deploying to other stages
creating pull requests
reverting changes

{warning} big commits might hit the max commit size limit [1]

{bad practice} store large-size items in source control systems, even if it works [1]
{recommendation} consider ways to reduce items’ size if they have lots of static [1] resources, like images [1]

{action} revert to a previous version

{operation} undo

revert the immediate changes made, as long as they aren't committed yet [1]
each item can be reverted separately [1]

{operation} revert

reverting to older commits

{recommendation} promote an older commit to be the HEAD

via git revert or git reset [1]
shows that there’s an update in the source control pane [1]
the workspace can be updated with that new commit [1]

{warning} reverting a data item to an older version might break the existing data and could possibly require dropping the data or the operation might fail [1]
{recommendation} check dependencies in advance before reverting changes back [1]

{concept} private workspace

a workspace that provides an isolated environment [1]

⇐ allows to work in isolation [1]

{prerequisite} the workspace is assigned to a Fabric capacity [1]
{prerequisite} access to data to work in the workspace [1]
{step} create a new branch from the main branch [1]

allows to have most up-to-date version of the content [1]
can be used for any future branch created by the user [1]

when a sprint is over, the changes are merged and one can start a fresh new task [1]

switch the connection to a new branch on the same workspace

approach can be used when is needed to fix a bug in the middle of a sprint [1]

{validation} connect to the correct folder in the branch to pull the right content into the workspace [1]

{best practice} make small incremental changes that are easy to merge and less likely to get into conflicts [1]

update the branch to resolve the conflicts first [1]

{best practice} change workspace’s configurations to enable productivity [1]

connection between items, or to different data sources or changes to parameters on a given item [1]

{recommendation} make sure you're working with the supported structure of the item you're authoring [1]

if you’re not sure, first clone a repo with content already synced to a workspace, then start authoring from there, where the structure is already in place [1]

{constraint} a workspace can only be connected to a single branch at a time [1]

{recommendation} treat this as a 1:1 mapping [1]

{stage} test

{best practice} allows to simulate a real production environment for testing purposes [1]

{alternative} simulate this by connecting Git to another workspace [1]

factors to consider for the test environment

data volume
usage volume
production environment’s capacity

stage and production should have the same (minimal) capacity [1]

using the same capacity can make production unstable during load testing [1]

{recommendation} test using a different capacity similar in resources to the production capacity [1]
{recommendation} use a capacity that allows to pay only for the testing time [1]

allows to avoid unnecessary costs [1]

{best practice} use deployment rules with a real-life data source

{recommendation} use data source rules to switch data sources in the test stage or parameterize the connection if not working through deployment pipelines [1]
{recommendation} separate the development and test data sources [1]
{recommendation} check related items

the changes made can also affect the dependent items [1]

{recommendation} verify that the changes don’t affect or break the performance of dependent items [1]

via impact analysis.

{operation} update data items in the workspace

imports items’ definition into the workspace and applies it on the existing data [1]
the operation is same for Git and deployment pipelines [1]
{recommendation} know in advance what the changes are and what impact they have on the existing data [1]
{recommendation} use commit messages to describe the changes made [1]
{recommendation} upload the changes first to a dev or test environment [1]

{benefit} allows to see how that item handles the change with test data [1]

{recommendation} check the changes on a staging environment, with real-life data (or as close to it as possible) [1]

{benefit} allows to minimize the unexpected behavior in production [1]

{recommendation} consider the best timing when updating the Prod environment [1]

{benefit} minimize the impact errors might cause on the business [1]

{recommendation} perform post-deployment tests in Prod to verify that everything works as expected [1]
{recommendation} have a deployment, respectively a recovery plan [1]

{benefit) allows to minimize the effort, respectively the downtime [1]

{stage} production

{best practice} let only specific people manage sensitive operations [1]
{best practice} use workspace permissions to manage access [1]

applies to all BI creators for a specific workspace who need access to the pipeline

{best practice} limit access to the repo or pipeline by only enabling permissions to users [1] who are part of the content creation process [1]
{best practice} set deployment rules to ensure production stage availability [1]

{goal} ensure the data in production is always connected and available to users [1]
{benefit} allows deployments run while while minimizing the downtimes
applies to data sources and parameters defined in the semantic model [1]

deployment into production using Git branches

{recommendation} use release branches [1]

requires changing the connection of workspace to the new release branches before every deployment [1]
if the build or release pipeline requires to change the source code, or run scripts in a build environment before deployment, then connecting the workspace to Git won't help [1]

{recommendation} after deploying to each stage, make sure to change all the configuration specific to that stage [1]

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2025) Fabric: Best practices for lifecycle management in Fabric [link]

[2] Microsoft Learn (2025) Fabric: CI/CD for pipelines in Data Factory in Microsoft Fabric [link]
[3] Microsoft Learn (2025) Fabric: Choose the best Fabric CI/CD workflow option for you [link]

Acronyms:

API - Application Programming Interface
BI - Business Intelligence
CI/CD - Continuous Integration and Continuous Deployment
VS - Visual Studio

20 January 2025

🏭🗒️Microsoft Fabric: [Azure] Service Principals (SPN) [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 20-Jan-2025

[Azure] Service Principal (SPN)

{def} a non-human, application-based security identity used by applications or automation tools to access specific Azure resources [1]

can be assigned precise permissions, making them perfect for automated processes or background services

allows to minimize the risks of human error and identity-based vulnerabilities
supported in datasets, Gen1/Gen2 dataflows, datamarts [2]
authentication type

supported only by [2]

Azure Data Lake Storage
Azure Data Lake Storage Gen2
Azure Blob Storage
Azure Synapse Analytics
Azure SQL Database
Dataverse
SharePoint online

doesn’t support

SQL data source with Direct Query in datasets [2]

when registering a new application in Microsoft Entra ID, a SPN is automatically created for the app registration [4]

the access to resources is restricted by the roles assigned to the SPN

⇒ gives control over which resources can be accessed and at which level [4]

{recommendation} use SPN with automated tools [4]

rather than allowing them to sign in with a user identity [4]

{prerequisite} an active Microsoft Entra user account with sufficient permissions to

register an application with the tenant [4]
assign to the application a role in the Azure subscription [4]
⇐ requires Application.ReadWrite.All permission [4]

extended to support Fabric Data Warehouses [1]

{benefit} automation-friendly API Access

allows to create, update, read, and delete Warehouse items via Fabric REST APIs using service principals [1]
enables to automate repetitive tasks without relying on user credentials [1]

e.g. provisioning or managing warehouses
increases security by limiting human error

the warehouses thus created, will be displayed in the Workspace list view in Fabric UI, with the Owner name of the SPN [1]
applicable to users with administrator, member, or contributor workspace role [3]
minimizes risk

the warehouses created with delegated account or fixed identity (owner’s identity) will stop working when the owner leaves the organization [1]

Fabric requires the user to login every 30 days to ensure a valid token is provided for security reasons [1]

{benefit} seamless integration with Client Tools:

tools like SSMS can connect to the Fabric DWH using SPN [1]
SPN provides secure access for developers to

run COPY INTO

with and without firewall enabled storage [1]

run any T-SQL query programmatically on a schedule with ADF pipelines [1]

{benefit} granular access control

Warehouses can be shared with an SPN through the Fabric portal [1]

once shared, administrators can use T-SQL commands to assign specific permissions to SPN [1]

allows to control precisely which data and operations an SPN has access to [1]

GRANT SELECT ON <table name> TO <Service principal name>

warehouses' ownership can be changed from an SPN to user, and vice-versa [3]

{benefit} improved DevOps and CI/CD Integration

SPN can be used to automate the deployment and management of DWH resources [1]

⇐ ensures faster, more reliable deployment processes while maintaining strong security postures [1]

{limitation} default semantic models are not supported for SPN created warehouses [3]

⇒ features such as listing tables in dataset view, creating report from the default dataset don’t work [3]

{limitation} SPN for SQL analytics endpoints is not currently supported
{limitation} SPNs are currently not supported for COPY INTO error files [3]

⇐ Entra ID credentials are not supported as well [3]

{limitation} SPNs are not supported for GIT APIs. SPN support exists only for Deployment pipeline APIs [3]
monitoring tools

[DMV] sys.dm_exec_sessions.login_name column [3]
[Query Insights] queryinsights.exec_requests_history.login_name [3]
Query activity

submitter column in Fabric query activity [3]

Capacity metrics app:

compute usage for warehouse operations performed by SPN appears as the Client ID under the User column in Background operations drill through table [3]

Previous Post <<||>> Next Post

References:

[1] Microsoft Fabric Updates Blog (2024) Service principal support for Fabric Data Warehouse [link]

[2] Microsoft Fabric Learn (2024) Service principal support in Data Factory [link]

[3] Microsoft Fabric Learn (2024) Service principal in Fabric Data Warehouse [link]

[4] Microsoft Fabric Learn (2024) Register a Microsoft Entra app and create a service principal [link]

[5] Microsoft Fabric Updates Blog (2024) Announcing Service Principal support for Fabric APIs [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:

ADF - Azure Data Factory

API - Application Programming Interface

CI/CD - Continuous Integration/Continuous Deployment

DMV - Dynamic Management View

DWH - Data Warehouse

SPN - service principal

SSMS - SQL Server Management Studio

07 December 2024

🏭 💠Data Warehousing: Microsoft Fabric (Part VI: SQL Databases for OLTP scenarios) [new feature]

Data Warehousing Series

One interesting announcements at Ignite is the availability in public preview of SQL databases in Microsoft Fabric, "a versatile and developer-friendly transactional database built on the foundation of Azure SQL database". With this Fabric can address besides OLAP also OLTP scenarios, evolving thus from analytics to a data platform [1]. According to the announcement, besides the AI-optimized architectural aspects, the feature makes the SQL Azure simple, autonomous and secure by design [1], and these latest aspects are considered in this post.

Simplicity revolves around the deployment and configuration of databases, the creation of a new database requiring giving a name and the database is created in seconds [1]. It’s a considerable improvement compared with the relatively complex setup needed for on-premise configurations, though sometimes more flexibility in configuration is needed upfront or over database’s lifetime. To get a database ready for testing one can import a sample database or get specific data via data flows and/or pipelines [1]. As development tools one can use Visual Studio Code or SSMS [1], and probably more tools will be available in time.

The integration with both GitHub and Azure DevOps allows to configure each database under source control, which is needed for many scenarios especially when multiple resources make changes to the database objects [1]. Frankly, that’s mainly important during the development phase, respectively in scenarios in which multiple people make in parallel changes to the logic. It will be interesting to see how much overhead or challenges the feature adds to development and how smoothly everything works together!

The most important aspect for many solutions is the replication of data in near-real time to the (open-source) delta parquet format in OneLake and thus making the data available for analytics almost immediately [1]. Probably, from this aspect many cloud-based applications can benefit, even if the performance might not be as good as in other well-established architectures. However, there are many other scenarios in which one needs to maintain and use data for OLTP/OLAP purposes. This invites adequate testing and a good weighting of the advantages and disadvantages involved.

A SQL database is a native item in Fabric, and therefore it utilizes Fabric capacity units like other Fabric workloads [1]. One can use the Fabric SKU estimator (still in private preview) to estimate the costs [2], though it will be interesting to see how cost-effective the solutions are. Probably, especially when the infrastructure is already available outside of Fabric, it will be easier and cost-effective to use the mirroring functionality. One should test and have a better estimator before moving blindly from the existing infrastructure to Fabric.

SQL databases in Fabric are autonomous by design, while allowing to get the best performance and availability by default [1]. High availability is reached through zone redundancy, while performance is achieved by scaling automatically the storage and compute to accommodate the workloads [1]. The auto-optimization capability is achieved with the help of the latest Intelligent Query Processing (IQP) enhancements, respectively the creation of missing indexes to improve query performance [1]. It will be interesting to see how the whole process works, given that the maintenance of indexes usually involves some challenges (e.g. identifying covering indexes, indexes needed only for temporary workloads, duplicated indexes).

SQL databases in Fabric are automatically configured for high availability with zone redundancy, while storage and compute scale automatically to accommodate the user workload [1]. The database is auto-optimized through the latest IQP enhancements while the system creates any missing indexes to improve query performance. All data is replicated to OneLake by default [1]. Finally, the database always receives the latest security updates with auto-patching, while automatic backups help in disaster recovery scenarios [1], which can be of real help for database administrators.

References:
[1] Microsoft Fabric Updates Blog (2024) Announcing SQL database in Microsoft Fabric Public Preview [link]
[2] Microsoft Fabric Updates Blog (2024) Announcing New Recruitment for the Private Preview of Microsoft Fabric SKU Estimator [link]

06 May 2024

🧭🏭Business Intelligence: Microsoft Fabric (Part III: The Metrics Layer) 🆕

Introduction

One of the announcements of this year's Microsoft Fabric Community first conference was the introduction of a metrics layer in Fabric which "allows organizations to create standardized business metrics, that are rooted in measures and are discoverable and intended for reuse" [1]. As it seems, the information content provided at the conference was kept to a minimum given that the feature is still in private preview, though several webcasts start to catch up on the topic (see [2], [4]). Moreover, as part of their show, the Explicit Measures (@PowerBITips) hosts had Carly Newsome as invitee, the manager of the project, who unveiled more details about the project and the feature, details which became the main source for the information below.

The idea of a metric layer or metric store is not new, data professionals occasionally refer to their structure(s) of metrics as such. The terms gained weight in their modern conception relatively recently in 2021-2022 (see [5], [6], [7], [8], [10]). Within the modern data stack, a metrics layer or metric store is an abstraction layer available between the data store(s) and end users. It allows to centrally define, store, and manage business metrics. Thus, it allows us to standardize and enforce a single source of truth (SSoT), respectively solve several issues existing in the data stacks. As Benn Stancil earlier remarked, the metrics layer is one of the missing pieces from the modern data stack (see [10]).

Microsoft's Solution

Microsoft's business case for metrics layer's implementation is based on three main ideas (1) duplicate measures contribute to poor data quality, (2) complex data models hinder self-service, (3) reduce data silos in Power BI. In Microsoft's conception the metric layer provides several benefits: consistent definitions and descriptions, easy management via management views, searchable and discoverable metrics, respectively assure trust through indicators.

For this feature's implementation Microsoft introduces a new Fabric Item called a metric set that allows to group several (business) metrics together as part of a mini-model that can be tailored to the needs of a subset of end-users and accessed by them via the standard tools already available. The metric set becomes thus a mini-model. Such mini-models allow to break down and reduce the overall complexity of semantic models, while being easy to evolve and consume. The challenge will become then on how to break down existing and future semantic models into nonoverlapping mini-models, creating in extremis a partition (see the Lego metaphor for data products). The idea of mini-models is not new, [12] advocating the idea of using a Master Model, a technique for creating derivative tabular models based on a single tabular solution.

A (business) metric is a way to elevate the measures from the various semantic models existing in the organization within the mini-model defined by the metric set. A metric can be reused in other fabric artifacts - currently in new reports on the Power BI service, respectively in notebooks by copying the code. Reusing metrics in other measures can mean that one can chain metrics and the changes made will be further propagated downstream.

The Metrics Layer in Microsoft Fabric (adapted diagram)

Every metric is tied to the original semantic model which allows thus to track how a metric is used across the solutions and, looking forward to Purview, to identify data's lineage. A measure is related to a "table", the source from which the measure came from.

Users' Perspective

The Metrics Layer feature is available in Microsoft Fabric service for Power BI within the Metrics menu element next to Scorecards. One starts by creating a metric set in an existing workspace, an operation which creates the actual artifact, to which the individual metrics are added. To create a metric, a user with build permissions can navigate through the semantic models across different workspaces he/she has access to, pick a measure from one of them and elevate it to a metric, copying in the process its measure's definition and description. In this way the metric will always point back to the measure from the semantic model, while the metrics thus created are considered as a related collection and can be shared around accordingly.

Once a metric is added to the metric set, one can add in edit mode dimensions to it (e.g. Date, Category, Product Id, etc.). One can then further explore a metric's output and add filters (e.g. concentrate on only one product or category) point from which one can slice-and-dice the data as needed.

There is a panel where one can see where the metric has been used (e.g. in reports, scorecards, and other integrations), when was last time refreshed, respectively how many times was used. Thus, one has the most important information in one place, which is great for developers as well as for the users. Probably, other metadata will be added, such as whether an increase in the metric would be favorable or unfavorable (like in Tableau Pulse, see [13]) or maybe levels of criticality, an unit of measure, or maybe its type - simple metric, performance indicator (PI), result indicator (RI), KPI, KRI etc.

Metrics can be persisted to the OneLake by saving their output to a delta table into the lakehouse. As demonstrated in the presentation(s), with just a copy-paste and a small piece of code one can materialize the data into a lakehouse delta table, from where the data can be reused as needed. Hopefully, the process will be further automated.

One can consume metrics and metrics sets also in Power BI Desktop, where a new menu element called Metric sets was added under the OneLake data hub, which can be used to connect to a metric set from a Semantic model and select the metrics needed for the project.

Tapping into the available Power BI solutions is done via an integration feature based on Sempy fabric package, a dataframe for storage and propagation of Power BI metadata which is part of the python-based semantic Link in Fabric [11].

Further Thoughts

When dealing with a new feature, a natural idea comes to mind: what challenges does the feature involve, respectively how can it be misused? Given that the metrics layer can be built within a workspace and that it can tap into the existing measures, this means that one can built on the existing infrastructure. However, this can imply restructuring, refactoring, moving, and testing a lot of code in the process, hopefully with minimal implications for the solutions already available. Whether the process is as simple as imagined is another story. As misusage, in extremis, data professionals might start building everything as metrics, though the danger might come when the data is persisted unnecessarily.

From a data mesh's perspective, a metric set is associated with a domain, though there will be metrics and data common to multiple domains. Moreover, a mini-model has the potential of becoming a data product. Distributing the logic across multiple workspaces and domains can add further challenges, especially in what concerns the synchronization and implemented of requirements in a way that doesn't lead to bottlenecks. But this is a general challenge for the development team(s).

The feature will probably suffer further changes until is released in public review (probably by September or the end of the year). I subscribe to other data professionals' opinion that the feature was for long needed and that can have an important impact on the solutions built.

Previous Post <<||>> Next Post

Resources:
[1] Microsoft Fabric Blog (2024) Announcements from the Microsoft Fabric Community Conference (link)
[2] Power BI Tips (2024) Explicit Measures Ep. 236: Metrics Hub, Hot New Feature with Carly Newsome (link)
[3] Power BI Tips (2024) Introducing Fabric Metrics Layer / Power Metrics Hub [with Carly Newsome] (link)
[4] KratosBI (2024) Fabric Fridays: Metrics Layer Conspiracy Theories #40 (link)
[5] Chris Webb's BI Blog (2022) Is Power BI A Semantic Layer? (link)
[6] The Data Stack Show (2022) TDSS 95: How the Metrics Layer Bridges the Gap Between Data & Business with Nick Handel of Transform (link)
[7] Sundeep Teki (2022) The Metric Layer & how it fits into the Modern Data Stack (link)
[8] Nick Handel (2021) A brief history of the metrics store (link)
[9] Aurimas (2022) The Jungle of Metrics Layers and its Invisible Elephant (link)
[10] Benn Stancil (2021) The missing piece of the modern data stack (link)
[11] Microsoft Learn (2024) Sempy fabric Package (link)
[12] Michael Kovalsky (2019) Master Model: Creating Derivative Tabular Models (link)
[13] Christina Obry (2023) The Power of a Metrics Layer - and How Your Organization Can Benefit From It (link)
[14] KratosBI (2024) Introducing the Metrics Layer in #MicrosoftFabric with Carly Newsome [link]

Resources:

[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

02 November 2016

♟️Strategic Management: Integration (Just the Quotes)

"By integration we mean the process of achieving unity of effort among the various subsystems in the accomplishment of the organization's tasks." (Paul R Lawrence, "Organization and environment: Managing differentiation and integration", 1967)

"No matter how difficult or unprecedented the problem, a breakthrough to the best possible solution can come only from a combination of rational analysis, based on the real nature of things, and imaginative reintegration of all the different items into a new pattern, using nonlinear brainpower. This is always the most effective approach to devising strategies for dealing successfully with challenges and opportunities, in the market arena as on the battlefield." (Kenichi Ohmae, "The Mind Of The Strategist", 1982)

"Culture [is] a pattern of basic assumptions invented, discovered, or developed by a given group as it learns to cope with its problems of external adaptation and internal integration that has worked well enough to be considered valid and, therefore, to be taught to new members as the correct way to perceive, think, and feel in relation to those problems." (Edgar H Schein, "Organizational Culture and Leadership", 1985)

"To keep the business from disintegrating, the concept of information systems architecture is becoming less of an option and more of a necessity." (John Zachman, "A Framework for Information Systems Architecture", 1987)

"Conventional process structures are fragmented and piecemeal, and they lack the integration necessary to maintain quality and service. They are breeding grounds for tunnel vision, as people tend to substitute the narrow goals of their particular department for the larger goals of the process as a whole. When work is handed off from person to person and unit to unit, delays and errors are inevitable. Accountability blurs, and critical issues fall between the cracks." (Michael M Hammer, "Reengineering Work: Don't Automate, Obliterate", Magazine, 1990) [source]

"But the net effect of increasing scale, centralization of capital, vertical integration and diversification within the corporate form of enterprise has been to replace the 'invisible hand' of the market by the 'visible hand' of the managers." (David Harvey, "The Limits To Capital", 2006)

02 November 2007

🏗️Software Engineering: Integration (Just the Quotes)

"With increasing size and complexity of the implementations of information systems, it is necessary to use some logical construct (or architecture) for defining and controlling the interfaces and the integration of all of the components of the system." (John Zachman, "A Framework for Information Systems Architecture", 1987)

"The longer we wait between integrations and acceptance tests, the worse things get. Wait twice as long and we'll have four or more times the hassle. The reason is that one bug written just yesterday is pretty easy to find, while ten or a hundred written weeks ago can become almost impossible." (Ron Jeffries, "Extreme Programming Installed", 2001)

"The main activity of programming is not the origination of new independent programs, but in the integration, modification, and explanation of existing ones." (Terry Winograd, "Beyond Programming Languages", 1991)

"As the size of software systems increases, the algorithms and data structures of the computation no longer constitute the major design problems. When systems are constructed from many components, the organization of the overall system - the software architecture - presents a new set of design problems. This level of design has been addressed in a number of ways including informal diagrams and descriptive terms, module interconnection languages, templates and frameworks for systems that serve the needs of specific domains, and formal models of component integration mechanisms." (David Garlan & Mary Shaw, "An introduction to software architecture", Advances in software engineering and knowledge engineering Vol 1, 1993)

"Enterprise architecture is the organizing logic for business processes and IT infrastructure reflecting the integration and standardization requirements of a company's operation model. […] The key to effective enterprise architecture is to identify the processes, data, technology, and customer interfaces that take the operating model from vision to reality." (Jeanne W Ross et al, "Enterprise architecture as strategy: creating a foundation for business", 2006)

"Enterprise-architecture is the integration of everything the enterprise is and does. Even the term ‘architecture’ is perhaps a little misleading. It’s on a much larger scale, the scale of the whole rather than of single subsystems: more akin to city-planning than to the architecture of a single building. In something this large, there are no simple states of ‘as-is’ versus ‘to-be’, because its world is dynamic, not static. And it has to find some way to manage the messy confusion of what is, rather than the ideal that we might like it to be." (Tom Graves, "Real Enterprise-Architecture : Beyond IT to the whole enterprise", 2007)

"Acceptance testing relies on the ability to execute automated tests in a productionlike environment. However, a vital property of such a test environment is that it is able to successfully support automated testing. Automated acceptance testing is not the same as user acceptance testing. One of the differences is that automated acceptance tests should not run in an environment that includes integration to all external systems. Instead, your acceptance testing should be focused on providing a controllable environment in which the system under test can be run. 'Controllable' in this context means that you are able to create the correct initial state for our tests. Integrating with real external systems removes our ability to do this." (David Farley & Jez Humble, "Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation", 2010)

"Interim solutions, however, acquire inertia (or momentum, depending on your point of view). Because they are there, ultimately useful and widely accepted, there is no immediate need to do anything else. Whenever a stakeholder has to decide what action adds the most value, there will be many that are ranked higher than proper integration of an interim solution. Why? Because it is there, it works, and it is accepted. The only perceived downside is that it does not follow the chosen standards and guidelines - except for a few niche markets, this is not considered to be a significant force." (Klaus Marquardt, [in Kevlin Henney’s "97 Things Every Programmer Should Know", 2010])

"Many processes in software development are repetitive and easily automated. The DRY principle applies in these contexts, as well as in the source code of the application. Manual testing is slow, error-prone, and difficult to repeat, so automated test suites should be used where possible. Integrating software can be time consuming and error-prone if done manually, so a build process should be run as frequently as possible, ideally with every check-in. Wherever painful manual processes exist that can be automated, they should be automated and standardized. The goal is to ensure that there is only one way of accomplishing the task, and it is as painless as possible." (Steve Smith, [in Kevlin Henney’s "97 Things Every Programmer Should Know", 2010])

"In many applications, integration or functional tests are used by default as the standard way to test almost all aspects of the system. However integration and functional tests are not the best way to detect and identify bugs. Because of the large number of components involved in a typical end-to-end test, it can be very hard to know where something has gone wrong. In addition, with so many moving parts, it is extremely difficult, if not completely unfeasible, to cover all of the possible paths through the application." (John F Smart, "Jenkins: The Definitive Guide", 2011)

SQL Troubles

Pages

06 October 2025

🏭🗒️Microsoft Fabric: Git [Notes]

30 September 2025

🤖〽️Prompt Engineering: Copilot Unabridged (Part 102: Mapping the Next Decade of AI - A Journey Through Innovation)

16 April 2025

🧮ERP: Implementations (Part XIV: A Never-Ending Story)

15 April 2025

🧮ERP: Implementations (Part XII: The Process Perspective)

13 April 2025

🏭🗒️Microsoft Fabric: Continuous Integration & Continuous Deployment [CI/CD] [Notes]

20 January 2025

🏭🗒️Microsoft Fabric: [Azure] Service Principals (SPN) [Notes]

07 December 2024

🏭 💠Data Warehousing: Microsoft Fabric (Part VI: SQL Databases for OLTP scenarios) [new feature]

06 May 2024

🧭🏭Business Intelligence: Microsoft Fabric (Part III: The Metrics Layer) 🆕

02 November 2016

♟️Strategic Management: Integration (Just the Quotes)

02 November 2007

🏗️Software Engineering: Integration (Just the Quotes)

About Me