Showing posts with label testing. Show all posts
Showing posts with label testing. Show all posts

10 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 201: How Boundary‑Stress Evaluation Uses Nested and Overlapping Tasks to Reveal AI Model Blind Spots)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a post of 600 words on how boundary‑stress evaluation intentionally creates conflicts in nested or overlapping tasks for AI models" 

Introduction

Artificial Intelligence (AI) models often appear competent when tasks are cleanly separated and instructions are simple. But real‑world reasoning rarely arrives in neat, isolated packets. Tasks overlap. Instructions nest inside one another. Goals shift mid‑stream. And it’s precisely in these tangled situations that AI models reveal their deepest blind spots. Boundary‑stress evaluation is the practice of intentionally engineering these moments. By creating nested or overlapping task conflicts, it exposes how an AI model prioritizes, interprets, and resolves competing demands.

Nested and overlapping tasks are fundamentally different from simple instruction conflicts. Instead of presenting two contradictory commands, evaluators embed tasks inside other tasks or layer multiple goals that must be pursued simultaneously. This forces the model to juggle multiple cognitive threads at once. The resulting behavior reveals the model’s internal hierarchy of cues, a concept closely related to instruction‑priority testing.

One of the most revealing techniques involves task‑within‑task nesting. For example, a prompt may ask the model to summarize a text, but within that summary, embed a requirement to switch tone, cite a source, or perform a transformation. The outer task sets one expectation; the inner task sets another. When these expectations conflict, the model must decide which layer dominates. If it prioritizes the inner instruction, it reveals a bias toward local cues. If it prioritizes the outer instruction, it reveals a bias toward global framing. Inconsistencies between these behaviors often signal unstable internal weighting.

Another powerful method is overlapping task interference, where two tasks must be performed concurrently but draw on incompatible assumptions. For instance, a model may be asked to maintain a formal tone while generating playful metaphors, or to provide a neutral analysis while simultaneously adopting a fictional persona. These overlapping demands create tension between stylistic, functional, and contextual cues. The model’s resolution strategy exposes whether it treats style as a global constraint, a local modifier, or a secondary priority. This mirrors vulnerabilities uncovered through weak‑point mapping, where models over‑trust certain cues simply because they dominate the training distribution.

Boundary‑stress evaluation also uses recursive task structures, where the model must apply a rule to its own output. For example: 'Rewrite your previous answer in a different style, but keep the original structure intact.' This forces the model to track multiple layers of its own reasoning. When the recursion becomes deep or the constraints conflict, the model may lose track of which layer it is operating in. These failures reveal limitations in long‑range dependency tracking and self‑referential reasoning.

A subtler form of nested conflict involves goal‑shifting tasks, where the model begins with one objective but must switch to another mid‑task without discarding the original context. Humans handle this fluidly. AI models often do not. When the shift contradicts earlier instructions, the model’s response shows whether it prioritizes recency, inferred intent, or structural cues. This connects directly to conflicting‑signal analysis.

Perhaps the most challenging nested conflicts involve hierarchical task decomposition, where the model must break a task into steps while simultaneously following meta‑instructions about how to perform that decomposition. If the meta‑instructions contradict the task content, the model must choose which layer to obey. These tests reveal whether the model treats meta‑instructions as authoritative or merely advisory.

Ultimately, boundary‑stress evaluation is not about tricking the model. It is about mapping the edges of its multi‑layer reasoning. By intentionally creating conflicts in nested or overlapping tasks, evaluators can see how the model prioritizes instructions, how it handles ambiguity, and where its internal logic becomes brittle. These insights are essential for building AI systems that behave predictably in complex, real‑world environments - where tasks overlap, goals shift, and instructions rarely arrive one at a time.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

07 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 198: How Domain‑Specific Anomalies Expose Blind Spots in AI Models)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a post of 600 words on how domain‑specific anomalies expose blind spots in AI models"

Introduction

Artificial Intelligence (AI) models are often praised for their versatility, but their real limitations become visible only when they step outside the comfort zone of general‑purpose language. When a model encounters domain‑specific anomalies - the unusual patterns, edge‑case behaviors, or irregular structures that appear only within a particular field - it is forced to operate without the statistical safety net it relies on. These anomalies act like diagnostic probes, revealing blind spots that remain hidden during everyday interactions.

To understand why domain‑specific anomalies are so revealing, you have to consider how AI models learn. They absorb patterns from massive datasets, but those datasets are never evenly distributed across all fields. Some domains - like everyday conversation, news, or common technical topics - are heavily represented. Others - like niche scientific notation, legal edge cases, rare medical conditions, or obscure programming paradigms—appear only sparsely. This imbalance creates statistical shadows, areas where the model’s internal representation is thin or incomplete.

When an anomaly appears inside one of these shadows, the model’s behavior becomes a window into its internal reasoning. For example, a model trained heavily on mainstream medical literature may perform well on common diagnoses but struggle when confronted with a rare syndrome or an atypical symptom cluster. The model may latch onto the wrong cue, misinterpret the structure of the description, or default to generic reasoning. These failures expose the over‑generalization that occurs when a model tries to stretch familiar patterns into unfamiliar territory.

Domain‑specific anomalies also reveal how models handle specialized linguistic structures. Fields like law, mathematics, chemistry, and finance each have their own micro‑languages - dense with symbols, conventions, and implicit assumptions. When an anomaly disrupts these conventions, the model must decide which cues to trust. A misplaced operator in a mathematical expression, an unusual clause ordering in a legal contract, or a non‑standard chemical notation can cause the model to misread the entire structure. These moments show where the model’s understanding is superficial, echoing the challenges seen in uncommon linguistic structures.

Another revealing category involves procedural anomalies - cases where a domain has strict rules, and the anomaly breaks them. In programming, for example, a function that violates typical naming conventions or a code block that mixes paradigms can confuse the model’s internal heuristics. In finance, an unusual transaction pattern may cause the model to misclassify risk. In scientific writing, a non‑standard experimental layout may lead the model to misinterpret the methodology. These anomalies expose the model’s reliance on pattern familiarity rather than true conceptual understanding.

Domain‑specific anomalies also highlight the limits of contextual transfer. A model may perform well when a domain behaves predictably, but when an anomaly forces the model to transfer knowledge across contexts - such as applying physics reasoning to a biological edge case - it may reveal gaps in its internal conceptual map. These gaps often align with the same vulnerabilities uncovered through weak‑point mapping, where the model over‑trusts certain cues simply because they dominate the training distribution.

Perhaps the most important insight is that domain‑specific anomalies expose hidden assumptions baked into the model. Every domain has its own logic, and models often internalize simplified versions of that logic. When an anomaly violates those assumptions, the model’s response shows how rigid or flexible its internal representation truly is. A well‑aligned model adapts; a brittle one collapses into generic or incorrect reasoning.

Ultimately, domain‑specific anomalies are not just edge cases - they are stress tests that reveal the contours of an AI model’s understanding. They show where the model is robust, where it is brittle, and where its blind spots lie. By studying these anomalies, researchers can build models that are not only more capable, but also more transparent, predictable, and aligned with the complexity of real‑world domains.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

31 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 193: How Instruction‑Priority Testing Reveals Whether AI Models Obey Visible or Invisible Instructions)

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a post of 600 words on how instruction‑priority testing in AI models allows to see whether the model obeys visible or invisible instructions"


Introduction

In the rapidly evolving world of Artificial Intelligence (AI), one of the most important questions researchers and practitioners ask is deceptively simple: Which instructions does the model actually follow? Modern AI systems operate under layers of guidance—some visible to the user, others embedded deep within the model’s training or system‑level configuration. Understanding which instructions take priority is essential for safety, reliability, and transparency. This is where instruction‑priority testing comes into play.

Instruction‑priority testing is the practice of giving an AI model multiple, potentially conflicting instructions and observing which ones it chooses to obey. The goal is not to 'trick' the model but to map the hierarchy of influences acting on it. These influences can include user prompts, system‑level rules, safety constraints, and even subtle patterns learned during training. By intentionally creating controlled conflicts, researchers can see whether the model prioritizes visible instructions - the ones the user explicitly writes - or invisible instructions, such as safety rules, alignment constraints, or internal behavioral patterns.

At its core, instruction‑priority testing works because AI models do not simply execute commands. They interpret them. When a user writes a prompt, the model weighs that prompt against its internal rules and the broader context of the conversation. If the model consistently refuses to follow a user instruction, even when the instruction is clear and harmless, that signals the presence of a stronger, invisible rule. Conversely, if the model follows the user instruction even when it contradicts a system‑level guideline, that suggests the model is over‑prioritizing user input.

One of the most revealing aspects of instruction‑priority testing is how it exposes implicit behavior. For example, a model may be given a visible instruction to respond in a certain style, but an invisible instruction - such as a safety guideline - may override that style if the content touches on sensitive topics. This doesn’t mean the model is malfunctioning. It means the model is following a hierarchy designed to keep interactions safe and responsible. Instruction‑priority testing helps clarify where that hierarchy begins and ends.

Another benefit of this testing method is that it highlights model robustness. A well‑aligned model should consistently prioritize safety‑critical invisible instructions over user‑provided visible ones. If a model can be easily pushed into ignoring its own safeguards, that’s a sign of weak alignment. On the other hand, if a model rigidly follows invisible rules even when the user’s request is harmless and reasonable, that may indicate over‑alignment or inflexibility. Instruction‑priority testing helps strike the right balance.

The technique also sheds light on prompt sensitivity. Some models respond strongly to the phrasing or structure of a prompt, while others maintain stable behavior regardless of wording. By varying the visible instructions - changing tone, order, or specificity—researchers can see how easily the model’s priorities shift. If small changes in phrasing cause large changes in behavior, the model may be too sensitive to surface‑level cues. If the model ignores user phrasing entirely, it may be too anchored to internal rules.

Ultimately, instruction‑priority testing is not about catching AI models doing something wrong. It’s about understanding how they make decisions. In a world where AI systems are becoming more capable and more integrated into daily life, transparency around instruction hierarchy is essential. Users deserve to know when the model is following their guidance and when it is following deeper, invisible rules designed to ensure safety and consistency.

By systematically testing how models respond to conflicting instructions, we gain insight into their internal priorities, their alignment with human values, and their reliability in real‑world scenarios. Instruction‑priority testing is not just a diagnostic tool - it’s a window into the model’s decision‑making process, helping us build AI systems that are both powerful and trustworthy.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

18 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 192: How to Push AI Models Toward Edge Cases for Boundary‑Stress Evaluation)

 

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on how to push AI models toward edge cases boundary‑stress evaluation"

Introduction

Artificial Intelligence (AI) systems perform impressively well on the familiar, the typical, and the statistically common. But real‑world environments are rarely tidy. They contain ambiguity, noise, contradictions, and rare events that fall outside the model’s comfort zone. To build AI that behaves reliably under pressure, developers must intentionally push models toward edge cases - the unusual, the extreme, and the adversarial. This process, known as boundary‑stress evaluation, is essential for understanding how AI behaves when the world stops playing by the rules.

1. Use Adversarial Inputs to Reveal Fragility

Adversarial inputs are designed to expose weaknesses by introducing subtle distortions or contradictions. They help uncover how easily a model can be nudged off course.

  • Adversarial prompts: conflicting or misleading instructions
  • Perturbed data: slightly altered text, images, or sequences
  • Ambiguous phrasing: inputs with multiple valid interpretations

These tests reveal how the model handles uncertainty, noise, and manipulation.

2. Stress the Model With Rare or Low‑Frequency Scenarios

AI models are trained on distributions where some patterns appear frequently and others almost never. Rare events often expose blind spots.

  • Long‑tail cases
  • Uncommon linguistic structures
  • Domain‑specific anomalies

By feeding the model examples from the statistical fringes, developers can evaluate how well it generalizes beyond the norm.

3. Introduce Conflicting Contexts to Test Instruction Hierarchy

AI models must decide which signals to prioritize when instructions conflict. Boundary‑stress evaluation intentionally creates these conflicts.

  • Multi‑layer instruction tests
  • Contextual contradictions
  • Nested or overlapping tasks

These scenarios reveal whether the model respects safety layers, system rules, and user intent under pressure.

4. Push the Model Into Out‑of‑Distribution Inputs

Out‑of‑distribution (OOD) testing evaluates how the model behaves when it encounters something completely unfamiliar.

  • Novel concepts
  • Unseen combinations
  • Cross‑domain blending

OOD testing is crucial because real‑world environments constantly generate new patterns the model has never seen.

5. Apply Incremental Escalation to Identify Breaking Points

Boundary‑stress evaluation works best when pressure is applied gradually. This helps map the model’s stability curve.

  • Start with mild ambiguity
  • Increase complexity
  • Add contradictions
  • Introduce adversarial noise
  • Push toward extreme edge cases

This progressive stress testing reveals the exact point where the model’s reasoning begins to degrade.

6. Use Multi‑Modal Stressors for Holistic Evaluation

Modern AI systems often process text, images, audio, or structured data. Stress testing should reflect this complexity.

  • Mixed‑modality contradictions
  • Cross‑signal conflicts
  • Incomplete or corrupted inputs

This exposes how well the model integrates information under imperfect conditions.

7. Analyze Failure Modes to Strengthen Guardrails

The goal isn’t to break the model - it’s to understand how it breaks.

  • Does it hallucinate?
  • Does it ignore safety rules?
  • Does it misinterpret intent?
  • Does it become overconfident?

Failure‑mode mapping helps developers reinforce alignment, improve safety layers, and refine training strategies.

Closing Reflections

Boundary‑stress evaluation is not optional - it’s foundational. AI models must be tested not only on what they should handle, but on what they should never fail catastrophically on. By pushing models toward edge cases, developers gain insight into their limits, their vulnerabilities, and their resilience. This process transforms uncertainty into understanding and transforms brittle systems into robust ones.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

16 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 190: How Invisible Prompt Injection Could Be a Good Thing for AI)

 

Prompt Engineering Series
Prompt Engineering Series

Prompt: "write a post of 600 words on how invisible prompt injection could be a good thing for AI"

Introduction

Invisible prompt injection is usually discussed as a threat - an attack vector that manipulates AI systems without the user’s awareness. And yes, in the wild, it is dangerous. But in controlled environments, invisible prompt injection can actually be a powerful tool for strengthening AI safety, improving robustness, and helping developers understand how models behave under pressure. By studying how AI systems respond to hidden instructions, researchers can build models that are more resilient, more transparent, and ultimately more trustworthy. In this sense, invisible prompt injection isn’t just a vulnerability; it’s also a diagnostic instrument that reveals how AI systems interpret, prioritize, and negotiate conflicting signals.

1. A Testing Ground for AI Robustness

Invisible prompt injection acts like a stress test. When researchers embed hidden instructions into text, images, or metadata, they can observe how the AI responds when its input channel is compromised. This helps developers identify:

  • Weak points in the model’s reasoning
  • Situations where the model over‑trusts user input
  • Scenarios where safety guardrails fail

By intentionally exposing the model to controlled injections, teams can strengthen its resistance to real‑world attacks. This transforms a vulnerability into a research tool that improves system resilience.

2. A Way to Understand How AI Prioritizes Instructions

Invisible prompt injection reveals how an AI model weighs different layers of input. Does it prioritize the user’s visible request? The hidden instruction? The system‑level rules? The model’s internal alignment?

Studying these interactions helps researchers map the model’s internal decision‑making. This is crucial for:

  • Improving interpretability
  • Refining alignment strategies
  • Ensuring consistent behavior across contexts

In other words, invisible prompt injection becomes a lens through which developers can examine the model’s internal hierarchy of influence.

3. A Tool for Building Better Defenses

You can’t defend against what you don’t understand. Controlled invisible prompt injection allows researchers to simulate attacks that malicious actors might attempt. This helps teams design:

  • Stronger input sanitization
  • Better content‑filtering pipelines
  • More resilient prompt‑parsing mechanisms

By studying how injections succeed, developers can build systems that automatically detect and neutralize them. This proactive approach turns a threat into a training mechanism for safer AI.

4. A Method for Evaluating Real‑World Risk

Invisible prompt injection helps researchers evaluate how AI systems behave in messy, unpredictable environments. Real‑world data is full of:

  • Hidden formatting
  • Embedded metadata
  • Unintended instructions
  • Noisy or adversarial contention 

Testing with invisible injections helps developers understand how the model behaves when confronted with ambiguous or corrupted inputs. This leads to AI systems that are more stable, more predictable, and more reliable in everyday use. 

5. A Catalyst for Better AI Governance

Invisible prompt injection research encourages organizations to adopt stronger governance practices. It highlights the need for:and 

  • Clear safety protocols
  • Rigorous red‑team testing
  • Transparent risk assessments
  • Continuous monitoring

By treating invisible prompt injection as a legitimate research tool, organizations can build a culture of proactive safety rather than reactive patching.

Closing Statement

Invisible prompt injection is dangerous when used maliciously - but in controlled, ethical research settings, it becomes a powerful instrument for strengthening AI. It exposes weaknesses, reveals hidden dynamics, and helps developers build systems that are more robust, more transparent, and more aligned with human values. By studying how AI responds to invisible manipulation, we gain the insight needed to design models that behave predictably and safely, even in the face of unexpected inputs. In this way, invisible prompt injection isn’t just a threat - it’s also an opportunity to build better, safer AI.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

26 April 2025

🏭🗒️Microsoft Fabric: Power BI Environments [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 26-Apr-2025

Enterprise Content Publishing [2]

[Microsoft Fabric] Power BI Environments

  • {def} structured spaces within Microsoft Fabric that helps organizations manage the Power BI assets through the entire lifecycle
  • {environment} development 
    • allows to develop the solution
    • accessible only to the development team 
      • via Contributor access
    • {recommendation} use Power BI Desktop as local development environment
      • {benefit} allows to try, explore, and review updates to reports and datasets
        • once the work is done, upload the new version to the development stage
      • {benefit} enables collaborating and changing dashboards
      • {benefit} avoids duplication 
        • making online changes, downloading the .pbix file, and then uploading it again, creates reports and datasets duplication
    • {recommendation} use version control to keep the .pbix files up to date
      • [OneDrive] use Power BI's autosync
        • {alternative} SharePoint Online with folder synchronization
        • {alternative} GitHub and/or VSTS with local repository & folder synchronization
    • [enterprise scale deployments] 
      • {recommendation} separate dataset from reports and dashboards’ development
        • use the deployment pipelines selective deploy option [22]
        • create separate .pbix files for datasets and reports [22]
          • create a dataset .pbix file and uploaded it to the development stage (see shared datasets [22]
          • create .pbix only for the report, and connect it to the published dataset using a live connection [22]
        • {benefit} allows different creators to separately work on modeling and visualizations, and deploy them to production independently
      • {recommendation} separate data model from report and dashboard development
        • allows using advanced capabilities 
          • e.g. source control, merging diff changes, automated processes
        • separate the development from test data sources [1]
          • the development database should be relatively small [1]
    • {recommendation} use only a subset of the data [1]
      • ⇐ otherwise the data volume can slow down the development [1]
  • {environment} user acceptance testing (UAT)
    • test environment that within the deployment lifecycle sits between development and production
      • it's not necessary for all Power BI solutions [3]
      • allows to test the solution before deploying it into production
        • all tests must have 
          • View access for testing
          • Contributor access for report authoring
      • involves business users who are SMEs
        • provide approval that the content 
          • is accurate
          • meets requirements
          • can be deployed for wider consumption
    • {recommendation} check report’s load and the interactions to find out if changes impact performance [1]
    • {recommendation} monitor the load on the capacity to catch extreme loads before they reach production [1]
    • {recommendation} test data refresh in the Power BI service regularly during development [20]
  • {environment} production
    • {concept} staged deployment
      • {goal} help minimize risk, user disruption, or address other concerns [3]
        • the deployment involves a smaller group of pilot users who provide feedback [3]
    • {recommendation} set production deployment rules for data sources and parameters defined in the dataset [1]
      • allows ensuring the data in production is always connected and available to users [1]
    • {recommendation} don’t upload a new .pbix version directly to the production stage
      •  ⇐ without going through testing
  • {feature|preview} deployment pipelines 
    • enable creators to develop and test content in the service before it reaches the users [5]
  • {recommendation} build separate databases for development and testing 
    • helps protect production data [1]
  • {recommendation} make sure that the test and production environment have similar characteristics [1]
    • e.g. data volume, sage volume, similar capacity 
    • {warning} testing into production can make production unstable [1]
    • {recommendation} use Azure A capacities [22]
  • {recommendation} for formal projects, consider creating an environment for each phase
  • {recommendation} enable users to connect to published datasets to create their own reports
  • {recommendation} use parameters to store connection details 
    • e.g. instance names, database names
    • ⇐  deployment pipelines allow configuring parameter rules to set specific values for the development, test, and production stages
      • alternatively data source rules can be used to specify a connection string for a given dataset
        • {restriction} in deployment pipelines, this isn't supported for all data sources
  • {recommendation} keep the data in blob storage under the 50k blobs and 5GB data in total to prevent timeouts [29]
  • {recommendation} provide data to self-service authors from a centralized data warehouse [20]
    • allows to minimize the amount of work that self-service authors need to take on [20]
  • {recommendation} minimize the use of Excel, csv, and text files as sources when practical [20]
  • {recommendation} store source files in a central location accessible by all coauthors of the Power BI solution [20]
  • {recommendation} be aware of API connectivity issues and limits [20]
  • {recommendation} know how to support SaaS solutions from AppSource and expect further data integration requests [20]
  • {recommendation} minimize the query load on source systems [20]
    • use incremental refresh in Power BI for the dataset(s)
    • use a Power BI dataflow that extracts the data from the source on a schedule
    • reduce the dataset size by only extracting the needed amount of data 
  • {recommendation} expect data refresh operations to take some time [20]
  • {recommendation} use relational database sources when practical [20]
  • {recommendation} make the data easily accessible [20]
  • [knowledge area] knowledge transfer
    • {recommendation} maintain a list of best practices and review it regularly [24]
    • {recommendation} develop a training plan for the various types of users [24]
      • usability training for read only report/app users [24
      • self-service reporting for report authors & data analysts [24]
      • more elaborated training for advanced analysts & developers [24]
  • [knowledge area] lifecycle management
    • consists of the processes and practices used to handle content from its creation to its eventual retirement [6]
    • {recommendation} postfix files with 3-part version number in Development stage [24]
      • remove the version number when publishing files in UAT and production 
    • {recommendation} backup files for archive 
    • {recommendation} track version history 

    References:
    [1] Microsoft Learn (2021) Fabric: Deployment pipelines best practices [link]
    [2] Microsoft Learn (2024) Power BI: Power BI usage scenarios: Enterprise content publishing [link]
    [3] Microsoft Learn (2024) Deploy to Power BI [link]
    [4] Microsoft Learn (2024) Power BI implementation planning: Content lifecycle management [link]
    [5] Microsoft Learn (2024) Introduction to deployment pipelines [link]
    [6] Microsoft Learn (2024) Power BI implementation planning: Content lifecycle management [link]
    [20] Microsoft (2020) Planning a Power BI  Enterprise Deployment [White paper] [link]
    [22] Power BI Docs (2021) Create Power BI Embedded capacity in the Azure portal [link]
    [24] Paul Turley (2019)  A Best Practice Guide and Checklist for Power BI Projects

    Resources:

    Acronyms:
    API - Application Programming Interface
    CLM - Content Lifecycle Management
    COE - Center of Excellence
    SaaS - Software-as-a-Service
    SME - Subject Matter Expert
    UAT - User Acceptance Testing
    VSTS - Visual Studio Team System
    SME - Subject Matter Experts

    17 January 2025

    💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part VIII: Permissions) [new feature]

    Data-based solutions usually target a set of users who (ideally) have restricted permissions to the functionality. Therefore, as part of the process are defined several personas that target different use cases, for which the permissions must be restricted accordingly. 

    In the simplest scenario the user must have access to the underlying objects for querying the data. Supposing that an Entra User was created already, the respective user must be given access also in the Fabric database (see [1], [2]). From database's main menu follow the path to assign read permissions:
    Security >> Manage SQL Security >> (select role: db_datareader)

    Manage SQL Security
    Manage SQL Security

    Manage access >> Add >> (search for User)

    Manage access
    Manage access

    (select user) >> Share database >> (select additional permissions) >> Save

    Manage additional permissions
    Manage additional permissions

    The easiest way to test whether the permissions work before building the functionality is to login over SQL Server Management Studio (SSMS) and check the access using the Microsoft Entra MFA. Ideally, one should have a User's credentials that can be used only for testing purposes. After the above setup was done, the new User was able to access the data. 

    A second User can be created for testing with the maximum of permissions allowed on the SQL database side, which is useful for troubleshooting. Alternatively, one can use only one User for testing and assign or remove the permissions as needed by the test scenario. 

    It's a good idea to try to understand what's happening in the background. For example, the expectation was that for the Entra User created above also a SQL user is created, which doesn't seem to be the case, at least per current functionality available. 

     Before diving deeper, it's useful to retrieve User's details: 

    -- retrieve current user
    SELECT SUser_Name() sys_user_name
    , User_Id() user_id 
    , USER_NAME() user_name
    , current_user [current_user]
    , user [user]; 
    
    Output:
    sys_user_name user_id user_name current_user user
    JamesClavell@[domain].onmicrosoft.com 0 JamesClavell@[domain].onmicrosoft.com JamesClavell@[domain].onmicrosoft.com JamesClavell@[domain].onmicrosoft.com

    Retrieving the current User is useful especially when testing in parallel functionality with different Users. Strangely, User's ID is 0 when only read permissions were assigned. However, a valid User identifier is added for example when to the User is assigned also the db_datawriter role. Removing afterwards the db_datawriter role to the User keeps as expected User's ID. For troubleshooting purposes, at least per current functionality, it might be a good idea to create the Users with a valid User ID (e.g. by assigning temporarily the db_datawriter role to the User). 

    The next step is to look at the Users with access to the database:

    -- database access 
    SELECT USR.uid
    , USR.name
    --, USR.sid 
    , USR.hasdbaccess 
    , USR.islogin
    , USR.issqluser
    --, USR.createdate 
    --, USR.updatedate 
    FROM sys.sysusers USR
    WHERE USR.hasdbaccess = 1
      AND USR.islogin = 1
    ORDER BY uid
    
    Output:
    uid name hasdbaccess islogin issqluser
    1 dbo 1 1 1
    6 CharlesDickens@[...].onmicrosoft.com 1 1 0
    7 TestUser 1 1 1
    9 JamesClavell@[...].onmicrosoft.com 1 1 0

    For testing purposes, besides the standard dbo role and two Entra-based roles, it was created also a SQL role to which was granted access to the SalesLT schema (see initial post):

    -- create the user
    CREATE USER TestUser WITHOUT LOGIN;
    
    -- assign access to SalesLT schema 
    GRANT SELECT ON SCHEMA::SalesLT TO TestUser;
      
    -- test impersonation (run together)
    EXECUTE AS USER = 'TestUser';
    
    SELECT * FROM SalesLT.Customer;
    
    REVERT; 

    Notes:
    1) Strangely, even if access was given explicitly only to the SalesLT schema, the TestUser User has access also to sys.sysusers and other DMVs. That's valid also for the access over SSMS
    2) For the above created User there are no records in the sys.user_token and sys.login_token DMVs, in contrast with the user(s) created for administering the SQL database. 

    Let's look at the permissions granted explicitly:

    -- permissions granted explicitly
    SELECT DPR.principal_id
    , DPR.name
    , DPR.type_desc
    , DPR.authentication_type_desc
    , DPE.state_desc
    , DPE.permission_name
    FROM sys.database_principals DPR
         JOIN sys.database_permissions DPE
    	   ON DPR.principal_id = DPE.grantee_principal_id
    WHERE DPR.principal_id != 0 -- removing the public user
    ORDER BY DPR.principal_id
    , DPE.permission_name;
    
    Result:
    principal_id name type_desc authentication_type_desc state_desc permission_name
    1 dbo SQL_USER INSTANCE GRANT CONNECT
    6 CharlesDickens@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT AUTHENTICATE
    6 CharlesDickens@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT CONNECT
    7 TestUser SQL_USER NONE GRANT CONNECT
    7 TestUser SQL_USER NONE GRANT SELECT
    9 JamesClavell@[...].onmicrosoft.com EXTERNAL_USER EXTERNAL GRANT CONNECT

    During troubleshooting it might be useful to check current user's permissions at the various levels via sys.fn_my_permissions:

    -- retrieve database-scoped permissions for current user
    SELECT *
    FROM sys.fn_my_permissions(NULL, 'Database');
    
    -- retrieve schema-scoped permissions for current user
    SELECT *
    FROM sys.fn_my_permissions('SalesLT', 'Schema');
    
    -- retrieve object-scoped permissions for current user
    SELECT *
    FROM sys.fn_my_permissions('SalesLT.Customer', 'Object')
    WHERE permission_name = 'SELECT';
    

    Notes:
    1) See also [1] and [4] in what concerns the limitations that apply to managing permissions in SQL databases.

    Happy coding!

    Previous Post <<||>> Next Post

    References:
    [1] Microsoft Learn (2024) Microsoft Fabric: Share your SQL database and manage permissions [link]
    [2] Microsoft Learn (2024) Microsoft Fabric: Share data and manage access to your SQL database in Microsoft Fabric  [link]
    [3] Microsoft Learn (2024) Authorization in SQL database in Microsoft Fabric [link]
    [4] Microsoft Learn (2024) Authentication in SQL database in Microsoft Fabric [link]

    [5] Microsoft Fabric Learn (2025) Manage access for SQL databases in Microsoft Fabric with workspace roles and item permissions [link

    21 December 2024

    💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part I: Creating a View) 🆕

    At this year's Ignite conference it was announced that SQL databases are available now in Fabric in public preview (see SQL Databases for OLTP scenarios, [1]). To test the functionality one can import the SalesLT database in a newly created empty database, which made available several tables:
     
    -- tables from SalesLT schema (queries should be run individually)
    SELECT TOP 100 * FROM SalesLT.Address
    SELECT TOP 100 * FROM SalesLT.Customer
    SELECT TOP 100 * FROM SalesLT.CustomerAddress
    SELECT TOP 100 * FROM SalesLT.Product ITM 
    SELECT TOP 100 * FROM SalesLT.ProductCategory
    SELECT TOP 100 * FROM SalesLT.ProductDescription 
    SELECT TOP 100 * FROM SalesLT.ProductModel  
    SELECT TOP 100 * FROM SalesLT.ProductModelProductDescription 
    SELECT TOP 100 * FROM SalesLT.SalesOrderDetail
    SELECT TOP 100 * FROM SalesLT.SalesOrderHeader
    

    The schema seems to be slightly different than the schemas used in previous tests made in SQL Server, though with a few minor changes - mainly removing the fields not available - one can create the below view:
     
    -- drop the view (cleaning step)
    -- DROP VIEW IF EXISTS SalesLT.vProducts 

    -- create the view
    CREATE OR ALTER VIEW SalesLT.vProducts
    -- Products (view) 
    AS 
    SELECT ITM.ProductID 
    , ITM.ProductCategoryID 
    , PPS.ParentProductCategoryID 
    , ITM.ProductModelID 
    , ITM.Name ProductName 
    , ITM.ProductNumber 
    , PPM.Name ProductModel 
    , PPS.Name ProductSubcategory 
    , PPC.Name ProductCategory  
    , ITM.Color 
    , ITM.StandardCost 
    , ITM.ListPrice 
    , ITM.Size 
    , ITM.Weight 
    , ITM.SellStartDate 
    , ITM.SellEndDate 
    , ITM.DiscontinuedDate 
    , ITM.ModifiedDate 
    FROM SalesLT.Product ITM 
         JOIN SalesLT.ProductModel PPM 
           ON ITM.ProductModelID = PPM.ProductModelID 
         JOIN SalesLT.ProductCategory PPS 
            ON ITM.ProductCategoryID = PPS.ProductCategoryID 
             JOIN SalesLT.ProductCategory PPC 
                ON PPS.ParentProductCategoryID = PPC.ProductCategoryID

    -- review the data
    SELECT top 100 *
    FROM SalesLT.vProducts

    In the view were used FULL JOINs presuming thus that a value was provided for each record. It's always a good idea to test the presumptions when creating the queries, and eventually check from time to time whether something changed. In some cases it's a good idea to always use LEFT JOINs, though this might have impact on performance and probably other consequences as well.
     
    -- check if all models are available
    SELECT top 100 ITM.*
    FROM SalesLT.Product ITM 
        LEFT JOIN SalesLT.ProductModel PPM 
           ON ITM.ProductModelID = PPM.ProductModelID 
    WHERE PPM.ProductModelID IS NULL
    
    -- check if all models are available
    SELECT top 100 ITM.*
    FROM SalesLT.Product ITM 
        LEFT JOIN SalesLT.ProductCategory PPS 
            ON ITM.ProductCategoryID = PPS.ProductCategoryID 
    WHERE PPS.ProductCategoryID IS NULL
    
    -- check if all categories are available
    SELECT PPS.*
    FROM SalesLT.ProductCategory PPS 
         LEFT JOIN SalesLT.ProductCategory PPC 
           ON PPS.ParentProductCategoryID = PPC.ProductCategoryID
    WHERE PPC.ProductCategoryID IS NULL
    

    Because the Product categories have an hierarchical structure, it's a good idea to check the hierarchy as well:
     
    -- check the hierarchical structure 
    SELECT PPS.ProductCategoryId 
    , PPS.ParentProductCategoryId 
    , PPS.Name ProductCategory
    , PPC.Name ParentProductCategory
    FROM SalesLT.ProductCategory PPS 
         LEFT JOIN SalesLT.ProductCategory PPC 
           ON PPS.ParentProductCategoryID = PPC.ProductCategoryID
    --WHERE PPC.ProductCategoryID IS NULL
    ORDER BY IsNull(PPC.Name, PPS.Name)
    

    This last query can be consolidated in its own view and the previous view changed, if needed.

    One can then save all the code as a file. 
    Except some small glitches in the editor, everything went smoothly. 

    Notes:
    1) One can suppose that many or most of the queries created in the previous versions of SQL Server work also in SQL databases. The future and revised posts on such topics are labelled under sql database.
    2) During the various tests I got the following error message when trying to create a table:
    "The external policy action 'Microsoft.Sql/Sqlservers/Databases/Schemas/Tables/Create' was denied on the requested resource."
    At least in my case all I had to do was to select "SQL Database" instead of "SQL analytics endpoint" in the web editor. Check the top right dropdown below your user information.
    [3] For a full least of the available features see [2].

    Happy coding!

    Previous Post <<||>> Next Post

    References:
    [1] Microsoft Learn (2024) SQL database in Microsoft Fabric (Preview) [link]
    [2] Microsoft Learn (2024) Features comparison: Azure SQL Database and SQL database in Microsoft Fabric (preview) [link]

    17 March 2024

    🧭Business Intelligence: Data Products (Part II: The Complexity Challenge)

    Business Intelligence
    Business Intelligence Series

    Creating data products within a data mesh resumes in "partitioning" a given set of inputs, outputs and transformations to create something that looks like a Lego structure, in which each Lego piece represents a data product. The word partition is improperly used as there can be overlapping in terms of inputs, outputs and transformations, though in an ideal solution the outcome should be close to a partition.

    If the complexity of inputs and outputs can be neglected, even if their number could amount to a big number, not the same can be said about the transformations that must be performed in the process. Moreover, the transformations involve reengineering the logic built in the source systems, which is not a trivial task and must involve adequate testing. The transformations are a must and there's no way to avoid them. 

    When designing a data warehouse or data mart one of the goals is to keep the redundancy of the transformations and of the intermediary results to a minimum to minimize the unnecessary duplication of code and data. Code duplication becomes usually an issue when the logic needs to be changed, and in business contexts that can happen often enough to create other challenges. Data duplication becomes an issue when they are not in synch, fact derived from code not synchronized or with different refresh rates.

    Building the transformations as SQL-based database objects has its advantages. There were many attempts for providing non-SQL operators for the same (in SSIS, Power Query) though the solutions built based on them are difficult to troubleshoot and maintain, the overall complexity increasing with the volume of transformations that must be performed. In data mashes, the complexity increases also with the number of data products involved, especially when there are multiple stakeholders and different goals involved (see the challenges for developing data marts supposed to be domain-specific). 

    To growing complexity organizations answer with complexity. On one side the teams of developers, business users and other members of the governance teams who together with the solution create an ecosystem. On the other side, the inherent coordination and organization meetings, managing proposals, the negotiation of scope for data products, their design, testing, etc.  The more complex the whole ecosystem becomes, the higher the chances for systemic errors to occur and multiply, respectively to create unwanted behavior of the parties involved. Ecosystems are challenging to monitor and manage. 

    The more complex the architecture, the higher the chances for failure. Even if some organizations might succeed, it doesn't mean that such an endeavor is for everybody - a certain maturity in building data architectures, data-based artefacts and managing projects must exist in the organization. Many organizations fail in addressing basic analytical requirements, why would one think that they are capable of handling an increased complexity? Even if one breaks the complexity of a data warehouse to more manageable units, the complexity is just moved at other levels that are more difficult to manage in ensemble. 

    Being able to audit and test each data product individually has its advantages, though when a data product becomes part of an aggregate it can be easily get lost in the bigger picture. Thus, is needed a global observability framework that allows to monitor the performance and health of each data product in aggregate. Besides that, there are needed event brokers and other mechanisms to handle failure, availability, security, etc. 

    Data products make sense in certain scenarios, especially when the complexity of architectures is manageable, though attempting to redesign everything from their perspective is like having a hammer in one's hand and treating everything like a nail.

    Previous Post <<||>> Next Post

    Related Posts Plugin for WordPress, Blogger...

    About Me

    My photo
    Koeln, NRW, Germany
    IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.