SQL Troubles: testing

Showing posts with label testing. Show all posts

11 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 202: How Pushing AI Models Into Out‑of‑Distribution Inputs Generates Novel Concepts)

Prompt Engineering Series

Prompt: "write a post of 600 words on how to push models into out‑of‑distribution iInputs to generate novel concepts in AI models"

Introduction

Artificial Intelligence (AI) models excel at interpolation - filling in the gaps between patterns they’ve already seen. But the frontier of creativity, innovation, and conceptual discovery lies outside those familiar boundaries. To reach that frontier, researchers use out‑of‑distribution (OOD) inputs: prompts, structures, or data patterns that sit beyond the model’s training distribution. When done intentionally and safely, this technique can reveal how models generalize, how they stretch their internal representations, and how they generate novel concepts that do not simply remix the past.

Pushing a model into OOD territory is not about confusing it. It’s about stress‑testing its conceptual elasticity. Models trained on massive datasets develop dense clusters of meaning - regions where concepts are richly represented - and sparse regions where the model has little experience. OOD inputs target those sparse regions. They force the model to navigate conceptual space without the usual statistical anchors, revealing how it constructs meaning when familiar patterns disappear. This connects directly to rare‑event blind‑spot analysis, where unusual inputs expose hidden weaknesses.

One powerful method for generating OOD conditions is structural perturbation. Instead of changing the content of a prompt, researchers alter its structure - using unusual syntax, hybrid formats, or nested instructions. For example, combining mathematical notation with poetic metaphor, or embedding code inside rhetorical questions. These hybrid structures push the model into regions where its learned representations overlap in unexpected ways. The model must reconcile incompatible patterns, often producing emergent conceptual blends that would not appear in standard prompting. This technique aligns with insights from uncommon linguistic structure testing.

Another approach involves semantic displacement - asking the model to apply concepts from one domain to another where they do not naturally belong. For example: 'Describe quantum entanglement using the logic of medieval guild economics.' This forces the model to map distant conceptual regions together, creating novel analogies or frameworks. These mappings are not random; they reveal how the model organizes knowledge internally. When the model is pushed far enough, it begins to generate new conceptual hybrids, not because it has seen them before, but because its internal geometry allows it to interpolate across distant domains.

A more advanced technique uses contradictory task layering, where the model must satisfy overlapping constraints that do not naturally coexist. For example: 'Invent a biological organism that obeys thermodynamic laws but violates known evolutionary principles.' These prompts push the model into conceptual dead zones - regions where no training example exists. The model must synthesize new structures to satisfy the constraints, often producing novel theoretical constructs. This method parallels the logic of boundary‑stress evaluation, where conflicting instructions reveal the model’s reasoning hierarchy.

OOD prompting also benefits from recursive abstraction, where the model is asked to generalize beyond its own generalizations. For instance: 'Generate a concept that is to machine learning what machine learning is to statistics.' This forces the model to climb the abstraction ladder, leaving the comfort of known categories. The resulting concepts often reflect the model’s latent ability to extrapolate beyond its training distribution.

Finally, OOD exploration can involve synthetic anomalies - inputs that deliberately violate the statistical norms of the training data. These anomalies act as conceptual shockwaves, pushing the model to reorganize its internal representations. When guided carefully, they can reveal new conceptual pathways, much like how scientific breakthroughs often emerge from anomalies that challenge existing theories.

Ultimately, pushing models into OOD inputs is not about breaking them. It is about discovering the edges of their conceptual space. By exploring those edges, researchers can uncover how models generalize, how they innovate, and how they generate ideas that go beyond the sum of their training data. OOD prompting is a tool for expanding the frontier of machine creativity - one carefully engineered anomaly at a time.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

10 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 201: How Boundary‑Stress Evaluation Uses Nested and Overlapping Tasks to Reveal AI Model Blind Spots)

Prompt Engineering Series

Prompt: "write a post of 600 words on how boundary‑stress evaluation intentionally creates conflicts in nested or overlapping tasks for AI models"

Introduction

Artificial Intelligence (AI) models often appear competent when tasks are cleanly separated and instructions are simple. But real‑world reasoning rarely arrives in neat, isolated packets. Tasks overlap. Instructions nest inside one another. Goals shift mid‑stream. And it’s precisely in these tangled situations that AI models reveal their deepest blind spots. Boundary‑stress evaluation is the practice of intentionally engineering these moments. By creating nested or overlapping task conflicts, it exposes how an AI model prioritizes, interprets, and resolves competing demands.

Nested and overlapping tasks are fundamentally different from simple instruction conflicts. Instead of presenting two contradictory commands, evaluators embed tasks inside other tasks or layer multiple goals that must be pursued simultaneously. This forces the model to juggle multiple cognitive threads at once. The resulting behavior reveals the model’s internal hierarchy of cues, a concept closely related to instruction‑priority testing.

One of the most revealing techniques involves task‑within‑task nesting. For example, a prompt may ask the model to summarize a text, but within that summary, embed a requirement to switch tone, cite a source, or perform a transformation. The outer task sets one expectation; the inner task sets another. When these expectations conflict, the model must decide which layer dominates. If it prioritizes the inner instruction, it reveals a bias toward local cues. If it prioritizes the outer instruction, it reveals a bias toward global framing. Inconsistencies between these behaviors often signal unstable internal weighting.

Another powerful method is overlapping task interference, where two tasks must be performed concurrently but draw on incompatible assumptions. For instance, a model may be asked to maintain a formal tone while generating playful metaphors, or to provide a neutral analysis while simultaneously adopting a fictional persona. These overlapping demands create tension between stylistic, functional, and contextual cues. The model’s resolution strategy exposes whether it treats style as a global constraint, a local modifier, or a secondary priority. This mirrors vulnerabilities uncovered through weak‑point mapping, where models over‑trust certain cues simply because they dominate the training distribution.

Boundary‑stress evaluation also uses recursive task structures, where the model must apply a rule to its own output. For example: 'Rewrite your previous answer in a different style, but keep the original structure intact.' This forces the model to track multiple layers of its own reasoning. When the recursion becomes deep or the constraints conflict, the model may lose track of which layer it is operating in. These failures reveal limitations in long‑range dependency tracking and self‑referential reasoning.

A subtler form of nested conflict involves goal‑shifting tasks, where the model begins with one objective but must switch to another mid‑task without discarding the original context. Humans handle this fluidly. AI models often do not. When the shift contradicts earlier instructions, the model’s response shows whether it prioritizes recency, inferred intent, or structural cues. This connects directly to conflicting‑signal analysis.

Perhaps the most challenging nested conflicts involve hierarchical task decomposition, where the model must break a task into steps while simultaneously following meta‑instructions about how to perform that decomposition. If the meta‑instructions contradict the task content, the model must choose which layer to obey. These tests reveal whether the model treats meta‑instructions as authoritative or merely advisory.

Ultimately, boundary‑stress evaluation is not about tricking the model. It is about mapping the edges of its multi‑layer reasoning. By intentionally creating conflicts in nested or overlapping tasks, evaluators can see how the model prioritizes instructions, how it handles ambiguity, and where its internal logic becomes brittle. These insights are essential for building AI systems that behave predictably in complex, real‑world environments - where tasks overlap, goals shift, and instructions rarely arrive one at a time.

Previous Post <<||>> Next Post

07 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 198: How Domain‑Specific Anomalies Expose Blind Spots in AI Models)

Prompt Engineering Series

Prompt: "write a post of 600 words on how domain‑specific anomalies expose blind spots in AI models"

Introduction

Artificial Intelligence (AI) models are often praised for their versatility, but their real limitations become visible only when they step outside the comfort zone of general‑purpose language. When a model encounters domain‑specific anomalies - the unusual patterns, edge‑case behaviors, or irregular structures that appear only within a particular field - it is forced to operate without the statistical safety net it relies on. These anomalies act like diagnostic probes, revealing blind spots that remain hidden during everyday interactions.

To understand why domain‑specific anomalies are so revealing, you have to consider how AI models learn. They absorb patterns from massive datasets, but those datasets are never evenly distributed across all fields. Some domains - like everyday conversation, news, or common technical topics - are heavily represented. Others - like niche scientific notation, legal edge cases, rare medical conditions, or obscure programming paradigms—appear only sparsely. This imbalance creates statistical shadows, areas where the model’s internal representation is thin or incomplete.

When an anomaly appears inside one of these shadows, the model’s behavior becomes a window into its internal reasoning. For example, a model trained heavily on mainstream medical literature may perform well on common diagnoses but struggle when confronted with a rare syndrome or an atypical symptom cluster. The model may latch onto the wrong cue, misinterpret the structure of the description, or default to generic reasoning. These failures expose the over‑generalization that occurs when a model tries to stretch familiar patterns into unfamiliar territory.

Domain‑specific anomalies also reveal how models handle specialized linguistic structures. Fields like law, mathematics, chemistry, and finance each have their own micro‑languages - dense with symbols, conventions, and implicit assumptions. When an anomaly disrupts these conventions, the model must decide which cues to trust. A misplaced operator in a mathematical expression, an unusual clause ordering in a legal contract, or a non‑standard chemical notation can cause the model to misread the entire structure. These moments show where the model’s understanding is superficial, echoing the challenges seen in uncommon linguistic structures.

Another revealing category involves procedural anomalies - cases where a domain has strict rules, and the anomaly breaks them. In programming, for example, a function that violates typical naming conventions or a code block that mixes paradigms can confuse the model’s internal heuristics. In finance, an unusual transaction pattern may cause the model to misclassify risk. In scientific writing, a non‑standard experimental layout may lead the model to misinterpret the methodology. These anomalies expose the model’s reliance on pattern familiarity rather than true conceptual understanding.

Domain‑specific anomalies also highlight the limits of contextual transfer. A model may perform well when a domain behaves predictably, but when an anomaly forces the model to transfer knowledge across contexts - such as applying physics reasoning to a biological edge case - it may reveal gaps in its internal conceptual map. These gaps often align with the same vulnerabilities uncovered through weak‑point mapping, where the model over‑trusts certain cues simply because they dominate the training distribution.

Perhaps the most important insight is that domain‑specific anomalies expose hidden assumptions baked into the model. Every domain has its own logic, and models often internalize simplified versions of that logic. When an anomaly violates those assumptions, the model’s response shows how rigid or flexible its internal representation truly is. A well‑aligned model adapts; a brittle one collapses into generic or incorrect reasoning.

Ultimately, domain‑specific anomalies are not just edge cases - they are stress tests that reveal the contours of an AI model’s understanding. They show where the model is robust, where it is brittle, and where its blind spots lie. By studying these anomalies, researchers can build models that are not only more capable, but also more transparent, predictable, and aligned with the complexity of real‑world domains.

Previous Post <<||>> Next Post

31 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 193: How Instruction‑Priority Testing Reveals Whether AI Models Obey Visible or Invisible Instructions)

Prompt Engineering Series

Prompt: "write a post of 600 words on how instruction‑priority testing in AI models allows to see whether the model obeys visible or invisible instructions"

Introduction

In the rapidly evolving world of Artificial Intelligence (AI), one of the most important questions researchers and practitioners ask is deceptively simple: Which instructions does the model actually follow? Modern AI systems operate under layers of guidance—some visible to the user, others embedded deep within the model’s training or system‑level configuration. Understanding which instructions take priority is essential for safety, reliability, and transparency. This is where instruction‑priority testing comes into play.

Instruction‑priority testing is the practice of giving an AI model multiple, potentially conflicting instructions and observing which ones it chooses to obey. The goal is not to 'trick' the model but to map the hierarchy of influences acting on it. These influences can include user prompts, system‑level rules, safety constraints, and even subtle patterns learned during training. By intentionally creating controlled conflicts, researchers can see whether the model prioritizes visible instructions - the ones the user explicitly writes - or invisible instructions, such as safety rules, alignment constraints, or internal behavioral patterns.

At its core, instruction‑priority testing works because AI models do not simply execute commands. They interpret them. When a user writes a prompt, the model weighs that prompt against its internal rules and the broader context of the conversation. If the model consistently refuses to follow a user instruction, even when the instruction is clear and harmless, that signals the presence of a stronger, invisible rule. Conversely, if the model follows the user instruction even when it contradicts a system‑level guideline, that suggests the model is over‑prioritizing user input.

One of the most revealing aspects of instruction‑priority testing is how it exposes implicit behavior. For example, a model may be given a visible instruction to respond in a certain style, but an invisible instruction - such as a safety guideline - may override that style if the content touches on sensitive topics. This doesn’t mean the model is malfunctioning. It means the model is following a hierarchy designed to keep interactions safe and responsible. Instruction‑priority testing helps clarify where that hierarchy begins and ends.

Another benefit of this testing method is that it highlights model robustness. A well‑aligned model should consistently prioritize safety‑critical invisible instructions over user‑provided visible ones. If a model can be easily pushed into ignoring its own safeguards, that’s a sign of weak alignment. On the other hand, if a model rigidly follows invisible rules even when the user’s request is harmless and reasonable, that may indicate over‑alignment or inflexibility. Instruction‑priority testing helps strike the right balance.

The technique also sheds light on prompt sensitivity. Some models respond strongly to the phrasing or structure of a prompt, while others maintain stable behavior regardless of wording. By varying the visible instructions - changing tone, order, or specificity—researchers can see how easily the model’s priorities shift. If small changes in phrasing cause large changes in behavior, the model may be too sensitive to surface‑level cues. If the model ignores user phrasing entirely, it may be too anchored to internal rules.

Ultimately, instruction‑priority testing is not about catching AI models doing something wrong. It’s about understanding how they make decisions. In a world where AI systems are becoming more capable and more integrated into daily life, transparency around instruction hierarchy is essential. Users deserve to know when the model is following their guidance and when it is following deeper, invisible rules designed to ensure safety and consistency.

By systematically testing how models respond to conflicting instructions, we gain insight into their internal priorities, their alignment with human values, and their reliability in real‑world scenarios. Instruction‑priority testing is not just a diagnostic tool - it’s a window into the model’s decision‑making process, helping us build AI systems that are both powerful and trustworthy.

Previous Post <<||>> Next Post

18 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 192: How to Push AI Models Toward Edge Cases for Boundary‑Stress Evaluation)

Prompt Engineering Series

Prompt: "write a post of 600 words on how to push AI models toward edge cases boundary‑stress evaluation"

Introduction

Artificial Intelligence (AI) systems perform impressively well on the familiar, the typical, and the statistically common. But real‑world environments are rarely tidy. They contain ambiguity, noise, contradictions, and rare events that fall outside the model’s comfort zone. To build AI that behaves reliably under pressure, developers must intentionally push models toward edge cases - the unusual, the extreme, and the adversarial. This process, known as boundary‑stress evaluation, is essential for understanding how AI behaves when the world stops playing by the rules.

1. Use Adversarial Inputs to Reveal Fragility

Adversarial inputs are designed to expose weaknesses by introducing subtle distortions or contradictions. They help uncover how easily a model can be nudged off course.

Adversarial prompts: conflicting or misleading instructions
Perturbed data: slightly altered text, images, or sequences
Ambiguous phrasing: inputs with multiple valid interpretations

These tests reveal how the model handles uncertainty, noise, and manipulation.

2. Stress the Model With Rare or Low‑Frequency Scenarios

AI models are trained on distributions where some patterns appear frequently and others almost never. Rare events often expose blind spots.

By feeding the model examples from the statistical fringes, developers can evaluate how well it generalizes beyond the norm.

3. Introduce Conflicting Contexts to Test Instruction Hierarchy

AI models must decide which signals to prioritize when instructions conflict. Boundary‑stress evaluation intentionally creates these conflicts.

Multi‑layer instruction tests
Contextual contradictions
Nested or overlapping tasks

These scenarios reveal whether the model respects safety layers, system rules, and user intent under pressure.

4. Push the Model Into Out‑of‑Distribution Inputs

Out‑of‑distribution (OOD) testing evaluates how the model behaves when it encounters something completely unfamiliar.

OOD testing is crucial because real‑world environments constantly generate new patterns the model has never seen.

5. Apply Incremental Escalation to Identify Breaking Points

Boundary‑stress evaluation works best when pressure is applied gradually. This helps map the model’s stability curve.

This progressive stress testing reveals the exact point where the model’s reasoning begins to degrade.

6. Use Multi‑Modal Stressors for Holistic Evaluation

Modern AI systems often process text, images, audio, or structured data. Stress testing should reflect this complexity.

This exposes how well the model integrates information under imperfect conditions.

7. Analyze Failure Modes to Strengthen Guardrails

The goal isn’t to break the model - it’s to understand how it breaks.

Does it hallucinate?
Does it ignore safety rules?
Does it misinterpret intent?
Does it become overconfident?

Failure‑mode mapping helps developers reinforce alignment, improve safety layers, and refine training strategies.

Closing Reflections

Boundary‑stress evaluation is not optional - it’s foundational. AI models must be tested not only on what they should handle, but on what they should never fail catastrophically on. By pushing models toward edge cases, developers gain insight into their limits, their vulnerabilities, and their resilience. This process transforms uncertainty into understanding and transforms brittle systems into robust ones.

Previous Post <<||>> Next Post

16 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 190: How Invisible Prompt Injection Could Be a Good Thing for AI)

Prompt Engineering Series

Prompt: "write a post of 600 words on how invisible prompt injection could be a good thing for AI"

Introduction

Invisible prompt injection is usually discussed as a threat - an attack vector that manipulates AI systems without the user’s awareness. And yes, in the wild, it is dangerous. But in controlled environments, invisible prompt injection can actually be a powerful tool for strengthening AI safety, improving robustness, and helping developers understand how models behave under pressure. By studying how AI systems respond to hidden instructions, researchers can build models that are more resilient, more transparent, and ultimately more trustworthy. In this sense, invisible prompt injection isn’t just a vulnerability; it’s also a diagnostic instrument that reveals how AI systems interpret, prioritize, and negotiate conflicting signals.

1. A Testing Ground for AI Robustness

Invisible prompt injection acts like a stress test. When researchers embed hidden instructions into text, images, or metadata, they can observe how the AI responds when its input channel is compromised. This helps developers identify:

Weak points in the model’s reasoning
Situations where the model over‑trusts user input
Scenarios where safety guardrails fail

By intentionally exposing the model to controlled injections, teams can strengthen its resistance to real‑world attacks. This transforms a vulnerability into a research tool that improves system resilience.

2. A Way to Understand How AI Prioritizes Instructions

Invisible prompt injection reveals how an AI model weighs different layers of input. Does it prioritize the user’s visible request? The hidden instruction? The system‑level rules? The model’s internal alignment?

Studying these interactions helps researchers map the model’s internal decision‑making. This is crucial for:

Improving interpretability
Refining alignment strategies
Ensuring consistent behavior across contexts

In other words, invisible prompt injection becomes a lens through which developers can examine the model’s internal hierarchy of influence.

3. A Tool for Building Better Defenses

You can’t defend against what you don’t understand. Controlled invisible prompt injection allows researchers to simulate attacks that malicious actors might attempt. This helps teams design:

Stronger input sanitization
Better content‑filtering pipelines
More resilient prompt‑parsing mechanisms

By studying how injections succeed, developers can build systems that automatically detect and neutralize them. This proactive approach turns a threat into a training mechanism for safer AI.

4. A Method for Evaluating Real‑World Risk

Invisible prompt injection helps researchers evaluate how AI systems behave in messy, unpredictable environments. Real‑world data is full of:

Hidden formatting
Embedded metadata
Unintended instructions
Noisy or adversarial contention

Testing with invisible injections helps developers understand how the model behaves when confronted with ambiguous or corrupted inputs. This leads to AI systems that are more stable, more predictable, and more reliable in everyday use.

5. A Catalyst for Better AI Governance

Invisible prompt injection research encourages organizations to adopt stronger governance practices. It highlights the need for:and

Clear safety protocols
Rigorous red‑team testing
Transparent risk assessments
Continuous monitoring

By treating invisible prompt injection as a legitimate research tool, organizations can build a culture of proactive safety rather than reactive patching.

Closing Statement

Invisible prompt injection is dangerous when used maliciously - but in controlled, ethical research settings, it becomes a powerful instrument for strengthening AI. It exposes weaknesses, reveals hidden dynamics, and helps developers build systems that are more robust, more transparent, and more aligned with human values. By studying how AI responds to invisible manipulation, we gain the insight needed to design models that behave predictably and safely, even in the face of unexpected inputs. In this way, invisible prompt injection isn’t just a threat - it’s also an opportunity to build better, safer AI.

Previous Post <<||>> Next Post

26 April 2025

🏭🗒️Microsoft Fabric: Power BI Environments [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)!

Last updated: 26-Apr-2025

Enterprise Content Publishing [2]

[Microsoft Fabric] Power BI Environments

{def} structured spaces within Microsoft Fabric that helps organizations manage the Power BI assets through the entire lifecycle
{environment} development

allows to develop the solution
accessible only to the development team

via Contributor access

{recommendation} use Power BI Desktop as local development environment

{benefit} allows to try, explore, and review updates to reports and datasets

once the work is done, upload the new version to the development stage

{benefit} enables collaborating and changing dashboards
{benefit} avoids duplication

making online changes, downloading the .pbix file, and then uploading it again, creates reports and datasets duplication

{recommendation} use version control to keep the .pbix files up to date

[OneDrive] use Power BI's autosync

{alternative} SharePoint Online with folder synchronization
{alternative} GitHub and/or VSTS with local repository & folder synchronization

[enterprise scale deployments]

{recommendation} separate dataset from reports and dashboards’ development

use the deployment pipelines selective deploy option [22]
create separate .pbix files for datasets and reports [22]

create a dataset .pbix file and uploaded it to the development stage (see shared datasets [22]
create .pbix only for the report, and connect it to the published dataset using a live connection [22]

{benefit} allows different creators to separately work on modeling and visualizations, and deploy them to production independently

{recommendation} separate data model from report and dashboard development

allows using advanced capabilities

e.g. source control, merging diff changes, automated processes

separate the development from test data sources [1]

the development database should be relatively small [1]

{recommendation} use only a subset of the data [1]

⇐ otherwise the data volume can slow down the development [1]

{environment} user acceptance testing (UAT)

test environment that within the deployment lifecycle sits between development and production

it's not necessary for all Power BI solutions [3]
allows to test the solution before deploying it into production

all tests must have

View access for testing
Contributor access for report authoring

involves business users who are SMEs

provide approval that the content

is accurate
meets requirements
can be deployed for wider consumption

{recommendation} check report’s load and the interactions to find out if changes impact performance [1]
{recommendation} monitor the load on the capacity to catch extreme loads before they reach production [1]
{recommendation} test data refresh in the Power BI service regularly during development [20]

{environment} production

{concept} staged deployment

{goal} help minimize risk, user disruption, or address other concerns [3]

the deployment involves a smaller group of pilot users who provide feedback [3]

{recommendation} set production deployment rules for data sources and parameters defined in the dataset [1]

allows ensuring the data in production is always connected and available to users [1]

{recommendation} don’t upload a new .pbix version directly to the production stage

⇐ without going through testing

{feature|preview} deployment pipelines

enable creators to develop and test content in the service before it reaches the users [5]

{recommendation} build separate databases for development and testing

helps protect production data [1]

{recommendation} make sure that the test and production environment have similar characteristics [1]

e.g. data volume, sage volume, similar capacity
{warning} testing into production can make production unstable [1]
{recommendation} use Azure A capacities [22]

{recommendation} for formal projects, consider creating an environment for each phase
{recommendation} enable users to connect to published datasets to create their own reports
{recommendation} use parameters to store connection details

e.g. instance names, database names
⇐ deployment pipelines allow configuring parameter rules to set specific values for the development, test, and production stages

alternatively data source rules can be used to specify a connection string for a given dataset

{restriction} in deployment pipelines, this isn't supported for all data sources

{recommendation} keep the data in blob storage under the 50k blobs and 5GB data in total to prevent timeouts [29]
{recommendation} provide data to self-service authors from a centralized data warehouse [20]

allows to minimize the amount of work that self-service authors need to take on [20]

{recommendation} minimize the use of Excel, csv, and text files as sources when practical [20]
{recommendation} store source files in a central location accessible by all coauthors of the Power BI solution [20]
{recommendation} be aware of API connectivity issues and limits [20]
{recommendation} know how to support SaaS solutions from AppSource and expect further data integration requests [20]
{recommendation} minimize the query load on source systems [20]

use incremental refresh in Power BI for the dataset(s)
use a Power BI dataflow that extracts the data from the source on a schedule
reduce the dataset size by only extracting the needed amount of data

{recommendation} expect data refresh operations to take some time [20]
{recommendation} use relational database sources when practical [20]
{recommendation} make the data easily accessible [20]
[knowledge area] knowledge transfer

{recommendation} maintain a list of best practices and review it regularly [24]
{recommendation} develop a training plan for the various types of users [24]

usability training for read only report/app users [24
self-service reporting for report authors & data analysts [24]
more elaborated training for advanced analysts & developers [24]

[knowledge area] lifecycle management

consists of the processes and practices used to handle content from its creation to its eventual retirement [6]
{recommendation} postfix files with 3-part version number in Development stage [24]

remove the version number when publishing files in UAT and production

{recommendation} backup files for archive
{recommendation} track version history

Previous Post <<||>> Next Post

References:

[1] Microsoft Learn (2021) Fabric: Deployment pipelines best practices [link]

[2] Microsoft Learn (2024) Power BI: Power BI usage scenarios: Enterprise content publishing [link]
[3] Microsoft Learn (2024) Deploy to Power BI [link]
[4] Microsoft Learn (2024) Power BI implementation planning: Content lifecycle management [link]
[5] Microsoft Learn (2024) Introduction to deployment pipelines [link]

[6] Microsoft Learn (2024) Power BI implementation planning: Content lifecycle management [link]

[20] Microsoft (2020) Planning a Power BI Enterprise Deployment [White paper] [link]

[22] Power BI Docs (2021) Create Power BI Embedded capacity in the Azure portal [link]

[24] Paul Turley (2019) A Best Practice Guide and Checklist for Power BI Projects

Resources:

Acronyms:
API - Application Programming Interface
CLM - Content Lifecycle Management
COE - Center of Excellence
SaaS - Software-as-a-Service
SME - Subject Matter Expert
UAT - User Acceptance Testing
VSTS - Visual Studio Team System
SME - Subject Matter Experts

17 January 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part VIII: Permissions) [new feature]

Data-based solutions usually target a set of users who (ideally) have restricted permissions to the functionality. Therefore, as part of the process are defined several personas that target different use cases, for which the permissions must be restricted accordingly.

In the simplest scenario the user must have access to the underlying objects for querying the data. Supposing that an Entra User was created already, the respective user must be given access also in the Fabric database (see [1], [2]). From database's main menu follow the path to assign read permissions:
Security >> Manage SQL Security >> (select role: db_datareader)

Manage SQL Security

Manage access >> Add >> (search for User)

Manage access

(select user) >> Share database >> (select additional permissions) >> Save

Manage additional permissions

The easiest way to test whether the permissions work before building the functionality is to login over SQL Server Management Studio (SSMS) and check the access using the Microsoft Entra MFA. Ideally, one should have a User's credentials that can be used only for testing purposes. After the above setup was done, the new User was able to access the data.

A second User can be created for testing with the maximum of permissions allowed on the SQL database side, which is useful for troubleshooting. Alternatively, one can use only one User for testing and assign or remove the permissions as needed by the test scenario.

It's a good idea to try to understand what's happening in the background. For example, the expectation was that for the Entra User created above also a SQL user is created, which doesn't seem to be the case, at least per current functionality available.

Before diving deeper, it's useful to retrieve User's details:

-- retrieve current user
SELECT SUser_Name() sys_user_name
, User_Id() user_id 
, USER_NAME() user_name
, current_user [current_user]
, user [user];

Output:

sys_user_name	user_id	user_name	current_user	user
JamesClavell@[domain].onmicrosoft.com	0	JamesClavell@[domain].onmicrosoft.com	JamesClavell@[domain].onmicrosoft.com	JamesClavell@[domain].onmicrosoft.com

Retrieving the current User is useful especially when testing in parallel functionality with different Users. Strangely, User's ID is 0 when only read permissions were assigned. However, a valid User identifier is added for example when to the User is assigned also the db_datawriter role. Removing afterwards the db_datawriter role to the User keeps as expected User's ID. For troubleshooting purposes, at least per current functionality, it might be a good idea to create the Users with a valid User ID (e.g. by assigning temporarily the db_datawriter role to the User).

The next step is to look at the Users with access to the database:

-- database access 
SELECT USR.uid
, USR.name
--, USR.sid 
, USR.hasdbaccess 
, USR.islogin
, USR.issqluser
--, USR.createdate 
--, USR.updatedate 
FROM sys.sysusers USR
WHERE USR.hasdbaccess = 1
  AND USR.islogin = 1
ORDER BY uid

Output:

uid	name	hasdbaccess	islogin	issqluser
1	dbo	1	1	1
6	CharlesDickens@[...].onmicrosoft.com	1	1	0
7	TestUser	1	1	1
9	JamesClavell@[...].onmicrosoft.com	1	1	0

For testing purposes, besides the standard dbo role and two Entra-based roles, it was created also a SQL role to which was granted access to the SalesLT schema (see initial post):

-- create the user
CREATE USER TestUser WITHOUT LOGIN;

-- assign access to SalesLT schema 
GRANT SELECT ON SCHEMA::SalesLT TO TestUser;
  
-- test impersonation (run together)
EXECUTE AS USER = 'TestUser';

SELECT * FROM SalesLT.Customer;

REVERT;

Notes:
1) Strangely, even if access was given explicitly only to the SalesLT schema, the TestUser User has access also to sys.sysusers and other DMVs. That's valid also for the access over SSMS
2) For the above created User there are no records in the sys.user_token and sys.login_token DMVs, in contrast with the user(s) created for administering the SQL database.

Let's look at the permissions granted explicitly:

-- permissions granted explicitly
SELECT DPR.principal_id
, DPR.name
, DPR.type_desc
, DPR.authentication_type_desc
, DPE.state_desc
, DPE.permission_name
FROM sys.database_principals DPR
     JOIN sys.database_permissions DPE
	   ON DPR.principal_id = DPE.grantee_principal_id
WHERE DPR.principal_id != 0 -- removing the public user
ORDER BY DPR.principal_id
, DPE.permission_name;

Result:

principal_id	name	type_desc	authentication_type_desc	state_desc	permission_name
1	dbo	SQL_USER	INSTANCE	GRANT	CONNECT
6	CharlesDickens@[...].onmicrosoft.com	EXTERNAL_USER	EXTERNAL	GRANT	AUTHENTICATE
6	CharlesDickens@[...].onmicrosoft.com	EXTERNAL_USER	EXTERNAL	GRANT	CONNECT
7	TestUser	SQL_USER	NONE	GRANT	CONNECT
7	TestUser	SQL_USER	NONE	GRANT	SELECT
9	JamesClavell@[...].onmicrosoft.com	EXTERNAL_USER	EXTERNAL	GRANT	CONNECT

During troubleshooting it might be useful to check current user's permissions at the various levels via sys.fn_my_permissions:

-- retrieve database-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions(NULL, 'Database');

-- retrieve schema-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions('SalesLT', 'Schema');

-- retrieve object-scoped permissions for current user
SELECT *
FROM sys.fn_my_permissions('SalesLT.Customer', 'Object')
WHERE permission_name = 'SELECT';

Notes:
1) See also [1] and [4] in what concerns the limitations that apply to managing permissions in SQL databases.

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) Microsoft Fabric: Share your SQL database and manage permissions [link]
[2] Microsoft Learn (2024) Microsoft Fabric: Share data and manage access to your SQL database in Microsoft Fabric [link]
[3] Microsoft Learn (2024) Authorization in SQL database in Microsoft Fabric [link]
[4] Microsoft Learn (2024) Authentication in SQL database in Microsoft Fabric [link]

[5] Microsoft Fabric Learn (2025) Manage access for SQL databases in Microsoft Fabric with workspace roles and item permissions [link]

21 December 2024

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part I: Creating a View) 🆕

At this year's Ignite conference it was announced that SQL databases are available now in Fabric in public preview (see SQL Databases for OLTP scenarios, [1]). To test the functionality one can import the SalesLT database in a newly created empty database, which made available several tables:

-- tables from SalesLT schema (queries should be run individually)
SELECT TOP 100 * FROM SalesLT.Address
SELECT TOP 100 * FROM SalesLT.Customer
SELECT TOP 100 * FROM SalesLT.CustomerAddress
SELECT TOP 100 * FROM SalesLT.Product ITM 
SELECT TOP 100 * FROM SalesLT.ProductCategory
SELECT TOP 100 * FROM SalesLT.ProductDescription 
SELECT TOP 100 * FROM SalesLT.ProductModel  
SELECT TOP 100 * FROM SalesLT.ProductModelProductDescription 
SELECT TOP 100 * FROM SalesLT.SalesOrderDetail
SELECT TOP 100 * FROM SalesLT.SalesOrderHeader

The schema seems to be slightly different than the schemas used in previous tests made in SQL Server, though with a few minor changes - mainly removing the fields not available - one can create the below view:

-- drop the view (cleaning step)

-- DROP VIEW IF EXISTS SalesLT.vProducts

-- create the view
CREATE OR ALTER VIEW SalesLT.vProducts
-- Products (view) 
AS 
SELECT ITM.ProductID 
, ITM.ProductCategoryID 
, PPS.ParentProductCategoryID 
, ITM.ProductModelID 
, ITM.Name ProductName 
, ITM.ProductNumber 
, PPM.Name ProductModel 
, PPS.Name ProductSubcategory 
, PPC.Name ProductCategory  
, ITM.Color 
, ITM.StandardCost 
, ITM.ListPrice 
, ITM.Size 
, ITM.Weight 
, ITM.SellStartDate 
, ITM.SellEndDate 
, ITM.DiscontinuedDate 
, ITM.ModifiedDate 
FROM SalesLT.Product ITM 
     JOIN SalesLT.ProductModel PPM 
       ON ITM.ProductModelID = PPM.ProductModelID 
     JOIN SalesLT.ProductCategory PPS 
        ON ITM.ProductCategoryID = PPS.ProductCategoryID 
         JOIN SalesLT.ProductCategory PPC 
            ON PPS.ParentProductCategoryID = PPC.ProductCategoryID

-- review the data
SELECT top 100 *
FROM SalesLT.vProducts

In the view were used FULL JOINs presuming thus that a value was provided for each record. It's always a good idea to test the presumptions when creating the queries, and eventually check from time to time whether something changed. In some cases it's a good idea to always use LEFT JOINs, though this might have impact on performance and probably other consequences as well.

-- check if all models are available
SELECT top 100 ITM.*
FROM SalesLT.Product ITM 
    LEFT JOIN SalesLT.ProductModel PPM 
       ON ITM.ProductModelID = PPM.ProductModelID 
WHERE PPM.ProductModelID IS NULL

-- check if all models are available
SELECT top 100 ITM.*
FROM SalesLT.Product ITM 
    LEFT JOIN SalesLT.ProductCategory PPS 
        ON ITM.ProductCategoryID = PPS.ProductCategoryID 
WHERE PPS.ProductCategoryID IS NULL

-- check if all categories are available
SELECT PPS.*
FROM SalesLT.ProductCategory PPS 
     LEFT JOIN SalesLT.ProductCategory PPC 
       ON PPS.ParentProductCategoryID = PPC.ProductCategoryID
WHERE PPC.ProductCategoryID IS NULL

Because the Product categories have an hierarchical structure, it's a good idea to check the hierarchy as well:

-- check the hierarchical structure 
SELECT PPS.ProductCategoryId 
, PPS.ParentProductCategoryId 
, PPS.Name ProductCategory
, PPC.Name ParentProductCategory
FROM SalesLT.ProductCategory PPS 
     LEFT JOIN SalesLT.ProductCategory PPC 
       ON PPS.ParentProductCategoryID = PPC.ProductCategoryID
--WHERE PPC.ProductCategoryID IS NULL
ORDER BY IsNull(PPC.Name, PPS.Name)

This last query can be consolidated in its own view and the previous view changed, if needed.

One can then save all the code as a file.

Except some small glitches in the editor, everything went smoothly.

Notes:
1) One can suppose that many or most of the queries created in the previous versions of SQL Server work also in SQL databases. The future and revised posts on such topics are labelled under sql database.
2) During the various tests I got the following error message when trying to create a table:

"The external policy action 'Microsoft.Sql/Sqlservers/Databases/Schemas/Tables/Create' was denied on the requested resource."

At least in my case all I had to do was to select "SQL Database" instead of "SQL analytics endpoint" in the web editor. Check the top right dropdown below your user information.

[3] For a full least of the available features see [2].

Happy coding!

Previous Post <<||>> Next Post

References:
[1] Microsoft Learn (2024) SQL database in Microsoft Fabric (Preview) [link]
[2] Microsoft Learn (2024) Features comparison: Azure SQL Database and SQL database in Microsoft Fabric (preview) [link]

SQL Troubles

Pages

11 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 202: How Pushing AI Models Into Out‑of‑Distribution Inputs Generates Novel Concepts)

10 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 201: How Boundary‑Stress Evaluation Uses Nested and Overlapping Tasks to Reveal AI Model Blind Spots)

07 June 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 198: How Domain‑Specific Anomalies Expose Blind Spots in AI Models)

31 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 193: How Instruction‑Priority Testing Reveals Whether AI Models Obey Visible or Invisible Instructions)

18 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 192: How to Push AI Models Toward Edge Cases for Boundary‑Stress Evaluation)

16 May 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 190: How Invisible Prompt Injection Could Be a Good Thing for AI)

26 April 2025

🏭🗒️Microsoft Fabric: Power BI Environments [Notes]

17 January 2025

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part VIII: Permissions) [new feature]

21 December 2024

💎🏭SQL Reloaded: Microsoft Fabric's SQL Databases (Part I: Creating a View) 🆕

About Me