Showing posts with label tenant. Show all posts
Showing posts with label tenant. Show all posts

12 March 2024

🏭🗒️Microsoft Fabric: OneLake [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 12-Mar-2024

Microsoft Fabric & OneLake
Microsoft Fabric & OneLake

[Microsoft Fabric] OneLake

  • a single, unified, logical data lake for the whole organization [2]
    • designed to be the single place for all an organization's analytics data [2]
    • provides a single, integrated environment for data professionals and the business to collaborate on data projects [1]
    • stores all data in a single open format [1]
    • its data is governed by default
    • combines storage locations across different regions and clouds into a single logical lake, without moving or duplicating data
      • similar to how Office applications are prewired to use OneDrive
      • saves time by eliminating the need to move and copy data 
  • comes automatically with every Microsoft Fabric tenant [2]
    • automatically provisions with no extra resources to set up or manage [2]
    • used as native store without needing any extra configuration [1
  • accessible by all analytics engines in the platform [1]
    • all the compute workloads in Fabric are preconfigured to work with OneLake
      • compute engines have their own security models (aka compute-specific security) 
        • always enforced when accessing data using that engine [3]
        • the conditions may not apply to users in certain Fabric roles when they access OneLake directly [3]
  • built on top of ADLS  [1]
    • supports the same ADLS Gen2 APIs and SDKs to be compatible with existing ADLS Gen2 applications [2]
    • inherits its hierarchical structure
    • provides a single-pane-of-glass file-system namespace that spans across users, regions and even clouds
  • data can be stored in any format
    • incl. Delta, Parquet, CSV, JSON
    • data can be addressed in OneLake as if it's one big ADLS storage account for the entire organization [2]
  • uses a layered security model built around the organizational structure of experiences within MF [3]
    • derived from Microsoft Entra authentication [3]
    • compatible with user identities, service principals, and managed identities [3]
    • using Microsoft Entra ID and Fabric components, one can build out robust security mechanisms across OneLake, ensuring that you keep your data safe while also reducing copies and minimizing complexity [3]
  • hierarchical in nature 
    • {benefit} simplifies management across the organization
    • its data is divided into manageable containers for easy handling
    • can have one or more capacities associated with it
      • different items consume different capacity at a certain time
      • offered through Fabric SKU and Trials
  • {component} OneCopy
    • allows to read data from a single copy, without moving or duplicating data [1]
  • {concept} Fabric tenant
    • a dedicated space for organizations to create, store, and manage Fabric items.
      • there's often a single instance of Fabric for an organization, and it's aligned with Microsoft Entra ID [1]
        • ⇒ one OneLake per tenant
      • maps to the root of OneLake and is at the top level of the hierarchy [1]
    • can contain any number of workspaces [2]
  • {concept} capacity
    • a dedicated set of resources that is available at a given time to be used [1]
    • defines the ability of a resource to perform an activity or to produce output [1]
  • {concept} domain
    • a way of logically grouping together workspaces in an organization that is relevant to a particular area or field [1]
    • can have multiple [subdomains]
      • {concept} subdomain
        • a way for fine tuning the logical grouping of the data
  • {concept} workspace 
    • a collection of Fabric items that brings together different functionality in a single tenant [1]
      • different data items appear as folders within those containers [2]
      • always lives directly under the OneLake namespace [4]
      • {concept} data item
        • a subtype of item that allows data to be stored within it using OneLake [4]
        • all Fabric data items store their data automatically in OneLake in Delta Parquet format [2]
      • {concept} Fabric item
        • a set of capabilities bundled together into a single component [4] 
        • can have permissions configured separately from the workspace roles [3]
        • permissions can be set by sharing an item or by managing the permissions of an item [3]
    • acts as a container that leverages capacity for the work that is executed [1]
      • provides controls for who can access the items in it [1]
        • security can be managed through Fabric workspace roles
      • enable different parts of the organization to distribute ownership and access policies [2]
      • part of a capacity that is tied to a specific region and is billed separately [2]
      • the primary security boundary for data within OneLake [3]
    • represents a single domain or project area where teams can collaborate on data [3]
  • [encryption] encrypted at rest by default using Microsoft-managed key [3]
    • the keys are rotated appropriately per compliance requirements [3]
    • data is encrypted and decrypted transparently using 256-bit AES encryption, one of the strongest block ciphers available, and it is FIPS 140-2 compliant [3]
    • {limitation} encryption at rest using customer-managed key is currently not supported [3]
  • {general guidance} write access
    • users must be part of a workspace role that grants write access [4] 
    • rule applies to all data items, so scope workspaces to a single team of data engineers [4] 
  • {general guidance}Lake access: 
    • users must be part of the Admin, Member, or Contributor workspace roles, or share the item with ReadAll access [4] 
  • {general guidance} general data access 
    • any user with Viewer permissions can access data through the warehouses, semantic models, or the SQL analytics endpoint for the Lakehouse [4] 
  • {general guidance} object level security:
    • give users access to a warehouse or lakehouse SQL analytics endpoint through the Viewer role and use SQL DENY statements to restrict access to certain tables [4]
  • {feature|preview} trusted workspace access
    • allows to securely access firewall-enabled Storage accounts by creating OneLake shortcuts to Storage accounts, and then use the shortcuts in the Fabric items [5]
    • based on [workspace identity]
    • {benefit} provides secure seamless access to firewall-enabled Storage accounts from OneLake shortcuts in Fabric workspaces, without the need to open the Storage account to public access [5]
    • {limitation} available for workspaces in Fabric capacities F64 or higher
  • {concept} workspace identity
    • a unique identity that can be associated with workspaces that are in Fabric capacities
    • enables OneLake shortcuts in Fabric to access Storage accounts that have [resource instance rules] configured
    • {operation} creating a workspace identity
      • Fabric creates a service principal in Microsoft Entra ID to represent the identity [5]
  • {concept} resource instance rules
    • a way to grant access to specific resources based on the workspace identity or managed identity [5] 
    • {operation} create resource instance rules 
      • created by deploying an ARM template with the resource instance rule details [5]

References:
[1] Microsoft Learn (2023) Administer Microsoft Fabric (link)
[2] Microsoft Learn (2023) OneLake, the OneDrive for data (link)
[3] Microsoft Learn (2023) OneLake security (link)
[4] Microsoft Learn (2023) Get started securing your data in OneLake (link}
[5] Microsoft Fabric Updates Blog (2024) Introducing Trusted Workspace Access for OneLake Shortcuts, by Meenal Srivastva (link)

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
ADLS - Azure Data Lake Storage
AES - Advanced Encryption Standard 
ARM - Azure Resource Manager
FIPS - Federal Information Processing Standard
SKU - Stock Keeping Units
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.