Showing posts with label storage. Show all posts
Showing posts with label storage. Show all posts

16 March 2025

🏭🗒️Microsoft Fabric: Azure Storage Account [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources for learning purposes. For the latest information please consult the documentation (see the links below)! 

Last updated: 16-Mar-2025

[Microsoft Fabric] Azure Storage Account

  • {def} a container provided by Microsoft Azure that houses all storage data in the cloud, including blobs (binary large objects), files, queues, and tables 
  • provides a unique namespace for Azure Storage data that's accessible from anywhere in the world over HTTP or HTTPS [1]
    • every object stored has a URL address that includes customer's unique account name [1]
      • the combination of the account name and the service endpoint forms the endpoints for the storage account [1]
  •  is an ARM resource
    • ⇐  the deployment and management service for Azur [
    • belongs to a Azure resource group
      • ⇐ a logical container for grouping Azure services [3]
  • {characteristic} durable
  • {characteristic} highly available
  • {characteristic} secure
  • {characteristic} massively scalable
  • {concept} storage account name
    • must be between 3 and 24 characters in length
    • may contain numbers and lowercase letters only
    • must be unique within Azure
  • {operation} create storage account 
  • {operation} migrate storage account
    • {scenario} move a storage account to a different subscription
    • {scenario} move a storage account to a different resource group
    • {scenario} move a storage account to a different region
    • {scenario} upgrade to a general-purpose v2 storage account
    • {scenario} migrate a classic storage account to Azure Resource Manager
  • {operation} delete storage account
    •  deletes the entire account, including all data in the account [3]
    • recovery is not guaranteed
      • under certain circumstances, a deleted storage account might be recovered [3]
    • deleting the resource group,  deletes the storage account and any other resources in that resource group [3]
    • {recommendation} back up any data before deleting the account [3]
  • [operation} monitoring storage accounts
    • see storage metrics in Azure Monitor
  • {operation} transfer data into a storage account
  • {type} general purpose v1 (GPv1)
    • can no longer be created from the Azure portal [3]
    • no new feature development is expected for this account type [3]
    • currently there's no plan to deprecate support [3]
      • at least one year's advance notice will be provided before deprecating [3]
  • {type} general purpose v2 (GPv2)
    • recommended for most scenarios [3]
    • support the latest Azure Storage features and incorporate all of the functionality of GPv1 [4]
    • upgrading to a GPv2 storage account from GPv1 or Blob storage accounts is straightforward [4]
      • permanent and cannot be undone [4]
      • there's no downtime or risk of data loss associated [4]
      • happens via a simple ARM operation that changes the account type [4]
      • the upgrade is free
      • changing the storage access tier after the upgrade may result in bill changes [4]
  • blob access tiers 
    • enable you to choose the most cost-effective storage based on your anticipated usage patterns [4]
  • storage account encryption
    • all data is automatically encrypted on the service side [1]
  • provides three storage services: 
    • table storage 
      • a NoSQL key/value store
    • queue storage 
      • a simple queueing service
    • blob storage
      • used extensively throughout Azure to store things from logs to virtual machine disks [2]
      • enables storing files predominantly in three different formats [2]
        • ⇐ according to read/write workload [2]
        • block blobs 
          • optimized for storing text or binary files
          • allow for efficient parallel upload/download of a block or list of blocks and modification of a blob at the block granularity [2]
          • modifications to an individual blob take a two-phase approach [2]
            • {phase 1} one uploads the changes as a set of blocks. 
            • {phase 2} one commits the changes by identifying the list of uploaded blocks
            • commonly used for files where the typical workload is to read or write the entire file [2]
              • e.g. text files, CSVs, and binary files 
        • append blobs
          • a variant of a block blob that is optimized for append-only write workloads [2]
            • e.g. logging
          • doesn't allow deleting or updating of existing blocks [2]
        • page blobs
          • optimized for predominantly random read/write workloads against portions of the blob, where data is stored in pages [2]
            • e.g. virtual machine disks
      • file structure for a blob in Blob Storage
        • container
          • logical grouping of blobs
            • similar to how folders group files on your local machine [2]
          • at the root of the Storage account [2]
          • it can be used to set access permissions on the blobs it contains [2]
          • within each container can have an unlimited quantity of blobs [2]
          • each blob (or more precisely, each container and blob pair) identifies a partition [2]
            • each file have in Blob Storage is its own partition [2]
  • when creating the Storage account one can define the degree of replication of the data desired for high availability and disaster recovery purposes [2]
    • at minimum data stored within a Storage account is replicated on three separate nodes within a single facility (i.e., building)
      • {option} locally redundant storage (LRS)
        • stores three copies of the data within a facility (which is naturally within a specific geographic region) [2]
      • {option} zone-redundant storage (ZRS)
        • augments LRS by enabling a replica within another facility within the same region. 
        • supports only block blobs [2]
      • {option} geo-redundant storage (GRS)
        • automatically replicates blob storage to another geographic region that is hundreds of miles away from the primary [2]
        • this secondary replica is not readable unless the primary becomes unavailable 
          • when this happens, the failover is transparent to your application, but Azure will send you an email notification) [2]
          • when new data arrives, that data is first replicated to the three local replicas and then asynchronously replicated to the secondary geographic replica (where it is also replicated three times) [2]
      • {option} read-only geo-redundant storage (RA-GRS)
        • variant of GRS providing a secondary endpoint that enables reading from the secondary Storage account [2]
    • Azure Monitor 
      • provides a unified monitoring experience [4]
      • stores metrics that include aggregated transaction statistics and capacity data about requests to the storage service [4]
      • receives metric data from the Azure Storage [4]

Resources:
[1] Microsoft Learn (2025) Azure: Storage account overview [link]
[2] Zoiner Tejada (2017) Mastering Azure Analytics 
[3] Microsoft Learn (2025) Azure: Create a storage account [link]
[4] Microsoft Learn (2025) Upgrade to a general-purpose v2 storage account [link]

Resources:
[R1] Microsoft Learn (2025) Fabric: What's new in Microsoft Fabric? [link]

Acronyms:
ARM - Azure Resource Manager 
GRS - Geo-Redundant Storage
LRS - Locally Redundant Storage
RA-GRS - Read-Only Geo-Redundant Storage
ZRS - Zone-Redundant Storage

03 July 2019

🧱IT: Redundant Array of Independent Disks [RAID] (Definitions)

"Installation of several disk drives to a system. Some drives contain mirrored information so data is not lost. RAID disk drives can be replaced quickly in cases of disk failure. This technology is good for Web and database servers, so that no information is lost and the information is always available." (Patrick Dalton, "Microsoft SQL Server Black Book", 1997)

"Sometimes referred to as redundant array of inexpensive disks, a system that uses multiple disk drives (an array) to provide performance and reliability. There are six levels describing RAID arrays, 0 through 5. Each level uses a different algorithm to implement fault tolerance." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A disk system that comprises multiple disk drives (an array) to provide higher performance, reliability, storage capacity, and lower cost. Fault-tolerant arrays are categorized in six RAID levels: 0 through 5. Each level uses a different algorithm to implement fault tolerance." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"A specific fault-tolerant disk array system design strategy that takes into account issues of cost benefit, reliability, and performance. It can be implemented at a hardware or a software level; each provides a different profile of cost, reliability, and performance. Depending on the person defining RAID, the word independent may be substituted with inexpensive." (Allan Hirt et al, "Microsoft SQL Server 2000 High Availability", 2004)

"A bunch of small, cheap disks. A RAID array is a group of disks used together as a single unit logical disk. RAID arrays can help with storage capacity, recoverability and performance, using what are called mirroring and striping. Mirroring creates duplicate copies of all physical data. Striping breaks data into many small pieces, where those small pieces can be accessed in parallel." (Gavin Powell, "Beginning Database Design", 2006)

"A schema for using groups of disks to increase performance, protect data, or both." (Tom Petrocelli, "Data Protection and Information Lifecycle Management", 2005)

"This is a grouping, or array, of hard disks that appear as a single, logical drive to the operating system." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

"RAID is an acronym for Redundant Array of Independent Disks. RAID is a collection of disks that operates as a single disk." (S. Sumathi & S. Esakkirajan, "Fundamentals of Relational Database Management Systems", 2007)

"A RAID array uses multiple physical disks to simulate one logical, larger disk, often with protection from disk failure. (The I can also stand for Independent, and the D can also stand for Drives.) " (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Using more disks than is necessary for the actual data itself, as a buffer against failure of one (or possibly more) disks." (David G Hill, "Data Protection: Governance, Risk Management, and Compliance", 2009)

"A category of disk drives that employ two or more drives in combination for fault tolerance and performance." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"A system of disk storage where data is distributed across several drives for faster access and improved fault tolerance." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A technology for configuring a logical data storage device across multiple physical devices to improve performance, availability or both. The primary goal is fault tolerance as in most configurations data can be recovered after a device failure and in some cases, without interruption." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"An acronym that means Redundant Array of Independent Disks. RAID is used to provide balance between performance and fault tolerance. RAID systems use multiple disks to create virtual disks (storage volumes) formed by several individual disks. RAID systems provide performance improvement and fault tolerance." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A category of disk drives that employ two or more drives in combination to deliver fault tolerance and improved performance." (Craig S Mullins, "Database Administration", 2012)

"A multi-disk storage system that optimizes performance, data safety, or both, depending on the type." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.