"A data mesh is inherently multimodal, and data products can be provided via a variety of means. Event streams remain the best option for the majority of data products, as it is far easier to power both operational and analytical use cases through a stream than a batch of files at rest." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Bad data is costly to fix, and it’s more costly the more widespread it is. Everyone who has accessed, used, copied, or processed the data may be affected and may require mitigating action on their part. The complexity is further increased by the fact that not every consumer will “fix” it in the same way. This can lead to divergent results that are divergent with others and can be a nightmare to detect, track down, and rectify." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Creating data products requires that domain owners have a degree of autonomy in modeling, building, and delivering data to their consumers. However, by empowering them with autonomy and independence, you run the risk of a significant technological sprawl across data product implementations, making it more difficult for consumers to use the data products for their own ends. Federated governance focuses on finding an equilibrium between the needs of the consumers, the autonomy of the data product owners, the business compliance and security requirements, and global data product requirements." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Data has historically been treated as a second-class citizen, as a form of exhaust or by-product emitted by business applications. This application-first thinking remains the major source of problems in today’s computing environments, leading to ad hoc data pipelines, cobbled together data access mechanisms, and inconsistent sources of similar-yet-different truths. Data mesh addresses these shortcomings head-on, by fundamentally altering the relationships we have with our data. Instead of a secondary by-product, data, and the access to it, is promoted to a first-class citizen on par with any other business service." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Data mesh architectures are inherently decentralized, and significant responsibility is delegated to the data product owners. A data mesh also benefits from a degree of centralization in the form of data product compatibility and common self-service tooling. Differing opinions, preferences, business requirements, legal constraints, technologies, and technical debt are just a few of the many factors that influence how we work together." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Data mesh promotes data to a product with the same rigor, ownership, and feature management of any other product in your business. The free-for-all, 'figure it out yourself' data access is replaced with purpose-built, maintained, and supported modes. It is as much a social shift as it is a technological shift and requires both top-down and bottom-up buy-in." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Enforcing a schema at read time, instead of at write time, leads to a proliferation of what we call 'bad data'. The lack of write-time checks means that data written into HDFS may not adhere to the schemas that the readers are using in their existing work […]. Some bad data will cause consumers to halt processing, while other bad data may go silently undetected. While both of these are problematic, silent failures can be deadly and difficult to detect." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"Federated governance can be roughly broken down into two main tasks. The first is establishing cross-organization policies, including data product standards and datah andling requirements, that apply to all users of the data mesh. The second is providing guidance on creating and using data products with self-service tools to make it easy to participate in the data mesh." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"The premise of the data mesh solution is simple. Publish important business data sets to dedicated, durable, and easily accessible data structures known as data products. The original creators of the data are responsible for modeling, evolution, quality, and support of the data, treating it with the same first-class care given to any other product in the organization. Prospective consumers can explore, discover, and subscribe to the data products they need for their business use cases. The data products should be well-described, easy to interpret, and form the basis for a set of self-updating data primitives for powering both business services and analytics." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
"The problem of bad data has existed for a very long time. Data copies diverge as their original source changes. Copies get stale. Errors detected in one data set are not fixed in duplicate ones. Domain knowledge related to interpreting and understanding data remains incomplete, as does support from the owners of the original data." (Adam Bellemare, "Building an Event-Driven Data Mesh: Patterns for Designing and Building Event-Driven Architectures", 2023)
No comments:
Post a Comment