Showing posts with label data governance. Show all posts
Showing posts with label data governance. Show all posts

15 October 2024

🗄️Data Management: Data Governance (Part III: Taming the Complexity)

Data Management Series
Data Management Series

The Chief Data Officer (CDO) or the “Head of the Data Team” is one of the most challenging jobs because is more of a "political" than a technical role. It requires the ideal candidate to be able to throw and catch curved balls almost all the time, and one must be able to play ball with all the parties having an interest in data (aka stakeholders). It’s a full-time job that requires the combination of management and technical skillsets, and both are important! The focus will change occasionally in one direction more than in the other, with important fluctuations. 

Moreover, even if one masters the technical and managerial aspects, the combination of the two gives birth to situations that require further expertise – applied systems thinking being probably the most important. This, also because there are so many points of failure that it's challenging to address all the important causes. Therefore, it’s critical to be a system thinker, to have an experienced team and make use adequately of its experience! 

In a complex word, in which even the smallest constraint or opportunity can have an important impact especially when it’s involved in the early stages of the processes taking place in organizations. It relies on the manager’s and team’s skillset, their inspiration, the way the business reacts to the tasks involved and probably many other aspects that make things work. It takes considerable effort until the whole mechanism works, and even more time to make things work efficiently. The best metaphor is probably the one of a small combat team in which everybody has their place and skillset in the mechanism, independently if one talks about strategy, tactics or operations. 

Unfortunately, building such teams takes time, and the more people are involved, the more complex this endeavor becomes. The manager and the team must meet somewhere in the middle in what concerns the philosophy, the execution of the various endeavors, the way of working together to achieve the same goals. There are multiple forces pulling in all directions and it takes time until one can align the goals, respectively the effort. 

The most challenging forces are the ones between the business and the data team, respectively the business and data requirements, forces that don’t necessarily converge. Working in small organizations, the two parties have in theory more challenges to overcome the challenges and a team’s experience can weight a lot in the process, though as soon the scale changes, the number of challenges to be overcome changes exponentially (there are however different exponential functions in which the basis and exponent make the growth rapid). 

In big organizations can appear other parties that have the same force to pull the weight in one direction or another. Thus, the political aspects become more complex to the degree that the technologies must follow the political decisions, with all the positive and negative implications deriving from this. As comparison, think about the challenges from moving from two to three or more moving bodies orbiting each other, resulting in a chaotic dynamical system for most initial conditions. 

Of course, a business’ context doesn’t have to create such complexity, though when things are unchecked, when delays in decision-making as well as other typical events occur, when there’s no structure, strategy, coordinated effort, or any other important components, the chances for chaotic behavior are quite high with the pass of time. This is just a model to explain real life situations that seem similar on the surface but prove to be quite complex when diving deeper. That’s probably why a CDO’s role as tamer of complexity is important and challenging!

Previous Post <<||>> Next Post

14 September 2024

🗄️Data Management: Data Governance (Part II: Heroes Die Young)

Data Management Series
Data Management Series

In the call for action there are tendencies in some organizations to idealize and overcharge main actors' purpose and image when talking about data governance by calling them heroes. Heroes are those people who fight for a goal they believe in with all their being and occasionally they pay the supreme tribute. Of course, the image of heroes is idealized and many other aspects are ignored, though such images sell ideas and ideals. Organizations might need heroes and heroic deeds to change the status quo, but the heroism doesn't necessarily payoff for the "heroes"! 

Sometimes, organizations need a considerable effort to change the status quo. It can be people's resistance to new, to the demands, to the ideas propagated, especially when they are not clearly explained and executed. It can be the incommensurable distance between the "AS IS" and the "TO BE" perspectives, especially when clear paths aren't in sight. It can be the lack of resources (e.g., time, money, people, tools), knowledge, understanding or skillset that makes the effort difficult. 

Unfortunately, such initiatives favor action over adequate strategies, planning and understanding of the overall context. The call do to something creates waves of actions and reactions which in the organizational context can lead to storms and even extreme behavior that ranges from resistance to the new to heroic deeds. Finding a few messages that support the call for action can help, though they can't replace the various critical for success factors.

Leading organizations on a new path requires a well-defined realistic strategy, respectively adequate tactical and operational planning that reflects organizations' specific needs, knowledge and capabilities. Just demanding from people to do their best is not enough, and heroism has chances to appear especially in this context. Unfortunately, the whole weight falls on the shoulders of the people chosen as actors in the fight. Ideally, it should be possible to spread the whole weight on a broader basis which should be considered the foundation for the new. 

The "heroes" metaphor is idealized and the negative outcome probably exaggerated, though extreme situations do occur in organizations when decisions, planning, execution and expectations are far from ideal. Ideal situations are met only in books and less in practice!

The management demands and the people execute, much like in the army, though by contrast people need to understand the reasoning behind what they are doing. Proper execution requires skillset, understanding, training, support, tools and the right resources for the right job. Just relying on people's professionalism and effort is not enough and is suboptimal, but this is what many organizations seem to do!

Organizations tend to respond to the various barriers or challenges with more resources or pressure instead of analyzing and depicting the situation adequately, and eventually change the strategy, tactics or operations accordingly. It's also difficult to do this as long an organization doesn't have the capabilities and practices of self-check, self-introspection, self-reflection, etc. Even if it sounds a bit exaggerated, an organization must know itself to overcome the various challenges. Regular meetings, KPIs and other metrics give the illusion of control when self-control is needed. 

Things don't have to be that complex even if managing data governance is a complex endeavor. Small or midsized organizations are in theory more capable to handle complexity because they can be more agile, have a robust structure and the flow of information and knowledge has less barriers, respectively a shorter distance to overcome, at least in theory. One can probably appeal to the laws and characteristics of networks to understand more about the deeper implications, of how solutions can be implemented in more complex setups.

01 September 2024

🗄️Data Management: Data Governance (Part I: No Guild of Heroes)

Data Management Series
Data Management Series

Data governance appeared around 1980s as topic though it gained popularity in early 2000s [1]. Twenty years later, organizations still miss the mark, respectively fail to understand and implement it in a consistent manner. As usual, the reasons for failure are multiple and they vary from misunderstanding what governance is all about to poor implementation of methodologies and inadequate management or leadership. 

Moreover, methodologies tend to idealize the various aspects and is not what organizations need, but pragmatism. For example, data governance is not about heroes and heroism [2], which can give the impression that heroic actions are involved and is not the case! Actions for the sake of action don’t necessarily lead to change by themselves. Organizations are in general good at creating meaningless action without results, especially when people preoccupy themselves, miss or ignore the mark. Big organizations are very good at generating actions without effects. 

People do talk to each other, though they try to solve their own problems and optimize their own areas without necessarily thinking about the bigger picture. The problem is not necessarily communication or the lack of depth into business issues, people do communicate, know the issues without a business impact assessment. The challenge is usually in convincing the upper management that the effort needs to be consolidated, supported, respectively the needed resources made available. 

Probably, one of the issues with data governance is the attempt of creating another structure in the organization focused on quality, which has the chances to fail, and unfortunately does fail. Many issues appear when the structure gains weight and it becomes a separate entity instead of being the backbone of organizations. 

As soon organizations separate the data governance from the key users, management and the other important decisional people in the organization, it takes a life of its own that has the chances to diverge from the initial construct. Then, organizations need "alignment" and probably other big words to coordinate the effort. Also such constructs can work but they are suboptimal because the forces will always pull in different directions.

Making each manager and the upper management responsible for governance is probably the way to go, though they’ll need the time for it. In theory, this can be achieved when many of the issues are solved at the lower level, when automation and further aspects allow them to supervise things, rather than hiding behind every issue. 

When too much mircomanagement is involved, people tend to busy themselves with topics rather than solve the issues they are confronted with. The actual actors need to be empowered to take decisions and optimize their work when needed. Kaizen, the philosophy of continuous improvement, proved itself that it works when applied correctly. They’ll need the knowledge, skills, time and support to do it though. One of the dangers is however that this becomes a full-time responsibility, which tends to create a separate entity again.

The challenge for organizations lies probably in the friction between where they are and what they must do to move forward toward the various objectives. Moving in small rapid steps is probably the way to go, though each person must be aware when something doesn’t work as expected and react. That’s probably the most important aspect. 

So, the more functions are created that diverge from the actual organization, the higher the chances for failure. Unfortunately, failure is visible in the later phases, and thus self-awareness, self-control and other similar “qualities” are needed, like small actors that keep the system in check and react whenever is needed. Ideally, the employees are the best resources to react whenever something doesn’t work as per design. 

Previous Post <<||>> Next Post 

Resources:
[1] Wikipedia (2023) Data Management [link]
[2] Tiankai Feng (2023) How to Turn Your Data Team Into Governance Heroes [link]


28 March 2024

🗄️🗒️Data Management: Master Data Management [MDM] [Notes]

Disclaimer: This is work in progress intended to consolidate information from various sources. 
Last updated: 28-Mar-2024

Master Data Management (MDM)

  • {definition} the technologies, processes, policies, standards and guiding principles that enable the management of master data values to enable consistent, shared, contextual use across systems, of the most accurate, timely, and relevant version of truth about essential business entities [2],[3]
  • {goal} enable sharing of information assets across business domains and applications within an organization [4]
  • {goal} provide authoritative source of reconciled and quality-assessed master (and reference) data [4]
  • {goal} lower cost and complexity through use of standards, common data models, and integration patterns [4]
  • {driver} meeting organizational data requirements
  • {driver} improving data quality
  • {driver} reducing the costs for data integration
  • {driver} reducing risks 
  • {type} operational MDM 
    • involves solutions for managing transactional data in operational applications [1]
    • rely heavily on data integration technologies
  • {type} analytical MDM
    • involves solutions for managing analytical master data
    • centered on providing high quality dimensions with multiple hierarchies [1]
    • cannot influence operational systems
      • any data cleansing made within operational application isn’t recognized by transactional applications [1]
        • ⇒ inconsistencies to the main operational data [1]
      • transactional application knowledge isn’t available to the cleansing process
  • {type} enterprise MDM
    • involves solutions for managing both transactional and analytical master data 
      • manages all master data entities
      • deliver maximum business value
    • operational data cleansing
      • improves the operational efficiencies of the applications and the business processes that use the applications
    • cross-application data need
      • consolidation
      • standardization
      • cleansing
      • distribution
    • needs to support high volume of transactions
      • ⇒ master data must be contained in data models designed for OLTP
        • ⇐ ODS don’t fulfill this requirement 
  • {enabler} high-quality data
  • {enabler} data governance
  • {benefit} single source of truth
    • used to support both operational and analytical applications in a consistent manner [1]
  • {benefit} consistent reporting
    • reduces the inconsistencies experienced previously
    • influenced by complex transformations
  • {benefit} improved competitiveness
    • MDM reduces the complexity of integrating new data and systems into the organization
      • ⇒ increased flexibility and improves competitiveness
    • ability to react to new business opportunities quickly with limited resources
  • {benefit} improved risk management
    • more reliable and consistent data improves the business’s ability to manage enterprise risk [1]
  • {benefit} improved operational efficiency and reduced costs
    • helps identify business’ pain point
      • by developing a strategy for managing master data
  • {benefit} improved decision making
    • reducing data inconsistency diminishes organizational data mistrust and facilitates clearer (and faster) business decisions [1]
  • {benefit} more reliable spend analysis and planning
    • better data integration helps planners come up with better decisions
      • improves the ability to 
        • aggregate purchasing activities
        • coordinate competitive sourcing
        • be more predictable about future spending
        • generally improve vendor and supplier management
  • {benefit} regulatory compliance
    • allows to reduce compliance risk
      • helps satisfy governance, regulatory and compliance requirements
    • simplifies compliance auditing
      • enables more effective information controls that facilitate compliance with regulations
  • {benefit} increased information quality
    • enables organizations to monitor conformance more effectively
      • via metadata collection
      • it can track whether data meets information quality expectations across vertical applications, which reduces information scrap and rework
  • {benefit} quicker results
    • reduces the delays associated with extraction and transformation of data [1]
      • ⇒ it speeds up the implementation of application migrations, modernization projects, and data warehouse/data mart construction [1]
  • {benefit} improved business productivity
    • gives enterprise architects the chance to explore how effective the organization is in automating its business processes by exploiting the information asset [1]
      • ⇐ master data helps organizations realize how the same data entities are represented, manipulated, or exchanged across applications within the enterprise and how those objects relate to business process workflows [1]
  • {benefit} simplified application development
    • provides the opportunity to consolidate the application functionality associated with the data lifecycle [1]
      • ⇐ consolidation in MDM is not limited to the data
      • ⇒ provides a single functional to which different applications can subscribe
        • ⇐ introducing a technical service layer for data lifecycle functionality provides the type of abstraction needed for deploying SOA or similar architectures
  • factors to consider for implementing an MDM:
    • effective technical infrastructure for collaboration [1]
    • organizational preparedness
      • for making a quick transition from a loosely combined confederation of vertical silos to a more tightly coupled collaborative framework
      • {recommendation} evaluate the kinds of training sessions and individual incentives required to create a smooth transition [1]
    • metadata management
      • via a metadata registry 
        • {recommendation} sets up a mechanism for unifying a master data view when possible [1]
        • determines when that unification should be carried out [1]
    • technology integration
      • {recommendation} diagnose what technology needs to be integrated to support the process instead of developing the process around the technology [1]
    • anticipating/managing change
      • proper preparation and organization will subtly introduce change to the way people think and act as shown in any shift in pattern [1]
      • changes in reporting structures and needs are unavoidable
    • creating a partnership between Business and IT
      • IT roles
        • plays a major role in executing the MDM program[1]
      • business roles
        • identifying and standardizing master data [1]
        • facilitating change management within the MDM program [1]
        • establishing data ownership
    • measurably high data quality
    • overseeing processes via policies and procedures for data governance [1]
  • {challenge} establishing enterprise-wide data governance
    • {recommendation} define and distribute the policies and procedures governing the oversight of master data
      • seeking feedback from across the different application teams provides a chance to develop the stewardship framework agreed upon by the majority while preparing the organization for the transition [1]
  • {challenge} isolated islands of information
    • caused by vertical alignment of IT
      • makes it difficult to fix the dissimilarities in roles and responsibilities in relation to the isolated data sets because they are integrated into a master view [1]
    • caused by data ownership
      • the politics of information ownership and management have created artificial exclusive domains supervised by individuals who have no desire to centralize information [1]
  • {challenge} consolidating master data into a centrally managed data asset [1]
    • transfers the responsibility and accountability for information management from the lines of business to the organization [1]
  • {challenge} managing MDM
    • MDM should be considered a program and not a project or an application [1]
  • {challenge} achieving timely and accurate synchronization across disparate systems [1]
  • {challenge} different definitions of master metadata 
    • different coding schemes, data types, collations, and more
      • ⇐ data definitions must be unified
  • {challenge} data conflicts 
    • {recommendation} resolve data conflicts during the project [5]
    • {recommendation} replicate the resolved data issues back to the source systems [5]
  • {challenge} domain knowledge 
    • {recommendation} involve domain experts in an MDM project [5]
  • {challenge} documentation
    • {recommendation} properly document your master data and metadata [5]
  • approaches
    • {architecture} no central MDM 
      • isn’t a real MDM approach
      • used when any kind of cross-system interaction is required [5]
        • e.g. performing analysis on data from multiple systems, ad-hoc merging and cleansing
      • {drawback} very inexpensive at the beginning; however, it turns out to be the most expensive over time [5]
    • {architecture} central metadata storage 
      • provides unified, centrally maintained definitions for master data [5]
        • followed and implemented by all systems
      • ad-hoc merging and cleansing becomes somewhat simpler [5]
      • does not use a specialized solution for the central metadata storage [5]
        • ⇐ the central storage of metadata is probably in an unstructured form 
          • e.g. documents, worksheets, paper
    • {architecture} central metadata storage with identity mapping 
      • stores keys that map tables in the MDM solution
        • only has keys from the systems in the MDM database; it does not have any other attributes [5]
      • {benefit} data integration applications can be developed much more quickly and easily [5]
      • {drawback} raises problems in regard to maintaining master data over time [5]
        • there is no versioning or auditing in place to follow the changes [5]
          • ⇒ viable for a limited time only
            • e.g. during upgrading, testing, and the initial usage of a new ERP system to provide mapping back to the old ERP system
    • {architecture} central metadata storage and central data that is continuously merged 
      • stores metadata as well as master data in a dedicated MDM system
      • master data is not inserted or updated in the MDM system [5]
      • the merging (and cleansing) of master data from source systems occurs continuously, regularly [5]
      • {drawback} continuous merging can become expensive [5]
      • the only viable use for this approach is for finding out what has changed in source systems from the last merge [5]
        • enables merging only the delta (new and updated data)
      • frequently used for analytical systems
    • {architecture} central MDM, single copy 
      • involves a specialized MDM application
        • master data, together with its metadata, is maintained in a central location [5]
        • ⇒ all existing applications are consumers of the master data
      • {drawback} upgrade all existing applications to consume master data from central storage instead of maintaining their own copies [5]
        • ⇒ can be expensive
        • ⇒ can be impossible (e.g. for older systems)
      • {drawback} needs to consolidate all metadata from all source systems [5]
      • {drawback} the process of creating and updating master data could simply be too slow [5]
        • because of the processes in place
    • {architecture} central MDM, multiple copies 
      • uses central storage of master data and its metadata
        • ⇐ the metadata here includes only an intersection of common metadata from source systems [5]
        • each source system maintains its own copy of master data, with additional attributes that pertain to that system only [5]
      • after master data is inserted into the central MDM system, it is replicated (preferably automatically) to source systems, where the source-specific attributes are updated [5]
      • {benefit} good compromise between cost, data quality, and the effectiveness of the CRUD process [5]
      • {drawback} update conflicts
        • different systems can also update the common data [5]
          • ⇒ involves continuous merges as well [5]
      • {drawback} uses a special MDM application
Acronyms:
MDM - Master Data Management
ODS - Operational Data Store
OLAP - online analytical processing
OLTP - online transactional processing
SOA - Service Oriented Architecture

References:
[1] The Art of Service (2017) Master Data Management Course 
[2] DAMA International (2009) "The DAMA Guide to the Data Management Body of Knowledge" 1st Ed.
[3] Tony Fisher 2009 "The Data Asset"
[4] DAMA International (2017) "The DAMA Guide to the Data Management Body of Knowledge" 2nd Ed.
[5] Dejan Sarka et al (2012) Exam 70-463: Implementing a Data Warehouse with Microsoft SQL Server 2012 (Training Kit)

22 March 2024

🧭Business Intelligence: Monolithic vs. Distributed Architecture (Part III: Architectural Applications)

 

Business Intelligence
Business Intelligence Series

Now considering the 500 houses and the skyscraper model introduced in thee previous post, which do you think will be built first? A skyscraper takes 2-10 years to build, depending on the city in which is built and the architecture characteristics. A house may take 6-12 months depending on similar factors. But one needs to build 500 houses. For sure the process can be optimized when the houses look the same, though there are many constraints one needs to consider - the number of workers, tools, and the construction material available at a given time, the volume of planning, etc. 

Within a rough estimate, it can take 2-5 years for each architecture to be built considering that on the average the advantages and disadvantages from the various areas can balance each other out. Historical data are in general needed for estimating the actual development time. One can start with a rough estimate and reevaluate the estimates up and down as more information are gathered. This usually happens in Software Engineering as well. 

Monolith vs. Distributed Architecture
Monolith vs. Distributed Architecture - 500 families

There are multiple ways in which the work can be assigned to the contractors. When the houses are split between domains, each domain can have its own contractor(s) or the contractors can be specialized by knowledge areas, or a combination of the two. Contractors’ performance should be the same, though in practice no two contractors are the same. Conversely, the chances are higher for some contractors to deliver at the expected quality. It would be useful to have worked before with the contractors and have a partnership that spans years back. There are risks on both sides, even if the risks might favor one architecture over the other, and this depends also on the quality of the contractors, designs, and planning. 

The planning must be good if not perfect to assure smooth development and each day can cost money when contractors are involved. The first planning must be done for the whole project and then split individually for each contractor and/or group of buildings. A back-and-forth check between the various plans is needed. Managing by exception can work, though it can also go terribly wrong. 

Lot of communication must occur between domains to make sure that everything fits together. Especially at the beginning, all the parties must plan together, must make sure that the rules of the games (best practices, policies, procedures, processes, methodologies) are agreed upon. Oversight (governance) needs to happen at a small scale as well on aggregate to makes sure that the rules of the game are followed. 

Now, which of the architectures do you think will fit a data warehouse (DWH)? Probably multiple voices will opt for the skyscraper, at least this is how a DWH looks from the outside. However, when one evaluates the architecture behind it, it can resemble a residential complex in which parts are bound together, but there are parts that can be distributed if needed. For example, in a DWH the HR department has its own area that's isolated from the other areas as it has higher security demands. There can be 2-3 other areas that don't share objects, and they can be distributed as well. The reasons why all infrastructure is on one machine are the costs associated with the licenses, respectively the reporting tools point to only one address. 

In data marts based DWHs, there are multiple buildings within the architecture, and thus the data marts can be distributed across a wider infrastructure, with each domain responsible for its own data mart(s). The data marts are by definition domain-dependent, and this is one of the downsides imputed to this architecture. 

Previous Post <<||>> Next Post

23 December 2017

🗃️Data Management: Data Governance (Just the Quotes)

"Data migration is not just about moving data from one place to another; it should be focused on: realizing all the benefits promised by the new system when you entertained the concept of new software in the first place; creating the improved enterprise performance that was the driver for the project; importing the best, the most appropriate and the cleanest data you can so that you enhance business intelligence; maintaining all your regulatory, legal and governance compliance criteria; staying securely in control of the project." (John Morris, "Practical Data Migration", 2009)

"Are data quality and data governance the same thing? They share the same goal, essentially striving for the same outcome of optimizing data and information results for business purposes. Data governance plays a very important role in achieving high data quality. It deals primarily with orchestrating the efforts of people, processes, objectives, technologies, and lines of business in order to optimize outcomes around enterprise data assets. This includes, among other things, the broader cross-functional oversight of standards, architecture, business processes, business integration, and risk and compliance. Data governance is an organizational structure that oversees the compliance and standards of enterprise data." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Data governance is about putting people in charge of fixing and preventing data issues and using technology to help aid the process. Any time data is synchronized, merged, and exchanged, there have to be ground rules guiding this. Data governance serves as the method to organize the people, processes, and technologies for data-driven programs like data quality; they are a necessary part of any data quality effort." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Data governance is the process of creating and enforcing standards and policies concerning data. [...] The governance process isn't a transient, short-term project. The governance process is a continuing enterprise-focused program." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Understanding an organization's current processes and issues is not enough to build an effective data governance program. To gather business, functional, and technical requirements, understanding the future vision of the business or organization is important. This is followed with the development of a visual prototype or logical model, independent of products or technology, to demonstrate the data governance process. This business-driven model results in a definition of enterprise-wide data governance based on key standards and processes. These processes are independent of the applications and of the tools and technologies required to implement them. The business and functional requirements, the discovery of business processes, along with the prototype or model, provide an impetus to address the "hard" issues in the data governance process." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"A big part of data governance should be about helping people (business and technical) get their jobs done by providing them with resources to answer their questions, such as publishing the names of data stewards and authoritative sources and other metadata, and giving people a way to raise, and if necessary escalate, data issues that are hindering their ability to do their jobs. Data governance helps answer some basic data management questions." (Mike Fleckenstein & Lorraine Fellows, "Modern Data Strategy", 2018)

"Data lake is an ecosystem for the realization of big data analytics. What makes data lake a huge success is its ability to contain raw data in its native format on a commodity machine and enable a variety of data analytics models to consume data through a unified analytical layer. While the data lake remains highly agile and data-centric, the data governance council governs the data privacy norms, data exchange policies, and the ensures quality and reliability of data lake." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"Data governance policies must not enforce constraints on data - Data governance intends to control the level of democracy within the data lake. Its sole purpose of existence is to maintain the quality level through audits, compliance, and timely checks. Data flow, either by its size or quality, must not be constrained through governance norms. [...] Effective data governance elevates confidence in data lake quality and stability, which is a critical factor to data lake success story. Data compliance, data sharing, risk and privacy evaluation, access management, and data security are all factors that impact regulation." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"Data governance presents a clear shift in approach, signals a dedicated focus on data management, distinctly identifies accountability for data, and improves communication through a known escalation path for data questions and issues. In fact, data governance is central to data management in that it touches on essentially every other data management function. In so doing, organizational change will be brought to a group is newly - and seriously - engaging in any aspect of data management." (Mike Fleckenstein & Lorraine Fellows, "Modern Data Strategy", 2018)

"Data is owned by the enterprise, not by systems or individuals. The enterprise should recognize and formalize the responsibilities of roles, such as data stewards, with specific accountabilities for managing data. A data governance framework and guidelines must be developed to allow data stewards to coordinate with their peers and to communicate and escalate issues when needed. Data should be governed cooperatively to ensure that the interests of data stewards and users are represented and also that value to the enterprise is maximized." (Mike Fleckenstein & Lorraine Fellows, "Modern Data Strategy", 2018)

"Data swamp, on the other hand, presents the devil side of a lake. A data lake in a state of anarchy is nothing but turns into a data swamp. It lacks stable data governance practices, lacks metadata management, and plays weak on ingestion framework. Uncontrolled and untracked access to source data may produce duplicate copies of data and impose pressure on storage systems." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"Typically, a data steward is responsible for a data domain (or part of a domain) across its life cycle. He or she supports that data domain across an entire business process rather than for a specific application or a project. In this way, data governance provides the end user with a go-to resource for data questions and requests. When formally applied, data governance also holds managers and executives accountable for data issues that cannot be resolved at lower levels. Thus, it establishes an escalation path beginning with the end user. Most important, data governance determines the level - local, departmental or enterprise - at which specific data is managed. The higher the value of a particular data asset, the more rigorous its data governance." (Mike Fleckenstein & Lorraine Fellows, "Modern Data Strategy", 2018)

"Broadly speaking, data governance builds on the concepts of governance found in other disciplines, such as management, accounting, and IT. Think of it as a set of practices and guidelines that define the loci of accountability and responsibility related to data within the organization. These guidelines support the organization's business model through generating and consuming data." (Gregory Vial, "Data Governance in the 21st-Century Organization", 2020)

"Good [data] governance requires balance and adjustment. When done well, it can fuel digital innovation without compromising security." (Gregory Vial, "Data Governance in the 21st-Century Organization", 2020)

"Good data governance ensures that downstream negative effects of poor data are avoided and that subsequent reports, analyses and conclusions are based on reliable, trusted data." (Robert F Smallwood, "Information Governance: Concepts, Strategies and Best Practices" 2ndEd., 2020)

"Where data governance really takes place is between strategy and the daily management of operations. Data governance should be a bridge that translates a strategic vision acknowledging the importance of data for the organization and codifying it into practices and guidelines that support operations, ensuring that products and services are delivered to customers."  (Gregory Vial, "Data Governance in the 21st-Century Organization", 2020)

"In an era of machine learning, where data is likely to be used to train AI, getting quality and governance under control is a business imperative. Failing to govern data surfaces problems late, often at the point closest to users (for example, by giving harmful guidance), and hinders explainability (garbage data in, machine-learned garbage out)." (Jesús Barrasa et al, "Knowledge Graphs: Data in Context for Responsive Businesses", 2021)

"Governance requires a really fine balance - governing to the point where consistency is assured, but flexibility remains. There is no perfect formula, but finding the right governance level within your organization’s culture is a critical component to making the most of BI opportunities." (Mike Saliter)

05 May 2017

⛏️Data Management: Data Steward (Definitions)

"A person with responsibility to improve the accuracy, reliability, and security of an organization’s data; also works with various groups to clearly define and standardize data." (Margaret Y Chu, "Blissful Data ", 2004)

"Critical players in data governance councils. Comfortable with technology and business problems, data stewards seek to speak up for their business units when an organization-wide decision will not work for that business unit. Yet they are not turf protectors, instead seeking solutions that will work across an organization. Data stewards are responsible for communication between the business users and the IT community." (Tony Fisher, "The Data Asset", 2009)

"A business leader and/or subject matter expert designated as accountable for: a) the identification of operational and Business Intelligence data requirements within an assigned subject area, b) the quality of data names, business definitions, data integrity rules, and domain values within an assigned subject area, c) compliance with regulatory requirements and conformance to internal data policies and data standards, d) application of appropriate security controls, e) analyzing and improving data quality, and f) identifying and resolving data related issues. Data stewards are often categorized as executive data stewards, business data stewards, or coordinating data stewards." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

[business data steward:] "A knowledge worker, business leader, and recognized subject matter expert assigned accountability for the data specifications and data quality of specifically assigned business entities, subject areas or databases, but with less responsibility for data governance than a coordinating data steward or an executive data steward." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The person responsible for maintaining a data element in a metadata registry." (Microsoft, "SQL Server 2012 Glossary, 2012)

"The term stewardship is “the management or care of another person’s property” (NOAD). Data stewards are individuals who are responsible for the care and management of data. This function is carried out in different ways based on the needs of particular organizations." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement ", 2012)

"The person responsible for maintaining a data element in a metadata registry." (Microsoft, SQL Server 2012 Glossary, 2012)

"An individual comfortable with both technology and business problems. Stewards are responsible for communicating between the business users and the IT community." (Jim Davis & Aiman Zeid, "Business Transformation: A Roadmap for Maximizing Organizational Insights", 2014)

"A role in the data governance organization that is responsible for the development of a uniform data model for business objects used across boundaries. The data steward is also often responsible for the development of master data management and ensures compliance with the governance rules." (Boris Otto & Hubert Österle, "Corporate Data Quality", 2015)

"A natural person assigned the responsibility to catalog, define, and monitor changes to critical data. Example: The data steward for finance critical data is Dan." (Gregory Lampshire, "The Data and Analytics Playbook", 2016)

"A person responsible for managing data content, quality, standards, and controls within an organization or function." (Jonathan Ferrar et al, "The Power of People", 2017)

"A data steward is a job role that involves planning, implementing and managing the sourcing, use and maintenance of data assets in an organization. Data stewards enable an organization to take control and govern all the types and forms of data and their associated libraries or repositories." (Richard T Herschel, "Business Intelligence", 2019)

26 January 2017

⛏️Data Management: Data Governance (Definitions)

"The infrastructure, resources, and processes involved in managing data as a corporate asset." (Jill Dyché & Evan Levy, "Customer Data Integration", 2006)

"A process focused on managing the quality, consistency, usability, security, and availability of information." (Alex Berson & Lawrence Dubov, "Master Data Management and Customer Data Integration for a Global Enterprise", 2007)

"The practice of organizing and implementing policies, procedures, and standards for the effective use of an organization's structured or unstructured information assets." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

"The process for addressing how data enters the organization, who is accountable for it, and how - using people, processes, and technologies - data achieves a quality standard that allows for complete transparency within an organization." (Tony Fisher, "The Data Asset", 2009)

"A framework of processes aimed at defining and managing the quality, consistency, usability, security, and availability of information with the primary focus on cross-functional, cross-departmental, and/or cross-divisional concerns of information management." (Alex Berson & Lawrence Dubov, "Master Data Management and Data Governance", 2010)

"The policies and processes that continually work to improve and ensure the availability, accessibility, quality, consistency, auditability, and security of data in a company or institution." (David Lyle & John G Schmidt, "Lean Integration", 2010)

"The exercise of authority, control, and shared decision-making (planning, monitoring, and enforcement) over the management of data assets." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"Data governance is the specification of decision rights and an accountability framework to encourage desirable behavior in the valuation, creation, storage, use, archival and deletion of data and information. It includes the processes, roles, standards and metrics that ensure the effective and efficient use of data and information in enabling an organization to achieve its goals." (Oracle, "Enterprise Information Management: Best Practices in Data Governance", 2011)

"Processes and controls at the data level; a newer, hybrid quality control discipline that includes elements of data quality, data management, information governance policy development, business process improvement, and compliance and risk management."(Robert F Smallwood, "Information Governance: Concepts, Strategies, and Best Practices", 2014)

"The process for addressing how data enters the organization, who is accountable for it, and how that data achieves the organization's quality standards that allow for complete transparency within an organization." (Jim Davis & Aiman Zeid, "Business Transformation", 2014) 

"A company-wide framework that determines which decisions must be made and who should make them. This includes the definition of roles, responsibilities, obligations and rights in handling the company’s resource data. In this, data governance pursues the goal of maximizing the value of the data in the company. While data governance determines how decisions should be made, data management makes the actual decisions and implements them." (Boris Otto & Hubert Österle, "Corporate Data Quality", 2015)

"The discipline of applying controls to data in order to ensure its integrity over time." (Gregory Lampshire, "The Data and Analytics Playbook", 2016)

"Data governance refers to the overall management of the availability, usability, integrity and security of the data employed in an enterprise. Sound data governance programs include a governing body or council, a defined set of procedures and a standard operating procedure." (Dennis C Guster, "Scalable Data Warehouse Architecture: A Higher Education Case Study", 2018)

"It is a combination of people, processes and technology that drives high-quality, high-value information. The technology portion of data governance combines data quality, data integration and master data management to ensure that data, processes, and people can be trusted and accountable, and that accurate information flows through the enterprise driving business efficiency." (Richard T Herschel, "Business Intelligence", 2019)

"The processes and technical infrastructure that an organization has in place to ensure data privacy, security, availability, usability, and integrity." (Lili Aunimo et al, "Big Data Governance in Agile and Data-Driven Software Development: A Market Entry Case in the Educational Game Industry", 2019)

"The management of data throughout its entire lifecycle in the company to ensure high data quality. Data Governance uses guidelines to determine which standards are applied in the company and which areas of responsibility should handle the tasks required to achieve high data quality." (Mohammad K Daradkeh, "Enterprise Data Lake Management in Business Intelligence and Analytics: Challenges and Research Gaps in Analytics Practices and Integration", 2021)

"A set of processes that ensures that data assets are formally managed throughout the enterprise. A data governance model establishes authority and management and decision making parameters related to the data produced or managed by the enterprise." (NSA/CSS)

"The management of the availability, usability, integrity and security of the data stored within an enterprise." (Solutions Review)

"The process of defining the rules that data has to follow within an organization." (Talend)

Data governance 2.0: "An agile approach to data governance focused on just enough controls for managing risk, which enables broader and more insightful use of data required by the evolving needs of an expanding business ecosystem." (Forrester)

"Data governance encompasses the strategies and technologies used to ensure data is in compliance with regulations and organization policies with respect to data usage." (Adobe)

"Data governance encompasses the strategies and technologies used to make sure business data stays in compliance with regulations and corporate policies." (Informatica) [source]

"Data Governance includes the people, processes and technologies needed to manage and protect the company’s data assets in order to guarantee generally understandable, correct, complete, trustworthy, secure and discoverable corporate data." (BI Survey) [source]

"Data governance is a control that ensures that data entry by a business user or an automated process meets business standards. It manages a variety of things including availability, usability, accuracy, integrity, consistency, completeness, and security of data usage. Through data governance, organizations are able to exercise positive control over the processes and methods to handle data." (Logi Analytics) [source]

"Data governance is a structure put in place allowing organisations to proactively manage data quality." (experian) [source]

"Data governance is an organization's internal policy framework that determines the way people make data management decisions. All aspects of data management must be carried out in accordance with the organization's governance policies." (Xplenty) [source]

"Data Governance is the exercise of decision-making and authority for data-related matters." (The Data Governance Institute)

"Data Governance is a system of decision rights and accountabilities for information-related processes, executed according to agreed-upon models which describe who can take what actions with what information, and when, under what circumstances, using what methods." (The Data Governance Institute)

"Data governance is the practice of organizing and implementing policies, procedures and standards for the effective use of an organization's structured/unstructured information assets." (Information Management)

"Data governance is the specification of decision rights and an accountability framework to ensure the appropriate behavior in the valuation, creation, consumption and control of data and analytics." (Gartner)

"The exercise of authority, control and shared decision making (planning, monitoring and enforcement) over the management of data assets. It refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise. A sound data governance program includes a governing body or council, a defined set of procedures, and a plan to execute those procedures." (CODATA)

23 November 2006

🔢Neera Bhansali - Collected Quotes

"Are data quality and data governance the same thing? They share the same goal, essentially striving for the same outcome of optimizing data and information results for business purposes. Data governance plays a very important role in achieving high data quality. It deals primarily with orchestrating the efforts of people, processes, objectives, technologies, and lines of business in order to optimize outcomes around enterprise data assets. This includes, among other things, the broader cross-functional oversight of standards, architecture, business processes, business integration, and risk and compliance. Data governance is an organizational structure that oversees the compliance and standards of enterprise data." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Data governance is about putting people in charge of fixing and preventing data issues and using technology to help aid the process. Any time data is synchronized, merged, and exchanged, there have to be ground rules guiding this. Data governance serves as the method to organize the people, processes, and technologies for data-driven programs like data quality; they are a necessary part of any data quality effort." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Data governance is the process of creating and enforcing standards and policies concerning data. [...] The governance process isn't a transient, short-term project. The governance process is a continuing enterprise-focused program." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Having data quality as a focus is a business philosophy that aligns strategy, business culture, company information, and technology in order to manage data to the benefit of the enterprise. Data quality is an elusive subject that can defy measurement and yet be critical enough to derail a single IT project, strategic initiative, or even an entire company." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

"Understanding an organization's current processes and issues is not enough to build an effective data governance program. To gather business, functional, and technical requirements, understanding the future vision of the business or organization is important. This is followed with the development of a visual prototype or logical model, independent of products or technology, to demonstrate the data governance process. This business-driven model results in a definition of enterprise-wide data governance based on key standards and processes. These processes are independent of the applications and of the tools and technologies required to implement them. The business and functional requirements, the discovery of business processes, along with the prototype or model, provide an impetus to address the "hard" issues in the data governance process." (Neera Bhansali, "Data Governance: Creating Value from Information Assets", 2014)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.