12 February 2017

⛏️Data Management: Data Quality (Definitions)

"Data are of high quality if they are fit for their intended use in operations, decision-making, and planning." (Joseph M Juran, 1964)

"[…] data has quality if it satisfies the requirements of its intended use. It lacks quality to the extent that it does not satisfy the requirement. In other words, data quality depends as much on the intended use as it does on the data itself. To satisfy the intended use, the data must be accurate, timely, relevant, complete, understood, and trusted." (Jack E Olson, "Data Quality: The Accuracy Dimension", 2003)

"A set of measurable characteristics of data that define how well data represents the real-world construct to which it refers." (Alex Berson & Lawrence Dubov, "Master Data Management and Customer Data Integration for a Global Enterprise", 2007)

"The state of completeness, validity, consistency, timeliness and accuracy that makes data appropriate for a specific use." (Keith Gordon, "Principles of Data Management", 2007)

"Deals with data validation and cleansing services (to ensure relevance, validity, accuracy, and consistency of the master data), reconciliation services (aimed at helping cleanse the master data of duplicates as part of consistency), and cross-reference services (to help with matching master data across multiple systems)." (Martin Oberhofer et al,"Enterprise Master Data Management", 2008)

"A set of data properties (features, parameters, etc.) describing their ability to satisfy user’s expectations or requirements concerning data using for information acquiring in a given area of interest, learning, decision making, etc." (Juliusz L Kulikowski, "Data Quality Assessment", 2009)

"Assessment of the cleanliness, accuracy, and reliability of data." (Laura Reeves, "A Manager's Guide to Data Warehousing", 2009)

"A set of measurable characteristics of data that define how well the data represents the real-world construct to which it refers." (Alex Berson & Lawrence Dubov, "Master Data Management and Data Governance", 2010)

"This term refers to whether an organization’s data is reliable, consistent, up to date, free of duplication, and can be used efficiently across the organization." (Tony Fisher, "The Data Asset", 2009)

"A set of measurable characteristics of data that define how well the data represents the real-world construct to which it refers." (Alex Berson & Lawrence Dubov, "Master Data Management and Data Governance", 2010)

"The degree of data accuracy, accessibility, relevance, time-liness, and completeness." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"The degree of fitness for use of data in particular application. Also the degree to which data conforms to data specifications as measured in data quality dimensions. Sometimes used interchangeably with information quality." (John R Talburt, "Entity Resolution and Information Quality", 2011) 

"The degree to which data is accurate, complete, timely, consistent with all requirements and business rules, and relevant for a given use." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"Contextual data quality considers the extent to which data are applicable (pertinent) to the task of the data user, not to the context of representation itself. Contextually appropriate data must be relevant to the consumer, in terms of timeliness and completeness. Dimensions include: value-added, relevancy, timeliness, completeness, and appropriate amount of data (from the Wang & Strong framework.)" (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement ", 2012)

"Intrinsic data quality denotes that data have quality in their own right; it is understood largely as the extent to which data values are in conformance with the actual or true values. Intrinsically good data is accurate, correct, and objective, and comes from a reputable source. Dimensions include: accuracy objectivity, believability, and reputation (from the Wang & Strong framework)." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement ", 2012)

"Representational data quality indicates that the system must present data in such a way that it is easy to understand (represented concisely and consistently) so that the consumer is able to interpret the data; understood as the extent to which data is presented in an intelligible and clear manner. Dimensions include: interpretability, ease of understanding, representational consistency, and concise representation (rom the Wang & Strong framework)." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement ", 2012)

"The level of quality of data represents the degree to which data meets the expectations of data consumers, based on their intended use of the data." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement", 2013) 

"The relative value of data, which is based on the accuracy of the knowledge that can be generated using that data. High-quality data is consistent, accurate, and unambiguous, and it can be processed efficiently." (Jim Davis & Aiman Zeid, "Business Transformation: A Roadmap for Maximizing Organizational Insights", 2014)

"The properties of data embodied by the “Five C’s”: clean, consistent, conformed, current, and comprehensive." (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"The degree to which data in an IT system is complete, up-to-date, consistent, and (syntactically and semantically) correct." (Tilo Linz et al, "Software Testing Foundations, 4th Ed", 2014)

"A measure for the suitability of data for certain requirements in the business processes, where it is used. Data quality is a multi-dimensional, context-dependent concept that cannot be described and measured by a single characteristic, but rather various data quality dimensions. The desired level of data quality is thereby oriented on the requirements in the business processes and functions, which use this data [...]" (Boris Otto & Hubert Österle, "Corporate Data Quality", 2015)

"[...] characteristics of data such as consistency, accuracy, reliability, completeness, timeliness, reasonableness, and validity. Data-quality software ensures that data elements are represented in a consistent way across different data stores or systems, making the data more trustworthy across the enterprise." (Judith S Hurwitz, "Cognitive Computing and Big Data Analytics", 2015)

"Refers to the accuracy, completeness, timeliness, integrity, and acceptance of data as determined by its users." (Gregory Lampshire, "The Data and Analytics Playbook", 2016)

"A measure of the useableness of data. An ideal dataset is accurate, complete, timely in publication, consistent in its naming of items and its handling of e.g. missing data, and directly machine-readable (see data cleaning), conforms to standards of nomenclature in the field, and is published with sufficient metadata that users can easily understand, for example, who it is published by and the meaning of the variables in the dataset." (Open Data Handbook) 

"Refers to the level of 'quality' in data. If a particular data store is seen as holding highly relevant data for a project, that data is seen as quality to the users." (Solutions Review)

"the processes and techniques involved in ensuring the reliability and application efficiency of data. Data is of high quality if it reliably reflects underlying processes and fits the intended uses in operations, decision making and planning." (KDnuggets)

"The narrow definition of data quality is that it's about data that is missing or incorrect. A broader definition is that data quality is achieved when a business uses data that is comprehensive, consistent, relevant and timely." (Information Management)

"Data Quality refers to the accuracy of datasets, and the ability to analyse and create actionable insights for other users." (experian) [source]

"Data quality refers to the current condition of data and whether it is suitable for a specific business purpose." (Xplenty) [source]

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.