30 January 2017

Data Management: Dirty Data (Definitions)

"Data that contain errors or cause problems when accessed and used. Some examples of dirty data are:   Values in data elements that exceed a reasonable range, e.g., an employee with 4299 years of service. Values in data elements that are invalid, e.g., a value of 'X' in a gender field, where the only valid values are 'M' and 'F'. Missing values, e.g., a blank value in a gender field, where the only valid values are 'M' and 'F'.  Incomplete data, e.g., a company has 10 products but data for only 8 products are included." (Margaret Y Chu, "Blissful Data ", 2004)

"Data that contain inaccuracies and/or inconsistencies." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"Poor quality data." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed, 2011)

"Data that is incorrect, out-of-date, redundant, incomplete, or formatted incorrectly." (Craig S Mullins, "Database Administration", 2012)

"Data with inaccuracies and potential errors." (Hamid R Arabnia et al, "Application of Big Data for National Security", 2015)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.