14 January 2018

Data Science: Unstructured Data (Definitions)

"Data that does not neatly fit into a tabular structure with well-defined and bounded definitions. Examples of unstructured data are e-mail messages and video streams. Many customer databases contain comment fields where customer service reps put in additional notes about customers." (Jill Dyché & Evan Levy, "Customer Data Integration: Reaching a Single Version of the Truth", 2006)

"Computerised information which does not have a data structure that is easily readable by a machine, including audio, video and unstructured text such as the body of a word-processed document - effectively this is the same as multimedia data." (Keith Gordon, "Principles of Data Management", 2007)

"Data that has no metadata, such as text files." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"Natively bitmapped data, such as video, audio, pictures, and MRI scans, that can be sensed either visually, audibly, or both." (David G Hill, "Data Protection: Governance, Risk Management, and Compliance", 2009)

"Data that does not fit into a structured data model or does not fit well into relational tables. Common examples include binary information such as video or audio and free-text information." (Evan Stubbs, "Delivering Business Analytics: Practical Guidelines for Best Practice", 2013)

"Data that does not follow a specified data format. Unstructured data can be text, video, images, and so on." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"Unstructured data has no real structure, such as the data in an email and a memo. Interestingly, estimates have 85% of all business information as unstructured data. There are now many products coming on the market that can put some structure into unstructured data so that it can be categorized or organized hierarchically." (Michael M David & Lee Fesperman, "Advanced SQL Dynamic Data Modeling and Hierarchical Processing", 2013)

"Data that exist in their original (raw) state; that is in the format in which they were collected." (Carlos Coronel & Steven Morris, "Database Systems: Design, Implementation, & Management  Ed. 11", 2014)

"Data whose logical organization is not apparent to the computer" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"Information (typically stored digitally) that either does not have a predefined data model or is not organized in a predefined manner. Most unstructured data is created by humans and includes email, documents, text messages, tweets, blogs, and more." (Brenda L Dietrich et al, "Analytics Across the Enterprise", 2014)

"Text, audio, video, and other types of complex data that won’t easily fit into a conventional relational database. Unstructured data isn’t as simple as the numbers and short strings that most data analysts use." (Meta S Brown, "Data Mining For Dummies", 2014)

"Data that cannot fit cleanly into a predefined structure." (Evan Stubbs, "Big Data, Big Innovation", 2014)

"Data without data model or that a computer program cannot easily use (in the sense of understanding its content). Examples are word processing documents or electronic mail" (Hasso Plattner, "A Course in In-Memory Data Management: The Inner Mechanics of In-Memory Databases" 2nd Ed., 2014)

"Data (generally text-based) which is not presented in a structured form such as a database, ontology, table, etc. Newspaper articles, government reports, blogs, and e-mails are all examples of unstructured data." (Hamid R Arabnia et al, "Application of Big Data for National Security", 2015)

"Data that doesn’t fit into a fixed and strict definition. Things like sound files, images, text, and web pages can be considered unstructured data." (Jason Williamson, "Getting a Big Data Job For Dummies", 2015)

"Information that does not follow a specified data format. Unstructured data can be text, video, images, and such." (Judith S Hurwitz, "Cognitive Computing and Big Data Analytics", 2015)

"Data that does not have a specific format. It can be customer reviews, tweets, pictures, or even hashtags." (Brittany Bullard, "Style and Statistics", 2016)

"A type of data where each instance in the data set may have its own internal structure; that is, the structure is not necessarily the same in every instance. For example, text data are often unstructured and require a sequence of operations to be applied to them in order to extract a structured representation for each instance." (John D Kelleher & Brendan Tierney, "Data science", 2018)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.