"Data is frequently missing or incongruous. If data is missing, do you simply ignore the missing points? That isn’t always possible. If data is incongruous, do you decide that something is wrong with badly behaved data (after all, equipment fails), or that the incongruous data is telling its own story, which may be more interesting? It’s reported that the discovery of ozone layer depletion was delayed because automated data collection tools discarded readings that were too low. In data science, what you have is frequently all you’re going to get. It’s usually impossible to get 'better' data, and you have no alternative but to work with the data at hand." (Mike Loukides, "What Is Data Science?", 2011).
"Data science isn’t just about the existence of data, or making guesses about what that data might mean; it’s about testing hypotheses and making sure that the conclusions you’re drawing from the data are valid.
"Data scientists combine entrepreneurship with patience, the willingness to build data products incrementally, the ability to explore, and the ability to iterate over a solution. They are inherently interdisciplinary. They can tackle all aspects of a problem, from initial data collection and data conditioning to drawing conclusions. They can think outside the box to come up with new ways to view the problem, or to work with very broadly defined problems: 'there’s a lot of data, what can you make from it?'" (Mike Loukides, "What Is Data Science?", 2011)
"Discovery is the key to building great data products, as opposed
to products that are merely good." (Mike Loukides, "The Evolution of Data
Products", 2011)
"New interfaces for data products are all about hiding the data itself, and getting to what the user wants." (Mike Loukides, "The Evolution of Data Products", 2011)
"The thread that ties most of these applications together is
that data collected from users provides added value. Whether that data is
search terms, voice samples, or product reviews, the users are in a feedback
loop in which they contribute to the products they use. That’s the beginning of
data science.
"Using data effectively requires something different from traditional statistics, where actuaries in business suits perform arcane but fairly well-defined kinds of analysis. What differentiates data science from statistics is that data science is a holistic approach. We’re increasingly finding data in the wild, and data scientists are involved with gathering data, massaging it into a tractable form, making it tell its story, and presenting that story to others" (Mike Loukides, "What Is Data Science?", 2011).
"Whether we’re talking about web server logs, tweet streams, online transaction records, 'citizen science', data from sensors, government data, or some other source, the problem isn’t finding data, it’s figuring out what to do with it." (Mike Loukides, "What Is Data Science?", 2011)
No comments:
Post a Comment