02 November 2018

📑Data Science: Data Analysts (Collected Quotes)

"[…] it is not enough to say: 'There's error in the data and therefore the study must be terribly dubious'. A good critic and data analyst must do more: he or she must also show how the error in the measurement or the analysis affects the inferences made on the basis of that data and analysis." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The use of statistical methods to analyze data does not make a study any more 'scientific', 'rigorous', or 'objective'. The purpose of quantitative analysis is not to sanctify a set of findings. Unfortunately, some studies, in the words of one critic, 'use statistics as a drunk uses a street lamp, for support rather than illumination'. Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Detailed study of the quality of data sources is an essential part of applied work. [...] Data analysts need to understand more about the measurement processes through which their data come. To know the name by which a column of figures is headed is far from being enough." (John W Tukey, "An Overview of Techniques of Data Analysis, Emphasizing Its Exploratory Aspects", 1982)

"Like a detective, a data analyst will experience many dead ends, retrace his steps, and explore many alternatives before settling on a single description of the evidence in front of him." (David Lubinsky & Daryl Pregibon , "Data analysis as search", Journal of Econometrics Vol. 38 (1–2), 1988)

"The four questions of data analysis are the questions of description, probability, inference, and homogeneity. Any data analyst needs to know how to organize and use these four questions in order to obtain meaningful and correct results. [...] 
THE DESCRIPTION QUESTION: Given a collection of numbers, are there arithmetic values that will summarize the information contained in those numbers in some meaningful way?
THE PROBABILITY QUESTION: Given a known universe, what can we say about samples drawn from this universe? [...]
THE INFERENCE QUESTION: Given an unknown universe, and given a sample that is known to have been drawn from that unknown universe, and given that we know everything about the sample, what can we say about the unknown universe? [...]
THE HOMOGENEITY QUESTION: Given a collection of observations, is it reasonable to assume that they came from one universe, or do they show evidence of having come from multiple universes?" (Donald J Wheeler," Myths About Data Analysis", International Lean & Six Sigma Conference, 2012)

"[…] the data itself can lead to new questions too. In exploratory data analysis (EDA), for example, the data analyst discovers new questions based on the data. The process of looking at the data to address some of these questions generates incidental visualizations - odd patterns, outliers, or surprising correlations that are worth looking into further." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Plotting numbers on a chart does not make you a data analyst. Knowing and understanding your data before you communicate it to your audience does."  (Andy Kriebel & Eva Murray, "#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time", 2018)

"Also, remember that data literacy is not just a set of technical skills. There is an equal need and weight for soft skills and business skills. This can be misleading for some technical resources within an organization, as those technical resources may believe they are data literate by default as they are data architects or data analysts. They have the existing technical skills, but maybe they do not have any deep proficiencies in other skills such as communicating with data, challenging assumptions, and mitigating bias, or perhaps they do not have an open mindset to be open to different perspectives." (Angelika Klidas & Kevin Hanegan, "Data Literacy in Practice", 2022)

"The lack of focus and commitment to color is a perplexing thing. When used correctly, color has no equal as a visualization tool - in advertising, in branding, in getting the message across to any audience you seek. Data analysts can make numbers dance and sing on command, but they sometimes struggle to create visually stimulating environments that convince the intended audience to tap their feet in time." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.