"The graphical method has considerable superiority for the exposition of statistical facts over the tabular. A heavy bank of figures is grievously wearisome to the eye, and the popular mind is as incapable of drawing any useful lessons from it as of extracting sunbeams from cucumbers." (Arthur B Farquhar & Henry Farquhar, "Economic and Industrial Delusions", 1891)
"The essential quality of graphic representations is clarity. If the diagram fails to give a clearer impression than the tables of figures it replaces, it is useless. To this end, we will avoid complicating the diagram by including too much data." (Armand Julin, "Summary for a Course of Statistics, General and Applied", 1910)
"The primary purpose of a graph is to show diagrammatically how the values of one of two linked variables change with those of the other. One of the most useful applications of the graph occurs in connection with the representation of statistical data." (John F Kenney & E S Keeping, "Mathematics of Statistics" Vol. I, 1939)
"When numbers in tabular form are taboo and words will not do the work well as is often the case. There is one answer left: Draw a picture. About the simplest kind of statistical picture or graph, is the line variety. It is very useful for showing trends, something practically everybody is interested in showing or knowing about or spotting or deploring or forecasting." (Darell Huff, "How to Lie with Statistics", 1954)
"To understand the need for structuring information, we should examine its opposite - nonstructured information. Nonstructured information may be thought of as exists and can be heard" (or sensed with audio devices), but the mind attaches no rational meaning to the sound. In another sense, noise can be equated to writing a group of letters, numbers, and other symbols on a page without any design or key to their meaning. In such a situation, there is nothing the mind can grasp. Nonstructured information can be classified as useless, unless meaning exists somewhere in the jumble and a key can be found to unlock its hidden significance." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)
"The scatterplot is a useful exploratory method for providing a first look at bivariate data to see how they are distributed throughout the plane, for example, to see clusters of points, outliers, and so forth." (William S Cleveland, "Visualizing Data", 1993)
"These questions can be applied to every kind of problem. They measure the usefulness of whatever construction or graphical invention allowing you to avoid useless graphics."" (Jacques Bertin [interview], 2003)
"Data are raw facts and figures that by themselves may be useless. To be useful, data must be processed into finished information, that is, data converted into a meaningful and useful context for specific users. An increasing challenge for managers is being able to identify and access useful information." (Richard L Daft & Dorothy Marcic,"Understanding Management" 5th Ed., 2006)
"To extract useful information from such large and structured data sets, a first step is to be able to visualize their structure, identifying interesting patterns, trends, and complex relationships between the items. The main idea of visual data exploration is to produce a representation of the data in such a way that the human eye can gain insight into their structure and patterns." (George Michailidis,Data Visualization Through Their Graph Representations" [in "Handbook of Data Visualization"], 2008)
"[...] the terms data visualization and information visualization (casually, data viz and info viz) are useful for referring to any visual representation of data that is: (•) algorithmically drawn" (may have custom touches but is largely rendered with the help of computerized methods); (•) easy to regenerate with different data" (the same form may be repurposed to represent different datasets with similar dimensions or characteristics);" (•) often aesthetically barren (data is not decorated); and (•) relatively data-rich (large volumes of data are welcome and viable, in contrast to infographics)." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)
"A common mistake is that all visualization must be simple, but this skips a step. You should actually design graphics that lend clarity, and that clarity can make a chart 'simple' to read. However, sometimes a dataset is complex, so the visualization must be complex. The visualization might still work if it provides useful insights that you wouldn’t get from a spreadsheet. […] Sometimes a table is better. Sometimes it’s better to show numbers instead of abstract them with shapes. Sometimes you have a lot of data, and it makes more sense to visualize a simple aggregate than it does to show every data point." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)
"Without context, data is useless, and any visualization you create with it will also be useless. Using data without knowing anything about it, other than the values themselves, is like hearing an abridged quote secondhand and then citing it as a main discussion point in an essay. It might be okay, but you risk finding out later that the speaker meant the opposite of what you thought." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)
"A signal is a useful message that resides in data. Data that isn’t useful is noise. […] When data is expressed visually, noise can exist not only as data that doesn’t inform but also as meaningless non-data elements of the display (e.g. irrelevant attributes, such as a third dimension of depth in bars, color variation that has no significance, and artificial light and shadow effects)." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)
"Usually, diagrams contain some noise – information unrelated to the diagram’s primary goal. Noise is decorations, redundant, and irrelevant data, unnecessarily emphasized and ambiguous icons, symbols, lines, grids, or labels. Every unnecessary element draws attention away from the central idea that the designer is trying to share. Noise reduces clarity by hiding useful information in a fog of useless data. You may quickly identify noise elements if you can remove them from the diagram or make them less intense and attractive without compromising the function." (Vasily Pantyukhin, "Principles of Design Diagramming", 2015)
"With time series though, there is absolutely no substitute for plotting. The pertinent pattern might end up being a sharp spike followed by a gentle taper down. Or, maybe there are weird plateaus. There could be noisy spikes that have to be filtered out. A good way to look at it is this: means and standard deviations are based on the naïve assumption that data follows pretty bell curves, but there is no corresponding 'default' assumption for time series data" (at least, not one that works well with any frequency), so you always have to look at the data to get a sense of what’s normal. [...] Along the lines of figuring out what patterns to expect, when you are exploring time series data, it is immensely useful to be able to zoom in and out." (Field Cady, "The Data Science Handbook", 2017)
"People do care about how they are measured. What can we do about this? If you are in the position to measure something, think about whether measuring it will change people’s behaviors in ways that undermine the value of your results. If you are looking at quantitative indicators that others have compiled, ask yourself: Are these numbers measuring what they are intended to measure? Or are people gaming the system and rendering this measure useless?" (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

No comments:
Post a Comment