Pages

07 November 2011

📉Graphical Representation: Orientation (Just the Quotes)

"The scales of any curve-chart should be so selected that the chart will not be exaggerated in either the horizontal or the vertical direction. It is possible to cause a visual exaggeration of data by carelessly or intentionally selecting a scale which unduly stretches the chart in either the horizontal or the vertical direction. Just as the English language can be used to exaggerate to the ear, so charts can exaggerate to the eye." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919)

"If a chart contains a number of series which vary widely in individual magnitude, optical distortion may result from the necessarily sharp changes in the angle of the curves. The space between steeply rising or falling curves always appears narrower than the vertical distance between the plotting points." (Rufus R Lutz, "Graphic Presentation Simplified", 1949)

"The circle graph, or pie chart, appears to simple and 'nonstatistical', so it is a popular form of presentation for general readers. However, since the eye can compare linear distances more easily and accurately than angles or areas, the component parts of a total usually can be shown more effectively in a chart using linear measurement." (Peter H Selby, "Interpreting Graphs and Tables", 1976)

"First, it is generally inadvisable to attempt to portray a series of more than four or five categories by means of pie charts. If, for example, there are six, eight, or more categories, it may be very confusing to differentiate the relative values portrayed, especially if several small sectors are of approximately the same size. Second, the pie chart may lose its effectiveness if an attempt is made to compare the component values of several circles, as might be found in a temporal or geographical series. In such case the one-hundred percent bar or column chart is more appropriate. Third, although the proportionate values portrayed in a pie chart are measured as distances along arcs about the circle, actually there is a tendency to estimate values in terms of areas of sectors or by the size of subtended angles at the center of the circle." (Calvin F Schmid, "Handbook of Graphic Presentation", 1954)

"Stacked bar graphs do not show data structure well. A trend in one of the stacked variables has to be deduced by scanning along the vertical bars. This becomes especially difficult when the categories do not move in the same direction." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"We make angle judgments when we read a pie chart, but we don't judge angles very well. These judgments are biased; we underestimate acute angles (angles less than 90°) and overestimate obtuse angles" (angles greater than 90°). Also, angles with horizontal bisectors" (when the line dividing the angle in two is horizontal) appear larger than angles with vertical bisectors." (Naomi B Robbins, "Creating More effective Graphs", 2005)

"Before calculating a confidence interval for a mean, first check that one of the situations just described holds. To determine whether the data are bell-shaped or skewed, and to check for outliers, plot the data using a histogram, dotplot, or stemplot. A boxplot can reveal outliers and will sometimes reveal skewness, but it cannot be used to determine the shape otherwise. The sample mean and median can also be compared to each other. Differences between the mean and the median usually occur if the data are skewed - that is, are much more spread out in one direction than in the other." (Jessica M Utts & Robert F Heckard, "Mind on Statistics", 2007)

"The donut, its spelling betrays its origins, is nearly always more deceit friendly than the pie, despite being modelled on a life-saving ring. This is because the hole destroys the second most important value- defining element, by hiding the slice angles in the middle." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"Mosaic plots are defined recursively, i.e., each variable that is introduced in a mosaic plot is plotted conditioned on the groups already established in the plot. As with barcharts, the area of bars or tiles is proportional to the number of observations (or the sum of the observation weights of a class). The direction along which bars are divided by a newly introduced variable is usually alternating, starting with the x-direction." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"Communication is the primary goal of data visualization. Any element that hinders - rather than helps - the reader, then, needs to be changed or removed: labels and tags that are in the way, colors that confuse or simply add no value, uncomfortable scales or angles. Each element needs to serve a particular purpose toward the goal of communicating and explaining information. Efficiency matters, because if you’re wasting a viewer’s time or energy, they’re going to move on without receiving your message." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Correlation measures the degree to which two phenomena are related to one another. [...] Two variables are positively correlated if a change in one is associated with a change in the other in the same direction, such as the relationship between height and weight. [...] A correlation is negative if a positive change in one variable is associated with a negative change in the other, such as the relationship between exercise and weight." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"New information is constantly flowing in, and your brain is constantly integrating it into this statistical distribution that creates your next perception" (so in this sense 'reality' is just the product of your brain’s ever-evolving database of consequence). As such, your perception is subject to a statistical phenomenon known in probability theory as kurtosis. Kurtosis in essence means that things tend to become increasingly steep in their distribution [...] that is, skewed in one direction. This applies to ways of seeing everything from current events to ourselves as we lean 'skewedly' toward one interpretation, positive or negative. Things that are highly kurtotic, or skewed, are hard to shift away from. This is another way of saying that seeing differently isn’t just conceptually difficult - it’s statistically difficult." (Beau Lotto, "Deviate: The Science of Seeing Differently", 2017)

"Skewed data means data that is shifted in one direction or the other. Skewness can cause machine learning models to underperform. Many machine learning models assume normally distributed data or data structures to follow the Gaussian structure. Any deviation from the assumed Gaussian structure, which is the popular bell curve, can affect model performance. A very effective area where we can apply feature engineering is by looking at the skewness of data and then correcting the skewness through normalization of the data." (Anthony So et al, "The Data Science Workshop" 2nd Ed., 2020)

"Sparklines focus on the trend over time and the direction rather than the actual values. Sparklines are used to visualize volatility or outliers. They are usually kept quite narrow on dashboards but still maintain an aspect ratio of 2:3." (Lorna Brown, "Tableau Desktop Cookbook", 2020)

"Another word of caution for dot plots that show changes over time. The dot plot is, by definition, a summary chart. It does not show all of the data in the intervening years. If the data between the two dots generally move in the same direction, a dot plot is sufficient. But if the data contain sharp variations year by year, a dot plot will obscure that pattern" (as it also does for bar charts)." (Jonathan Schwabish, "Better Data Visualizations: A guide for scholars, researchers, and wonks", 2021)



No comments:

Post a Comment