20 December 2018

Data Science: Logarithms (Just the Quotes)

"With the ordinary scale, fluctuations in large factors are very noticeable, while relatively greater fluctuations in smaller factors are barely apparent. The semi-logarithmic scale permits the graphic representation of changes in every quantity on the same basis, without respect to the magnitude of the quantity itself. At the same time, it shows the actual value by reference to the numbers in the scale column. By indicating both absolute and relative value and changes to one scale, it combines the advantages of both the natural and percentage scale, without the disadvantages of either." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)

"The ratio chart not only correctly represents relative changes but also indicates absolute amounts at the same time. Because of its distinctive structure, it is referred to as a semilogarithmic chart. The vertical axis is ruled logarithmically and the horizontal axis arithmetically. The continued narrowing of the spacings of the scale divisions on the vertical axis is characteristic of logarithmic rulings; the equal intervals on the horizontal axis are indicative of arithmetic rulings." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Logging size transforms the original skewed distribution into a more symmetrical one by pulling in the long right tail of the distribution toward the mean. The short left tail is, in addition, stretched. The shift toward symmetrical distribution produced by the log transform is not, of course, merely for convenience. Symmetrical distributions, especially those that resemble the normal distribution, fulfill statistical assumptions that form the basis of statistical significance testing in the regression model." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Logging skewed variables also helps to reveal the patterns in the data. […] the rescaling of the variables by taking logarithms reduces the nonlinearity in the relationship and removes much of the clutter resulting from the skewed distributions on both variables; in short, the transformation helps clarify the relationship between the two variables. It also […] leads to a theoretically meaningful regression coefficient." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The logarithmic transformation serves several purposes: (1) The resulting regression coefficients sometimes have a more useful theoretical interpretation compared to a regression based on unlogged variables. (2) Badly skewed distributions - in which many of the observations are clustered together combined with a few outlying values on the scale of measurement - are transformed by taking the logarithm of the measurements so that the clustered values are spread out and the large values pulled in more toward the middle of the distribution. (3) Some of the assumptions underlying the regression model and the associated significance tests are better met when the logarithm of the measured variables is taken." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"It is common for positive data to be skewed to the right: some values bunch together at the low end of the scale and others trail off to the high end with increasing gaps between the values as they get higher. Such data can cause severe resolution problems on graphs, and the common remedy is to take logarithms. Indeed, it is the frequent success of this remedy that partly accounts for the large use of logarithms in graphical data display." (William S Cleveland, "The Elements of Graphing Data", 1985)

"When magnitudes are graphed on a logarithmic scale, percents and factors are easier to judge since equal multiplicative factors and percents result in equal distances throughout the entire scale." (William S Cleveland, "The Elements of Graphing Data", 1985)

"The logarithm is one of many transformations that we can apply to univariate measurements. The square root is another. Transformation is a critical tool for visualization or for any other mode of data analysis because it can substantially simplify the structure of a set of data. For example, transformation can remove skewness toward large values, and it can remove monotone increasing spread. And often, it is the logarithm that achieves this removal." (William S Cleveland, "Visualizing Data", 1993)

"Use a logarithmic scale when it is important to understand percent change or multiplicative factors. […] Showing data on a logarithmic scale can cure skewness toward large values." (Naomi B Robbins, "Creating More effective Graphs", 2005)

"It is important to pay heed to the following detail: a disadvantage of logarithmic diagrams is that a graphical integration is not possible, i.e., the area under the curve (the integral) is of no relevance." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.