17 November 2011

📉Graphical Representation: Metaphor (Just the Quotes)

"Every metaphor is the tip of a submerged model. […] Use of theoretical models resembles the use of metaphors in requiring analogical transfer of a vocabulary. Metaphor and model-making reveal new relationships; both are attempts to pour new content into old bottles." (Max Black," Models and Metaphors", 1962)

"One should employ a metaphor in science only when there is good evidence that an important similarity or analogy exists between its primary and secondary subjects. One should seek to discover more about the relevant similarities or analogies, always considering the possibility that there are no important similarities or analogies, or alternatively, that there are quite distinct similarities for which distinct terminology should be introduced. One should try to discover what the 'essential' features of the similarities or analogies are, and one should try to assimilate one’s account of them to other theoretical work in the same subject area - that is, one should attempt to explicate the metaphor." (Richard Boyd, "Metaphor and Theory Change: What Is ‘Metaphor’ a Metaphor For?", 1979)

"The essence of a graphic display is that a set of numbers having both magnitudes and an order are represented by an appropriate visual metaphor - the magnitude and order of the metaphorical representation match the numbers. We can display data badly by ignoring or distorting this concept." (Howard Wainer, "How to Display Data Badly", The American Statistician Vol. 38(2), 1984) 

"Despite the prevailing use of graphs as metaphors for communicating and reasoning about dependencies, the task of capturing informational dependencies by graphs is not at all trivial." (Judea Pearl, "Probabilistic Reasoning in Intelligent Systems: Network of Plausible Inference", 1988)

"Perhaps our ultimate understanding of scientific topics is measured in terms of our ability to generate metaphoric pictures of what is going on. Maybe understanding is coming up with metaphoric pictures." (Per Bak, "How Nature Works: the science of self-organized criticality", 1996)

"Make use of a simple data metaphor. Regardless of the concept you are trying to convey with an information graphic, you must make sure that the visual metaphor (i.e., a circle to represent a whole, as with a pie chart) be clear and logical. Don’t get so caught up in being clever that you make illogical comparisons or use unclear metaphors. In other words, don’t make your readers have to think too hard to get the point. They’ll appreciate you for it!" (Jennifer George-Palilonis," A Practical Guide to Graphics Reporting: Information Graphics for Print, Web & Broadcast", 2006)

"Specific numbers, visual descriptions of objects or events and identifiable locations don’t always jump out, and a graphic may not always present itself right away. A good graphics reporter will often discover graphics potential in less obvious ways. Is the explanation in a story getting bogged down and hard to follow? If so, can the information be organized differently? Perhaps in a more graphic manner? Is there information that hat can be conveyed conceptually to put a thought or idea into a more visual perspective? Visual metaphors (or 'data metaphors' in the case of mathematical or quantifiable information) often make it easier for people to digest information." (Jennifer George-Palilonis," A Practical Guide to Graphics Reporting: Information Graphics for Print, Web & Broadcast", 2006)

"All graphics by definition employ metaphors, but some are more metaphorical than others. Sometimes the metaphor escapes from its graphical cage, takes on a life of its own and provides exciting deception opportunities." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"[…] a graph is nothing but a visual metaphor. To be truthful, it must correspond closely to the phenomena it depicts: longer bars or bigger pie slices must correspond to more, a rising line must correspond to an increasing amount. If a graphical depiction of data does not faithfully follow this principle, it is almost sure to be misleading. But the metaphoric attachment of a graphic goes farther than this. The character of the depiction ism a necessary and sufficient condition for the character of the data. When the data change, so too must their depiction; but when the depiction changes very little, we assume that the data, likewise, are relatively unchanging. If this convention is not followed, we are usually misled." (Howard Wainer, "Graphic Discovery: A trout in the milk and other visuals" 2nd, 2008)

"All sorts of metaphorical interpretations are culturally ingrained. An astute designer will think about these possible interpretations and work with them, rather than against them." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Visual metaphors are about integrating a certain visual quality in your work that somehow conveys that extra bit of connection between the data, the design, and the topic. It goes beyond just the choice of visual variable, though this will have a strong influence. Deploying the best visual metaphor is something that really requires a strong design instinct and a certain amount of experience." (Andy Kirk, "Data Visualization: A successful design process", 2012)

16 November 2011

📉Graphical Representation: Action (Just the Quotes)

"The types of graphics used in operating a business fall into three main categories: diagrams, maps, and charts. Diagrams, such as organization diagrams, flow diagrams, and networks, are usually intended to graphically portray how an activity should be, or is being, accomplished, and who is responsible for that accomplishment. Maps such as route maps, location maps, and density maps, illustrate where an activity is, or should be, taking place, and what exists there. [...] Charts such as line charts, column charts, and surface charts, are normally constructed to show the businessman how much and when. Charts have the ability to graphically display the past, present, and anticipated future of an activity. They can be plotted so as to indicate the current direction that is being followed in relationship to what should be followed. They can indicate problems and potential problems, hopefully in time for constructive corrective action to be taken." (Robert D Carlsen & Donald L Vest, "Encyclopedia of Business Charts", 1977)

"Part of the strategy of regression modelling is to improve the model until the residuals look 'structureless', or like a simple random sample. They should only contain structure that is already taken into account (such as nonconstant variance) or imposed by the fitting process itself. By plotting them against a variety of original and derived variables, we can look for systematic patterns that relate to the model's adequacy. Although we talk about graphics for use after the model is fit, if problems with the fit are discovered at this stage of the analysis, We should take corrective action and refit the equation or a modified form of it." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"We can gain further insight into what makes good plots by thinking about the process of visual perception. The eye can assimilate large amounts of visual information, perceive unanticipated structure, and recognize complex patterns; however, certain kinds of patterns are more readily perceived than others. If we thoroughly understood the interaction between the brain, eye, and picture, we could organize displays to take advantage of the things that the eye and brain do best, so that the potentially most important patterns are associated with the most easily perceived visual aspects in the display." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"Many of the applications of visualization in this book give the impression that data analysis consists of an orderly progression of exploratory graphs, fitting, and visualization of fits and residuals. Coherence of discussion and limited space necessitate a presentation that appears to imply this. Real life is usually quite different. There are blind alleys. There are mistaken actions. There are effects missed until the very end when some visualization saves the day. And worse, there is the possibility of the nearly unmentionable: missed effects." (William S Cleveland, "Visualizing Data", 1993)

"Anyone who has seen, and especially used, a highly responsive interactive visualization tool will be struck by two features. First, that a mere rearrangement of how the data is displayed can lead to a surprising degree of additional insight into that data. Second, that the very property of interactivity can considerably enhance that tool's effectiveness, especially if the computer's response follows a user's action virtually immediately, say within a fraction of a second." (Robert Spence, "Information Visualization", 2001)

"All good KPIs that I have come across, that have made a difference, had the CEO’s constant attention, with daily calls to the relevant staff. [...] A KPI should tell you about what action needs to take place. [...] A KPI is deep enough in the organization that it can be tied down to an individual. [...] A good KPI will affect most of the core CSFs and more than one BSC perspective. [...] A good KPI has a flow on effect." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"Many management reports are not a management tool; they are merely memorandums of information. As a management tool, management reports should encourage timely action in the right direction, by reporting on those activities the Board, management, and staff need to focus on. The old adage 'what gets measured gets done' still holds true." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"A persuasive visualization primarily serves the relationship between the designer and the reader. It is useful when the designer wishes to change the reader’s mind about something. It represents a very specific point of view, and advocates a change of opinion or action on the part of the reader. In this category of visualization, the data represented is specifically chosen for the purpose of supporting the designer’s point of view, and is presented carefully so as to convince the reader of same." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Data alone isn’t valuable. In fact, it can be expensive in time and resources to manage and maintain. The analysis of this data is closer to something that is valuable. A clearly communicated analysis starts to transform a reflection of the world into knowledge in the minds of people. Even so, knowledge alone does not make your organization better. It is the decisions and actions of people - based on this data-sourced knowledge - that is the goal. But these decisions are seldom made in a vacuum. In most organizations, decisions are a collaborative, social experience. People come together to discuss options, review their knowledge of the situation, and arrive at a path to go down. Herein is one of the great powers of effective data products: They can shape and guide these discussions. Conclusions are seldom clear-cut, even when there is data to support a direction." (Zach Gemignani et al, "Data Fluency", 2014)

"Data captures actions and characteristics of the real world and transforms them into something that can be examined and explored after the fact." (Zach Gemignani et al, "Data Fluency", 2014)

"Just because data is visualized doesn’t necessarily mean that it is accurate, complete, or indicative of the right course of action. Exhibiting a healthy skepticism is almost always a good thing." (Phil Simon, "The Visual Organization: Data Visualization, Big Data, and the Quest for Better Decisions", 2014)

"Further develop the situation or problem by covering relevant background. Incorporate external context or comparison points. Give examples that illustrate the issue. Include data that demonstrates the problem. Articulate what will happen if no action is taken or no change is made. Discuss potential options for addressing the problem. Illustrate the benefits of your recommended solution." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"If you simply present data, it’s easy for your audience to say, Oh, that’s interesting, and move on to the next thing. But if you ask for action, your audience has to make a decision whether to comply or not. This elicits a more productive reaction from your audience, which can lead to a more productive conversation - one that might never have been started if you hadn’t recommended the action in the first place." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"A data story starts out like any other story, with a beginning and a middle. However, the end should never be a fixed event, but rather a set of options or questions to trigger an action from the audience. Never forget that the goal of data storytelling is to encourage and energize critical thinking for business decisions." (James Richardson, 2017)

"Indicators represent a way of 'distilling' the larger volume of data collected by organizations. As data become bigger and bigger, due to the greater span of control or growing complexity of operations, data management becomes increasingly difficult. Actions and decisions are greatly influenced by the nature, use and time horizon (e.g., short or long-term) of indicators." (Fiorenzo Franceschini et al, "Designing Performance Measurement Systems: Theory and Practice of Key Performance Indicators", 2019)

"The intended endpoint or destination of a data story is to guide an audience toward a better understanding and appreciation of your main point or insight, which hopefully leads to discussion, action, and change. However, if you have several divergent findings and try to combine them into a single data story, you may run the risk of confusing your audience or overwhelming them with too much information. To tell a cohesive data story, you must prioritize and limit what you focus on. Sometimes an insight deserves its own data story rather than being appended to the narrative of another insight." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

📉Graphical Representation: Designers (Just the Quotes)

"The numerous design possibilities include several varieties of line graphs that are geared to particular types of problems. The design of a graph should be adapted to the type of data being structured. The data might be percentages, index numbers, frequency distributions, probability distributions, rates of change, numbers of dollars, and so on. Consequently, the designer must be prepared to structure his graph accordingly." (Cecil H Meyers, "Handbook of Basic Graphs: A modern approach", 1970)

"Because ease of use is the purpose, this ratio of function to conceptual complexity is the ultimate test of system design. Neither function alone nor simplicity alone defines a good design. [...] Function, and not simplicity, has always been the measure of excellence for its designers." (Fred P Brooks, "The Mythical Man-Month: Essays", 1975)

"The interior decoration of graphics generates a lot of ink that does not tell the viewer anything new. The purpose of decoration varies - to make the graphic appear more scientific and precise, to enliven the display, to give the designer an opportunity to exercise artistic skills. Regardless of its cause, it is all non-data-ink or redundant data-ink, and it is often chartjunk."  (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Good design protects you from the need for too many highly accurate components in the system. But such design principles are still, to this date, ill-understood and need to be researched extensively. Not that good designers do not understand this intuitively, merely it is not easily incorporated into the design methods you were taught in school. Good minds are still needed in spite of all the computing tools we have developed." (Richard Hamming, "The Art of Doing Science and Engineering: Learning to Learn", 1997)

"There is no end to the information we can use. A 'good' map provides the information we need for a particular purpose - or the information the mapmaker wants us to have. To guide us, a map’s designers must consider more than content and projection; any single map involves hundreds of decisions about presentation." (Peter Turchi, "Maps of the Imagination: The writer as cartographer", 2004)

"For a given dataset there is not a great deal of advice which can be given on content and context. hose who know their own data should know best for their specific purposes. It is advisable to think hard about what should be shown and to check with others if the graphic makes the desired impression. Design should be let to designers, though some basic guidelines should be followed: consistency is important (sets of graphics should be in similar style and use equivalent scaling); proximity is helpful (place graphics on the same page, or on the facing page, of any text that refers to them); and layout should be checked (graphics should be neither too small nor too large and be attractively positioned relative to the whole page or display)." (Antony Unwin, "Good Graphics?" [in "Handbook of Data Visualization"], 2008)

"The main goal of data visualization is its ability to visualize data, communicating information clearly and effectively. It doesn’t mean that data visualization needs to look boring to be functional or extremely sophisticated to look beautiful. To convey ideas effectively, both aesthetic form and functionality need to go hand in hand, providing insights into a rather sparse and complex dataset by communicating its key aspects in a more intuitive way. Yet designers often tend to discard the balance between design and function, creating gorgeous data visualizations which fail to serve its main purpose - communicate information." (Vitaly Friedman, "Data Visualization and Infographics", Smashing Magazine, 2008)

"Designers are responsible for the project’s fit and finish, that is, specifying the geometry and sizes of components so they properly mate with each other and are ergonomically and aesthetically acceptable within the operating environment." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"Having a purposeless or poorly performing dashboard is more common than not. This happens when the underlying architecture is not designed properly to support the needs of dashboard interaction. There is an obvious disconnect between the design of the data warehouse and the design of the dashboards. The people who design the data warehouse do not know what the dashboard will do; and the people who design the dashboards do not know how the data warehouse was designed, resulting in a lack of cohesion between the two. A similar disconnect can also exist between the dashboard designer and the business analyst, resulting in a dashboard that may look beautiful and dazzling but brings very little business value." (Nils H Rasmussen et al, "Business Dashboards: A visual catalog for design and deployment", 2009)

"Be aware that bar charts provide ample opportunities for chart junk. The space within the bars is enticingly empty and it is tempting to put images or textures in the background. Some designers even swap out the standard bars for graphics." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"All sorts of metaphorical interpretations are culturally ingrained. An astute designer will think about these possible interpretations and work with them, rather than against them." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"A persuasive visualization primarily serves the relationship between the designer and the reader. It is useful when the designer wishes to change the reader’s mind about something. It represents a very specific point of view, and advocates a change of opinion or action on the part of the reader. In this category of visualization, the data represented is specifically chosen for the purpose of supporting the designer’s point of view, and is presented carefully so as to convince the reader of same." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"[...] visual art, primarily serves the relationship between the designer and the data. [...] it often entails unidirectional encoding of information, meaning that the reader may not be able to decode the visual presentation to understand the underlying information. [...] visual art merely translates the data into a visual form. The designer may intend only to condense it, translate it into a new medium, or make it beautiful; she may not intend for the reader to be able to extract anything from it other than enjoyment." (Noah Iliinsky & Julie Steel, "Designing Data Visualizations", 2011)

"Information design, when successful - whether in print, on the web, or in the environment - represents the functional balance of the meaning of the information, the skills and inclinations of the designer, and the perceptions, education, experience, and needs of the audience." (Joel Katz, "Designing Information: Human factors and common sense in information design", 2012)

"Good design is an important part of any visualization, while decoration (or chart-junk) is best omitted. Statisticians should also be careful about comparing themselves to artists and designers; our goals are so different that we will fare poorly in comparison." (Hadley Wickham, "Graphical Criticism: Some Historical Notes", Journal of Computational and Graphical Statistics Vol. 22(1), 2013)

"Developing a clear understanding of the requirements of a particular target audience is a tricky problem for a designer. While it might seem obvious to you that it would be a good idea to understand requirements, it’s a common pitfall for designers to cut corners by making assumptions rather than actually engaging with any target users. " (Tamara Munzner, "Visualization Analysis and Design", 2014)

"Usually, diagrams contain some noise - information unrelated to the diagram’s primary goal. Noise is decorations, redundant, and irrelevant data, unnecessarily emphasized and ambiguous icons, symbols, lines, grids, or labels. Every unnecessary element draws attention away from the central idea that the designer is trying to share. Noise reduces clarity by hiding useful information in a fog of useless data. You may quickly identify noise elements if you can remove them from the diagram or make them less intense and attractive without compromising the function." (Vasily Pantyukhin, "Principles of Design Diagramming", 2015)

"Another problem is that while data visualizations may appear to be objective, the designer has a great deal of control over the message a graphic conveys. Even using accurate data, a designer can manipulate how those data make us feel. She can create the illusion of a correlation where none exists, or make a small difference between groups look big." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

📉Graphical Representation: Composition (Just the Quotes)

"Nothing is so illuminating as a set of properly proportioned diagrams. [...] In addition to the significance of graphics in analytical work, it is likewise a valuable aid to the memory. A picture is manifestly more readily retained in mind than a description of the same subject, no matter how vividly it may have been expressed. A pictorial or diagrammatic illustration usually produces a firmer and more lasting impression than any composition of words or tabulation of figures, however well they may be arranged or set forth." (Allan C Haskell, "How to Make and Use Graphic Charts", 1919)

"Without adequate planning, it is seldom possible to achieve either proper emphasis of each component element within the chart or a presentation that is pleasing in its entirely. Too often charts are developed around a single detail without sufficient regard for the work as a whole. Good chart design requires consideration of these four major factors:" (1) size," (2) proportion," (3) position and margins, and" (4) composition." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"As a general rule, plotted points and graph lines should be given more 'weight' than the axes. In this way the 'meat' will be easily distinguishable from the 'bones'. Furthermore, an illustration composed of lines of unequal weights is always more attractive than one in which all the lines are of uniform thickness. It may not always be possible to emphasise the data in this way however. In a scattergram, for example, the more plotted points there are, the smaller they may need to be and this will give them a lighter appearance. Similarly, the more curves there are on a graph, the thinner the lines may need to be. In both cases, the axes may look better if they are drawn with a somewhat bolder line so that they are easily distinguishable from the data." (Linda Reynolds & Doig Simmonds, "Presentation of Data in Science" 4th Ed, 1984)

"Functional visualizations are more than innovative statistical analyses and computational algorithms. They must make sense to the user and require a visual language system that uses color, shape, line, hierarchy and composition to communicate clearly and appropriately, much like the alphabetic and character-based languages used worldwide between humans." (Matt Woolman, "Digital Information Graphics", 2002)

"While visuals are an essential part of data storytelling, data visualizations can serve a variety of purposes from analysis to communication to even art. Most data charts are designed to disseminate information in a visual manner. Only a subset of data compositions is focused on presenting specific insights as opposed to just general information. When most data compositions combine both visualizations and text, it can be difficult to discern whether a particular scenario falls into the realm of data storytelling or not." (Brent Dykes, "Effective Data Storytelling: How to Drive Change with Data, Narrative and Visuals", 2019)

"A semantic approach to visualization focuses on the interplay between charts, not just the selection of charts themselves. The approach unites the structural content of charts with the context and knowledge of those interacting with the composition. It avoids undue and excessive repetition by instead using referential devices, such as filtering or providing detail-on-demand. A cohesive analytical conversation also builds guardrails to keep users from derailing from the conversation or finding themselves lost without context. Functional aesthetics around color, sequence, style, use of space, alignment, framing, and other visual encodings can affect how users follow the script." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Aligning on data ink can be a powerful way to build relationships across charts. It can be used to obscure the lines between charts, making the composition feel more seamless. [....] Alignment paradigms can also influence the layout design needed. [...] The layout added to the alignment further supports this relationship." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Beyond basic charts, practitioners must also learn to compose visualizations together elegantly. The perceptual stage focuses on making the literal charts more precise as well as working to de-emphasize the entire piece. Design choices start to consider distractions, reducing visual clutter and centering on the message. Minimalism is espoused as a core value with an emphasis on shifting toward precision as accuracy. This is the most common next step for practitioners. Minimalism is also a key stage in maturation. It is experimentation at one extreme that helps practitioners distill down to core, shared practices." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Chart choices can also create weight within the entire composition. Presenting information as a comprehensive visualization, such as in a dashboard, requires thinking beyond individual charts. In writing, we not only craft sentences, but write the composition as an entire piece. Certain sentences may drive the writing more, but all sentences play a role in conveying the message." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Visualizations are abstractions, relying on primary graphicacy skills to fully understand the composition." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

15 November 2011

📉Graphical Representation: Distribution (Just the Quotes)

"Some distributions [...] are symmetrical about their central value. Other distributions have marked asymmetry and are said to be skew. Skew distributions are divided into two types. If the 'tail' of the distribution reaches out into the larger values of the variate, the distribution is said to show positive skewness; if the tail extends towards the smaller values of the variate, the distribution is called negatively skew." (Michael J Moroney, "Facts from Figures", 1951)

"The impression created by a chart depends to a great extent on the shape of the grid and the distribution of time and amount scales. When your individual figures are a part of a series make sure your own will harmonize with the other illustrations in spacing of grid rulings, lettering, intensity of lines, and planned to take the same reduction by following the general style of the presentation." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"The logarithmic transformation serves several purposes:" (1) The resulting regression coefficients sometimes have a more useful theoretical interpretation compared to a regression based on unlogged variables. (2) Badly skewed distributions - in which many of the observations are clustered together combined with a few outlying values on the scale of measurement - are transformed by taking the logarithm of the measurements so that the clustered values are spread out and the large values pulled in more toward the middle of the distribution. (3) Some of the assumptions underlying the regression model and the associated significance tests are better met when the logarithm of the measured variables is taken." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Plotting on power-transformed scales (either cube roots or logs) is recommended only in those cases where the distribution is very asymmetric and the reference configuration for the untransformed plot would be a straight line through the origin." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"Boxplots provide information at a glance about center (median), spread (interquartile range), symmetry, and outliers. With practice they are easy to read and are especially useful for quick comparisons of two or more distributions. Sometimes unexpected features such as outliers, skew, or differences in spread are made obvious by boxplots but might otherwise go unnoticed." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"Comparing normal distributions reduces to comparing only means and standard deviations. If standard deviations are the same, the task even simpler: just compare means. On the other hand, means and standard deviations may be incomplete or misleading as summaries for nonnormal distributions." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"If a distribution were perfectly symmetrical, all symmetry-plot points would be on the diagonal line. Off-line points indicate asymmetry. Points fall above the line when distance above the median is greater than corresponding distance below the median. A consistent run of above-the-line points indicates positive skew; a run of below-the-line points indicates negative skew." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"Remember that normality and symmetry are not the same thing. All normal distributions are symmetrical, but not all symmetrical distributions are normal. With water use we were able to transform the distribution to be approximately symmetrical and normal, but often symmetry is the most we can hope for. For practical purposes, symmetry (with no severe outliers) may be sufficient. Transformations are not a magic wand, however. Many distributions cannot even be made symmetrical." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"Many good things happen when data distributions are well approximated by the normal. First, the question of whether the shifts among the distributions are additive becomes the question of whether the distributions have the same standard deviation; if so, the shifts are additive. […] A second good happening is that methods of fitting and methods of probabilistic inference, to be taken up shortly, are typically simple and on well understood ground. […] A third good thing is that the description of the data distribution is more parsimonious." (William S Cleveland, "Visualizing Data", 1993)

"The quantile plot is a good general display since it is fairly easy to construct and does a good job of portraying many aspects of a distribution. Three convenient features of the plot are the following: First, in constructing it, we do not make any arbitrary choices of parameter values or cell boundaries [...] and no models for the data are fitted or assumed. Second, like a table, it is not a summary but a display of all the data. Third, on the quantile plot every point is plotted at a distinct location, even if there are duplicates in the data. The number of points that can be portrayed without overlap is limited only by the resolution of the plotting device. For a high resolution device several hundred points distinguished." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"Boxplots provide information at a glance about center (median), spread (interquartile range), symmetry, and outliers. With practice they are easy to read and are especially useful for quick comparisons of two or more distributions. Sometimes unexpected features such as outliers, skew, or differences in spread are made obvious by boxplots but might otherwise go unnoticed." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"A useful feature of a stem plot is that the values maintain their natural order, while at the same time they are laid out in a way that emphasizes the overall distribution of where the values are concentrated (that is, where the longer branches are). This enables you easily to pick out key values such as the median and quartiles." (Alan Graham, "Developing Thinking in Statistics", 2006)

"When displaying information visually, there are three questions one will find useful to ask as a starting point. Firstly and most importantly, it is vital to have a clear idea about what is to be displayed; for example, is it important to demonstrate that two sets of data have different distributions or that they have different mean values? Having decided what the main message is, the next step is to examine the methods available and to select an appropriate one. Finally, once the chart or table has been constructed, it is worth reflecting upon whether what has been produced truly reflects the intended message. If not, then refine the display until satisfied; for example if a chart has been used would a table have been better or vice versa?" (Jenny Freeman et al, "How to Display Data", 2008)

"'Distribution' refers to how the vof a variable are placed along an axis, keeping the proportional distances taken from the values in the table. In descriptive statistics, there are two complementary ways to study a distribution: searching for what is common (the measures of central tendency) and searching for what is different along with how much different it is (measures of dispersion)." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"The simplest and most common way to represent the empirical distribution of a numerical variable is by showing the individual values as dots arranged along a line. The main difficulty with this plot concerns how to treat tied values. We usually don't want to represent them by the same point, since that means that the two values look like one. What we can do is 'jitter' the points a bit (i.e., move them back and forth at right angles to the plot axis) so that all points are visible. […] In addition to permitting you to identify individual points, dotplots allow you to look into some of the distributional properties of a variable. […] Dotplots can also be good for looking for modality. " (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

"There is no ‘correct’ way to display sets of numbers: each of the plots we have used has some advantages: strip-charts show individual points, box-and-whisker plots are convenient for rapid visual summaries, and histograms give a good feel for the underlying shape of the data distribution." (David Spiegelhalter, "The Art of Statistics: Learning from Data", 2019)

📉Graphical Representation: Simplification (Just the Quotes)

"Judgment must be used in the showing of figures in any chart or numerical presentation, so that the figures may not give an appearance of greater accuracy than their method of collection would warrant. Too many otherwise excellent reports contain figures which give the impression of great accuracy when in reality the figures may be only the crudest approximations. Except in financial statements, it is a safe rule to use ciphers whenever possible at the right of all numbers of great size. The use of the ciphers greatly simplifies the grasping of the figures by the reader, and, at the same time, it helps to avoid the impression of an accuracy which is not warranted by the methods of collecting the data." (Willard C Brinton, "Graphic Methods for Presenting Facts", 1919) 

"The great difference between the graphic representation of yesterday, which was poorly dissociated from the figurative image, and the graphics of tomorrow, is the disappearance of the congential fixity of the image. […] When one can superimpose, juxtapose, transpose, and permute graphic images in ways that lead to groupings and classings, the graphic image passes from the dead image, the 'illustration,' to the living image, the widely accessible research instrument it is now becoming. The graphic is no longer only the 'representation' of a final simplification, it is a point of departure for the discovery of these simplifications and the means for their justification. The graphic has become, by its manageability, an instrument for information processing." (Jacques Bertin, "Semiology of graphics" ["Semiologie Graphique"], 1967)

"What about confusing clutter? Information overload? Doesn't data have to be ‘boiled down’ and  ‘simplified’? These common questions miss the point, for the quantity of detail is an issue completely separate from the difficulty of reading. Clutter and confusion are failures of design, not attributes of information. Often the less complex and less subtle the line, the more ambiguous and less interesting is the reading. Stripping the detail out of data is a style based on personal preference and fashion, considerations utterly indifferent to substantive content." (Edward R Tufte, "Envisioning Information", 1990)

"A good chart delineates and organizes information. It communicates complex ideas, procedures, and lists of facts by simplifying, grouping, and setting and marking priorities. By spatial organization, it should lead the eye through information smoothly and efficiently." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"An axis is the ruler that establishes regular intervals for measuring information. Because it is such a widely accepted convention, it is often taken for granted and its importance overlooked. Axes may emphasize, diminish, distort, simplify, or clutter the information. They must be used carefully and accurately." (Mary H Briscoe, "Preparing Scientific Illustrations: A guide to better posters, presentations, and publications" 2nd ed., 1995)

"Good ideas do not communicate themselves. Ideas must be organized. Highly complex ideas need to be clarified and simplified whereas diffuse data may benefit from being combined. Ideas and data must be made interesting and comprehensible to those not familiar with them." (Mary H Briscoe, "Preparing Scientific Illustrations:  guide to better posters, presentations, and publications" 2nd ed., 1995)

"Mathematical models are continually invoking ideas of infinitely smooth surfaces, weightless strings, weightless beams, perfectly spherical balls, projectiles flying through airless space, gases which are perfectly compressible and liquids which are perfectly incompressible, and so on. The purpose of such simplifications is, in theory, to understand the world better despite the oversimplification, which you hope either will not matter or will be corrected when you construct a second (better) model." (David Wells, "You Are a Mathematician: A wise and witty introduction to the joy of numbers", 1995)

"Charts are used to represent quantitative data in a graphic format. A chart visually illustrates relationships between numbers. When creating a chart, keep in mind that the goal is to represent the data in a simplified and appealing way so as not to muddle the message the chart is meant to convey." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"Information graphics are an essential component of technical communication. Very few technical documents or presentations can be considered complete without graphical elements to present some essential data. Because engineers are visually oriented, graphic aids allow their thoughts and ideas to be better understood by other engineers. Information graphics are essential in presenting data because they simplify the content, offer a visually pleasing alternative to gray text in a proposal or an article, and thereby invite interest." (Dennis K Lieu & Sheryl Sorby, "Visualization, Modeling, and Graphics for Engineering Design", 2009)

"The data is a simplification - an abstraction - of the real world. So when you visualize data, you visualize an abstraction of the world, or at least some tiny facet of it. Visualization is an abstraction of data, so in the end, you end up with an abstraction of an abstraction, which creates an interesting challenge. […] Just like what it represents, data can be complex with variability and uncertainty, but consider it all in the right context, and it starts to make sense." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Form simplification means simplifying relationships among the components of the whole, emphasizing the whole and reducing the relevance of individual components by standardizing and generalizing relationships. This results in an increased weight of useful information (signal) against useless information (noise)." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"GIGO is a famous saying coined by early computer scientists: garbage in, garbage out. At the time, people would blindly put their trust into anything a computer output indicated because the output had the illusion of precision and certainty. If a statistic is composed of a series of poorly defined measures, guesses, misunderstandings, oversimplifications, mismeasurements, or flawed estimates, the resulting conclusion will be flawed." (Daniel J Levitin, "Weaponized Lies", 2017)

14 November 2011

📉Graphical Representation: Boxplots (Just the Quotes)

"Boxplots provide information at a glance about center (median), spread (interquartile range), symmetry, and outliers. With practice they are easy to read and are especially useful for quick comparisons of two or more distributions. Sometimes unexpected features such as outliers, skew, or differences in spread are made obvious by boxplots but might otherwise go unnoticed." (Lawrence C Hamilton, "Regression with Graphics: A second course in applied statistics", 1991)

"A bar graph typically presents either averages or frequencies. It is relatively simple to present raw data" (in the form of dot plots or box plots). Such plots provide much more information. and they are closer to the original data. If the bar graph categories are linked in some way - for example, doses of treatments - then a line graph will be much more informative. Very complicated bar graphs containing adjacent bars are very difficult to grasp. If the bar graph represents frequencies. and the abscissa values can be ordered, then a line graph will be much more informative and will have substantially reduced chart junk." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Before calculating a confidence interval for a mean, first check that one of the situations just described holds. To determine whether the data are bell-shaped or skewed, and to check for outliers, plot the data using a histogram, dotplot, or stemplot. A boxplot can reveal outliers and will sometimes reveal skewness, but it cannot be used to determine the shape otherwise. The sample mean and median can also be compared to each other. Differences between the mean and the median usually occur if the data are skewed - that is, are much more spread out in one direction than in the other." (Jessica M Utts & Robert F Heckard, "Mind on Statistics", 2007)

"Symmetry and skewness can be judged, but boxplots are not entirely useful for judging shape. It is not possible to use a boxplot to judge whether or not a dataset is bell-shaped, nor is it possible to judge whether or not a dataset may be bimodal." (Jessica M Utts & Robert F Heckard, "Mind on Statistics", 2007)

"Sorting data is one of the most efficient actions to derive different views of data in order to see the variables from many angles. Sorting is usually not applied to the data itself, but to statistical objects of a plot. We might want to sort the bars in a barchart, the variables in a parallel boxplot or the categories in a boxplot y by x." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"Need to consider outliers as they can affect statistics such as means, standard deviations, and correlations. They can either be explained, deleted, or accommodated (using either robust statistics or obtaining additional data to fill-in). Can be detected by methods such as box plots, scatterplots, histograms or frequency distributions." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"A boxplot is a dotplot enhanced with a schematic that provides information about the center and spread of the data, including the median, quartiles, and so on. This is a very useful way of summarizing a variable's distribution. The dotplot can also be enhanced with a diamond-shaped schematic portraying the mean and standard deviation" (or the standard error of the mean)." (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

"Visual clutter is one of the most serious issues with bar charts. Using a bar to represent a simple data point is clearly overkill that results in no room for more data. At times, this may make us overlook less obvious things. The population pyramids offer a glaring example of this. But dot plots are not only about reducing clutter and avoiding overstimulation. Because we don’t compare heights, dot plots actually allow us to break the scale to improve resolution, and that’s a big plus over bar charts." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"[…] the drawback of the box plot is that it tends to hide the values due to its design." (Andy Kriebel & Eva Murray, "#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time", 2018)

"Side-by-side box plots is a simpler approach that can give a crude understanding of the relationship between one quantitative variable and two or more qualitative variables. When we have many subgroups, side-by-side box-and-whisker plots can be very useful for comparing basic features of a distribution." (Deborah Nolan & Sara Stoudt, "Communicating with Data: The Art of Writing for Data Science", 2021)

📉Graphical Representation: Extremes (Just the Quotes)

"Missing data values pose a particularly sticky problem for symbols. For instance, if the ray corresponding to a missing value is simply left off of a star symbol, the result will be almost indistinguishable from a minimum (i.e., an extreme) value. It may be better either (i) to impute a value, perhaps a median for that variable, or a fitted value from some regression on other variables, (ii) to indicate that the value is missing, possibly with a dashed line, or (iii) not to draw the symbol for a particular observation if any value is missing." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"Skewness is a measure of symmetry. For example, it's zero for the bell-shaped normal curve, which is perfectly symmetric about its mean. Kurtosis is a measure of the peakedness, or fat-tailedness, of a distribution. Thus, it measures the likelihood of extreme values." (John L Casti, "Reality Rules: Picturing the world in mathematics", 1992)

"If the underlying pattern of the data has gentle curvature with no local maxima and minima, then locally linear fitting is usually sufficient. But if there are local maxima or minima, then locally quadratic fitting typically does a better job of following the pattern of the data and maintaining local smoothness." (William S Cleveland, "Visualizing Data", 1993)

"Variance and its square root, the standard deviation, summarize the amount of spread around the mean, or how much a variable varies. Outliers influence these statistics too, even more than they influence the mean. On the other hand. the variance and standard deviation have important mathematical advantages that make them (together with the mean) the foundation of classical statistics. If a distribution appears reasonably symmetrical, with no extreme outliers, then the mean and standard deviation or variance are the summaries most analysts would use." (Lawrence C Hamilton, "Data Analysis for Social Scientists: A first course in applied statistics", 1995)

"Clearly, the mean is greatly influenced by extreme values, but it can be appropriate for many situations where extreme values do not arise. To avoid misuse, it is essential to know which summary measure best reflects the data and to use it carefully. Understanding the situation is necessary for making the right choice. Know the subject!" (Herbert F Spirer et al, "Misused Statistics" 2nd Ed, 1998)

"A feature shared by both the range and the interquartile range is that they are each calculated on the basis of just two values - the range uses the maximum and the minimum values, while the IQR uses the two quartiles. The standard deviation, on the other hand, has the distinction of using, directly, every value in the set as part of its calculation. In terms of representativeness, this is a great strength. But the chief drawback of the standard deviation is that, conceptually, it is harder to grasp than other more intuitive measures of spread." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Many scientists who work not just with noise but with probability make a common mistake: They assume that a bell curve is automatically Gauss's bell curve. Empirical tests with real data can often show that such an assumption is false. The result can be a noise model that grossly misrepresents the real noise pattern. It also favors a limited view of what counts as normal versus non-normal or abnormal behavior. This assumption is especially troubling when applied to human behavior. It can also lead one to dismiss extreme data as error when in fact the data is part of a pattern." (Bart Kosko, "Noise", 2006)

"Standard quantile graphs offer certain advantages over cumulative percent frequency graphs. Among these advantages are ease of construction, actual data points are shown as opposed to summaries of class intervals, no decisions are required as to what the best size class interval might be, the same curve functions as a less-than and greater-than curve, and the actual maximum and minimum values are shown on the graph." (Robert L Harris, "Information Graphics: A Comprehensive Illustrated Reference", 1996)

"[…] an outlier is an observation that lies an 'abnormal' distance from other values in a batch of data. There are two possible explanations for the occurrence of an outlier. One is that this happens to be a rare but valid data item that is either extremely large or extremely small. The other is that it is a mistake - maybe due to a measuring or recording error." (Alan Graham, "Developing Thinking in Statistics", 2006)

"Plotting data is a useful first stage to any analysis and will show extreme observations together with any discernible patterns. In addition the relative sizes of categories are easier to see in a diagram" (bar chart or pie chart) than in a table. Graphs are useful as they can be assimilated quickly, and are particularly helpful when presenting information to an audience. Tables can be useful for displaying information about many variables at once, while graphs can be useful for showing multiple observations on groups or individuals. Although there are no hard and fast rules about when to use a graph and when to use a table, in the context of a report or a paper it is often best to use tables so that the reader can scrutinise the numbers directly." (Jenny Freeman et al, "How to Display Data", 2008)

13 November 2011

📉Graphical Representation: Density (Just the Quotes)

"Although arguments can be made that high data density does not imply that a graphic will be good, nor one with low density bad, it does reflect on the efficiency of the transmission of information. Obviously, if we hold clarity and accuracy constant, more information is better than less. One of the great assets of graphical techniques is that they can convey large amounts of information in a small space." (Howard Wainer, "How to Display Data Badly", The American Statistician Vol. 38(2), 1984) 

"Equal variability is not always achieved in plots. For instance, if the theoretical distribution for a probability plot has a density that drops off gradually to zero in the tails (as the normal density does), then the variability of the data in the tails of the probability plot is greater than in the center. Another example is provided by the histogram. Since the height of any one bar has a binomial distribution, the standard deviation of the height is approximately proportional to the square root of the expected height; hence, the variability of the longer bars is greater." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"[…] the only worse design than a pie chart is several of them, for then the viewer is asked to compare quantities located in spatial disarray both within and between pies. […] Given their low data-density and failure to order numbers along a visual dimension, pie charts should never be used." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Visual displays rich with data are not only an appropriate and proper complement to human capabilities, but also such designs are frequently optimal. If the visual task is contrast, comparison, and choice - as so often it is - then the more relevant information within eyespan, the better. Vacant, low-density displays, the dreaded posterization of data spread over pages and pages, require viewers to rely on visual memory - a weak skill - to make a contrast, a comparison, a choice." (Edward R Tufte, "Envisioning Information", 1990)

"We envision information in order to reason about, communicate, document, and preserve that knowledge - activities nearly always carried out on two-dimensional paper and computer screen. Escaping this flatland and enriching the density of data displays are the essential tasks of information design." (Edward R Tufte, "Envisioning Information", 1990)

"Using colour, itʼs possible to increase the density of information even further. A single colour can be used to represent two variables simultaneously. The difficulty, however, is that there is a limited amount of information that can be packed into colour without confusion." (Brian Suda, "A Practical Guide to Designing with Data", 2010)

"The use of the density scale to construct the histogram ensures that the area of each rectangle in the histogram will be proportional to the corresponding relative frequency. The formula for density can also be used when class widths are equal. However, when the intervals are of equal width, the extra arithmetic required to obtain the densities is unnecessary." (Roxy Peck et al, "Introduction to Statistics and Data Analysis" 4th Ed., 2012)

"Linking is a powerful dynamic interactive graphics technique that can help us better understand high-dimensional data. This technique works in the following way: When several plots are linked, selecting an observation's point in a plot will do more than highlight the observation in the plot we are interacting with - it will also highlight points in other plots with which it is linked, giving us a more complete idea of its value across all the variables. Selecting is done interactively with a pointing device. The point selected, and corresponding points in the other linked plots, are highlighted simultaneously. Thus, we can select a cluster of points in one plot and see if it corresponds to a cluster in any other plot, enabling us to investigate the high-dimensional shape and density of the cluster of points, and permitting us to investigate the structure of the disease space." (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

"When there are few data points, place the data labels directly on the data. Data density refers to the amount of data shown in a visualization through encodings (points, bars, lines, etc.). A common mistake is presenting too much data in a single data graph. The data itself can obscure the insight. It can make the chart unreadable because the data values are not discernible. Examples include: overlapping data points, too many lines in a line chart, or too many slices in a pie chart. Selecting the appropriate amount of data requires a delicate balance. It is your job to determine how much detail is necessary." (Kristen Sosulski, "Data Visualization Made Simple: Insights into Becoming Visual", 2018)

📉Graphical Representation: Missing Data (Just the Quotes)

"Missing data values pose a particularly sticky problem for symbols. For instance, if the ray corresponding to a missing value is simply left off of a star symbol, the result will be almost indistinguishable from a minimum (i.e., an extreme) value. It may be better either (i) to impute a value, perhaps a median for that variable, or a fitted value from some regression on other variables, (ii) to indicate that the value is missing, possibly with a dashed line, or (iii) not to draw the symbol for a particular observation if any value is missing." (John M Chambers et al, "Graphical Methods for Data Analysis", 1983)

"We often think, naïvely, that missing data are the primary impediments to intellectual progress - just find the right facts and all problems will dissipate. But barriers are often deeper and more abstract in thought. We must have access to the right metaphor, not only to the requisite information. Revolutionary thinkers are not, primarily, gatherers of facts, but weavers of new intellectual structures." (Stephen J Gould, "The Flamingo's Smile: Reflections in Natural History", 1985)

"Statistics depend on collecting information. If questions go unasked, or if they are asked in ways that limit responses, or if measures count some cases but exclude others, information goes ungathered, and missing numbers result. Nevertheless, choices regarding which data to collect and how to go about collecting the information are inevitable." (Joel Best, "More Damned Lies and Statistics: How numbers confuse public issues", 2004)

"People tend to give greater weight to the data that they have just been exposed to than other relevant data. […] This phenomenon, where people give greater attention to recent or easily available data, is often referred to as an availability error." (Alan Graham, "Developing Thinking in Statistics", 2006)

"There are many reasons for the existence of missing values: the failure of a sensor, different recording standards for different parts of a sample, or structural differences of the objects observed that make it impossible to record all attributes for all observed instances." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"There are several key issues in the field of statistics that impact our analyses once data have been imported into a software program. These data issues are commonly referred to as the measurement scale of variables, restriction in the range of data, missing data values, outliers, linearity, and nonnormality." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"[…] events will always occur that cannot be foreseen by following a chain of logical deductive reasoning. Successful prediction requires intuitive leaps and/or information that is not part of the original data available." (John L Casti, "X-Events: The Collapse of Everything", 2012)

"Missing data is the blind spot of statisticians. If they are not paying full attention, they lose track of these little details. Even when they notice, many unwittingly sway things our way. Most ranking systems ignore missing values." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"Having NUMBERSENSE means: (•) Not taking published data at face value; (•) Knowing which questions to ask; (•) Having a nose for doctored statistics. [...] NUMBERSENSE is that bit of skepticism, urge to probe, and desire to verify. It’s having the truffle hog’s nose to hunt the delicacies. Developing NUMBERSENSE takes training and patience. It is essential to know a few basic statistical concepts. Understanding the nature of means, medians, and percentile ranks is important. Breaking down ratios into components facilitates clear thinking. Ratios can also be interpreted as weighted averages, with those weights arranged by rules of inclusion and exclusion. Missing data must be carefully vetted, especially when they are substituted with statistical estimates. Blatant fraud, while difficult to detect, is often exposed by inconsistency." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"Accuracy and coherence are related concepts pertaining to data quality. Accuracy refers to the comprehensiveness or extent of missing data, performance of error edits, and other quality assurance strategies. Coherence is the degree to which data - item value and meaning are consistent over time and are comparable to similar variables from other routinely used data sources." (Aileen Rothbard, "Quality Issues in the Use of Administrative Data Records", 2015)

"There are several key issues in the field of statistics that impact our analyses once data have been imported into a software program. These data issues are commonly referred to as the measurement scale of variables, restriction in the range of data, missing data values, outliers, linearity, and nonnormality." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"[…] people attempt to use highly flexible mathematical structures with large numbers of parameters that can be adjusted to fit the data, the result often being models that fit the data well but lack structural representation of the phenomena and thus are not predictive outside the range of the data. The situation is exacerbated by uncertainty regarding model parameters on account of insufficient data relative to model complexity, which in fact means uncertainty regarding the models themselves. More importantly from the standpoint of epistemology, the amount of available data is often miniscule in comparison to the amount needed for validation. The desire for knowledge has far outstripped experimental/observational capability. We are starved for data." (Edward R Dougherty, "The Evolution of Scientific Knowledge: From certainty to uncertainty", 2016)

"There are other problems with Big Data. In any large data set, there are bound to be inconsistencies, misclassifications, missing data - in other words, errors, blunders, and possibly lies. These problems with individual items occur in any data set, but they are often hidden in a large mass of numbers even when these numbers are generated out of computer interactions." (David S Salsburg, "Errors, Blunders, and Lies: How to Tell the Difference", 2017)

"Unless we’re collecting data ourselves, there’s a limit to how much we can do to combat the problem of missing data. But we can and should remember to ask who or what might be missing from the data we’re being told about. Some missing numbers are obvious […]. Other omissions show up only when we take a close look at the claim in question." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Correlation does not imply causation: often some other missing third variable is influencing both of the variables you are correlating. […] The need for a scatterplot arose when scientists had to examine bivariate relations between distinct variables directly. As opposed to other graphic forms - pie charts, line graphs, and bar charts - the scatterplot offered a unique advantage: the possibility to discover regularity in empirical data (shown as points) by adding smoothed lines or curves designed to pass 'not through, but among them', so as to pass from raw data to a theory-based description, analysis, and understanding." (Michael Friendly & Howard Wainer, "A History of Data Visualization and Graphic Communication", 2021)

12 November 2011

📉Graphical Representation: Exploration (Just the Quotes)

"Modern data graphics can do much more than simply substitute for small statistical tables. At their best, graphics are instruments for reasoning about quantitative information. Often the most effective way to describe, explore, and summarize a set of numbers even a very large set - is to look at pictures of those numbers. Furthermore, of all methods for analyzing and communicating statistical information, well-designed data graphics are usually the simplest and at the same time the most powerful." (Edward R Tufte, "The Visual Display of Quantitative Information", 1983)

"Working with binned data directly addresses large data set issues of computation and plotting speed. Almost everything that can bc done with the original data can be done faster with binned data. Further, working with binned data allows image processing algorithms to be adapted and applied to bin cells. Thus tools can bc brought to bare that are not traditionally associated with exploratory data analysis." (Daniel B Carr, "Looking at Large Data Sets Using Binned Data Plots", [in "Computing and Graphics in Statistics"] 1991)

"The scatterplot is a useful exploratory method for providing a first look at bivariate data to see how they are distributed throughout the plane, for example, to see clusters of points, outliers, and so forth." (William S Cleveland, "Visualizing Data", 1993)

"Overview first, zoom and filter, then details on demand." (Ben Shneiderman “The Eyes Have It: A Task by Data Type Taxonomy for Information Visualizations.” IEEE Symposium on Visual Languages, 1996) 

"Construction refers to everything involved in the production of the graphical display, including questions of what to plot and how to plot. Deciding what to plot is not always easy and again depends on what we want to accomplish. In the initial phases of an analysis, two-dimensional displays of the response against each of the p predictors are obvious choices for gaining insights about the data, choices that are often recommended in the introductory regression literature. Displays of residuals from an initial exploratory fit are frequently used as well." (R Dennis Cook, "Regression Graphics: Ideas for studying regressions through graphics", 1998)

"If we attempt to map the world of a story before we explore it, we are likely either to (a) prematurely limit our exploration, so as to reduce the amount of material we need to consider, or" (b) explore at length but, recognizing the impossibility of taking note of everything, and having no sound basis for choosing what to include, arbitrarily omit entire realms of information. The opportunities are overwhelming." (Peter Turchi, "Maps of the Imagination: The writer as cartographer", 2004)

"Clearly principles and guidelines for good presentation graphics have a role to play in exploratory graphics, but personal taste and individual working style also play important roles. The same data may be presented in many alternative ways, and taste and customs differ as to what is regarded as a good presentation graphic. Nevertheless, there are principles that should be respected and guidelines that are generally worth following. No one should expect a perfect consensus where graphics are concerned." (Antony Unwin, Good Graphics?"[in "Handbook of Data Visualization"], 2008)

"There are two main reasons for using graphic displays of datasets: either to present or to explore data. Presenting data involves deciding what information you want to convey and drawing a display appropriate for the content and for the intended audience. [...] Exploring data is a much more individual matter, using graphics to find information and to generate ideas.Many displays may be drawn. They can be changed at will or discarded and new versions prepared, so generally no one plot is especially important, and they all have a short life span." (Antony Unwin, "Good Graphics?" [in "Handbook of Data Visualization"], 2008)

"Presentation graphics face the challenge to depict a key message in - usually a single - graphic which needs to fit very many observers at a time, without the chance to give further explanations or context. Exploration graphics, in contrast, are mostly created and used only by a single researcher, who can use as many graphics as necessary to explore particular questions. In most cases none of these graphics alone gives a comprehensive answer to those questions, but must be seen as a whole in the context of the analysis." (Martin Theus & Simon Urbanek, "Interactive Graphics for Data Analysis: Principles and Examples", 2009)

"All graphics present data and allow a certain degree of exploration of those same data. Some graphics are almost all presentation, so they allow just a limited amount of exploration; hence we can say they are more infographics than visualization, whereas others are mostly about letting readers play with what is being shown, tilting more to the visualization side of our linear scale. But every infographic and every visualization has a presentation and an exploration component: they present, but they also facilitate the analysis of what they show, to different degrees." (Alberto Cairo, "The Functional Art", 2011)

"A viewer’s eye must be guided to 'read' the elements in a logical order. The design of an exploratory graphic needs to allow for the additional component of discovery - guiding the viewer to first understand the overall concept and then engage her to further explore the supporting information." (Felice C Frankel & Angela H DePace, "Visual Strategies", 2012)

"The process of visual analysis can potentially go on endlessly, with seemingly infinite combinations of variables to explore, especially with the rich opportunities bigger data sets give us. However, by deploying a disciplined and sensible balance between deductive and inductive enquiry you should be able to efficiently and effectively navigate towards the source of the most compelling stories." (Andy Kirk, "Data Visualization: A successful design process", 2012)

"Early exploration of a dataset can be overwhelming, because you don’t know where to start. Ask questions about the data and let your curiosities guide you. […] Make multiple charts, compare all your variables, and see if there are interesting bits that are worth a closer look. Look at your data as a whole and then zoom in on categories and individual data points. […] Subcategories, the categories within categories" (within categories), are often more revealing than the main categories. As you drill down, there can be higher variability and more interesting things to see." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Good visualization is a winding process that requires statistics and design knowledge. Without the former, the visualization becomes an exercise only in illustration and aesthetics, and without the latter, one of only analyses. On their own, these are fine skills, but they make for incomplete data graphics. Having skills in both provides you with the luxury - which is growing into a necessity - to jump back and forth between data exploration and storytelling." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Put everything together - from understanding data, to exploration, clarity, and adapting to an audience - and you get a general process for how to make data graphics. " (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Visualization can be appreciated purely from an aesthetic point of view, but it’s most interesting when it’s about data that’s worth looking at. That’s why you start with data, explore it, and then show results rather than start with a visual and try to squeeze a dataset into it. It’s like trying to use a hammer to bang in a bunch of screws. […] Aesthetics isn’t just a shiny veneer that you slap on at the last minute. It represents the thought you put into a visualization, which is tightly coupled with clarity and affects interpretation." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"[...] communicating with data is less often about telling a specific story and more like starting a guided conversation. It is a dialogue with the audience rather than a monologue. While some data presentations may share the linear approach of a traditional story, other data products" (analytical tools, in particular) give audiences the flexibility for exploration. In our experience, the best data products combine a little of both: a clear sense of direction defined by the author with the ability for audiences to focus on the information that is most relevant to them. The attributes of the traditional story approach combined with the self-exploration approach leads to the guided safari analogy." (Zach Gemignani et al, "Data Fluency", 2014)

"Exploratory analysis is what you do to understand the data and figure out what might be noteworthy or interesting to highlight to others." (Cole N Knaflic, "Storytelling with Data: A Data Visualization Guide for Business Professionals", 2015)

"Exploring data generates hypotheses about patterns in our data. The visualizations and tools of dynamic interactive graphics ease and improve the exploration, helping us to 'see what our data seem to say'." (Forrest W Young et al, "Visual Statistics: Seeing data with dynamic interactive graphics", 2016)

With time series though, there is absolutely no substitute for plotting. The pertinent pattern might end up being a sharp spike followed by a gentle taper down. Or, maybe there are weird plateaus. There could be noisy spikes that have to be filtered out. A good way to look at it is this: means and standard deviations are based on the naïve assumption that data follows pretty bell curves, but there is no corresponding 'default' assumption for time series data (at least, not one that works well with any frequency), so you always have to look at the data to get a sense of what’s normal. [...] Along the lines of figuring out what patterns to expect, when you are exploring time series data, it is immensely useful to be able to zoom in and out." (Field Cady, "The Data Science Handbook", 2017)

"Models are formal structures represented in mathematics and diagrams that help us to understand the world. Mastery of models improves your ability to reason, explain, design, communicate, act, predict, and explore." (Scott E Page, "The Model Thinker", 2018)

"The way we explore data today, we often aren't constrained by rigid hypothesis testing or statistical rigor that can slow down the process to a crawl. But we need to be careful with this rapid pace of exploration, too. Modern business intelligence and analytics tools allow us to do so much with data so quickly that it can be easy to fall into a pitfall by creating a chart that misleads us in the early stages of the process." (Ben Jones, "Avoiding Data Pitfalls: How to Steer Clear of Common Blunders When Working with Data and Presenting Analysis and Visualizations", 2020)

"Data that is well prepared makes the analysis easier and allows a deeper exploration of patterns. It helps the analyst sift through the data with less friction. Data that is well crafted holds up to rigorous analysis and presentation. It removes the wall between us and the data and allows us to see the patterns. Well-shaped data isn't only functional, it's also aesthetic." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"We define analytical intent to be the goal that a consumer or analyst focuses on when performing either targeted or more open-ended data exploration and discovery. Analytical intent is expressed as part of a conversation between the user and a visualization interface." (Vidya Setlur & Bridget Cogley, "Functional Aesthetics for data visualization", 2022)

"Charts used to confirm are less formal, and designed well enough to be interpreted, but they don’t always have to be presentation worthy. […] Or maybe you don’t know what you’re looking for […] This is exploratory work - rougher still in design, usually iterative, sometimes interactive. Most of us don’t do as much exploratory work as we do declarative and confirmatory; we should do more. It’s a kind of data brainstorming." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)

"Confirmation is a kind of focused exploration, whereas true exploration is more open-ended. The bigger and more complex the data, and the less you know going in, the more exploratory the work. If confirmation is hiking a new trail, exploration is blazing one." (Scott Berinato, "Good Charts : the HBR guide to making smarter, more persuasive data visualizations", 2023)

📉Graphical Representation: Expert Perspectives (Just the Quotes)

"Absorb the data. Read it, re-read it, read it backwards and understand the lyrical and human-centred contribution." (Kate McLean) [1]

"Admit that nothing you create on deadline will be perfect. However, it should never be wrong. I try to work by a motto my editor likes to say: 'No Heroics. Your code may not be beautiful, but if it works, it’s good enough.' A visualisation may not have every feature you could possibly want, but if it gets the message across and is useful to people, it’s good enough. Being 'good enough' is not an insult in journalism – it’s a necessity." (Lena Groeger) [1]

"After the data exploration phase you may come to the conclusion that the data does not support the goal of the project. The thing is: data is leading in a data visualization project – you cannot make up some data just to comply with your initial ideas. So, you need to have some kind of an open mind and 'listen to what the data has to say' and learn what its potential is for a visualization. Sometimes this means that a project has to stop if there is too much of a mismatch between the goal of the project and the available data. In other cases this may mean that the goal needs to be adjusted and the project can continue." (Jan Willem Tulp) [1]

"Although all our projects are very much data driven, visualisation is only part of the products and solutions we create. This day and age provides us with amazing opportunities to combine video, animation, visualisation, sound and interactivity. Why not make full use of this? Judging whether to include something or not is all about editing: asking 'is it really necessary?'. There is always an aspect of gut feel or instinct mixed with continuous doubt that drives me in these cases." (Thomas Clever) [1]

"At the beginning, there’s a process of 'interviewing' the data – first evaluating their source and means of collection/aggregation/computation, and then trying to get a sense of what they say – and how well they say it via quick sketches in Excel with pivot tables and charts. Do the data, in various slices, say anything interesting? If I’m coming into this with certain assumptions, do the data confirm them, or refute them?" (Alyson Hurt) [1]

"Context is key. You’ll hear that the most important quality of a visualisation is graphical honesty, or storytelling value, or facilitation of 'insights'. The truth is, all of these things (and others) are the most important quality, but in different times and places. There is no singular function of visualisation; what’s important shifts with the constraints of your audience, goals, tools, expertise, and data and time available.’ (Scott Murray) [1]

"Data and data sets are not objective; they are creations of human design. Hidden biases in both the collection and analysis stages present considerable risks [in terms of inference]." (Kate Crawford) [1]

"Data inspires me. I always open the data in its native format and look at the raw data just to get the lay of the land. It’s much like looking at a map to begin a journey." (Kim Rees) [1]

"'Everything must have a reason.' A principle that I learned as a graphic designer that still applies to data visualization. In essence, everything needs to be rationalized and have a logic to why it’s in the design/visualization, or it’s out." (Stefanie Posavec) [1]

"Good design is honest. It does not make a product appear more innovative, powerful or valuable than it really is. It does not attempt to manipulate the consumer with promises that cannot be kept." (Dieter Rams) [1]

"I focus on structural exploration on one side and on the reality and the landscape of opportunities in the other […] I try not to impose any early ideas of what the result will look like because that will emerge from the process. In a nutshell I first activate data curiosity, client curiosity, and then visual imagination in parallel with experimentation." (Santiago Ortiz) [1]

"I kick it over into a rough picture as soon as possible. When I can see something then I am able to ask better questions of it – then the what-about-this iterations begin. I try to look at the same data in as many different dimensions as possible. For example, if I have a spreadsheet of bird sighting locations and times, first I like to see where they happen, previewing it in some mapping software. I’ll also look for patterns in the timing of the phenomenon, usually using a pivot table in a spreadsheet. The real magic happens when a pattern reveals itself only when seen in both dimensions at the same time." (John Nelson) [1]

"I say begin by learning about data visualization’s 'black and whites' , the rules, then start looking for the greys. It really then becomes quite a personal journey of developing your conviction." (Jorge Camoes) [1]

"I suppose one could say our work has a certain signature. Style, to me, has a negative connotation of 'slapped on' = to prettify something without much meaning. We don’t make it our goal to have a recognisable (visual) signature, instead to create work that truly matters and is unique. Pretty much all our projects are bespoke and have a different end result. That is one of the reasons why we are more concerned with working according to values and principles that transcend individual projects and I believe that is what makes our work recognisable." (Thomas Clever) [1]

"I think this is something I’ve learned from experience rather than advice that was passed on. Less can often be more. In other words, don’t get carried away and try to tell the reader everything there is to know on a subject. Know what it is that you want to show the reader and don’t stray from that. I often find myself asking others 'do we need to show this?” or “is this really necessary'?' Let’s take it out." (Simon Scarr) [1]

"I truly feel that experimentation (even for the sake of experimentation) is important, and I would strongly encourage it. There are infinite possibilities in diagramming and visual communication, so we have much to explore yet. I think a good rule of thumb is to never allow your design or implementation to obscure the reader understanding the central point of your piece. However, I’d even be willing to forsake this, at times, to allow for innovation and experimentation. It ends up moving us all forward, in some way or another." (Kennedy Elliott) [1]

"I’m obsessed with alignments. Sloppy label placement on final files causes my confidence in the designer to flag. What other details haven’t been given full attention? Has the data been handled sloppily as well? [...] On the flip side, clean, layered, and logically built final files are a thing of beauty and my confidence in the designer, and their attention to detail, soars." (Jen Christiansen) [1]

"I’ve come to believe that pure beautiful visual works are somehow relevant in everyday life, because they can become a trigger to get people curious to explore the contents these visuals convey. I like the idea of making people say 'oh that’s beautiful! I want to know what this is about!' I think that probably (or, at least, lots of people pointed that out to us) being Italians plays its role on this idea of 'making things not only functional but beautiful'." (Giorgia Lupi) [1]

"It is easy to immerse yourself in a certain idea, but I think it is important to step back regularly and recognize that other people have different ways of interpreting things. I am very fortunate to work with people whom I greatly admire and who also see things from a different perspective. Their feedback is invaluable in the process." (Jane Pong) [1]

"Look at how other designers solve visual problems (but don’t copy the look of their solutions). Look at art to see how great painters use space, and organise the elements of their pictures. Look back at the history of infographics. It’s all been done before, and usually by hand! Draw something with a pencil (or pen [...] but NOT a computer!). Sketch often: The cat asleep. The view from the bus. The bus. Personally, I listen to music – mostly jazz – a lot." (Nigel Holmes) [1]

‘My design approach requires that I immerse myself deeply in the problem domain and available data very early in the project, to get a feel for the unique characteristics of the data, its 'texture' and the affordances it brings. It is very important that the results from these explorations, which I also discuss in detail with my clients, can influence the basic concept and main direction of the project. To put it in Hans Rosling’s words, you need to “let the data set change your mind set”. (Moritz Stefaner) [1]

"My main advice is not to be disheartened. Sometimes the data don’t show what you 
thought they would, or they aren’t available in a usable or comparable form. But [in my world] sometimes that research still turns up threads a reporter could pursue and turn into a really interesting story – there just might not be a viz in it. Or maybe there’s no story at all. And that’s all okay. At minimum, you’ve still hopefully learned something new in the process about a topic, or a data source (person or database), or a 'gotcha' in a particular dataset – lessons that can be applied to another project down the line." (Alyson Hurt) [1]

"Research is key. Data, without interpretation, is just a jumble of words and numbers – out of context and devoid of meaning. If done well, research not only provides a solid foundation upon which to build your graphic/visualisation, but also acts as a source of inspiration and a guidebook for creativity. A good researcher must be a team player with the ability to think critically, analytically, and creatively. They should be a proactive problem solver, identifying potential pitfalls and providing various roadmaps for overcoming them. In short, their inclusion should amplify, not restrain, the talents of others." (Amanda Hobbs) [1]

"The capability to cope with the technological dimension is a key attribute of successful students: coding – more as a logic and a mindset than a technical task – is becoming a very important asset for designers who want to work in Data Visualization. It doesn’t necessarily mean that you need to be able to code to find a job, but it helps a lot in the design process. The profile in the (near) future will be a hybrid one, mixing competences, skills and approaches currently separated into disciplinary silos." (Paolo Ciuccarelli) [1]

"The experience offered by a visualisation influences the interpreting phase of understanding. Whereas tone embodies a continuum, the judgement of the most suitable experience is more distinct and concerns different methods of enabling interpretation: explanatory, exhibitory or exploratory you degrade its existence and malign its importance. Words are not your enemy. Complex thoughts are not your enemy. Confusion is. Don’t confuse your audience. Don’t talk down to them, don’t mislead them, and certainly don’t lie to them." (Amanda Hobbs) [1]

"The key difference I think in producing data visualization/infographics in the service of journalism versus other contexts (like art) is that there is always an underlying, ultimate goal: to be useful. Not just beautiful or efficient – although something can (and should!) be all of those things. But journalism presents a certain set of constraints. A journalist has to always ask the question: How can I make this more useful? How can what I am creating help someone, teach someone, show someone something new?" (Lena Groeger) [1]

"There's a strand of the data viz world that argues that everything could be a bar chart. That’s possibly true but also possibly a world without joy." (Amanda Cox, [interview in ( Scott Berinato"The Power of Visualization’s 'Aha!' Moments, Harvard Business Review] 2013) (link) [1]

"Think of the reader – a specific reader, like a friend who’s curious but a novice to the subject and to data-viz – when designing the graphic. That helps. And I rely pretty heavily on that introductory text that runs with each graphic – about 100 words, usually, that should give the new-to-the-subject reader enough background to understand why this graphic is worth engaging with and sets them up to understand and contextualize the takeaway. And annotate the graphic itself. If there’s a particular point you want the reader to understand, make it! Explicitly!" (Katie Peek) [1]

"Using our eyes to switch between different views that are visible simultaneously has much 
lower cognitive load than consulting our mem￾ory to compare a current view with what was seen before." (Tamara Munzner) [1]

"We should pay as much attention to understanding the project’s goal in relation to its audience. This involves understanding principles of perception and cognition in addition to other relevant factors, such as culture and education levels, for example. More importantly, it means carefully matching the tasks in the representation to our audience’s needs, expectations, expertise, etc. Visualizations are human-centred projects, in that they are not universal and will not be effective for all humans uniformly. As producers of visualizations, whether devised for data exploration or communication of information, we need to take into careful consideration those on the other side of the equation, and who will face the challenges of decoding our representations." (Isabel Meirelles) [1]

"What is the least this can be? What is the minimum result that will 1) be factually accurate, 2) present the core concepts of this story in a way that a general audience will understand, and 3) be readable on a variety of screen sizes 
(desktop, mobile, etc.)? And then I judge what else can be done based on the time I have. 
Certainly, when we’re down to the wire it’s no time to introduce complex new features that require lots of testing and could potentially break other, working features." (Alyson Hurt) [1]

"When I first started learning about visualisation, I naively assumed that datasets arrived at your doorstep ready to roll. Begrudgingly I accepted that before you can plot or graph anything, you have to find the data, understand it, evaluate it, clean it, and perhaps restructure it." (Marcia Gray) [1]

"When something is not harmonious, it’s either boring or chaotic. At one extreme is a visual experience that is so bland that the viewer is not engaged. The human brain will reject understimulating information. At the other extreme is a visual experience that is so overdone, so chaotic, that the viewer can’t stand to look at it. The human brain rejects what it cannot organize, what it cannot understand." (Jill Morton) [1]

"When the data has been explored sufficiently, it is time to sit down and reflect – what were the most interesting insights? What surprised me? What were the recurring themes and facts throughout all views on the data? In the end, what do we find most important and most interesting? These are the things that will govern which angles and perspectives we want to emphasize in the subsequent project phases." (Moritz Stefaner) [1]

"You don’t get there [beauty] with cosmetics, you get there by taking care of the details, by polishing and refining what you have. This is ultimately a matter of trained taste, or what German speakers call fingerspitzengefühl ('finger-tip-feeling')." (Oliver Reichenstein) [1]

References:
[1] Andy Kirk, "Data Visualisation: A Handbook for Data Driven Design" 2nd Ed., 2019

💠SQL Server: SQL Server 2012 is almost here [new feature]

I was quite quiet for the past 3-4 months, and this not because of the lack of blogging material, but lack of time. Instead of writing I preferred reading, diving in some special topics related to SQL Server (e.g. tempdb and security), in the near future following to post some of my notes. For short time I was busy learning for ITIL® v3 Foundation Certification, the topics on Knowledge Management giving me more ideas for several posts waiting in the pipe. I started also the online “Introduction to Databases” course offered by Stanford University, attempting thus a scholastic approach of the topic, of importance being the material on Relational Algebra, material I didn’t had the chance to study in the past.

From my perspective, during this time two  important events related to SQL Server took place – the launch of AX Dynamics 2012 and, more recently, the introduction of SQL Server 2012 at PASS (The Professional Association of SQL Server) 2011.

SQL Server 2012

At PASS Summit 2011 were disclosed 4 of the newest SQL Server Products: SQL Server 2012 (code Denali), Power View (code Crescent), ColumnStore Index (code Apollo) and SQL Server Data Tools (code Juneau). The PASS 2011 streamed sessions are available online with quite interesting materials on SQL Server topics like application and database development, database administration and deployment, BI, etc. If you want to learn more about SQL Server, check the CTP 3 Product Guide, which contains datasheets, white papers, technical presentations, demonstrations and links to videos, or the SQL Server 2012 Developer Training Kit Preview (requires Microsoft’s Web Platform Installer).

Dynamics AX 2012

Because lately I’ve been spending more and more time with Dynamics AX, Microsoft’s ERP (Enterprise Resource Planning) solution, I’d like to include related content in my posts, at least presenting resources if I can’t get yet into technical stuff. As its backend is based mainly on SQL Server, AX is the perfect environment to see SQL Server at work, or to perform configuration and administration activities. In addition, AX material (best/good practices, methodologies, various other papers) related to SQL Server could be extended to other environments. I’m saluting Microsoft’s decision of making available publicly more Technet and MSDN content, previously most of the technical content being accessible mainly though Microsoft’s Partner Network and Customer Network. A good compilation of resources is available on AX Technical Support Blog and Inside Microsoft Dynamics AX blog.

As pointed above, recently was launched Microsoft Dynamics AX 2012 (see global and local launch events).  It’s interesting to point out that, with this edition, SSRS becomes the reporting platform for AX, a considerable step forward.

Books

In what concerns the free books there are 3 free “new” appearances: Jonathan Kehayias and Ted Krueger’s book Troubleshooting SQL Server: A Guide for the Accidental DBA (zipped PDF), which provides a basic approach to troubleshooting, Fabiano Amorim’s book on Complete Showplan Operators (PDF, Epub), and Ross Mistry and Stacia Misner’s Introducing Microsoft SQL Server 2008 R2 (PDF, requires registration).
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.