09 December 2015

🪙Business Intelligence: Averages (Just the Quotes)

"It is difficult to understand why statisticians commonly limit their inquiries to Averages, and do not revel in more comprehensive views. Their souls seem as dull to the charm of variety as that of the native of one of our flat English counties, whose retrospect of Switzerland was that, if its mountains could be thrown into its lakes, two nuisances would be got rid of at once. An Average is but a solitary fact, whereas if a single other fact be added to it, an entire Normal Scheme, which nearly corresponds to the observed one, starts potentially into existence." (Sir Francis Galton, "Natural Inheritance", 1889)

"Statistics may rightly be called the science of averages. […] Great numbers and the averages resulting from them, such as we always obtain in measuring social phenomena, have great inertia. […] It is this constancy of great numbers that makes statistical measurement possible. It is to great numbers that statistical measurement chiefly applies." (Sir Arthur L Bowley, "Elements of Statistics", 1901)

"[…] the new mathematics is a sort of supplement to language, affording a means of thought about form and quantity and a means of expression, more exact, compact, and ready than ordinary language. The great body of physical science, a great deal of the essential facts of financial science, and endless social and political problems are only accessible and only thinkable to those who have had a sound training in mathematical analysis, and the time may not be very remote when it will be understood that for complete initiation as an efficient citizen of one of the new great complex world wide states that are now developing, it is as necessary to be able to compute, to think in averages and maxima and minima, as it is now to be able to read and to write." (Herbert G Wells, "Mankind In the Making", 1906)

"Of itself an arithmetic average is more likely to conceal than to disclose important facts; it is the nature of an abbreviation, and is often an excuse for laziness." (Arthur L Bowley, "The Nature and Purpose of the Measurement of Social Phenomena", 1915)

"Averages are like the economic man; they are inventions, not real. When applied to salaries they hide gaunt poverty at the lower end." (Julia Lathrop, 1919)

"Scientific laws, when we have reason to think them accurate, are different in form from the common-sense rules which have exceptions: they are always, at least in physics, either differential equations, or statistical averages." (Bertrand A Russell, "The Analysis of Matter", 1927)

"An average value is a single value within the range of the data that is used to represent all of the values in the series. Since an average is somewhere within the range of the data, it is sometimes called a measure of central value." (Frederick E Croxton & Dudley J Cowden, "Practical Business Statistics", 1937)

"An average is a single value which is taken to represent a group of values. Such a representative value may be obtained in several ways, for there are several types of averages. […] Probably the most commonly used average is the arithmetic average, or arithmetic mean." (John R Riggleman & Ira N Frisbee, "Business Statistics", 1938)

"Because they are determined mathematically instead of according to their position in the data, the arithmetic and geometric averages are not ascertained by graphic methods." (John R Riggleman & Ira N Frisbee, "Business Statistics", 1938)

"[…] statistical literacy. That is, the ability to read diagrams and maps; a 'consumer' understanding of common statistical terms, as average, percent, dispersion, correlation, and index number."  (Douglas Scates, "Statistics: The Mathematics for Social Problems", 1943)

"[Disorganized complexity] is a problem in which the number of variables is very large, and one in which each of the many variables has a behavior which is individually erratic, or perhaps totally unknown. However, in spite of this helter-skelter, or unknown, behavior of all the individual variables, the system as a whole possesses certain orderly and analyzable average properties. [...] [Organized complexity is] not problems of disorganized complexity, to which statistical methods hold the key. They are all problems which involve dealing simultaneously with a sizable number of factors which are interrelated into an organic whole. They are all, in the language here proposed, problems of organized complexity." (Warren Weaver, "Science and Complexity", American Scientist Vol. 36, 1948)

"The economists, of course, have great fun - and show remarkable skill - in inventing more refined index numbers. Sometimes they use geometric averages instead of arithmetic averages (the advantage here being that the geometric average is less upset by extreme oscillations in individual items), sometimes they use the harmonic average. But these are all refinements of the basic idea of the index number [...]" (Michael J Moroney, "Facts from Figures", 1951)

"The mode would form a very poor basis for any further calculations of an arithmetical nature, for it has deliberately excluded arithmetical precision in the interests of presenting a typical result. The arithmetic average, on the other hand, excellent as it is for numerical purposes, has sacrificed its desire to be typical in favour of numerical accuracy. In such a case it is often desirable to quote both measures of central tendency."(Michael J Moroney, "Facts from Figures", 1951)

"An average does not tell the full story. It is hardly fully representative of a mass unless we know the manner in which the individual items scatter around it. A further description of the series is necessary if we are to gauge how representative the average is." (George Simpson & Fritz Kafk, "Basic Statistics", 1952)

"An average is a single value selected from a group of values to represent them in some way, a value which is supposed to stand for whole group of which it is part, as typical of all the values in the group." (Albert E Waugh, "Elements of Statistical Methods" 3rd Ed., 1952)

"Only when there is a substantial number of trials involved is the law of averages a useful description or prediction." (Darell Huff, "How to Lie with Statistics", 1954)

"Place little faith in an average or a graph or a trend when those important figures are missing."  (Darell Huff, "How to Lie with Statistics", 1954)

"Every economic and social situation or problem is now described in statistical terms, and we feel that it is such statistics which give us the real basis of fact for understanding and analysing problems and difficulties, and for suggesting remedies. In the main we use such statistics or figures without any elaborate theoretical analysis; little beyond totals, simple averages and perhaps index numbers. Figures have become the language in which we describe our economy or particular parts of it, and the language in which we argue about policy." (Ely Devons, "Essays in Economics", 1961)

"The fact that index numbers attempt to measure changes of items gives rise to some knotty problems. The dispersion of a group of products increases with the passage of time, principally because some items have a long-run tendency to fall while others tend to rise. Basic changes in the demand is fundamentally responsible. The averages become less and less representative as the distance from the period increases." (Anna C Rogers, "Graphic Charts Handbook", 1961)

"Myth is more individual and expresses life more precisely than does science. Science works with concepts of averages which are far too general to do justice to the subjective variety of an individual life." (Carl G Jung, "Memories, Dreams, Reflections", 1963)

"An average is sometimes called a 'measure of central tendency' because individual values of the variable usually cluster around it. Averages are useful, however, for certain types of data in which there is little or no central tendency." (William A Spirr & Charles P Bonini, "Statistical Analysis for Business Decisions" 3rd Ed., 1967)

"The most widely used mathematical tools in the social sciences are statistical, and the prevalence of statistical methods has given rise to theories so abstract and so hugely complicated that they seem a discipline in themselves, divorced from the world outside learned journals. Statistical theories usually assume that the behavior of large numbers of people is a smooth, average 'summing-up' of behavior over a long period of time. It is difficult for them to take into account the sudden, critical points of important qualitative change. The statistical approach leads to models that emphasize the quantitative conditions needed for equilibrium-a balance of wages and prices, say, or of imports and exports. These models are ill suited to describe qualitative change and social discontinuity, and it is here that catastrophe theory may be especially helpful." (Alexander Woodcock & Monte Davis, "Catastrophe Theory", 1978)

"The arithmetic mean has another familiar property that will be useful to remember. The sum of the deviations of the values from their mean is zero, and the sum of the squared deviations of the values about the mean is a minimum. That is to say, the sum of the squared deviations is less than the sum of the squared deviations about any other value." (Charles T Clark & Lawrence L Schkade, "Statistical Analysis for Administrative Decisions", 1979)

"Averaging results, whether weighted or not, needs to be done with due caution and commonsense. Even though a measurement has a small quoted error it can still be, not to put too fine a point on it, wrong. If two results are in blatant and obvious disagreement, any average is meaningless and there is no point in performing it. Other cases may be less outrageous, and it may not be clear whether the difference is due to incompatibility or just unlucky chance." (Roger J Barlow, "Statistics: A guide to the use of statistical methods in the physical sciences", 1989)

"All the law [of large numbers] tells us is that the average of a large number of throws will be more likely than the average of a small number of throws to differ from the true average by less than some stated amount. And there will always be a possibility that the observed result will differ from the true average by a larger amount than the specified bound." (Peter L Bernstein, "Against the Gods: The Remarkable Story of Risk", 1996)

"It is a consequence of the definition of the arithmetic mean that the mean will lie somewhere between the lowest and highest values. In the unrealistic and meaningless case that all values which make up the mean are the same, all values will be equal to the average. In an unlikely and impractical case, it is possible for only one of many values to be above or below the average. By the very definition of the average, it is impossible for all values to be above average in any case." (Herbert F Spirer et al, "Misused Statistics" 2nd Ed, 1998)

"Averages, ranges, and histograms all obscure the time-order for the data. If the time-order for the data shows some sort of definite pattern, then the obscuring of this pattern by the use of averages, ranges, or histograms can mislead the user. Since all data occur in time, virtually all data will have a time-order. In some cases this time-order is the essential context which must be preserved in the presentation." (Donald J Wheeler," Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"Since the average is a measure of location, it is common to use averages to compare two data sets. The set with the greater average is thought to ‘exceed’ the other set. While such comparisons may be helpful, they must be used with caution. After all, for any given data set, most of the values will not be equal to the average." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"A bar graph typically presents either averages or frequencies. It is relatively simple to present raw data (in the form of dot plots or box plots). Such plots provide much more information. and they are closer to the original data. If the bar graph categories are linked in some way - for example, doses of treatments - then a line graph will be much more informative. Very complicated bar graphs containing adjacent bars are very difficult to grasp. If the bar graph represents frequencies. and the abscissa values can be ordered, then a line graph will be much more informative and will have substantially reduced chart junk." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"If you want to hide data, try putting it into a larger group and then use the average of the group for the chart. The basis of the deceit is the endearingly innocent assumption on the part of your readers that you have been scrupulous in using a representative average: one from which individual values do not deviate all that much. In scientific or statistical circles, where audiences tend to take less on trust, the 'quality' of the average (in terms of the scatter of the underlying individual figures) is described by the standard deviation, although this figure is itself an average." (Nicholas Strange, "Smoke and Mirrors: How to bend facts and figures to your advantage", 2007)

"Prior to the discovery of the butterfly effect it was generally believed that small differences averaged out and were of no real significance. The butterfly effect showed that small things do matter. This has major implications for our notions of predictability, as over time these small differences can lead to quite unpredictable outcomes. For example, first of all, can we be sure that we are aware of all the small things that affect any given system or situation? Second, how do we know how these will affect the long-term outcome of the system or situation under study? The butterfly effect demonstrates the near impossibility of determining with any real degree of accuracy the long term outcomes of a series of events." (Elizabeth McMillan, Complexity, "Management and the Dynamics of Change: Challenges for practice", 2008)

"Having NUMBERSENSE means: (•) Not taking published data at face value; (•) Knowing which questions to ask; (•) Having a nose for doctored statistics. [...] NUMBERSENSE is that bit of skepticism, urge to probe, and desire to verify. It’s having the truffle hog’s nose to hunt the delicacies. Developing NUMBERSENSE takes training and patience. It is essential to know a few basic statistical concepts. Understanding the nature of means, medians, and percentile ranks is important. Breaking down ratios into components facilitates clear thinking. Ratios can also be interpreted as weighted averages, with those weights arranged by rules of inclusion and exclusion. Missing data must be carefully vetted, especially when they are substituted with statistical estimates. Blatant fraud, while difficult to detect, is often exposed by inconsistency." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"What is so unconventional about the statistical way of thinking? First, statisticians do not care much for the popular concept of the statistical average; instead, they fixate on any deviation from the average. They worry about how large these variations are, how frequently they occur, and why they exist. [...] Second, variability does not need to be explained by reasonable causes, despite our natural desire for a rational explanation of everything; statisticians are frequently just as happy to pore over patterns of correlation. [...] Third, statisticians are constantly looking out for missed nuances: a statistical average for all groups may well hide vital differences that exist between these groups. Ignoring group differences when they are present frequently portends inequitable treatment. [...] Fourth, decisions based on statistics can be calibrated to strike a balance between two types of errors. Predictably, decision makers have an incentive to focus exclusively on minimizing any mistake that could bring about public humiliation, but statisticians point out that because of this bias, their decisions will aggravate other errors, which are unnoticed but serious. [...] Finally, statisticians follow a specific protocol known as statistical testing when deciding whether the evidence fits the crime, so to speak. Unlike some of us, they don’t believe in miracles. In other words, if the most unusual coincidence must be contrived to explain the inexplicable, they prefer leaving the crime unsolved." (Kaiser Fung, "Numbers Rule the World", 2010) 

"A very different - and very incorrect - argument is that successes must be balanced by failures (and failures by successes) so that things average out. Every coin flip that lands heads makes tails more likely. Every red at roulette makes black more likely. […] These beliefs are all incorrect. Good luck will certainly not continue indefinitely, but do not assume that good luck makes bad luck more likely, or vice versa." (Gary Smith, "Standard Deviations", 2014)

"The indicators - through no particular fault of anyone in particular - have not kept up with the changing world. As these numbers have become more deeply embedded in our culture as guides to how we are doing, we rely on a few big averages that can never be accurate pictures of complicated systems for the very reason that they are too simple and that they are averages. And we have neither the will nor the resources to invent or refine our current indicators enough to integrate all of these changes." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"The search for better numbers, like the quest for new technologies to improve our lives, is certainly worthwhile. But the belief that a few simple numbers, a few basic averages, can capture the multifaceted nature of national and global economic systems is a myth. Rather than seeking new simple numbers to replace our old simple numbers, we need to tap into both the power of our information age and our ability to construct our own maps of the world to answer the questions we need answering." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"When a trait, such as academic or athletic ability, is measured imperfectly, the observed differences in performance exaggerate the actual differences in ability. Those who perform the best are probably not as far above average as they seem. Nor are those who perform the worst as far below average as they seem. Their subsequent performances will consequently regress to the mean." (Gary Smith, "Standard Deviations", 2014)

"The more complex the system, the more variable (risky) the outcomes. The profound implications of this essential feature of reality still elude us in all the practical disciplines. Sometimes variance averages out, but more often fat-tail events beget more fat-tail events because of interdependencies. If there are multiple projects running, outlier (fat-tail) events may also be positively correlated - one IT project falling behind will stretch resources and increase the likelihood that others will be compromised." (Paul Gibbons, "The Science of Successful Organizational Change",  2015)

"The no free lunch theorem for machine learning states that, averaged over all possible data generating distributions, every classification algorithm has the same error rate when classifying previously unobserved points. In other words, in some sense, no machine learning algorithm is universally any better than any other. The most sophisticated algorithm we can conceive of has the same average performance (over all possible tasks) as merely predicting that every point belongs to the same class. [...] the goal of machine learning research is not to seek a universal learning algorithm or the absolute best learning algorithm. Instead, our goal is to understand what kinds of distributions are relevant to the 'real world' that an AI agent experiences, and what kinds of machine learning algorithms perform well on data drawn from the kinds of data generating distributions we care about." (Ian Goodfellow et al, "Deep Learning", 2015)

"[…] average isn’t something that should be considered in isolation. Your average is only as good as the data that supports it. If your sample isn’t representative of the full population, if you cherry- picked the data, or if there are other issues with your data, your average may be misleading." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"If you’re looking at an average, you are - by definition - studying a specific sample set. If you’re comparing averages, and those averages come from different sample sets, the differences in the sample sets may well be manifested in the averages. Remember, an average is only as good as the underlying data." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"In the real world, statistical issues rarely exist in isolation. You’re going to come across cases where there’s more than one problem with the data. For example, just because you identify some sampling errors doesn’t mean there aren’t also issues with cherry picking and correlations and averages and forecasts - or simply more sampling issues, for that matter. Some cases may have no statistical issues, some may have dozens. But you need to keep your eyes open in order to spot them all." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"Just as with aggregated data, an average is a summary statistic that can tell you something about the data - but it is only one metric, and oftentimes a deceiving one at that. By taking all of the data and boiling it down to one value, an average (and other summary statistics) may imply that all of the underlying data is the same, even when it’s not." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"Keep in mind that a weighted average may be different than a simple (non- weighted) average because a weighted average - by definition - counts certain data points more heavily. When you’re thinking about an average, try to determine if it’s a simple average or a weighted average. If it’s weighted, ask yourself how it’s being weighted, and see which data points count more than others." (John H Johnson & Mike Gluck, "Everydata: The misinformation hidden in the little data you consume every day", 2016)

"Outliers make it very hard to give an intuitive interpretation of the mean, but in fact, the situation is even worse than that. For a real‐world distribution, there always is a mean (strictly speaking, you can define distributions with no mean, but they’re not realistic), and when we take the average of our data points, we are trying to estimate that mean. But when there are massive outliers, just a single data point is likely to dominate the value of the mean and standard deviation, so much more data is required to even estimate the mean, let alone make sense of it." (Field Cady, "The Data Science Handbook", 2017)

"Theoretically, the normal distribution is most famous because many distributions converge to it, if you sample from them enough times and average the results. This applies to the binomial distribution, Poisson distribution and pretty much any other distribution you’re likely to encounter (technically, any one for which the mean and standard deviation are finite)." (Field Cady, "The Data Science Handbook", 2017)

"A recurring theme in machine learning is combining predictions across multiple models. There are techniques called bagging and boosting which seek to tweak the data and fit many estimates to it. Averaging across these can give a better prediction than any one model on its own. But here a serious problem arises: it is then very hard to explain what the model is (often referred to as a 'black box'). It is now a mixture of many, perhaps a thousand or more, models." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Mean-averages can be highly misleading when the raw data do not form a symmetric pattern around a central value but instead are skewed towards one side [...], typically with a large group of standard cases but with a tail of a few either very high (for example, income) or low (for example, legs) values." (David Spiegelhalter, "The Art of Statistics: Learning from Data", 2019)

"Random forests are essentially an ensemble of trees. They use many short trees, fitted to multiple samples of the data, and the predictions are averaged for each observation. This helps to get around a problem that trees, and many other machine learning techniques, are not guaranteed to find optimal models, in the way that linear regression is. They do a very challenging job of fitting non-linear predictions over many variables, even sometimes when there are more variables than there are observations. To do that, they have to employ 'greedy algorithms', which find a reasonably good model but not necessarily the very best model possible." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Unfortunately, when an ‘average’ is reported in the media, it is often unclear whether this should be interpreted as the mean or median." (David Spiegelhalter, "The Art of Statistics: Learning from Data", 2019)

"Average deviation is the average amount of scatter of the items in a distribution from either the mean or the median, ignoring the signs of the deviations. The average that is taken of the scatter is an arithmetic mean, which accounts for the fact that this measure is often called the mean deviation."  (Charles T Clark & Lawrence L Schkade)

"While the individual man is an insoluble puzzle, in the aggregate he becomes a mathematical certainty. You can, for example, never foretell what anyone man will be up to, but you can say with precision what an average number will be up to. Individuals vary, but percentages remain constant. So says the statistician." (Sir Arthur C Doyle)

05 December 2015

🪙Business Intelligence: Indicators (Just the Quotes)

"If we view organizations as adaptive, problem-solving structures, then inferences about effectiveness have to be made, not from static measures of output, but on the basis of the processes through which the organization approaches problems. In other words, no single measurement of organizational efficiency or satisfaction - no single time-slice of organizational performance can provide valid indicators of organizational health." (Warren G Bennis, "General Systems Yearbook", 1962)

"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor." (Donald T Campbell, "Assessing the impact of planned social change", 1976)

"Indicators tend to direct your attention toward what they are monitoring. It is like riding a bicycle: you will probably steer it where you are looking. If, for example, you start measuring your inventory levels carefully, you are likely to take action to drive your inventory levels down, which is good up to a point. But your inventories could become so lean that you can’t react to changes in demand without creating shortages. So because indicators direct one’s activities, you should guard against overreacting. This you can do by pairing indicators, so that together both effect and counter-effect are measured. Thus, in the inventory example, you need to monitor both inventory levels and the incidence of shortages. A rise in the latter will obviously lead you to do things to keep inventories from becoming too low." (Andrew S Grove, "High Output Management", 1983)

"So because indicators direct one’s activities, you should guard against overreacting. This you can do by pairing indicators, so that together both effect and counter-effect are measured. […] In sum, joint monitoring is likely to keep things in the optimum middle ground." (Andrew S Grove, "High Output Management", 1983)

"The first rule is that a measurement - any measurement - is better than none. But a genuinely effective indicator will cover the output of the work unit and not simply the activity involved. […] If you do not systematically collect and maintain an archive of indicators, you will have to do an awful lot of quick research to get the information you need, and by the time you have it, the problem is likely to have gotten worse." (Andrew S Grove, "High Output Management", 1983)

"The number of possible indicators you can choose is virtually limitless, but for any set of them to be useful, you have to focus each indicator on a specific operational goal. […] Put another way, which five pieces of information would you want to look at each day, immediately upon arriving at your office?" (Andrew S Grove, "High Output Management", 1983)

"All good KPIs that I have come across, that have made a difference, had the CEO’s constant attention, with daily calls to the relevant staff. [...] A KPI should tell you about what action needs to take place. [...] A KPI is deep enough in the organization that it can be tied down to an individual. [...] A good KPI will affect most of the core CSFs and more than one BSC perspective. [...] A good KPI has a flow on effect." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"If the KPIs you currently have are not creating change, throw them out because there is a good chance that they may be wrong. They are probably measures that were thrown together without the in-depth research and investigation KPIs truly deserve." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"Key performance indicators (KPIs) are the vital navigation instruments used by managers to understand whether their business is on a successful voyage or whether it is veering off the prosperous path. The right set of indicators will shine light on performance and highlight areas that need attention. ‘What gets measured gets done’ and ‘if you can’t measure it, you can’t manage it’ are just two of the popular sayings used to highlight the critical importance of metrics. Without the right KPIs managers are sailing blind." (Bernard Marr, "Key Performance Indicators (KPI): The 75 measures every manager needs to know", 2011)

"KRAs and KPIs KRA and KPI are two confusing acronyms for an approach commonly recommended for identifying a person’s major job responsibilities. KRA stands for key result areas; KPI stands for key performance indicators. As academics and consultants explain this jargon, key result areas are the primary components or parts of the job in which a person is expected to deliver results. Key performance indicators represent the measures that will be used to determine how well the individual has performed. In other words, KRAs tell where the individual is supposed to concentrate her attention; KPIs tell how her performance in the specified areas should be measured. Probably few parts of the performance appraisal process create more misunderstanding and bewilderment than do the notion of KRAs and KPIs. The reason is that so much of the material written about KPIs and KRAs is both." (Dick Grote, "How to Be Good at Performance Appraisals: Simple, Effective, Done Right", 2011)

"A statistical index has all the potential pitfalls of any descriptive statistic - plus the distortions introduced by combining multiple indicators into a single number. By definition, any index is going to be sensitive to how it is constructed; it will be affected both by what measures go into the index and by how each of those measures is weighted." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"Even if you have a solid indicator of what you are trying to measure and manage, the challenges are not over. The good news is that 'managing by statistics' can change the underlying behavior of the person or institution being managed for the better. If you can measure the proportion of defective products coming off an assembly line, and if those defects are a function of things happening at the plant, then some kind of bonus for workers that is tied to a reduction in defective products would presumably change behavior in the right kinds of ways. Each of us responds to incentives (even if it is just praise or a better parking spot). Statistics measure the outcomes that matter; incentives give us a reason to improve those outcomes." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"Once these different measures of performance are consolidated into a single number, that statistic can be used to make comparisons […] The advantage of any index is that it consolidates lots of complex information into a single number. We can then rank things that otherwise defy simple comparison […] Any index is highly sensitive to the descriptive statistics that are cobbled together to build it, and to the weight given to each of those components. As a result, indices range from useful but imperfect tools to complete charades." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"Defining an indicator as lagging, coincident, or leading is connected to another vital notion: the business cycle. Indicators are lagging or leading based on where economists believe we are in the business cycle: whether we are heading into a recession or emerging from one." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"[…] economics is a profession grounded in the belief that 'the economy' is a machine and a closed system. The more clearly that machine is understood, the more its variables are precisely measured, the more we will be able to manage and steer it as we choose, avoiding the frenetic expansions and sharp contractions. With better indicators would come better policy, and with better policy, states would be less likely to fall into depression and risk collapse." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"Our needs going forward will be best served by how we make use of not just this data but all data. We live in an era of Big Data. The world has seen an explosion of information in the past decades, so much so that people and institutions now struggle to keep pace. In fact, one of the reasons for the attachment to the simplicity of our indicators may be an inverse reaction to the sheer and bewildering volume of information most of us are bombarded by on a daily basis. […] The lesson for a world of Big Data is that in an environment with excessive information, people may gravitate toward answers that simplify reality rather than embrace the sheer complexity of it." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"Statistics are meaningless unless they exist in some context. One reason why the indicators have become more central and potent over time is that the longer they have been kept, the easier it is to find useful patterns and points of reference." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"The indicators - through no particular fault of anyone in particular - have not kept up with the changing world. As these numbers have become more deeply embedded in our culture as guides to how we are doing, we rely on a few big averages that can never be accurate pictures of complicated systems for the very reason that they are too simple and that they are averages. And we have neither the will nor the resources to invent or refine our current indicators enough to integrate all of these changes." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"We don’t need new indicators that replace old simple numbers with new simple numbers. We need instead bespoke indicators, tailored to the specific needs and specific questions of governments, businesses, communities, and individuals." (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"Yet our understanding of the world is still framed by our leading indicators. Those indicators define the economy, and what they say becomes the answer to the simple question 'Are we doing well?'" (Zachary Karabell, "The Leading Indicators: A short history of the numbers that rule our world", 2014)

"[…] an overall green status indicator doesn’t mean anything most of the time. All it says is that the things under measurement seem okay. But there always will be many more things not under measurement. To celebrate green indicators is to ignore the unknowns. […] The tendency to roll up metrics into dashboards promotes ignorance of the real situation on the ground. We forget that we only see what is under measurement. We only act when something is not green." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"Financial measures are a quantification of an activity that has taken place; we have simply placed a value on the activity. Thus, behind every financial measure is an activity. I call financial measures result indicators, a summary measure. It is the activity that you will want more or less of. It is the activity that drives the dollars, pounds, or yen. Thus financial measures cannot possibly be KPIs." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Key performance indicators (KPIs) are those indicators that focus on the aspects of organizational performance that are the most critical for the current and future success of the organization." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Key Performance Indicators (KPIs) in many organizations are a broken tool. The KPIs are often a random collection prepared with little expertise, signifying nothing. [...] KPIs should be measures that link daily activities to the organization’s critical success factors (CSFs), thus supporting an alignment of effort within the organization in the intended direction." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Most organizational measures are very much past indicators measuring events of the last month or quarter. These indicators cannot be and never were KPIs." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"We need indicators of overall performance that need only be reviewed on a monthly or bimonthly basis. These measures need to tell the story about whether the organization is being steered in the right direction at the right speed, whether the customers and staff are happy, and whether we are acting in a responsible way by being environmentally friendly. These measures are called key result indicators (KRIs)." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Indicators represent a way of 'distilling' the larger volume of data collected by organizations. As data become bigger and bigger, due to the greater span of control or growing complexity of operations, data management becomes increasingly difficult. Actions and decisions are greatly influenced by the nature, use and time horizon (e.g., short or long-term) of indicators." (Fiorenzo Franceschini et al, "Designing Performance Measurement Systems: Theory and Practice of Key Performance Indicators", 2019)

"Indicators take on the role of real 'conceptual technologies', capable of driving organizational management in intangible terms, conditioning the 'what' to focus and the 'how'; in other words, they become the beating heart of the management, operational and technological processes." (Fiorenzo Franceschini et al, "Designing Performance Measurement Systems: Theory and Practice of Key Performance Indicators", 2019)

"Monitoring a process requires identifying specific activities, responsibilities and indicators for testing effectiveness and efficiency. Effectiveness means setting the right goals and objectives, making sure that they are properly accomplished (doing the right things); effectiveness is measured comparing the achieved results with target objectives. On the other hand, efficiency means getting the most (output) from the available (input) resources (doing things right): efficiency defines a link between process performance and available resources." (Fiorenzo Franceschini et al, "Designing Performance Measurement Systems: Theory and Practice of Key Performance Indicators", 2019)

"People do care about how they are measured. What can we do about this? If you are in the position to measure something, think about whether measuring it will change people’s behaviors in ways that undermine the value of your results. If you are looking at quantitative indicators that others have compiled, ask yourself: Are these numbers measuring what they are intended to measure? Or are people gaming the system and rendering this measure useless?" (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"A KPI is a performance measure that demonstrates how effectively an organisation is achieving its critical objectives. They are used to track performance over a period of time to ensure the organisation is heading in the desired direction, and are quantifiable to guide whether activities need to be dialled up or down, resources adjusted or management resource focused on understanding what is in play that may be holding back the organisation." (Ian Wallis, "Data Strategy: From definition to execution", 2021)

"The KPI juggernaut has been misused and abused in too many organisations to the extent it has devalued the concept of KPIs. KPIs used well - the ten things that really matter to an organisation - can, in my experience, be a real galvanising force to get focus and attention put in those areas which really can make a difference. The rest is a distraction, there through some misplaced view that more adds value when actually it detracts through losing the focus from where it needs to be." (Ian Wallis, "Data Strategy: From definition to execution", 2021)

04 December 2015

🪙Business Intelligence: Measures/Metrics (Just the Quotes)

"The most important and frequently stressed prescription for avoiding pitfalls in the use of economic statistics, is that one should find out before using any set of published statistics, how they have been collected, analysed and tabulated. This is especially important, as you know, when the statistics arise not from a special statistical enquiry, but are a by-product of law or administration. Only in this way can one be sure of discovering what exactly it is that the figures measure, avoid comparing the non-comparable, take account of changes in definition and coverage, and as a consequence not be misled into mistaken interpretations and analysis of the events which the statistics portray." (Ely Devons, "Essays in Economics", 1961)

"If we view organizations as adaptive, problem-solving structures, then inferences about effectiveness have to be made, not from static measures of output, but on the basis of the processes through which the organization approaches problems. In other words, no single measurement of organizational efficiency or satisfaction - no single time-slice of organizational performance can provide valid indicators of organizational health." (Warren G Bennis, "General Systems Yearbook", 1962)

"[Management by objectives is] a process whereby the superior and the subordinate managers of an enterprise jointly identify its common goals, define each individual's major areas of responsibility in terms of the results expected of him, and use these measures as guides for operating the unit and assessing the contribution of each of its members." (Robert House, "Administrative Science Quarterly", 1971)

"A mature science, with respect to the matter of errors in variables, is not one that measures its variables without error, for this is impossible. It is, rather, a science which properly manages its errors, controlling their magnitudes and correctly calculating their implications for substantive conclusions." (Otis D Duncan, "Introduction to Structural Equation Models", 1975)

"Any observed statistical regularity will tend to collapse once pressure is placed upon it for control purposes." (Charles Goodhart, "Problems of Monetary Management: the U.K. Experience", 1975)

"The more any quantitative social indicator is used for social decision-making, the more subject it will be to corruption pressures and the more apt it will be to distort and corrupt the social processes it is intended to monitor." (Donald T Campbell, "Assessing the impact of planned social change", 1976)

"Reengineering is the fundamental rethinking and radical redesign of business processes to achieve dramatic improvements in critical contemporary measures of performance such as cost, quality, service and speed." (James A Champy & Michael M Hammer, "Reengineering the Corporation", 1993)

"Industrial managers faced with a problem in production control invariably expect a solution to be devised that is simple and unidimensional. They seek the variable in the situation whose control will achieve control of the whole system: tons of throughput, for example. Business managers seek to do the same thing in controlling a company; they hope they have found the measure of the entire system when they say 'everything can be reduced to monetary terms'." (Stanford Beer, "Decision and Control", 1994)

"A strategy is a set of hypotheses about cause and effect. The measurement system should make the relationships (hypotheses) among objectives (and measures) in the various perspectives explicit so that they can be managed and validated. The chain of cause and effect should pervade all four perspectives of a Balanced Scorecard." (Robert S Kaplan & David P Norton, "The Balanced Scorecard", Harvard Business Review, 1996)

"The Balanced Scorecard has its greatest impact when it is deployed to drive organizational change. [...] The Balanced Scorecard is primarily a mechanism for strategy implementation, not for strategy formulation. It can accommodate either approach for formulating business unit strategy-starting from the customer perspective, or starting from excellent internal-business-process capabilities. For whatever approach that SBU senior executives use to formulate their strategy, the Balanced Scorecard will provide an invaluable mechanism for translating that strategy into specific objectives, measures, and targets, and monitoring the implementation of that strategy during subsequent periods." (Robert S Kaplan & David P Norton, "The Balanced Scorecard", Harvard Business Review, 1996)

"The Balanced Scorecard translates mission and strategy into objectives and measures, organized into four different perspectives: financial, customer, internal business process, and learning and growth. The scorecard provides a framework, a language, to communicate mission and strategy; it uses measurement to inform employees about the drivers of current and future success." (Robert S Kaplan & David P Norton, "The Balanced Scorecard", Harvard Business Review, 1996)

"When a measure becomes a target, it ceases to be a good measure." (Marilyn Strathern, "‘Improving ratings’: audit in the British University system", 1997)

"Since the average is a measure of location, it is common to use averages to compare two data sets. The set with the greater average is thought to ‘exceed’ the other set. While such comparisons may be helpful, they must be used with caution. After all, for any given data set, most of the values will not be equal to the average." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"First, good statistics are based on more than guessing. [...] Second, good statistics are based on clear, reasonable definitions. Remember, every statistic has to define its subject. Those definitions ought to be clear and made public. [...] Third, good statistics are based on clear, reasonable measures. Again, every statistic involves some sort of measurement; while all measures are imperfect, not all flaws are equally serious. [...] Finally, good statistics are based on good samples." (Joel Best, "Damned Lies and Statistics: Untangling Numbers from the Media, Politicians, and Activists", 2001)

"Statistics depend on collecting information. If questions go unasked, or if they are asked in ways that limit responses, or if measures count some cases but exclude others, information goes ungathered, and missing numbers result. Nevertheless, choices regarding which data to collect and how to go about collecting the information are inevitable." (Joel Best, "More Damned Lies and Statistics: How numbers confuse public issues", 2004)

"If the KPIs you currently have are not creating change, throw them out because there is a good chance that they may be wrong. They are probably measures that were thrown together without the in-depth research and investigation KPIs truly deserve." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"Key performance indicators (KPIs) are the vital navigation instruments used by managers to understand whether their business is on a successful voyage or whether it is veering off the prosperous path. The right set of indicators will shine light on performance and highlight areas that need attention. ‘What gets measured gets done’ and ‘if you can’t measure it, you can’t manage it’ are just two of the popular sayings used to highlight the critical importance of metrics. Without the right KPIs managers are sailing blind." (Bernard Marr, "Key Performance Indicators (KPI): The 75 measures every manager needs to know", 2011)

"A statistical index has all the potential pitfalls of any descriptive statistic - plus the distortions introduced by combining multiple indicators into a single number. By definition, any index is going to be sensitive to how it is constructed; it will be affected both by what measures go into the index and by how each of those measures is weighted." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"Even if you have a solid indicator of what you are trying to measure and manage, the challenges are not over. The good news is that 'managing by statistics' can change the underlying behavior of the person or institution being managed for the better. If you can measure the proportion of defective products coming off an assembly line, and if those defects are a function of things happening at the plant, then some kind of bonus for workers that is tied to a reduction in defective products would presumably change behavior in the right kinds of ways. Each of us responds to incentives (even if it is just praise or a better parking spot). Statistics measure the outcomes that matter; incentives give us a reason to improve those outcomes." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"Once these different measures of performance are consolidated into a single number, that statistic can be used to make comparisons […] The advantage of any index is that it consolidates lots of complex information into a single number. We can then rank things that otherwise defy simple comparison […] Any index is highly sensitive to the descriptive statistics that are cobbled together to build it, and to the weight given to each of those components. As a result, indices range from useful but imperfect tools to complete charades." (Charles Wheelan, "Naked Statistics: Stripping the Dread from the Data", 2012)

"No subjective metric can escape strategic gaming [...] The possibility of mischief is bottomless. Fighting ratings is fruitless, as they satisfy a very human need. If one scheme is beaten down, another will take its place and wear its flaws. Big Data just deepens the danger. The more complex the rating formulas, the more numerous the opportunities there are to dress up the numbers. The larger the data sets, the harder it is to audit them." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"The urge to tinker with a formula is a hunger that keeps coming back. Tinkering almost always leads to more complexity. The more complicated the metric, the harder it is for users to learn how to affect the metric, and the less likely it is to improve it." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"Until a new metric generates a body of data, we cannot test its usefulness. Lots of novel measures hold promise only on paper." (Kaiser Fung, "Numbersense: How To Use Big Data To Your Advantage", 2013)

"[…] an overall green status indicator doesn’t mean anything most of the time. All it says is that the things under measurement seem okay. But there always will be many more things not under measurement. To celebrate green indicators is to ignore the unknowns. […] The tendency to roll up metrics into dashboards promotes ignorance of the real situation on the ground. We forget that we only see what is under measurement. We only act when something is not green." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"Financial measures are a quantification of an activity that has taken place; we have simply placed a value on the activity. Thus, behind every financial measure is an activity. I call financial measures result indicators, a summary measure. It is the activity that you will want more or less of. It is the activity that drives the dollars, pounds, or yen. Thus financial measures cannot possibly be KPIs." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"'Getting it right the first time' is a rare achievement, and ascertaining the organization’s winning KPIs and associated reports is no exception. The performance measure framework and associated reporting is just like a piece of sculpture: you can be criticized on taste and content, but you can’t be wrong. The senior management team and KPI project team need to ensure that the project has a just-do-it culture, not one in which every step and measure is debated as part of an intellectual exercise." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"In order to get measures to drive performance, a reporting framework needs to be developed at all levels within the organization." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Most organizational measures are very much past indicators measuring events of the last month or quarter. These indicators cannot be and never were KPIs." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Rolling up fine-grained metrics to create high-level dashboards puts pressure on teams to keep the fine-grained metrics green even when it might not be the best use of their time." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"Scaling supervision using metrics is one thing; scaling results is quite another. The former doesn’t automatically ensure the latter." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"We need indicators of overall performance that need only be reviewed on a monthly or bimonthly basis. These measures need to tell the story about whether the organization is being steered in the right direction at the right speed, whether the customers and staff are happy, and whether we are acting in a responsible way by being environmentally friendly. These measures are called key result indicators (KRIs)." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Unfortunately, setting the scale at zero is the best recipe for creating dull charts, in both senses of the word: boring and with little variation. The solution is not to break the scale, but rather to find a similar message that can be communicated using alternative metrics." (Jorge Camões, "Data at Work: Best practices for creating effective charts and information graphics in Microsoft Excel", 2016)

"GIGO is a famous saying coined by early computer scientists: garbage in, garbage out. At the time, people would blindly put their trust into anything a computer output indicated because the output had the illusion of precision and certainty. If a statistic is composed of a series of poorly defined measures, guesses, misunderstandings, oversimplifications, mismeasurements, or flawed estimates, the resulting conclusion will be flawed." (Daniel J Levitin, "Weaponized Lies", 2017)

"To be any good, a sample has to be representative. A sample is representative if every person or thing in the group you’re studying has an equally likely chance of being chosen. If not, your sample is biased. […] The job of the statistician is to formulate an inventory of all those things that matter in order to obtain a representative sample. Researchers have to avoid the tendency to capture variables that are easy to identify or collect data on - sometimes the things that matter are not obvious or are difficult to measure." (Daniel J Levitin, "Weaponized Lies", 2017)

"Statistical metrics can show us facts and trends that would be impossible to see in any other way, but often they’re used as a substitute for relevant experience, by managers or politicians without specific expertise or a close-up view." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

02 December 2015

🪙Business Intelligence: Reporting (Just the Quotes)

"A man's judgment cannot be better than the information on which he has based it. Give him no news, or present him only with distorted and incomplete data, with ignorant, sloppy, or biased reporting, with propaganda and deliberate falsehoods, and you destroy his whole reasoning process and make him somewhat less than a man." (Arthur H Sulzberger, [speech] 1948)

"The secret language of statistics, so appealing in a fact-minded culture, is employed to sensationalize, inflate, confuse, and oversimplify. Statistical methods and statistical terms are necessary in reporting the mass data of social and economic trends, business conditions, 'opinion' polls, the census. But without writers who use the words with honesty and understanding and readers who know what they mean, the result can only be semantic nonsense." (Darell Huff, "How to Lie with Statistics", 1954)

"To be worth much, a report based on sampling must use a representative sample, which is one from which every source of bias has been removed." (Darell Huff, "How to Lie with Statistics", 1954)

"It is probable that one day we shall begin to draw organization charts as a series of linked groups rather than as a hierarchical structure of individual 'reporting' relationships." (Douglas McGregor, "The Human Side of Enterprise", 1960)

"[...] as the planning process proceeds to a specific financial or marketing state, it is usually discovered that a considerable body of 'numbers' is missing, but needed numbers for which there has been no regular system of collection and reporting; numbers that must be collected outside the firm in some cases. This serendipity usually pays off in a much better management information system in the form of reports which will be collected and reviewed routinely." (William H. Franklin Jr., Financial Strategies, 1987)

"Intangible assets [...] surpass physical assets in most business enterprises, both in value and contribution to growth, yet they are routinely expensed in the financial reports and hence remain absent from corporate balance sheets. This asymmetric treatment of capitalizing (considering as assets) physical and financial investment while expensing intangibles leads to biased and deficient reporting of firms’ performance and value." (Baruch Lev, "Intangibles: Management, Measurement, and Reporting", 2000)

"Project planning is the key to effective project management. Detailed and accurate planning of a project produces the managerial information that is the basis of project justification (costs, benefits, strategic impact, etc.) and the defining of the business drivers (scope, objectives) that form the context for the technical solution. In addition, project planning also produces the project schedules and resource allocations that are the framework for the other project management processes: tracking, reporting, and review." (Rob Thomsett, "Radical Project Management", 2002)

"Many management reports are not a management tool; they are merely memorandums of information. As a management tool, management reports should encourage timely action in the right direction, by reporting on those activities the Board, management, and staff need to focus on. The old adage 'what gets measured gets done' still holds true." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"Reporting to the Board is a classic 'catch-22' situation. Boards complain about getting too much information too late, and management complains that up to 20% of their time is tied up in the Board reporting process. Boards obviously need to ascertain whether management is steering the ship correctly and the state of the crew and customers before they can relax and 'strategize' about future initiatives. The process of assessing the current status of the organization from the most recent Board report is where the principal problem lies. Board reporting needs to occur more efficiently and effectively for both the Board and management." (David Parmenter, "Pareto’s 80/20 Rule for Corporate Accountants", 2007)

"Readability in visualization helps people interpret data and make conclusions about what the data has to say. Embed charts in reports or surround them with text, and you can explain results in detail. However, take a visualization out of a report or disconnect it from text that provides context (as is common when people share graphics online), and the data might lose its meaning; or worse, others might misinterpret what you tried to show." (Nathan Yau, "Data Points: Visualization That Means Something", 2013)

"Another way to secure statistical significance is to use the data to discover a theory. Statistical tests assume that the researcher starts with a theory, collects data to test the theory, and reports the results - whether statistically significant or not. Many people work in the other direction, scrutinizing the data until they find a pattern and then making up a theory that fits the pattern." (Gary Smith, "Standard Deviations", 2014)

"These practices - selective reporting and data pillaging - are known as data grubbing. The discovery of statistical significance by data grubbing shows little other than the researcher’s endurance. We cannot tell whether a data grubbing marathon demonstrates the validity of a useful theory or the perseverance of a determined researcher until independent tests confirm or refute the finding. But more often than not, the tests stop there. After all, you won’t become a star by confirming other people’s research, so why not spend your time discovering new theories? The data-grubbed theory consequently sits out there, untested and unchallenged." (Gary Smith, "Standard Deviations", 2014)

"A dashboard is like the executive summary of a report. We read executive summaries and skip the body of the report if the summary is more or less in line with our expectations. Trouble is, measurement is never exhaustive. It is only when we dive in that we realize what areas may have been missed." (Sriram Narayan, "Agile IT Organization Design: For Digital Transformation and Continuous Delivery", 2015)

"'Getting it right the first time' is a rare achievement, and ascertaining the organization’s winning KPIs and associated reports is no exception. The performance measure framework and associated reporting is just like a piece of sculpture: you can be criticized on taste and content, but you can’t be wrong. The senior management team and KPI project team need to ensure that the project has a just-do-it culture, not one in which every step and measure is debated as part of an intellectual exercise." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"In order to get measures to drive performance, a reporting framework needs to be developed at all levels within the organization." (David Parmenter, "Key Performance Indicators: Developing, implementing, and using winning KPIs" 3rd Ed., 2015)

"Statistics, because they are numbers, appear to us to be cold, hard facts. It seems that they represent facts given to us by nature and it’s just a matter of finding them. But it’s important to remember that people gather statistics. People choose what to count, how to go about counting, which of the resulting numbers they will share with us, and which words they will use to describe and interpret those numbers. Statistics are not facts. They are interpretations. And your interpretation may be just as good as, or better than, that of the person reporting them to you." (Daniel J Levitin, "Weaponized Lies", 2017)

🪙Business Intelligence: Analytics (Just the Quotes)

"Data are essential, but performance improvements and competitive advantage arise from analytics models that allow managers to predict and optimize outcomes. More important, the most effective approach to building a model rarely starts with the data; instead it originates with identifying the business opportunity and determining how the model can improve performance." (Dominic Barton & David Court, "Making Advanced Analytics Work for You", 2012) 

"Even with simple and usable models, most organizations will need to upgrade their analytical skills and literacy. Managers must come to view analytics as central to solving problems and identifying opportunities - to make it part of the fabric of daily operations." (Dominic Barton & David Court, "Making Advanced Analytics Work for You", 2012)

"There is another important distinction pertaining to mining data: the difference between (1) mining the data to find patterns and build models, and (2) using the results of data mining. Students often confuse these two processes when studying data science, and managers sometimes confuse them when discussing business analytics. The use of data mining results should influence and inform the data mining process itself, but the two should be kept distinct." (Foster Provost & Tom Fawcett, "Data Science for Business", 2013)

"It is important to remember that predictive data analytics models built using machine learning techniques are tools that we can use to help make better decisions within an organization and are not an end in themselves. It is paramount that, when tasked with creating a predictive model, we fully understand the business problem that this model is being constructed to address and ensure that it does address it." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015)

"Machine learning takes many different forms and goes by many different names: pattern recognition, statistical modeling, data mining, knowledge discovery, predictive analytics, data science, adaptive systems, self-organizing systems, and more. Each of these is used by different communities and has different associations. Some have a long half-life, some less so." (Pedro Domingos, "The Master Algorithm", 2015)

"The human side of analytics is the biggest challenge to implementing big data." (Paul Gibbons, "The Science of Successful Organizational Change", 2015)

"One important thing to bear in mind about the outputs of data science and analytics is that in the vast majority of cases they do not uncover hidden patterns or relationships as if by magic, and in the case of predictive analytics they do not tell us exactly what will happen in the future. Instead, they enable us to forecast what may come. In other words, once we have carried out some modelling there is still a lot of work to do to make sense out of the results obtained, taking into account the constraints and assumptions in the model, as well as considering what an acceptable level of reliability is in each scenario." (Jesús Rogel-Salazar, "Data Science and Analytics with Python", 2017)

"One of the biggest truths about the real–time analytics is that nothing is actually real–time; it's a myth. In reality, it's close to real–time. Depending upon the performance and ability of a solution and the reduction of operational latencies, the analytics could be close to real–time, but, while day-by-day we are bridging the gap between real–time and near–real–time, it's practically impossible to eliminate the gap due to computational, operational, and network latencies." (Shilpi Saxena & Saurabh Gupta, "Practical Real-time Data Processing and Analytics", 2017)

"The tension between bias and variance, simplicity and complexity, or underfitting and overfitting is an area in the data science and analytics process that can be closer to a craft than a fixed rule. The main challenge is that not only is each dataset different, but also there are data points that we have not yet seen at the moment of constructing the model. Instead, we are interested in building a strategy that enables us to tell something about data from the sample used in building the model." (Jesús Rogel-Salazar, "Data Science and Analytics with Python", 2017) 

"Big data is revolutionizing the world around us, and it is easy to feel alienated by tales of computers handing down decisions made in ways we don’t understand. I think we’re right to be concerned. Modern data analytics can produce some miraculous results, but big data is often less trustworthy than small data. Small data can typically be scrutinized; big data tends to be locked away in the vaults of Silicon Valley. The simple statistical tools used to analyze small datasets are usually easy to check; pattern-recognizing algorithms can all too easily be mysterious and commercially sensitive black boxes." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"For advanced analytics, a well-designed data pipeline is a prerequisite, so a large part of your focus should be on automation. This is also the most difficult work. To be successful, you need to stitch everything together." (Piethein Strengholt, "Data Management at Scale: Best Practices for Enterprise Architecture", 2020)

"Data literacy is not a change in an individual’s abilities, talents, or skills within their careers, but more of an enhancement and empowerment of the individual to succeed with data. When it comes to data and analytics succeeding in an organization’s culture, the increase in the workforces’ skills with data literacy will help individuals to succeed with the strategy laid in front of them. In this way, organizations are not trying to run large change management programs; the process is more of an evolution and strengthening of individual’s talents with data. When we help individuals do more with data, we in turn help the organization’s culture do more with data." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"In the world of data and analytics, people get enamored by the nice, shiny object. We are pulled around by the wind of the latest technology, but in so doing we are pulled away from the sound and intelligent path that can lead us to data and analytical success. The data and analytical world is full of examples of overhyped technology or processes, thinking this thing will solve all of the data and analytical needs for an individual or organization. Such topics include big data or data science. These two were pushed into our minds and down our throats so incessantly over the past decade that they are somewhat of a myth, or people finally saw the light. In reality, both have a place and do matter, but they are not the only solution to your data and analytical needs. Unfortunately, though, organizations bit into them, thinking they would solve everything, and were left at the alter, if you will, when it came time for the marriage of data and analytical success with tools." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

"Pure data science is the use of data to test, hypothesize, utilize statistics and more, to predict, model, build algorithms, and so forth. This is the technical part of the puzzle. We need this within each organization. By having it, we can utilize the power that these technical aspects bring to data and analytics. Then, with the power to communicate effectively, the analysis can flow throughout the needed parts of an organization." (Jordan Morrow, "Be Data Literate: The data literacy skills everyone needs to succeed", 2021)

22 October 2015

🪙Business Intelligence: Data Warehouse (Just the Quotes)

"Unfortunately, just collecting the data in one place and making it easily available isn’t enough. When operational data from transactions is loaded into the data warehouse, it often contains missing or inaccurate data. How good or bad the data is a function of the amount of input checking done in the application that generates the transaction. Unfortunately, many deployed applications are less than stellar when it comes to validating the inputs. To overcome this problem, the operational data must go through a 'cleansing' process, which takes care of missing or out-of-range values. If this cleansing step is not done before the data is loaded into the data warehouse, it will have to be performed repeatedly whenever that data is used in a data mining operation." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Having a purposeless or poorly performing dashboard is more common than not. This happens when the underlying architecture is not designed properly to support the needs of dashboard interaction. There is an obvious disconnect between the design of the data warehouse and the design of the dashboards. The people who design the data warehouse do not know what the dashboard will do; and the people who design the dashboards do not know how the data warehouse was designed, resulting in a lack of cohesion between the two. A similar disconnect can also exist between the dashboard designer and the business analyst, resulting in a dashboard that may look beautiful and dazzling but brings very little business value." (Nils H Rasmussen et al, "Business Dashboards: A visual catalog for design and deployment", 2009)

"Having multiple data lakes replicates the same problems that were created with multiple data warehouses - disparate data siloes and data fiefdoms that don't facilitate sharing of the corporate data assets across the organization. Organizations need to have a single data lake from which they can source the data for their BI/data warehousing and analytic needs. The data lake may never become the 'single version of the truth' for the organization, but then again, neither will the data warehouse. Instead, the data lake becomes the 'single or central repository for all the organization's data' from which all the organization's reporting and analytic needs are sourced." (Billl Schmarzo, "Driving Business Strategies with Data Science: Big Data MBA" 1st Ed., 2015)

"Unfortunately, some organizations are replicating the bad data warehouse practice by creating special-purpose data lakes - data lakes to address a specific business need. Resist that urge! Instead, source the data that is needed for that specific business need into an 'analytic sandbox' where the data scientists and the business users can collaborate to find those data variables and analytic models that are better predictors of the business performance. Within the 'analytic sandbox', the organization can bring together (ingest and integrate) the data that it wants to test, build the analytic models, test the model's goodness of fit, acquire new data, refine the analytic models, and retest the goodness of fit." (Billl Schmarzo, "Driving Business Strategies with Data Science: Big Data MBA" 1st Ed., 2015)

"Data quality in warehousing and BI is typically defined in terms of the 4 C’s - is the data clean, correct, consistent, and complete? When it comes to big data, there are two schools of thought that have different views and expectations of data quality. The first school believes that the gold standard of the 4 C’s must apply to all data (big and little) used for clinical care and performance metrics. The second school believes that in big data environments, a stringent data quality standard is impossible, too costly, or not required. While diametrically opposite opinions may play well in panel discussions, they do little to reconcile the realities of healthcare data quality." (Prashant Natarajan et al, "Demystifying Big Data and Machine Learning for Healthcare", 2017) 

"Data warehousing has always been difficult, because leaders within an organization want to approach warehousing and analytics as just another technology or application buy. Viewed in this light, they fail to understand the complexity and interdependent nature of building an enterprise reporting environment." (Prashant Natarajan et al, "Demystifying Big Data and Machine Learning for Healthcare", 2017)

"A data lake is a storage repository that holds a very large amount of data, often from diverse sources, in native format until needed. In some respects, a data lake can be compared to a staging area of a data warehouse, but there are key differences. Just like a staging area, a data lake is a conglomeration point for raw data from diverse sources. However, a staging area only stores new data needed for addition to the data warehouse and is a transient data store. In contrast, a data lake typically stores all possible data that might be needed for an undefined amount of analysis and reporting, allowing analysts to explore new data relationships. In addition, a data lake is usually built on commodity hardware and software such as Hadoop, whereas traditional staging areas typically reside in structured databases that require specialized servers." (Mike Fleckenstein & Lorraine Fellows, "Modern Data Strategy", 2018)

"A data warehouse follows a pre-built static structure to model source data. Any changes at the structural and configuration level must go through a stringent business review process and impact analysis. Data lakes are very agile. Consumption or analytical layer can be modified to fit in the model requirements. Consumers of a data lake are not constant; therefore, schema and modeling lies at the liberty of analysts and scientists." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"Data warehousing, as we are aware, is the traditional approach of consolidating data from multiple source systems and combining into one store that would serve as the source for analytical and business intelligence reporting. The concept of data warehousing resolved the problems of data heterogeneity and low-level integration. In terms of objectives, a data lake is no different from a data warehouse. Both are primary advocates of terms like 'single source of truth' and 'central data repository'." (Saurabh Gupta et al, "Practical Enterprise Data Lake Insights", 2018)

"A defining characteristic of the data lakehouse architecture is allowing direct access to data as files while retaining the valuable properties of a data warehouse. Just do both!" (Bill Inmon et al, "Building the Data Lakehouse", 2021)

"The data lakehouse architecture presents an opportunity comparable to the one seen during the early years of the data warehouse market. The unique ability of the lakehouse to manage data in an open environment, blend all varieties of data from all parts of the enterprise, and combine the data science focus of the data lake with the end user analytics of the data warehouse will unlock incredible value for organizations. [...] "The lakehouse architecture equally makes it natural to manage and apply models where the data lives." (Bill Inmon et al, "Building the Data Lakehouse", 2021)

04 August 2015

Statistics: Median (Definitions)

"The middle value in an ordered set of values for which there are an equal number of values." (Jennifer George-Palilonis, "A Practical Guide to Graphics Reporting", 2006)

"The center-most value in an ordered set of values. If the set quantity is even, then the average of the two center-most values." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The median is a statistical measure of variation. It represents the middle measurement when a set of measurements are collected in ascending order: 50% of the measurements are above the median and 50% are below it." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement ", 2012)

"The middle value in a set of ordered numbers. The median value is determined by choosing the smallest value such that at least half of the values in the set are no greater than the chosen value. If the number of values within the set is odd, the median value corresponds to a single value. If the number of values within the set is even, the median value corresponds to the sum of the two middle values divided by two." (Microsoft, "SQL Server 2012 Glossary", 2012)

"The middle value in a set of values. Half the values fall below the median, and half the values fall above the median. See also average; mode." (E C Nelson & Stephen L Nelson, "Excel Data Analysis For Dummies ", 2015)

"To find the median, list the values of the data set in numerical order and identify which value appears in the middle of the list." (Christopher Donohue et al, "Foundations of Financial Risk: An Overview of Financial Risk and Risk-based Financial Regulation, 2nd Ed", 2015)

"Middle score in a distribution." (K  N Krishnaswamy et al, "Management Research Methodology: Integration of Principles, Methods and Techniques", 2016)

Statistics: Mean (Definitions)

"In a numerical sequence, the number that has an equal number of values before and after it. In the sequence 3, 5, 7, 9, 11, seven is the mean." (Dale Furtwengler, "Ten Minute Guide to Performance Appraisals", 2000)

"The average value of a sample of data that is typically gathered in a matrix experiment." (Clyde M Creveling, "Six Sigma for Technical Processes: An Overview for R Executives, Technical Leaders, and Engineering Managers", 2006)

"The sum of all values in a variable divided by the number of values." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"The average value of a sample of data that is typically gathered in a matrix experiment." (Lynne Hambleton, "Treasure Chest of Six Sigma Growth Methods, Tools, and Best Practices", 2007)

"The sum of all values in a variable divided by the number of values." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2007)

"The result of dividing the sum of all values within a set by the count of all values included." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The mean is a statistical measure of central tendency. It is most easily understood as the mathematical average. It is calculated by summing the value of a set of measurements and dividing by the number of measurements taken." (Laura Sebastian-Coleman, "Measuring Data Quality for Ongoing Improvement", 2012)

"To find the mean add up the values in the data set and then divide by the number of values." (Christopher Donohue et al, "Foundations of Financial Risk: An Overview of Financial Risk and Risk-based Financial Regulation" 2nd Ed., 2015)

"Arithmetic averages of scores. The mean is the most commonly used measure of central tendency, but should be computed only for score data." (K  N Krishnaswamy et al, "Management Research Methodology: Integration of Principles, Methods and Techniques", 2016)

Statistics: Moving Average (Definitions)

"A trend-following indicator that works best in a trending environment. Moving averages smooth out price action but operate with a time lag. Any number of moving averages can be employed, with different time spans, to generate buy and sell signals. When only one average is employed, a buy signal is given when the price closes above the average. When two averages are employed, a buy signal is given when the shorter average crosses above the longer average. Technicians use three types: simple, weighted, and exponentially smoothed averages." (Guido Deboeck & Teuvo Kohonen (Eds), "Visual Explorations in Finance with Self-Organizing Maps 2nd Ed.", 2000)

"For a time series, an average that is updated as new information is received. With the moving average, the manager employs the most recent observations to calculate an average, which is used as the forecast for the next period." (Jae K Shim & Joel G Siegel, "Budgeting Basics and Beyond", 2008)

[exponential moving average:] "A moving average of data that gives more weight to the more recent data in the period and less weight to the older data in the period. The formula applies weighting factors which decrease exponentially. The weighting for each older data point decreases exponentially, giving much more importance to recent observations while still not discarding older observations entirely." (SQL Server 2012 Glossary, "Microsoft", 2012)

"An average that’s calculated by using only a specified set of values, such as an average based on just the last three values." (E C Nelson & Stephen L Nelson, "Excel Data Analysis For Dummies ", 2015)

"A mathematical average of data points over a specified period of time. Moving averages are used on financial price charts to show the average price over a selected interval of time. Examples are the SMA(9), SMA(20), SMA(50), or SMA(200) referring to 9-, 20-, 50-, or 200-period simple moving averages. Other types of moving averages also exist, such as an exponential moving average (EMA) and triangular moving averages (TMA). The EMA places more emphasis on the most recent data points. The TMA places more emphasis on the center data points of the specified range, that is, 9, 20, 50, 200, and so on." (Russell A Stultz, "The Option Strategy Desk Reference", 2019)

17 June 2015

📊Business Intelligence: Advanced Analytics (Definitions)

"A subset of analytical techniques that, among other things, often uses statistical methods to identify and quantify the influence and significance of relationships between items of interest, groups similar items together, creates predictions, and identifies mathematical optimal or near-optimal answers to business problems." (Evan Stubbs, "Delivering Business Analytics: Practical Guidelines for Best Practice", 2013)

"Algorithms for complex analysis of either structured or unstructured data. It includes sophisticated statistical models, machine learning, neural networks, text analytics, and other advanced data-mining techniques Advanced analytics does not include database query and reporting and OLAP cubes." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A subset of analytical techniques that, among other things, often uses statistical methods to identify and quantify the influence and significant of relationships between items of interest, group similar items together, create predictions, and identify mathematical optimal or near-optimal answers to business problems." (Evan Stubbs, "Big Data, Big Innovation", 2014)

"Advanced Analytics is the autonomous or semi-autonomous examination of data or content using sophisticated techniques and tools, typically beyond those of traditional business intelligence (BI), to discover deeper insights, make predictions, or generate recommendations. Advanced analytic techniques include those such as data/text mining, machine learning, pattern matching, forecasting, visualization, semantic analysis, sentiment analysis, network and cluster analysis, multivariate statistics, graph analysis, simulation, complex event processing, neural networks. (Gartner)

"Analytic techniques and technologies that apply statistical and/or machine learning algorithms that allow firms to discover, evaluate, and optimize models that reveal and/or predict new insights." (Forrester)

"Advanced analytics describes data analysis that goes beyond simple mathematical calculations such as sums and averages, or filtering and sorting. Advanced analyses use mathematical and statistical formulas and algorithms to generate new information, to recognize patterns, and also to predict outcomes and their respective probabilities." (BI-Survey) [source]

"Advanced analytics is an umbrella term for a group of high-level methods and tools that can help you get more out of your data. The predictive capabilities of advanced analytics can be used to forecast trends, events, and behaviors. This gives organizations the ability to perform advanced statistical models such as 'what-if' calculations, as well as to future-proof various aspects of their operations." (Sisense) [source]

10 June 2015

📊Business Intelligence: Report Snapshot (Definitions)

"A SQL Server Reporting Services report that contains data that was queried at a particular point in time and has been stored on the Report Server." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A report that contains data captured at a specific point in time. Since report snapshots hold datasets instead of queries, report snapshots can be used to limit processing costs by running the snapshot during off-peak times." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A report that contains data captured at a specific point in time. A report snapshot is stored in an intermediate format containing retrieved data rather than a query and rendering definitions." (Jim Joseph et al, "Microsoft® SQL Server™ 2008 Reporting Services Unleashed", 2009)

"A static report that contains data captured at a specific point in time." (Microsoft, "SQL Server 2012 Glossary", 2012)

29 May 2015

🎓Knowledge Management: Keeping Current or the Quest to Lifelong Learning for IT Professionals

Introduction

    The pace with which technologies and the business changes becomes faster and faster. If 5-10 years back a vendor needed 3-5 years before coming with a new edition of a product, nowadays each 1-2 years a new edition is released. The release cycles become shorter and shorter, vendors having to keep up with the changing technological trends. Changing trends allow other vendors to enter the market with new products, increasing thus the competition and the need for responsiveness from other vendors. On one side the new tools/editions bring new functionality which mainly address technical and business requirements. On the other side existing tools functionality gets deprecated and superset by other. Knowledge doesn’t resume only to the use of tools, but also in the methodologies, procedures, best practices or processes used to make most of the respective products. Evermore, the value of some tools increases when mixed, flexible infrastructures relying on the right mix of tools working together.

    For an IT person keeping current with the advances in technologies is a major requirement. First of all because knowing modern technologies is a ticket for a good and/or better paid job. Secondly because many organizations try to incorporate in their IT infrastructure modern tools that would allow them increase the ROI and achieve further benefits. Thirdly because, as I’d like to believe, most of the IT professionals are eager to learn new things, keep up with the novelty. Being an adept of the continuous learning philosophy is also a way to keep the brain challenged, other type of challenge than the one we meet in daily tasks.

Knowledge Sources

    Face-to-face or computer-based trainings (CBTs) are the old-fashioned ways of keeping up-to-date with the advances in technologies though paradoxically not all organizations afford to train their IT employees. Despite of affordable CBTs, face-to-face trainings are quite expensive for the average IT person, therefore the IT professional has to reorient himself to other sources of knowledge. Fortunately many important Vendors like Microsoft or IBM provide in one form or another through Knowledge Bases (KB), tutorials, forums, presentations and Blogs a wide range of resources that could be used for learning. Similar resources exist also from similar parties, directly or indirectly interested in growing the knowledge pool.

    Nowadays reading a book or following a course it isn’t anymore a requirement for learning a subject. Blogs, tutorials, articles and other types of similar material can help more. Through their subject-oriented focus, they can bring some clarity in a small unit of time. Often they come with references to further materials, bring fresh perspectives, and are months or even years ahead books or courses. Important professionals in the field can be followed on blogs, Twitter, LinkedIn, You Tube and other social media platforms. Seeing in what topics they are interested in, how they code, what they think, maybe how they think, some even share their expertize ad-hoc when asked, all of this can help an IT professional considerably if he knows how to take advantage of these modern facilities.

    MOOCs start to approach IT topics, and further topics that can become handy for an IT professional. Most of them are free or a small fee is required for some of them, especially if participants’ identity needs to be verified. Such courses are a valuable resource of information. The participant can see how such a course is structured, what topics are approached, and what’s the minimal knowledge base required; the material is almost the same as in a normal university course, and in the end it’s not the piece of paper with the testimonial that’s important, but the change in perspective we obtained by taking the course. In addition the MOOC participant can interact with people with similar hobbies, collaborate with them on projects, and why not, something useful can come out of it. Through MOOCs or direct Vendor initiatives, free or freeware versions of software is available. Sometimes the whole functionality is available for personal use. The professional is therefore no more dependent on the software he can use only at work. New possibilities open for the person who wants to learn.

Maximizing the Knowledge Value

    Despite the considerable numbers of knowledge resources, for an IT professional the most important part of his experience comes from hand-on experience acquired on the job. If the knowledge is not rooted in hand-on experience, his knowledge remains purely theoretical, with minimal value. Therefore in order to maximize the value of his learning, an IT professional has to attempt using his knowledge as much and soon as possible in praxis. One way to increase the value of experience is to be involved in projects dealing with new technologies or challenges that would allow a professional to further extend his knowledge base. Sometimes we can choose such projects or gain exposure to the technologies, though other times no such opportunities can be sized or identified.

    Probably an IT professional can use in his daily duties 10-30% of what he learned. This percentage can be however increased by involving himself in other types of personal or collective (open source or work) projects. This would allow exploring the subjects from other perspective. Considering that many projects involve overtime, many professionals have also a rich personal life, it looks difficult to do that, though not impossible.

    Even if not on a regular basis achievable, a professional can allocate 1-3 hours on a weekly basis from his working time for learning something new. It can be something that would help directly or indirectly his organization, though sometimes it pays off to learn technologies that have nothing to do with the actual job. Somebody may argue that the respective hours are not “billable”, are a waste of time and other resources, that the technologies are not available, that there’s lot of due tasks, etc. With a little benevolence and with the right argumentation also such criticism can be silenced. The arguments can be for example based on the fact that a skilled professional can be with time more productive, a small investment in knowledge can have later a bigger benefit for both parties – employee and employer. An older study was showing that when IT professionals was given some freedom to approach personal projects at work, and use some time for their own benefit, the value they bring for an organization increased. There are companies like Google who made from this type of work a philosophy.

    A professional can also allocate 1-3 hours from his free time while commuting or other similar activities. Reading something before going to bed or as relaxation after work can prove to be a good shut-down for the brain from the daily problems. Where there’s interest in learning something new a person will find the time, no matter how busy his schedule is. It’s important however to do that on a regular basis, and with time the hours and knowledge accumulate.

    It’s also important to have a focused effort that will bring some kind of benefit. Learning just for the sake of learning brings little value on investment for a person if it’s not adequately focused. For sure it’s interesting and fun to browse through different topics, it’s even recommended to do so occasionally, though on the long run if a person wants to increase the value of his knowledge, he needs somehow to focus the knowledge within a given direction and apply that knowledge.

    Direction we obtain by choosing a career or learning path, and focusing on the direct or indirect related topics that belong to that path. Focusing on the subjects related to a career path allows us to build our knowledge further on existing knowledge, understanding a topic fully. On the other side focusing on other areas of applicability not directly linked with our professional work can broaden our perspective by looking at one topic from another’s topic perspective. This can be achieved for example by joining the knowledge base of a hobby we have with the one of our professional work. In certain configurations new opportunities for joint growth can be identified.

    The value of knowledge increases primarily when it’s used in day-to-day scenarios (a form of learning by doing). It would be useful for example for a professional to start a project that can bring some kind of benefit. It can be something simple like building a web page or a full website, an application that processes data, a solution based on a mix of technologies, etc. Such a project would allow simulating to some degree day-to-day situations, when the professional is forced to used and question some aspects, to deal with some situations that can’t be found in textbook or other learning material. If such a project can bring a material benefit, the value of knowledge increases even more.

    Another way to integrate the accumulated knowledge is through blogging and problem-solving. Topic or problem-oriented blogging can allow externalizing a person’s knowledge (aka tacit knowledge), putting knowledge in new contexts into a small focused unit of work, doing some research and see how other think about the same topic/problem, getting feedback, correcting or improving some aspects. It’s also a way of documenting the various problems identified while learning or performing a task. Blogging helps a person to improve his writing communication skills, his vocabulary and with a little more effort can be also a visit card for his professional experience.

    Trying to apply new knowledge in hand-on trainings, tutorials or by writing a few lines of code to test functionality and its applicability, same as structuring new learned material into notes in the form of text or knowledge maps (e.g. concept maps, mind maps, causal maps, diagrams, etc.) allow learners to actively learn the new concepts, increasing overall material’s retention. Even if notes and knowledge maps don’t apply the learned material directly, they offer a new way of structuring the content and resources for further enrichment and review. Applied individually, but especially when combined, the different types of active learning help as well maximize the value of knowledge with a minimum of effort.

Conclusion

    The bottom line – given the fast pace with which new technologies enter the market and the business environment evolves, an IT professional has to keep himself up-to-date with nowadays technologies. He has now more means than ever to do that – affordable computer-based training, tutorials, blogs, articles, videos, forums, studies, MOOC and other type of learning material allow IT professionals to approach a wide range of topics. Through active, focused, sustainable and hand-on learning we can maximize the value of knowledge, and in the end depends of each of us how we use the available resources to make most of our learning experience.

08 May 2015

📊Business Intelligence: Data Analytics (Definitions)

"Business Intelligence procedures and techniques for exploration and analysis of data to discover and identify meaningful information and trends." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"Analytics is the systematic analysis of large databases to solve problems and make informed decisions." (John R Schermerhorn Jr, "Management" 12th Ed., 2012)

"Procedures and techniques for exploration and analysis of data to discover and identify new and meaningful information and trends." (Craig S Mullins, "Database Administration", 2012)

"A data-driven process that creates insight. These processes incorporate a wide variety of techniques and may include manual analysis, reporting, predictive models, time-series models, or optimization models." (Evan Stubbs, "Delivering Business Analytics: Practical Guidelines for Best Practice", 2013)

"A suite of technical solutions that uses mathematical and statistical methods. The solutions are applied to data to generate insight to help organizations understand historical business performance as well as forecast and plan for future decisions." (Jim Davis & Aiman Zeid, "Business Transformation", 2014) 

"Analytics is the discovery and communication of meaningful patterns in data." (Elaine Biech, "ASTD Handbook" 2nd Ed., 2014) 

"The business intelligence and analytics technologies that are grounded mostly in data mining and statistical analysis." (Xiuli He, "Supply Chain Analytics: Challenges and Opportunities", 2014)

"Data analytics refers to qualitative and quantitative techniques and processes used to enhance productivity and business gain." (Piyush K Shukla & Madhuvan Dixit, "Big Data: An Emerging Field of Data Engineering", 2015)

"The act of extracting and communicating meaningful information among the data sets." (Hamid R Arabnia et al, "Application of Big Data for National Security", 2015) 

"A broad term that includes quantitative analysis of data and building quantitative models. Analytics is the science of analysis and discovery. Analysis may process data from a data warehouse, may result in building model-driven DSS, or may occur in a special study using statistical or data mining software. In general, analytics refers to quantitative analysis and manipulation of data." (Daniel J Power & Ciara Heavin, "Decision Support, Analytics, and Business Intelligence" 3rd Ed., 2017)

"A scientific and systematic approach to examine raw data in order to draw valid conclusions about them. Data are extracted and structured, and qualitative and quantitative techniques are used to identify and analyze patterns." (Lesley S J Farmer, "Data Analytics for Strategic Management: Getting the Right Data", 2017)

"Techniques used to identify patterns in data sets. Qualitative and quantitative techniques are employed to derive meaning that may be valuable and could result in a positive business gain for an organization." (Daniel J Power & Ciara Heavin, "Decision Support, Analytics, and Business Intelligence" 3rd Ed., 2017)

"The discovery, interpretation, and communication of meaningful patterns in data to inform decision making and improve performance." (Jonathan Ferrar et al, "The Power of People: Learn How Successful Organizations Use Workforce Analytics To Improve Business Performance", 2017)

"Analytics refers to quantitative and statistical analysis and manipulation of data to derive meaning. Analytics is a broad umbrella term that includes business analytics and data analytics." (Daniel J. Power & Ciara Heavin, "Data-Based Decision Making and Digital Transformation", 2018)

"Involves drawing insights from the data including big data. Analytics uses simple to advanced tools depending upon the objectives. Analytics may involve visual display of data (charts and graphs), descriptive statistics, making predictions, forecasting future outcomes, or optimizing business processes." (Amar Sahay, "Business Analytics" Vol. I, 2018)

"Is the science of examining raw data with the purpose of drawing actionable information from it, data analytics is used to allow companies and organization to make better business decisions and in the sciences to verify or disprove existing theories." (Dennis C Guster, "Scalable Data Warehouse Architecture: A Higher Education Case Study", 2018)

"Data analytics is a process that examines, clears, converts and models data to explore useful information, draws conclusions and supports decision making." (A Aylin Tokuç, "Management of Big Data Projects: PMI Approach for Success", 2019)

"A rapidly emerging field of information science arising from the explosion of data generated by many Internet based applications and services. Data analytics embodies a sequential process of descriptive, diagnostic, predictive and prescriptive analytics. Each type has a different purpose and requires different techniques to gain meaningful outcomes. The latter two often employ machine learning to gain valuable insights and directional guidance in decision making, such as in self-driving automobiles." (Darrold L Cordes et al, "Transforming Urban Slums: Pathway to Functionally Intelligent Cities in Developing Countries", 2021)

"Discovery, interpretation, and communication of meaningful patterns in data; and the process of applying those patterns towards effective decision making." (Francisco S Gutierres & Pedro M Gome, "The Integrated Tourism Analysis Platform (ITAP) for Tourism Destination Management", 2021)

"The science of extracting meaningful information continuously with the assistance of specialized system for finding patterns to get feasible solutions." (Selvan C & S  R Balasundaram, "Data Analysis in Context-Based Statistical Modeling in Predictive Analytics", 2021)

"Analytics encompasses the discovery, interpretation, and communication of meaningful patterns in data. It relies on the simultaneous application of statistics, computer programming and operations research to quantify performance and is particularly valuable in areas with large amounts of recorded information. The goal of this exercise is to guide decision-making based on the business context. The analytics flow comprises descriptive, diagnostic, predictive analytics and eventually prescriptive steps." (Accenture)

"Data Analytics describes the end-to-end process by which data is cleaned, inspected and modeled. The objective is to discover useful and actionable information that supports decision-making." (Accenture)

"Data analytics enables organizations to analyze all their data (real-time, historical, unstructured, structured, qualitative) to identify patterns and generate insights to inform and, in some cases, automate decisions, connecting intelligence and action." (Tibco) [source]

"Data analytics is a set of technologies and practices that reveal meaning hidden in raw data." (Xplenty) [source]

"Data and analytics is the management of data for all uses (operational and analytical) and the analysis of data to drive business processes and improve business outcomes through more effective decision making and enhanced customer experiences." (Gartner)

"Data analytics (DA) is the process of examining data sets in order to draw conclusions about the information they contain, increasingly with the aid of specialized systems and software." (Techtarget) [source]

"Data analytics is the process of querying and interrogating data in the pursuit of valuable insight and information." (snowflake) [source]

"Data analytics is the pursuit of extracting meaning from raw data using specialized computer systems. These systems transform, organize, and model the data to draw conclusions and identify patterns." (Informatica) [source]

"Data analytics refers to the use of processes and technology to combine and examine datasets, identify meaningful patterns, correlations, and trends in them, and most importantly, extract valuable insights." (Qlik) [source]

"The discovery, interpretation, and communication of meaningful patterns in data. They are essentially the backbone of any data-driven decision making." (Insight Software)

"The process and techniques for the exploration and analysis of business data to discover and identify new and meaningful information and trends that allow for analysis to take place."(Information Management)
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.