03 November 2018

🔭Data Science: Forecasting (Just the Quotes)

"Extrapolations are useful, particularly in the form of soothsaying called forecasting trends. But in looking at the figures or the charts made from them, it is necessary to remember one thing constantly: The trend to now may be a fact, but the future trend represents no more than an educated guess. Implicit in it is 'everything else being equal' and 'present trends continuing'. And somehow everything else refuses to remain equal." (Darell Huff, "How to Lie with Statistics", 1954)

"When numbers in tabular form are taboo and words will not do the work well as is often the case. There is one answer left: Draw a picture. About the simplest kind of statistical picture or graph, is the line variety. It is very useful for showing trends, something practically everybody is interested in showing or knowing about or spotting or deploring or forecasting." (Darell Huff, "How to Lie with Statistics", 1954)

"The moment you forecast you know you’re going to be wrong, you just don’t know when and in which direction." (Edgar R Fiedler, 1977)

"Many of the basic functions performed by neural networks are mirrored by human abilities. These include making distinctions between items (classification), dividing similar things into groups (clustering), associating two or more things (associative memory), learning to predict outcomes based on examples (modeling), being able to predict into the future (time-series forecasting), and finally juggling multiple goals and coming up with a good- enough solution (constraint satisfaction)." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Probability theory is a serious instrument for forecasting, but the devil, as they say, is in the details - in the quality of information that forms the basis of probability estimates." (Peter L Bernstein, "Against the Gods: The Remarkable Story of Risk", 1996)

"Under conditions of uncertainty, both rationality and measurement are essential to decision-making. Rational people process information objectively: whatever errors they make in forecasting the future are random errors rather than the result of a stubborn bias toward either optimism or pessimism. They respond to new information on the basis of a clearly defined set of preferences. They know what they want, and they use the information in ways that support their preferences." (Peter L Bernstein, "Against the Gods: The Remarkable Story of Risk", 1996)

"Time-series forecasting is essentially a form of extrapolation in that it involves fitting a model to a set of data and then using that model outside the range of data to which it has been fitted. Extrapolation is rightly regarded with disfavour in other statistical areas, such as regression analysis. However, when forecasting the future of a time series, extrapolation is unavoidable." (Chris Chatfield, "Time-Series Forecasting" 2nd Ed, 2000)

"Models can be viewed and used at three levels. The first is a model that fits the data. A test of goodness-of-fit operates at this level. This level is the least useful but is frequently the one at which statisticians and researchers stop. For example, a test of a linear model is judged good when a quadratic term is not significant. A second level of usefulness is that the model predicts future observations. Such a model has been called a forecast model. This level is often required in screening studies or studies predicting outcomes such as growth rate. A third level is that a model reveals unexpected features of the situation being described, a structural model, [...] However, it does not explain the data." (Gerald van Belle, "Statistical Rules of Thumb", 2002)

"Most long-range forecasts of what is technically feasible in future time periods dramatically underestimate the power of future developments because they are based on what I call the 'intuitive linear' view of history rather than the 'historical exponential' view." (Ray Kurzweil, "The Singularity is Near", 2005)

"A forecaster should almost never ignore data, especially when she is studying rare events […]. Ignoring data is often a tip-off that the forecaster is overconfident, or is overfitting her model - that she is interested in showing off rather than trying to be accurate."  (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"Whether information comes in a quantitative or qualitative flavor is not as important as how you use it. [...] The key to making a good forecast […] is not in limiting yourself to quantitative information. Rather, it’s having a good process for weighing the information appropriately. […] collect as much information as possible, but then be as rigorous and disciplined as possible when analyzing it. [...] Many times, in fact, it is possible to translate qualitative information into quantitative information." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"In common usage, prediction means to forecast a future event. In data science, prediction more generally means to estimate an unknown value. This value could be something in the future (in common usage, true prediction), but it could also be something in the present or in the past. Indeed, since data mining usually deals with historical data, models very often are built and tested using events from the past." (Foster Provost & Tom Fawcett, "Data Science for Business", 2013)

"Using random processes in our models allows economists to capture the variability of time series data, but it also poses challenges to model builders. As model builders, we must understand the uncertainty from two different perspectives. Consider first that of the econometrician, standing outside an economic model, who must assess its congruence with reality, inclusive of its random perturbations. An econometrician’s role is to choose among different parameters that together describe a family of possible models to best mimic measured real world time series and to test the implications of these models. I refer to this as outside uncertainty. Second, agents inside our model, be it consumers, entrepreneurs, or policy makers, must also confront uncertainty as they make decisions. I refer to this as inside uncertainty, as it pertains to the decision-makers within the model. What do these agents know? From what information can they learn? With how much confidence do they forecast the future? The modeler’s choice regarding insiders’ perspectives on an uncertain future can have significant consequences for each model’s equilibrium outcomes." (Lars P Hansen, "Uncertainty Outside and Inside Economic Models", [Nobel lecture] 2013)

"One important thing to bear in mind about the outputs of data science and analytics is that in the vast majority of cases they do not uncover hidden patterns or relationships as if by magic, and in the case of predictive analytics they do not tell us exactly what will happen in the future. Instead, they enable us to forecast what may come. In other words, once we have carried out some modelling there is still a lot of work to do to make sense out of the results obtained, taking into account the constraints and assumptions in the model, as well as considering what an acceptable level of reliability is in each scenario." (Jesús Rogel-Salazar, "Data Science and Analytics with Python", 2017)

"Regression describes the relationship between an exploratory variable (i.e., independent) and a response variable (i.e., dependent). Exploratory variables are also referred to as predictors and can have a frequency of more than 1. Regression is being used within the realm of predictions and forecasting. Regression determines the change in response variable when one exploratory variable is varied while the other independent variables are kept constant. This is done to understand the relationship that each of those exploratory variables exhibits." (Danish Haroon, "Python Machine Learning Case Studies", 2017)

"The first myth is that prediction is always based on time-series extrapolation into the future (also known as forecasting). This is not the case: predictive analytics can be applied to generate any type of unknown data, including past and present. In addition, prediction can be applied to non-temporal (time-based) use cases such as disease progression modeling, human relationship modeling, and sentiment analysis for medication adherence, etc. The second myth is that predictive analytics is a guarantor of what will happen in the future. This also is not the case: predictive analytics, due to the nature of the insights they create, are probabilistic and not deterministic. As a result, predictive analytics will not be able to ensure certainty of outcomes." (Prashant Natarajan et al, "Demystifying Big Data and Machine Learning for Healthcare", 2017)

"We know what forecasting is: you start in the present and try to look into the future and imagine what it will be like. Backcasting is the opposite: you state your desired vision of the future as if it’s already happened, and then work backward to imagine the practices, policies, programs, tools, training, and people who worked in concert in a hypothetical past (which takes place in the future) to get you there." (Eben Hewitt, "Technology Strategy Patterns: Architecture as strategy" 2nd Ed., 2019)

"Ideally, a decision maker or a forecaster will combine the outside view and the inside view - or, similarly, statistics plus personal experience. But it’s much better to start with the statistical view, the outside view, and then modify it in the light of personal experience than it is to go the other way around. If you start with the inside view you have no real frame of reference, no sense of scale - and can easily come up with a probability that is ten times too large, or ten times too small." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

02 November 2018

🔭Data Science: Nonlinearity (Just the Quotes)

"The term chaos is used in a specific sense where it is an inherently random pattern of behaviour generated by fixed inputs into deterministic (that is fixed) rules (relationships). The rules take the form of non-linear feedback loops. Although the specific path followed by the behaviour so generated is random and hence unpredictable in the long-term, it always has an underlying pattern to it, a 'hidden' pattern, a global pattern or rhythm. That pattern is self-similarity, that is a constant degree of variation, consistent variability, regular irregularity, or more precisely, a constant fractal dimension. Chaos is therefore order (a pattern) within disorder (random behaviour)." (Ralph D Stacey, "The Chaos Frontier: Creative Strategic Control for Business", 1991)

"In nonlinear systems - and the economy is most certainly nonlinear - chaos theory tells you that the slightest uncertainty in your knowledge of the initial conditions will often grow inexorably. After a while, your predictions are nonsense." (M Mitchell Waldrop, "Complexity: The Emerging Science at the Edge of Order and Chaos", 1992)

"In addition to dimensionality requirements, chaos can occur only in nonlinear situations. In multidimensional settings, this means that at least one term in one equation must be nonlinear while also involving several of the variables. With all linear models, solutions can be expressed as combinations of regular and linear periodic processes, but nonlinearities in a model allow for instabilities in such periodic solutions within certain value ranges for some of the parameters." (Courtney Brown, "Chaos and Catastrophe Theories", 1995)

"The dimensionality and nonlinearity requirements of chaos do not guarantee its appearance. At best, these conditions allow it to occur, and even then under limited conditions relating to particular parameter values. But this does not imply that chaos is rare in the real world. Indeed, discoveries are being made constantly of either the clearly identifiable or arguably persuasive appearance of chaos. Most of these discoveries are being made with regard to physical systems, but the lack of similar discoveries involving human behavior is almost certainly due to the still developing nature of nonlinear analyses in the social sciences rather than the absence of chaos in the human setting."  (Courtney Brown, "Chaos and Catastrophe Theories", 1995)

"So we pour in data from the past to fuel the decision-making mechanisms created by our models, be they linear or nonlinear. But therein lies the logician's trap: past data from real life constitute a sequence of events rather than a set of independent observations, which is what the laws of probability demand. [...] It is in those outliers and imperfections that the wildness lurks." (Peter L Bernstein, "Against the Gods: The Remarkable Story of Risk", 1996)

"There is a new science of complexity which says that the link between cause and effect is increasingly difficult to trace; that change (planned or otherwise) unfolds in non-linear ways; that paradoxes and contradictions abound; and that creative solutions arise out of diversity, uncertainty and chaos." (Andy P Hargreaves & Michael Fullan, "What’s Worth Fighting for Out There?", 1998)

"A system may be called complex here if its dimension (order) is too high and its model (if available) is nonlinear, interconnected, and information on the system is uncertain such that classical techniques can not easily handle the problem." (M Jamshidi, "Autonomous Control on Complex Systems: Robotic Applications", Current Advances in Mechanical Design and Production VII, 2000)

"Most physical systems, particularly those complex ones, are extremely difficult to model by an accurate and precise mathematical formula or equation due to the complexity of the system structure, nonlinearity, uncertainty, randomness, etc. Therefore, approximate modeling is often necessary and practical in real-world applications. Intuitively, approximate modeling is always possible. However, the key questions are what kind of approximation is good, where the sense of 'goodness' has to be first defined, of course, and how to formulate such a good approximation in modeling a system such that it is mathematically rigorous and can produce satisfactory results in both theory and applications." (Guanrong Chen & Trung Tat Pham, "Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems", 2001)

"Swarm intelligence can be effective when applied to highly complicated problems with many nonlinear factors, although it is often less effective than the genetic algorithm approach discussed later in this chapter. Swarm intelligence is related to swarm optimization […]. As with swarm intelligence, there is some evidence that at least some of the time swarm optimization can produce solutions that are more robust than genetic algorithms. Robustness here is defined as a solution’s resistance to performance degradation when the underlying variables are changed." (Michael J North & Charles M Macal, "Managing Business Complexity: Discovering Strategic Solutions with Agent-Based Modeling and Simulation", 2007)

"Thus, nonlinearity can be understood as the effect of a causal loop, where effects or outputs are fed back into the causes or inputs of the process. Complex systems are characterized by networks of such causal loops. In a complex, the interdependencies are such that a component A will affect a component B, but B will in general also affect A, directly or indirectly.  A single feedback loop can be positive or negative. A positive feedback will amplify any variation in A, making it grow exponentially. The result is that the tiniest, microscopic difference between initial states can grow into macroscopically observable distinctions." (Carlos Gershenson, "Design and Control of Self-organizing Systems", 2007)

"All forms of complex causation, and especially nonlinear transformations, admittedly stack the deck against prediction. Linear describes an outcome produced by one or more variables where the effect is additive. Any other interaction is nonlinear. This would include outcomes that involve step functions or phase transitions. The hard sciences routinely describe nonlinear phenomena. Making predictions about them becomes increasingly problematic when multiple variables are involved that have complex interactions. Some simple nonlinear systems can quickly become unpredictable when small variations in their inputs are introduced." (Richard N Lebow, "Forbidden Fruit: Counterfactuals and International Relations", 2010)

"Given the important role that correlation plays in structural equation modeling, we need to understand the factors that affect establishing relationships among multivariable data points. The key factors are the level of measurement, restriction of range in data values (variability, skewness, kurtosis), missing data, nonlinearity, outliers, correction for attenuation, and issues related to sampling variation, confidence intervals, effect size, significance, sample size, and power." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"Complexity is a relative term. It depends on the number and the nature of interactions among the variables involved. Open loop systems with linear, independent variables are considered simpler than interdependent variables forming nonlinear closed loops with a delayed response." (Jamshid Gharajedaghi, "Systems Thinking: Managing Chaos and Complexity A Platform for Designing Business Architecture" 3rd Ed., 2011)

"We have minds that are equipped for certainty, linearity and short-term decisions, that must instead make long-term decisions in a non-linear, probabilistic world." (Paul Gibbons, "The Science of Successful Organizational Change", 2015)

"Random forests are essentially an ensemble of trees. They use many short trees, fitted to multiple samples of the data, and the predictions are averaged for each observation. This helps to get around a problem that trees, and many other machine learning techniques, are not guaranteed to find optimal models, in the way that linear regression is. They do a very challenging job of fitting non-linear predictions over many variables, even sometimes when there are more variables than there are observations. To do that, they have to employ 'greedy algorithms', which find a reasonably good model but not necessarily the very best model possible." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Exponentially growing systems are prevalent in nature, spanning all scales from biochemical reaction networks in single cells to food webs of ecosystems. How exponential growth emerges in nonlinear systems is mathematically unclear. […] The emergence of exponential growth from a multivariable nonlinear network is not mathematically intuitive. This indicates that the network structure and the flux functions of the modeled system must be subjected to constraints to result in long-term exponential dynamics." (Wei-Hsiang Lin et al, "Origin of exponential growth in nonlinear reaction networks", PNAS 117 (45), 2020)

"Non-linear associations are also quantifiable. Even linear regression can be used to model some non-linear relationships. This is possible because linear regression has to be linear in parameters, not necessarily in the data. More complex relationships can be quantified using entropy-based metrics such as mutual information. Linear models can also handle interaction terms. We talk about interaction when the model’s output depends on a multiplicative relationship between two or more variables." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

🔭Data Science: Linearity (Just the Quotes)

"In addition to dimensionality requirements, chaos can occur only in nonlinear situations. In multidimensional settings, this means that at least one term in one equation must be nonlinear while also involving several of the variables. With all linear models, solutions can be expressed as combinations of regular and linear periodic processes, but nonlinearities in a model allow for instabilities in such periodic solutions within certain value ranges for some of the parameters." (Courtney Brown, "Chaos and Catastrophe Theories", 1995)

"So we pour in data from the past to fuel the decision-making mechanisms created by our models, be they linear or nonlinear. But therein lies the logician's trap: past data from real life constitute a sequence of events rather than a set of independent observations, which is what the laws of probability demand. [...] It is in those outliers and imperfections that the wildness lurks." (Peter L Bernstein, "Against the Gods: The Remarkable Story of Risk", 1996)

"All forms of complex causation, and especially nonlinear transformations, admittedly stack the deck against prediction. Linear describes an outcome produced by one or more variables where the effect is additive. Any other interaction is nonlinear. This would include outcomes that involve step functions or phase transitions. The hard sciences routinely describe nonlinear phenomena. Making predictions about them becomes increasingly problematic when multiple variables are involved that have complex interactions. Some simple nonlinear systems can quickly become unpredictable when small variations in their inputs are introduced." (Richard N Lebow, "Forbidden Fruit: Counterfactuals and International Relations", 2010)

"There are several key issues in the field of statistics that impact our analyses once data have been imported into a software program. These data issues are commonly referred to as the measurement scale of variables, restriction in the range of data, missing data values, outliers, linearity, and nonnormality." (Randall E Schumacker & Richard G Lomax, "A Beginner’s Guide to Structural Equation Modeling" 3rd Ed., 2010)

"Complexity is a relative term. It depends on the number and the nature of interactions among the variables involved. Open loop systems with linear, independent variables are considered simpler than interdependent variables forming nonlinear closed loops with a delayed response." (Jamshid Gharajedaghi, "Systems Thinking: Managing Chaos and Complexity A Platform for Designing Business Architecture" 3rd Ed., 2011)

"Without precise predictability, control is impotent and almost meaningless. In other words, the lesser the predictability, the harder the entity or system is to control, and vice versa. If our universe actually operated on linear causality, with no surprises, uncertainty, or abrupt changes, all future events would be absolutely predictable in a sort of waveless orderliness." (Lawrence K Samuels, "Defense of Chaos", 2013)

"An oft-repeated rule of thumb in any sort of statistical model fitting is 'you can't fit a model with more parameters than data points'. This idea appears to be as wide-spread as it is incorrect. On the contrary, if you construct your models carefully, you can fit models with more parameters than datapoints [...]. A model with more parameters than datapoints is known as an under-determined system, and it's a common misperception that such a model cannot be solved in any circumstance. [...] this misconception, which I like to call the 'model complexity myth' [...] is not true in general, it is true in the specific case of simple linear models, which perhaps explains why the myth is so pervasive." (Jake Vanderplas, "The Model Complexity Myth", 2015) [source]

"Random forests are essentially an ensemble of trees. They use many short trees, fitted to multiple samples of the data, and the predictions are averaged for each observation. This helps to get around a problem that trees, and many other machine learning techniques, are not guaranteed to find optimal models, in the way that linear regression is. They do a very challenging job of fitting non-linear predictions over many variables, even sometimes when there are more variables than there are observations. To do that, they have to employ 'greedy algorithms', which find a reasonably good model but not necessarily the very best model possible." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Non-linear associations are also quantifiable. Even linear regression can be used to model some non-linear relationships. This is possible because linear regression has to be linear in parameters, not necessarily in the data. More complex relationships can be quantified using entropy-based metrics such as mutual information. Linear models can also handle interaction terms. We talk about interaction when the model’s output depends on a multiplicative relationship between two or more variables." (Aleksander Molak, "Causal Inference and Discovery in Python", 2023)

🔭Data Science: Data Analysts (Just the Quotes)

"[…] it is not enough to say: 'There's error in the data and therefore the study must be terribly dubious'. A good critic and data analyst must do more: he or she must also show how the error in the measurement or the analysis affects the inferences made on the basis of that data and analysis." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"The use of statistical methods to analyze data does not make a study any more 'scientific', 'rigorous', or 'objective'. The purpose of quantitative analysis is not to sanctify a set of findings. Unfortunately, some studies, in the words of one critic, 'use statistics as a drunk uses a street lamp, for support rather than illumination'. Quantitative techniques will be more likely to illuminate if the data analyst is guided in methodological choices by a substantive understanding of the problem he or she is trying to learn about. Good procedures in data analysis involve techniques that help to (a) answer the substantive questions at hand, (b) squeeze all the relevant information out of the data, and (c) learn something new about the world." (Edward R Tufte, "Data Analysis for Politics and Policy", 1974)

"Detailed study of the quality of data sources is an essential part of applied work. [...] Data analysts need to understand more about the measurement processes through which their data come. To know the name by which a column of figures is headed is far from being enough." (John W Tukey, "An Overview of Techniques of Data Analysis, Emphasizing Its Exploratory Aspects", 1982)

"Like a detective, a data analyst will experience many dead ends, retrace his steps, and explore many alternatives before settling on a single description of the evidence in front of him." (David Lubinsky & Daryl Pregibon , "Data analysis as search", Journal of Econometrics Vol. 38 (1–2), 1988)

"The four questions of data analysis are the questions of description, probability, inference, and homogeneity. Any data analyst needs to know how to organize and use these four questions in order to obtain meaningful and correct results. [...] 
THE DESCRIPTION QUESTION: Given a collection of numbers, are there arithmetic values that will summarize the information contained in those numbers in some meaningful way?
THE PROBABILITY QUESTION: Given a known universe, what can we say about samples drawn from this universe? [...]
THE INFERENCE QUESTION: Given an unknown universe, and given a sample that is known to have been drawn from that unknown universe, and given that we know everything about the sample, what can we say about the unknown universe? [...]
THE HOMOGENEITY QUESTION: Given a collection of observations, is it reasonable to assume that they came from one universe, or do they show evidence of having come from multiple universes?" (Donald J Wheeler," Myths About Data Analysis", International Lean & Six Sigma Conference, 2012)

"[…] the data itself can lead to new questions too. In exploratory data analysis (EDA), for example, the data analyst discovers new questions based on the data. The process of looking at the data to address some of these questions generates incidental visualizations - odd patterns, outliers, or surprising correlations that are worth looking into further." (Danyel Fisher & Miriah Meyer, "Making Data Visual", 2018)

"Plotting numbers on a chart does not make you a data analyst. Knowing and understanding your data before you communicate it to your audience does."  (Andy Kriebel & Eva Murray, "#MakeoverMonday: Improving How We Visualize and Analyze Data, One Chart at a Time", 2018)

"Also, remember that data literacy is not just a set of technical skills. There is an equal need and weight for soft skills and business skills. This can be misleading for some technical resources within an organization, as those technical resources may believe they are data literate by default as they are data architects or data analysts. They have the existing technical skills, but maybe they do not have any deep proficiencies in other skills such as communicating with data, challenging assumptions, and mitigating bias, or perhaps they do not have an open mindset to be open to different perspectives." (Angelika Klidas & Kevin Hanegan, "Data Literacy in Practice", 2022)

"The lack of focus and commitment to color is a perplexing thing. When used correctly, color has no equal as a visualization tool - in advertising, in branding, in getting the message across to any audience you seek. Data analysts can make numbers dance and sing on command, but they sometimes struggle to create visually stimulating environments that convince the intended audience to tap their feet in time." (Kate Strachnyi, "ColorWise: A Data Storyteller’s Guide to the Intentional Use of Color", 2023)

Data Science: Torturing the Data in Statistics

Statistics, through its methods, techniques and models rooted in mathematical reasoning, allows exploring, analyzing and summarizing a given set of data, being used to support decision-making, experiments, theories and ultimately to gain and communicate insights. When used adequately, statistics can prove to be a useful toolset, however as soon its use deviates from the mathematical rigor and principles on which it was built, it can be easily misused. Moreover, the results obtained with the help of statistics, can be easily denatured in communication, even when the statistical results are valid. 

The easiness with which statistics can be misused is probably best reflected in sayings like 'if you torture the data long enough it will confess'.  The formulation is attributed by several sources to the economist Ronald H Coase, however according to Coase the reference made by him in the 1960’s was slightly different: 'if you torture the data enough, nature will always confess' (see [1]). The latter formulation is not necessarily negative if one considers the persistence needed by researchers in revealing nature’s secrets. In exchange, the former formulation seems to stress only the negative aspect. 

The word 'torture' seems to be used instead of 'abuse', though metaphorically it has more weight, it draws the attention and sticks with the reader or audience. As the Quotes Investigator remarks [1], ‘torturing the data’ was employed as metaphor much earlier. For example, a 1933 article contains the following passage: 

"The evidence submitted by the committee from its own questionnaire warrants no such conclusion. To torture the data given in Table I into evidence supporting a twelve-hour minimum of professional training is indeed a statistical feat, but one which the committee accomplishes to its own satisfaction." ("The Elementary School Journal" Vol. 33 (7), 1933)

More than a decade earlier, in a similar context with Coase's quote, John Dewey remarked:

"Active experimentation must force the apparent facts of nature into forms different to those in which they familiarly present themselves; and thus make them tell the truth about themselves, as torture may compel an unwilling witness to reveal what he has been concealing." (John Dewey, "Reconstruction in Philosophy", 1920)

Torture was used metaphorically from 1600s, if we consider the following quote from Sir Francis Bacon’s 'Advancement of Learning':

"Another diversity of Methods is according to the subject or matter which is handled; for there is a great difference in delivery of the Mathematics, which are the most abstracted of knowledges, and Policy, which is the most immersed […], yet we see how that opinion, besides the weakness of it, hath been of ill desert towards learning, as that which taketh the way to reduce learning to certain empty and barren generalities; being but the very husks and shells of sciences, all the kernel being forced out and expulsed with the torture and press of the method." (Sir Francis Bacon, Advancement of Learning, 1605)

However a similar metaphor with closer meaning can be found almost two centuries later:

"One very reprehensible mode of theory-making consists, after honest deductions from a few facts have been made, in torturing other facts to suit the end proposed, in omitting some, and in making use of any authority that may lend assistance to the object desired; while all those which militate against it are carefully put on one side or doubted." (Henry De la Beche, "Sections and Views, Illustrative of Geological Phaenomena", 1830)

Probably, also the following quote from Goethe deservers some attention:

"Someday someone will write a pathology of experimental physics and bring to light all those swindles which subvert our reason, beguile our judgement and, what is worse, stand in the way of any practical progress. The phenomena must be freed once and for all from their grim torture chamber of empiricism, mechanism, and dogmatism; they must be brought before the jury of man's common sense." (Johann Wolfgang von Goethe)

Alternatives to Coase’s formulation were used in several later sources, replacing 'data' with 'statistics' or 'numbers':

"Beware of the problem of testing too many hypotheses; the more you torture the data, the more likely they are to confess, but confessions obtained under duress may not be admissible in the court of scientific opinion." (Stephen M Stigler, "Neutral Models in Biology", 1987)

"Torture numbers, and they will confess to anything." (Gregg Easterbrook, New Republic, 1989)

"[…] an honest exploratory study should indicate how many comparisons were made […] most experts agree that large numbers of comparisons will produce apparently statistically significant findings that are actually due to chance. The data torturer will act as if every positive result confirmed a major hypothesis. The honest investigator will limit the study to focused questions, all of which make biologic sense. The cautious reader should look at the number of ‘significant’ results in the context of how many comparisons were made." (James L Mills, "Data torturing", New England Journal of Medicine, 1993)

"This is true only if you torture the statistics until they produce the confession you want." (Larry Schweikart, "Myths of the 1980s Distort Debate over Tax Cuts", 2001) [source

"Even properly done statistics can’t be trusted. The plethora of available statistical techniques and analyses grants researchers an enormous amount of freedom when analyzing their data, and it is trivially easy to ‘torture the data until it confesses’." (Alex Reinhart, "Statistics Done Wrong: The Woefully Complete Guide", 2015)

There is also a psychological component attached to data or facts' torturing to fit the reality, tendency derived from the way the human mind works, the limits and fallacies associated with mind's workings. 

"What are the models? Well, the first rule is that you’ve got to have multiple models - because if you just have one or two that you’re using, the nature of human psychology is such that you’ll torture reality so that it fits your models, or at least you’ll think it does." (Charles Munger, 1994)

Independently of the formulation and context used, the fact remains: statistics (aka data, numbers) can be easily abused, and the reader/audience should be aware of it!

Previously published on quotablemath.blogspot.com.

🔭Data Science: Intelligence (Just the Quotes)

"To be able to discern that what is true is true, and that what is false is false, - this is the mark and character of intelligence." (Ralph W Emerson, "Essays", 1841)

"We study the complex in the simple; and only from the intuition of the lower can we safely proceed to the intellection of the higher degrees. The only danger lies in the leaping from low to high, with the neglect of the intervening gradations." (Samuel T Coleridge, "Physiology of Life", 1848)

"The accidental causes of science are only 'accidents' relatively to the intelligence of a man." (Chauncey Wright, "The Genesis of Species", North American Review, 1871)

"Does the harmony the human intelligence thinks it discovers in nature exist outside of this intelligence? No, beyond doubt, a reality completely independent of the mind which conceives it, sees or feels it, is an impossibility." (Henri Poincaré, "The Value of Science", 1905)

"No one can predict how far we shall be enabled by means of our limited intelligence to penetrate into the mysteries of a universe immeasurably vast and wonderful; nevertheless, each step in advance is certain to bring new blessings to humanity and new inspiration to greater endeavor." (Theodore W Richards, "The Fundamental Properties of the Elements", [Faraday lecture] 1911)

"It may be impossible for human intelligence to comprehend absolute truth, but it is possible to observe Nature with an unbiased mind and to bear truthful testimony of things seen." (Sir Richard A Gregory, "Discovery, Or, The Spirit and Service of Science", 1916)

"In other words then, if a machine is expected to be infallible, it cannot also be intelligent. There are several theorems which say almost exactly that. But these theorems say nothing about how much intelligence may be displayed if a machine makes no pretense at infallibility." (Alan M Turing, 1946)

"A computer would deserve to be called intelligent if it could deceive a human into believing that it was human." (Alan Turing, "Computing Machinery and Intelligence" , Mind Vol. 59, 1950)

"All intelligent endeavor stands with one foot on observation and the other on contemplation." (Gerald Holton & Duane H D Roller, "Foundations of Modern Physical Science", 1950)

"What in fact is the schema of the object? In one essential respect it is a schema belonging to intelligence. To have the concept of an object is to attribute the perceived figure to a substantial basis, so that the figure and the substance that it thus indicates continue to exist outside the perceptual field. The permanence of the object seen from this viewpoint is not only a product of intelligence, but constitutes the very first of those fundamental ideas of conservation which we shall see developing within the thought process." (Jean Piaget, "The Psychology of Intelligence", 1950)

"[…] observation is not enough, and it seems to me that in science, as in the arts, there is very little worth having that does not require the exercise of intuition as well as of intelligence, the use of imagination as well as of information." (Kathleen Lonsdale, "Facts About Crystals", American Scientist Vol. 39 (4), 1951)

"Concepts are for me specific mental abilities exercised in acts of judgment, and expressed in the intelligent use of words (though not exclusively in such use)." (Peter T Geach, "Mental Acts: Their Content and their Objects", 1954)

"The following are some aspects of the artificial intelligence problem: […] If a machine can do a job, then an automatic calculator can be programmed to simulate the machine. […] It may be speculated that a large part of human thought consists of manipulating words according to rules of reasoning and rules of conjecture. From this point of view, forming a generalization consists of admitting a new word and some rules whereby sentences containing it imply and are implied by others. This idea has never been very precisely formulated nor have examples been worked out. […] How can a set of (hypothetical) neurons be arranged so as to form concepts. […] to get a measure of the efficiency of a calculation it is necessary to have on hand a method of measuring the complexity of calculating devices which in turn can be done. […] Probably a truly intelligent machine will carry out activities which may best be described as self-improvement. […] A number of types of 'abstraction' can be distinctly defined and several others less distinctly. […] the difference between creative thinking and unimaginative competent thinking lies in the injection of a some randomness. The randomness must be guided by intuition to be efficient." (John McCarthy et al, "A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence", 1955)

"Solving problems is the specific achievement of intelligence." (George Polya, 1957)

"Computers do not decrease the need for mathematical analysis, but rather greatly increase this need. They actually extend the use of analysis into the fields of computers and computation, the former area being almost unknown until recently, the latter never having been as intensively investigated as its importance warrants. Finally, it is up to the user of computational equipment to define his needs in terms of his problems, In any case, computers can never eliminate the need for problem-solving through human ingenuity and intelligence." (Richard E Bellman & Paul Brock, "On the Concepts of a Problem and Problem-Solving", American Mathematical Monthly 67, 1960)

"Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion:, and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make." (Irving J Good, "Speculations Concerning the First Ultraintelligent Machine", Advances in Computers Vol. 6, 1965)

"When intelligent machines are constructed, we should not be surprised to find them as confused and as stubborn as men in their convictions about mind-matter, consciousness, free will, and the like." (Marvin Minsky, "Matter, Mind, and Models", Proceedings of the International Federation of Information Processing Congress Vol. 1 (49), 1965)

"Artificial intelligence is the science of making machines do things that would require intelligence if done by men." (Marvin Minsky, 1968)

"Intelligence has two parts, which we shall call the epistemological and the heuristic. The epistemological part is the representation of the world in such a form that the solution of problems follows from the facts expressed in the representation. The heuristic part is the mechanism that on the basis of the information solves the problem and decides what to do." (John McCarthy & Patrick J Hayes, "Some Philosophical Problems from the Standpoint of Artificial Intelligence", Machine Intelligence 4, 1969)

"Questions are the engines of intellect, the cerebral machines which convert energy to motion, and curiosity to controlled inquiry." (David H Fischer, "Historians’ Fallacies", 1970)

"Man is not a machine, [...] although man most certainly processes information, he does not necessarily process it in the way computers do. Computers and men are not species of the same genus. [...] No other organism, and certainly no computer, can be made to confront genuine human problems in human terms. [...] However much intelligence computers may attain, now or in the future, theirs must always be an intelligence alien to genuine human problems and concerns." (Joesph Weizenbaum, Computer Power and Human Reason: From Judgment to Calculation, 1976)

"Play is the only way the highest intelligence of humankind can unfold." (Joseph C Pearce, "Magical Child: Rediscovering Nature's Plan for Our Children", 1977)

"Because of mathematical indeterminancy and the uncertainty principle, it may be a law of nature that no nervous system is capable of acquiring enough knowledge to significantly predict the future of any other intelligent system in detail. Nor can intelligent minds gain enough self-knowledge to know their own future, capture fate, and in this sense eliminate free will." (Edward O Wilson, "On Human Nature", 1978)

"Collective intelligence emerges when a group of people work together effectively. Collective intelligence can be additive (each adds his or her part which together form the whole) or it can be synergetic, where the whole is greater than the sum of its parts." (Trudy and Peter Johnson-Lenz, "Groupware: Orchestrating the Emergence of Collective Intelligence", cca. 1980)

"Knowing a great deal is not the same as being smart; intelligence is not information alone but also judgement, the manner in which information is coordinated and used." (Carl Sagan, "Cosmos", 1980)

"The basic idea of cognitive science is that intelligent beings are semantic engines - in other words, automatic formal systems with interpretations under which they consistently make sense. We can now see why this includes psychology and artificial intelligence on a more or less equal footing: people and intelligent computers (if and when there are any) turn out to be merely different manifestations of the same underlying phenomenon. Moreover, with universal hardware, any semantic engine can in principle be formally imitated by a computer if only the right program can be found." (John Haugeland, "Semantic Engines: An introduction to mind design", 1981)

"There is a tendency to mistake data for wisdom, just as there has always been a tendency to confuse logic with values, intelligence with insight. Unobstructed access to facts can produce unlimited good only if it is matched by the desire and ability to find out what they mean and where they lead." (Norman Cousins, "Human Options : An Autobiographical Notebook", 1981) 

"Cybernetic information theory suggests the possibility of assuming that intelligence is a feature of any feedback system that manifests a capacity for learning." (Paul Hawken et al, "Seven Tomorrows", 1982)

"We lose all intelligence by averaging." (John Naisbitt, "Megatrends: Ten New Directions Transforming Our Lives", 1982)

"Artificial intelligence is based on the assumption that the mind can be described as some kind of formal system manipulating symbols that stand for things in the world. Thus it doesn't matter what the brain is made of, or what it uses for tokens in the great game of thinking. Using an equivalent set of tokens and rules, we can do thinking with a digital computer, just as we can play chess using cups, salt and pepper shakers, knives, forks, and spoons. Using the right software, one system (the mind) can be mapped onto the other (the computer)." (George Johnson, Machinery of the Mind: Inside the New Science of Artificial Intelligence, 1986)

"Cybernetics is simultaneously the most important science of the age and the least recognized and understood. It is neither robotics nor freezing dead people. It is not limited to computer applications and it has as much to say about human interactions as it does about machine intelligence. Today’s cybernetics is at the root of major revolutions in biology, artificial intelligence, neural modeling, psychology, education, and mathematics. At last there is a unifying framework that suspends long-held differences between science and art, and between external reality and internal belief." (Paul Pangaro, "New Order From Old: The Rise of Second-Order Cybernetics and Its Implications for Machine Intelligence", 1988)

"A popular myth says that the invention of the computer diminishes our sense of ourselves, because it shows that rational thought is not special to human beings, but can be carried on by a mere machine. It is a short stop from there to the conclusion that intelligence is mechanical, which many people find to be an affront to all that is most precious and singular about their humanness." (Jeremy Campbell, "The improbable machine", 1989)

"Fuzziness, then, is a concomitant of complexity. This implies that as the complexity of a task, or of a system for performing that task, exceeds a certain threshold, the system must necessarily become fuzzy in nature. Thus, with the rapid increase in the complexity of the information processing tasks which the computers are called upon to perform, we are reaching a point where computers will have to be designed for processing of information in fuzzy form. In fact, it is the capability to manipulate fuzzy concepts that distinguishes human intelligence from the machine intelligence of current generation computers. Without such capability we cannot build machines that can summarize written text, translate well from one natural language to another, or perform many other tasks that humans can do with ease because of their ability to manipulate fuzzy concepts." (Lotfi A Zadeh, "The Birth and Evolution of Fuzzy Logic", 1989)

"Modeling underlies our ability to think and imagine, to use signs and language, to communicate, to generalize from experience, to deal with the unexpected, and to make sense out of the raw bombardment of our sensations. It allows us to see patterns, to appreciate, predict, and manipulate processes and things, and to express meaning and purpose. In short, it is one of the most essential activities of the human mind. It is the foundation of what we call intelligent behavior and is a large part of what makes us human. We are, in a word, modelers: creatures that build and use models routinely, habitually – sometimes even compulsively – to face, understand, and interact with reality."  (Jeff Rothenberg, "The Nature of Modeling. In: Artificial Intelligence, Simulation, and Modeling", 1989)

"We haven't worked on ways to develop a higher social intelligence […] We need this higher intelligence to operate socially or we're not going to survive. […] If we don't manage things socially, individual high intelligence is not going to make much difference. [...] Ordinary thought in society is incoherent - it is going in all sorts of directions, with thoughts conflicting and canceling each other out. But if people were to think together in a coherent way, it would have tremendous power." (David Bohm, "New Age Journal", 1989)

"[Language comprehension] involves many components of intelligence: recognition of words, decoding them into meanings, segmenting word sequences into grammatical constituents, combining meanings into statements, inferring connections among statements, holding in short-term memory earlier concepts while processing later discourse, inferring the writer’s or speaker’s intentions, schematization of the gist of a passage, and memory retrieval in answering questions about the passage. [… The reader] constructs a mental representation of the situation and actions being described. […] Readers tend to remember the mental model they constructed from a text, rather than the text itself." (Gordon H Bower & Daniel G Morrow, 1990)

"The insight at the root of artificial intelligence was that these 'bits' (manipulated by computers) could just as well stand as symbols for concepts that the machine would combine by the strict rules of logic or the looser associations of psychology." (Daniel Crevier, "AI: The tumultuous history of the search for artificial intelligence", 1993)

"The leading edge of growth of intelligence is at the cultural and societal level. It is like a mind that is struggling to wake up. This is necessary because the most difficult problems we face are now collective ones. They are caused by complex global interactions and are beyond the scope of individuals to understand and solve. Individual mind, with its isolated viewpoints and narrow interests, is no longer enough." (Jeff Wright, "Basic Beliefs", [email] 1995)

"Adaptation is the process of changing a system during its operation in a dynamically changing environment. Learning and interaction are elements of this process. Without adaptation there is no intelligence." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Artificial intelligence comprises methods, tools, and systems for solving problems that normally require the intelligence of humans. The term intelligence is always defined as the ability to learn effectively, to react adaptively, to make proper decisions, to communicate in language or images in a sophisticated way, and to understand." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Learning is the process of obtaining new knowledge. It results in a better reaction to the same inputs at the next session of operation. It means improvement. It is a step toward adaptation. Learning is a major characteristic of intelligent systems." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Intelligence is: (a) the most complex phenomenon in the Universe; or (b) a profoundly simple process. The answer, of course, is (c) both of the above. It's another one of those great dualities that make life interesting." (Ray Kurzweil, "The Age of Spiritual Machines: When Computers Exceed Human Intelligence", 1999)

"It [collective intelligence] is a form of universally distributed intelligence, constantly enhanced, coordinated in real time, and resulting in the effective mobilization of skills. I'll add the following indispensable characteristic to this definition: The basis and goal of collective intelligence is mutual recognition and enrichment of individuals rather than the cult of fetishized or hypostatized communities." (Pierre Levy, "Collective Intelligence", 1999)

"It is, however, fair to say that very few applications of swarm intelligence have been developed. One of the main reasons for this relative lack of success resides in the fact that swarm-intelligent systems are hard to 'program', because the paths to problem solving are not predefined but emergent in these systems and result from interactions among individuals and between individuals and their environment as much as from the behaviors of the individuals themselves. Therefore, using a swarm-intelligent system to solve a problem requires a thorough knowledge not only of what individual behaviors must be implemented but also of what interactions are needed to produce such or such global behavior." (Eric Bonabeau et al, "Swarm Intelligence: From Natural to Artificial Systems", 1999)

"Once a computer achieves human intelligence it will necessarily roar past it." (Ray Kurzweil, "The Age of Spiritual Machines: When Computers Exceed Human Intelligence", 1999)

"[…] when software systems become so intractable that they can no longer be controlled, swarm intelligence offers an alternative way of designing an ‘intelligent’ systems, in which autonomy, emergence, and distributed functioning replace control, preprogramming, and centralization." (Eric Bonabeau et al, "Swarm Intelligence: From Natural to Artificial Systems", 1999)

"With the growing interest in complex adaptive systems, artificial life, swarms and simulated societies, the concept of 'collective intelligence' is coming more and more to the fore. The basic idea is that a group of individuals (e. g. people, insects, robots, or software agents) can be smart in a way that none of its members is. Complex, apparently intelligent behavior may emerge from the synergy created by simple interactions between individuals that follow simple rules." (Francis Heylighen, "Collective Intelligence and its Implementation on the Web", 1999)

"Ecological rationality uses reason – rational reconstruction – to examine the behavior of individuals based on their experience and folk knowledge, who are ‘naïve’ in their ability to apply constructivist tools to the decisions they make; to understand the emergent order in human cultures; to discover the possible intelligence embodied in the rules, norms and institutions of our cultural and biological heritage that are created from human interactions but not by deliberate human design. People follow rules without being able to articulate them, but they can be discovered." (Vernon L Smith, "Constructivist and ecological rationality in economics",  2002)

"But intelligence is not just a matter of acting or behaving intelligently. Behavior is a manifestation of intelligence, but not the central characteristic or primary definition of being intelligent. A moment's reflection proves this: You can be intelligent just lying in the dark, thinking and understanding. Ignoring what goes on in your head and focusing instead on behavior has been a large impediment to understanding intelligence and building intelligent machines." (Jeff Hawkins, "On Intelligence", 2004)

"Evolution moves towards greater complexity, greater elegance, greater knowledge, greater intelligence, greater beauty, greater creativity, and greater levels of subtle attributes such as love. […] Of course, even the accelerating growth of evolution never achieves an infinite level, but as it explodes exponentially it certainly moves rapidly in that direction." (Ray Kurzweil, "The Singularity is Near", 2005)

"Swarm Intelligence can be defined more precisely as: Any attempt to design algorithms or distributed problem-solving methods inspired by the collective behavior of the social insect colonies or other animal societies. The main properties of such systems are flexibility, robustness, decentralization and self-organization." ("Swarm Intelligence in Data Mining", Ed. Ajith Abraham et al, 2006))

"Swarm intelligence is sometimes also referred to as mob intelligence. Swarm intelligence uses large groups of agents to solve complicated problems. Swarm intelligence uses a combination of accumulation, teamwork, and voting to produce solutions. Accumulation occurs when agents contribute parts of a solution to a group. Teamwork occurs when different agents or subgroups of agents accidentally or purposefully work on different parts of a large problem. Voting occurs when agents propose solutions or components of solutions and the other agents vote explicitly by rating the proposal’s quality or vote implicitly by choosing whether to follow the proposal." (Michael J North & Charles M Macal, "Managing Business Complexity: Discovering Strategic Solutions with Agent-Based Modeling and Simulation", 2007)

"The brain and its cognitive mental processes are the biological foundation for creating metaphors about the world and oneself. Artificial intelligence, human beings’ attempt to transcend their biology, tries to enter into these scenarios to learn how they function. But there is another metaphor of the world that has its own particular landscapes, inhabitants, and laws. The brain provides the organic structure that is necessary for generating the mind, which in turn is considered a process that results from brain activity." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"Cultures are never merely intellectual constructs. They take form through the collective intelligence and memory, through a commonly held psychology and emotions, through spiritual and artistic communion." (Tariq Ramadan, "Islam and the Arab Awakening", 2012)

"An intuition is neither caprice nor a sixth sense but a form of unconscious intelligence." (Gerd Gigerenzer, "Risk Savvy", 2015)

"Artificial intelligence is the elucidation of the human learning process, the quantification of the human thinking process, the explication of human behavior, and the understanding of what makes intelligence possible." (Kai-Fu Lee, "AI Superpowers: China, Silicon Valley, and the New World Order", 2018)

"Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality." (Judea Pearl, "The Book of Why: The New Science of Cause and Effect", 2018)

"AI won‘t be fool proof in the future since it will only as good as the data and information that we give it to learn. It could be the case that simple elementary tricks could fool the AI algorithm and it may serve a complete waste of output as a result." (Zoltan Andrejkovics, "Together: AI and Human. On the Same Side", 2019)

"People who assume that extensions of modern machine learning methods like deep learning will somehow 'train up', or learn to be intelligent like humans, do not understand the fundamental limitations that are already known. Admitting the necessity of supplying a bias to learning systems is tantamount to Turing’s observing that insights about mathematics must be supplied by human minds from outside formal methods, since machine learning bias is determined, prior to learning, by human designers." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

More quotes on "Intelligence" at the-web-of-knowledge.blogspot.com

01 November 2018

🔭Data Science: Black Boxes (Just the Quotes)

"The terms 'black box' and 'white box' are convenient and figurative expressions of not very well determined usage. I shall understand by a black box a piece of apparatus, such as four-terminal networks with two input and two output terminals, which performs a definite operation on the present and past of the input potential, but for which we do not necessarily have any information of the structure by which this operation is performed. On the other hand, a white box will be similar network in which we have built in the relation between input and output potentials in accordance with a definite structural plan for securing a previously determined input-output relation." (Norbert Wiener, "Cybernetics: Or Control and Communication in the Animal and the Machine", 1948)

"The definition of a ‘good model’ is when everything inside it is visible, inspectable and testable. It can be communicated effortlessly to others. A ‘bad model’ is a model that does not meet these standards, where parts are hidden, undefined or concealed and it cannot be inspected or tested; these are often labelled black box models." (Hördur V Haraldsson & Harald U Sverdrup, "Finding Simplicity in Complexity in Biogeochemical Modelling" [in "Environmental Modelling: Finding Simplicity in Complexity", Ed. by John Wainwright and Mark Mulligan, 2004])

"Operational thinking is about mapping relationships. It is about capturing interactions, interconnections, the sequence and flow of activities, and the rules of the game. It is about how systems do what they do, or the dynamic process of using elements of the structure to produce the desired functions. In a nutshell, it is about unlocking the black box that lies between system input and system output." (Jamshid Gharajedaghi, "Systems Thinking: Managing Chaos and Complexity A Platform for Designing Business Architecture" 3rd Ed., 2011)

"The transparency of Bayesian networks distinguishes them from most other approaches to machine learning, which tend to produce inscrutable 'black boxes'. In a Bayesian network you can follow every step and understand how and why each piece of evidence changed the network’s beliefs." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"A recurring theme in machine learning is combining predictions across multiple models. There are techniques called bagging and boosting which seek to tweak the data and fit many estimates to it. Averaging across these can give a better prediction than any one model on its own. But here a serious problem arises: it is then very hard to explain what the model is (often referred to as a 'black box'). It is now a mixture of many, perhaps a thousand or more, models." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Deep neural networks have an input layer and an output layer. In between, are “hidden layers” that process the input data by adjusting various weights in order to make the output correspond closely to what is being predicted. [...] The mysterious part is not the fancy words, but that no one truly understands how the pattern recognition inside those hidden layers works. That’s why they’re called 'hidden'. They are an inscrutable black box - which is okay if you believe that computers are smarter than humans, but troubling otherwise." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"The concept of integrated information is clearest when applied to networks. Imagine a black box with input and output terminals. Inside are some electronics, such as a network with logic elements (AND, OR, and so on) wired together. Viewed from the outside, it will usually not be possible to deduce the circuit layout simply by examining the cause–effect relationship between inputs and outputs, because functionally equivalent black boxes can be built from very different circuits. But if the box is opened, it’s a different story. Suppose you use a pair of cutters to sever some wires in the network. Now rerun the system with all manner of inputs. If a few snips dramatically alter the outputs, the circuit can be described as highly integrated, whereas in a circuit with low integration the effect of some snips may make no difference at all." (Paul Davies, "The Demon in the Machine: How Hidden Webs of Information Are Solving the Mystery of Life", 2019)

"Big data is revolutionizing the world around us, and it is easy to feel alienated by tales of computers handing down decisions made in ways we don’t understand. I think we’re right to be concerned. Modern data analytics can produce some miraculous results, but big data is often less trustworthy than small data. Small data can typically be scrutinized; big data tends to be locked away in the vaults of Silicon Valley. The simple statistical tools used to analyze small datasets are usually easy to check; pattern-recognizing algorithms can all too easily be mysterious and commercially sensitive black boxes." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"If the data that go into the analysis are flawed, the specific technical details of the analysis don’t matter. One can obtain stupid results from bad data without any statistical trickery. And this is often how bullshit arguments are created, deliberately or otherwise. To catch this sort of bullshit, you don’t have to unpack the black box. All you have to do is think carefully about the data that went into the black box and the results that came out. Are the data unbiased, reasonable, and relevant to the problem at hand? Do the results pass basic plausibility checks? Do they support whatever conclusions are drawn?" (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

"This problem with adding additional variables is referred to as the curse of dimensionality. If you add enough variables into your black box, you will eventually find a combination of variables that performs well - but it may do so by chance. As you increase the number of variables you use to make your predictions, you need exponentially more data to distinguish true predictive capacity from luck." (Carl T Bergstrom & Jevin D West, "Calling Bullshit: The Art of Skepticism in a Data-Driven World", 2020)

🔭Data Science: Probabilistic Models (Just the Quotes)

"A deterministic system is one in which the parts interact in a perfectly predictable way. There is never any room for doubt: given a last state of the system and the programme of information by defining its dynamic network, it is always possible to predict, without any risk of error, its succeeding state. A probabilistic system, on the other hand, is one about which no precisely detailed prediction can be given. The system may be studied intently, and it may become more and more possible to say what it is likely to do in any given circumstances. But the system simply is not predetermined, and a prediction affecting it can never escape from the logical limitations of the probabilities in which terms alone its behaviour can be described." (Stafford Beer, "Cybernetics and Management", 1959)

"[...] there can be such a thing as a simple probabilistic system. For example, consider the tossing of a penny. Here is a perfectly simple system, but one which is notoriously unpredictable. It maybe described in terms of a binary decision process, with a built-in even probability between the two possible outcomes." (Stafford Beer, "Cybernetics and Management", 1959)

"When loops are present, the network is no longer singly connected and local propagation schemes will invariably run into trouble. [...] If we ignore the existence of loops and permit the nodes to continue communicating with each other as if the network were singly connected, messages may circulate indefinitely around the loops and process may not converges to a stable equilibrium. […] Such oscillations do not normally occur in probabilistic networks […] which tend to bring all messages to some stable equilibrium as time goes on. However, this asymptotic equilibrium is not coherent, in the sense that it does not represent the posterior probabilities of all nodes of the network." (Judea Pearl, "Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference", 1988)

"We will use the convenient expression 'chosen at random' to mean that the probabilities of the events in the sample space are all the same unless some modifying words are near to the words 'at random'. Usually we will compute the probability of the outcome based on the uniform probability model since that is very common in modeling simple situations. However, a uniform distribution does not imply that it comes from a random source; […]" (Richard W Hamming, "The Art of Probability for Scientists and Engineers", 1991)

"Exploratory data analysis (EDA) is a collection of techniques that reveal (or search for) structure in a data set before calculating any probabilistic model. Its purpose is to obtain information about the data distribution (univariate or multivariate), about the presence of outliers and clusters, to disclose relationships and correlations between objects and/or variables." (Ildiko E  Frank & Roberto Todeschini, "The Data Analysis Handbook", 1994)

"To understand what kinds of problems are solvable by the Monte Carlo method, it is important to note that the method enables simulation of any process whose development is influenced by random factors. Second, for many mathematical problems involving no chance, the method enables us to artificially construct a probabilistic model (or several such models), making possible the solution of the problems." (Ilya M Sobol, "A Primer for the Monte Carlo Method", 1994)

"The role of graphs in probabilistic and statistical modeling is threefold: (1) to provide convenient means of expressing substantive assumptions; (2) to facilitate economical representation of joint probability functions; and (3) to facilitate efficient inferences from observations." (Judea Pearl, "Causality: Models, Reasoning, and Inference", 2000)

"The nice thing with Monte Carlo is that you play a game of let’s pretend, like this: first of all there are ten scenarios with different probabilities, so let’s first pick a probability. The dice in this case is a random number generator in the computer. You roll the dice and pick a scenario to work with. Then you roll the dice for a certain speed, and you roll the dice again to see what direction it took. The last thing is that it collided with the bottom at an unknown time so you roll dice for the unknown time. So now you have speed, direction, starting point, time. Given them all, I know precisely where it [could have] hit the bottom. You have the computer put a point there. Rolling dice, I come up with different factors for each scenario. If I had enough patience, I could do it with pencil and paper. We calculated ten thousand points. So you have ten thousand points on the bottom of the ocean that represent equally likely positions of the sub. Then you draw a grid, count the points in each cell of the grid, saying that 10% of the points fall in this cell, 1% in that cell, and those percentages are what you use for probabilities for the prior for the individual distributions." (Henry R Richardson) [in (Sharon B McGrayne, "The Theory That Would Not Die", 2011)]

"A major advantage of probabilistic models is that they can be easily applied to virtually any data type (or mixed data type), as long as an appropriate generative model is available for each mixture component. [...] A downside of probabilistic models is that they try to fit the data to a particular kind of distribution, which may often not be appropriate for the underlying data. Furthermore, as the number of model parameters increases, over-fitting becomes more common. In such cases, the outliers may fit the underlying model of normal data. Many parametric models are also harder to interpret in terms of intensional knowledge, especially when the parameters of the model cannot be intuitively presented to an analyst in terms of underlying attributes. This can defeat one of the important purposes of anomaly detection, which is to provide diagnostic understanding of the abnormal data generative process." (Charu C Aggarwal, "Outlier Analysis", 2013)

"The process of using a probabilistic model to answer a query, given evidence." (Avi Pfeffer, "Practical Probabilistic Programming", 2016)

"Monte Carlo simulations handle uncertainty by using a computer’s random number generator to determine outcomes. Done over and over again, the simulations show the distribution of the possible outcomes. [...] The beauty of these Monte Carlo simulations is that they allow users to see the probabilistic consequences of their decisions, so that they can make informed choices. [...] Monte Carlo simulations are one of the most valuable applications of data science because they can be used to analyze virtually any uncertain situation where we are able to specify the nature of the uncertainty [...]" (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"A simple probabilistic model would not be sufficient to generate the fantastic diversity we see." Wolfgang Pauli

31 October 2018

🔭Data Science: Deep Learning (Just the Quotes)

"Despite the enormous success of deep learning, relatively little is understood theoretically about why these techniques are so successful at feature learning and compression." (Pankaj Mehta & David J Schwab, "An exact mapping between the Variational Renormalization Group and Deep Learning", 2014)

"Deep learning is about using a stacked hierarchy of feature detectors. [...] we use pattern detectors and we build them into networks that are arranged in hundreds of layers and then we adjust the links between these layers, usually using some kind of gradient descent." (Joscha Bach, "Joscha: Computational Meta-Psychology", 2015)

"The power of deep learning models comes from their ability to classify or predict nonlinear data using a modest number of parallel nonlinear steps4. A deep learning model learns the input data features hierarchy all the way from raw data input to the actual classification of the data. Each layer extracts features from the output of the previous layer." (N D Lewis, "Deep Learning Made Easy with R: A Gentle Introduction for Data Science", 2016)

"Although deep learning systems share some similarities with machine learning systems, certain characteristics make them sufficiently distinct. For example, conventional machine learning systems tend to be simpler and have fewer options for training. DL systems are noticeably more sophisticated; they each have a set of training algorithms, along with several parameters regarding the systems’ architecture. This is one of the reasons we consider them a distinct framework in data science." (Yunus E Bulut & Zacharias Voulgaris, "AI for Data Science: Artificial Intelligence Frameworks and Functionality for Deep Learning, Optimization, and Beyond", 2018)

"Deep learning broadly describes the large family of neural network architectures that contain multiple, interacting hidden layers." (Benjamin Bengfort et al, Applied Text Analysis with Python, 2018)

"Deep learning has instead given us machines with truly impressive abilities but no intelligence. The difference is profound and lies in the absence of a model of reality." (Judea Pearl, "The Book of Why: The New Science of Cause and Effect", 2018)

"DL systems also tend to be more autonomous than their machine counterparts. To some extent, DL systems can do their own feature engineering. More conventional systems tend to require more fine-tuning of the feature-set, and sometimes require dimensionality reduction to provide any decent results. In addition, the generalization of conventional ML systems when provided with additional data generally don’t improve as much as DL systems. This is also one of the key characteristics that makes DL systems a preferable option when big data is involved." (Yunus E Bulut & Zacharias Voulgaris, "AI for Data Science: Artificial Intelligence Frameworks and Functionality for Deep Learning, Optimization, and Beyond", 2018)

"[…] deep learning has succeeded primarily by showing that certain questions or tasks we thought were difficult are in fact not. It has not addressed the truly difficult questions that continue to prevent us from achieving humanlike AI." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"In essence, deep learning models are just chains of functions, which means that many deep learning libraries tend to have a functional or verbose, declarative style." (Benjamin Bengfort et al, Applied Text Analysis with Python, 2018)

"The second big myth of data science is that every data science project needs big data and needs to use deep learning. In general, having more data helps, but having the right data is the more important requirement" (John D Kelleher & Brendan Tierney, "Data Science", 2018)

"People who assume that extensions of modern machine learning methods like deep learning will somehow 'train up', or learn to be intelligent like humans, do not understand the fundamental limitations that are already known. Admitting the necessity of supplying a bias to learning systems is tantamount to Turing’s observing that insights about mathematics must be supplied by human minds from outside formal methods, since machine learning bias is determined, prior to learning, by human designers." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

30 October 2018

💠🛠️SQL Server: Administration (Troubleshooting Login Failed for User)

    Since the installation of an SQL Server 2017 on a virtual machine (VM) in the Microsoft Cloud started to appear in the error log records with the following message:

Login failed for user '<domain>\<computer>$'. Reason: Could not find a login matching the name provided. [CLIENT: <local machine>]
Error: 18456, Severity: 14, State: 5.


   From the text it seemed like a permission problem, thing confirmed by the documentation (see [1]), the Error Number and State correspond to a „User Id is not valid“ situation. In a first step I attempted to give permissions to the local account (dollar sign included). The account wasn’t found in the Active Directory (AD), though by typing the account directly in the “Login name” I managed to give temporarily sysadmin permission to the account. The error continued to appear in the error log. I looked then at the accounts under which the SQL Services run - nothing suspect in there.

   Except the error message, which was appearing with an alarming frequency (a few seconds apart), everything seemed to be working on the server. The volume of  records (a few hundred thousands over a few days) bloating the error log, as well the fact that I didn’t knew what’s going on made me take the time and further investigate the issue.

  Looking today at the Windows Logs for Applications I observed that the error is caused by an account used for the Microsoft SQL Server IaaS Agent and IaaS Query Service. Once I gave permissions to the account the error disappeared.

   The search for a best practice on what permissions to give to the IaaS Agent and IaaS Query Service lead me to [2]. To quote, the “Agent Service needs Local System rights to be able to install and configure SQL Server, attach disks and enable storage pool and manage automated security patching of Windows and SQL server”, while the “IaaS Query Service is started with an NT Service account which is a Sys Admin on the SQL Server”. In fact, this was the only resource I found that made a reference to the IaaS Query Service.

   This was just one of the many scenarios in which the above error appears. For more information see for example  [3], [4] or [5].

References:
[1] Microsoft (2017) MSSQLSERVER_18456 [Online] Available from: https://docs.microsoft.com/en-us/sql/relational-databases/errors-events/mssqlserver-18456-database-engine-error?view=sql-server-2017
[2] SQL Database Engine Blog (2018) SQL Server IaaS Extension Query Service for SQL Server on Azure VM, by Mine Tokus Altug [Online] Available from:  https://blogs.msdn.microsoft.com/sqlserverstorageengine/2018/10/25/sql-server-iaas-extension-query-service-for-sql-server-on-azure-vm/
[3] Microsoft Support (2018) "Login failed for user" error message when you log on to SQL Server [Online] Available from: https://support.microsoft.com/en-sg/help/555332/login-failed-for-user-error-message-when-you-log-on-to-sql-server
[4] Microsoft Technet (2018) How to Troubleshoot Connecting to the SQL Server Database [Online] Available from: Engine https://social.technet.microsoft.com/wiki/contents/articles/2102.how-to-troubleshoot-connecting-to-the-sql-server-database-engine.aspx 
[5] Microsoft Blogs (2011)Troubleshoot Connectivity/Login failures (18456 State x) with SQL Server, by Sakthivel Chidambaram [Online] Available from: https://blogs.msdn.microsoft.com/sqlsakthi/2011/02/06/troubleshoot-connectivitylogin-failures-18456-state-x-with-sql-server/

29 October 2018

💠🛠️SQL Server: Administration (Searching the Error Log)

    Searching for a needle in a haystack is an achievable task though may turn to be daunting. Same can be said about searching for a piece of information in the SQL error log. Fortunately, there is xp_readerrorlog, an undocumented (extended) stored procedure, which helps in the process. The stored procedure makes available the content of the error log and provides basic search capabilities via a small set of parameters. For example, it can be used to search for errors, warnings, failed backups, consistency checks, failed logins, databases instant file initializations, and so on. It helps identify whether an event occurred and the time at which the event occurred.

   The following are the parameter available with the stored procedure:

Parameter
Name
Type
Description
1FileToReadint0 = Current, 1 or 2, 3, … n Archive Number
2Logtypeint1 = SQL Error Log and 2 = SQL Agent log
3String1varchar(255)the string to match the logs on
4String2varchar(255)a second string to match in combination with String1 (AND)
5StartDatedatetimebeginning date to look from
6EndDatedatetimeending date to look up to
7ResultsOrderASC or DESC sorting


Note:
If the SQL Server Agent hasn’t been active, then there will be no Agent log and the call to the stored procedure will return an error.

   Here are a few examples of using the stored procedure:

-- listing the content of the current SQL Server error log
EXEC xp_readerrorlog 0, 1

-- listing the content of the second SQL Server error log
EXEC xp_readerrorlog 1, 1

-- listing the content of the current SQL Server Agent log
EXEC xp_readerrorlog 0, 2

-- searching for errors 
EXEC xp_readerrorlog 0, 1, N'error'

-- searching for errors that have to do with consistency checks
EXEC xp_readerrorlog 0, 1, N'error', N'CHECKDB'

-- searching for errors that have to do with consistency checks
EXEC xp_readerrorlog 0, 1, N'failed', N'backups'

-- searching for warnings 
EXEC xp_readerrorlog 0, 1, N'warning'

-- searching who killed a session
EXEC xp_readerrorlog 0, 1, N'kill'

-- searching for I/O information
EXEC xp_readerrorlog 0, 1, N'I/O'

-- searching for consistency checks 
EXEC xp_readerrorlog 0, 1, N'CHECKDB'

-- searching for consistency checks performed via DBCC
EXEC xp_readerrorlog 0, 1, N'DBCC CHECKDB'

-- searching for failed logins  
EXEC xp_readerrorlog 0, 1, N'Login failed'

-- searching for 
EXEC xp_readerrorlog 0, 1, N'[INFO]'

-- searching for shutdowns 
EXEC xp_readerrorlog 0, 1, N'shutdown'

-- searching for a database instant file initialization event  
EXEC xp_readerrorlog 0, 1, N'database instant file initialization'

   If the error log is too big it’s important to narrow the search for a given time interval:

-- searching for errors starting with a given date 
DECLARE @StartDate as Date = DateAdd(d, -1, GetDate())
EXEC xp_readerrorlog 0, 1, N'error', N'', @StartDate

-- searching for errors within a time interval 
DECLARE @StartDate as Date = DateAdd(d, -14, GetDate())
DECLARE @EndDate as Date = DateAdd(d, -7, GetDate())
EXEC xp_readerrorlog 0, 1, N'', N'', @StartDate, @EndDate, N'desc' 

   The output can be dumped into a table especially when is needed to perform a detailed analysis on the error log. It might be interesting to check how often an error message occurred, like in the below example. One can take thus advantage of more complex pattern searching.

-- creating the error log table 
CREATE TABLE dbo.ErrorLogMessages (
    LogDate datetime2(0) 
  , ProcessInfo nvarchar(255)
  , [Text] nvarchar(max))

-- loading the errors 
INSERT INTO dbo.ErrorLogMessages
EXEC xp_readerrorlog 0, 1

-- checking the results 
SELECT *
FROM dbo.ErrorLogMessages

-- checking messages frequency 
SELECT [Text]
, count(*) NoOccurrences
, Min(LogDate) FirstOccurrence
FROM dbo.ErrorLogMessages
GROUP BY [Text]
HAVING count(*)>1
ORDER BY NoOccurrences DESC

-- getting the errors and their information 
SELECT *
FROM (
 SELECT *
 , Lead([Text], 1) OVER (PARTITION BY LogDate, ProcessInfo ORDER BY LogDate) PrevMessage
 FROM dbo.ErrorLogMessages
 ) DAT
WHERE [Text] LIKE '%error:%[0-9]%'

-- cleaning up 
--DROP TABLE IF EXISTS dbo.ErrorLogMessages 

   For those who don’t have admin permissions it is necessary to explicitly give execute permissions on the xp_readerrorlog stored procedure:

-- giving explicit permissions to account
GRANT EXECUTE ON xp_readerrorlog TO [<account_name>]

   Personally, I’ve been using the stored procedure mainly to check whether error messages were logged for a given time interval and whether the consistency checks run without problems. Occasionally, I used it to check for failed logins or sessions terminations (aka kills).

Notes:
Microsoft warns that undocumented objects might change in future releases. Fortunately, xp_readerrorlog made it since SQL Server 2005 to SQL Server 2017, so it might make it further…
The above code was tested also on SQL Server 2017.

Happy coding!
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.