12 April 2006

🖍️Bart Kosko - Collected Quotes

"A bell curve shows the 'spread' or variance in our knowledge or certainty. The wider the bell the less we know. An infinitely wide bell is a flat line. Then we know nothing. The value of the quantity, position, or speed could lie anywhere on the axis. An infinitely narrow bell is a spike that is infinitely tall. Then we have complete knowledge of the value of the quantity. The uncertainty principle says that as one bell curve gets wider the other gets thinner. As one curve peaks the other spreads. So if the position bell curve becomes a spike and we have total knowledge of position, then the speed bell curve goes flat and we have total uncertainty (infinite variance) of speed." (Bart Kosko, "Fuzzy Thinking: The new science of fuzzy logic", 1993)

"Bivalence trades accuracy for simplicity. Binary outcomes of yes and no, white and black, true and false simplify math and computer processing. You can work with strings of 0s and 1s more easily than you can work with fractions. But bivalence requires some force fitting and rounding off [...] Bivalence holds at cube corners. Multivalence holds everywhere else." (Bart Kosko, "Fuzzy Thinking: The new science of fuzzy logic", 1993)

"Fuzziness has a formal name in science: multivalence. The opposite of fuzziness is bivalence or two-valuedness, two ways to answer each question, true or false, 1 or 0. Fuzziness means multivalence. It means three or more options, perhaps an infinite spectrum of options, instead of just two extremes. It means analog instead of binary, infinite shades of gray between black and white." (Bart Kosko, "Fuzzy Thinking: The new science of fuzzy logic", 1993)

"The binary logic of modern computers often falls short when describing the vagueness of the real world. Fuzzy logic offers more graceful alternatives." (Bart Kosko & Satoru Isaka, "Fuzzy Logic,” Scientific American Vol. 269, 1993)

"A bit involves both probability and an experiment that decides a binary or yes-no question. Consider flipping a coin. One bit of in-formation is what we learn from the flip of a fair coin. With an unfair or biased coin the odds are other than even because either heads or tails is more likely to appear after the flip. We learn less from flipping the biased coin because there is less surprise in the outcome on average. Shannon's bit-based concept of entropy is just the average information of the experiment. What we gain in information from the coin flip we lose in uncertainty or entropy." (Bart Kosko, "Noise", 2006)

"A signal has a finite-length frequency spectrum only if it lasts infinitely long in time. So a finite spectrum implies infinite time and vice versa. The reverse also holds in the ideal world of mathematics: A signal is finite in time only if it has a frequency spectrum that is infinite in extent." (Bart Kosko, "Noise", 2006)

"Bell curves don't differ that much in their bells. They differ in their tails. The tails describe how frequently rare events occur. They describe whether rare events really are so rare. This leads to the saying that the devil is in the tails." (Bart Kosko, "Noise", 2006)

"Chaos can leave statistical footprints that look like noise. This can arise from simple systems that are deterministic and not random. [...] The surprising mathematical fact is that most systems are chaotic. Change the starting value ever so slightly and soon the system wanders off on a new chaotic path no matter how close the starting point of the new path was to the starting point of the old path. Mathematicians call this sensitivity to initial conditions but many scientists just call it the butterfly effect. And what holds in math seems to hold in the real world - more and more systems appear to be chaotic." (Bart Kosko, "Noise", 2006)

"'Chaos' refers to systems that are very sensitive to small changes in their inputs. A minuscule change in a chaotic communication system can flip a 0 to a 1 or vice versa. This is the so-called butterfly effect: Small changes in the input of a chaotic system can produce large changes in the output. Suppose a butterfly flaps its wings in a slightly different way. can change its flight path. The change in flight path can in time change how a swarm of butterflies migrates." (Bart Kosko, "Noise", 2006)

"I wage war on noise every day as part of my work as a scientist and engineer. We try to maximize signal-to-noise ratios. We try to filter noise out of measurements of sounds or images or anything else that conveys information from the world around us. We code the transmission of digital messages with extra 0s and 1s to defeat line noise and burst noise and any other form of interference. We design sophisticated algorithms to track noise and then cancel it in headphones or in a sonogram. Some of us even teach classes on how to defeat this nemesis of the digital age. Such action further conditions our anti-noise reflexes." (Bart Kosko, "Noise", 2006)

"Linear systems do not benefit from noise because the output of a linear system is just a simple scaled version of the input [...] Put noise in a linear system and you get out noise. Sometimes you get out a lot more noise than you put in. This can produce explosive effects in feedback systems that take their own outputs as inputs." (Bart Kosko, "Noise", 2006)

"Many scientists who work not just with noise but with probability make a common mistake: They assume that a bell curve is automatically Gauss's bell curve. Empirical tests with real data can often show that such an assumption is false. The result can be a noise model that grossly misrepresents the real noise pattern. It also favors a limited view of what counts as normal versus non-normal or abnormal behavior. This assumption is especially troubling when applied to human behavior. It can also lead one to dismiss extreme data as error when in fact the data is part of a pattern." (Bart Kosko, "Noise", 2006)

"Noise is a signal we don't like. Noise has two parts. The first has to do with the head and the second with the heart. The first part is the scientific or objective part: Noise is a signal. [...] The second part of noise is the subjective part: It deals with values. It deals with how we draw the fuzzy line between good signals and bad signals. Noise signals are the bad signals. They are the unwanted signals that mask or corrupt our preferred signals. They not only interfere but they tend to interfere at random." (Bart Kosko, "Noise", 2006)

"Noise is an unwanted signal. A signal is anything that conveys information or ultimately anything that has energy. The universe consists of a great deal of energy. Indeed a working definition of the universe is all energy anywhere ever. So the answer turns on how one defines what it means to be wanted and by whom." (Bart Kosko, "Noise", 2006)

"The central limit theorem differs from laws of large numbers because random variables vary and so they differ from constants such as population means. The central limit theorem says that certain independent random effects converge not to a constant population value such as the mean rate of unemployment but rather they converge to a random variable that has its own Gaussian bell-curve description." (Bart Kosko, "Noise", 2006)

"The flaw in the classical thinking is the assumption that variance equals dispersion. Variance tends to exaggerate outlying data because it squares the distance between the data and their mean. This mathematical artifact gives too much weight to rotten apples. It can also result in an infinite value in the face of impulsive data or noise. [...] Yet dispersion remains an elusive concept. It refers to the width of a probability bell curve in the special but important case of a bell curve. But most probability curves don't have a bell shape. And its relation to a bell curve's width is not exact in general. We know in general only that the dispersion increases as the bell gets wider. A single number controls the dispersion for stable bell curves and indeed for all stable probability curves - but not all bell curves are stable curves." (Bart Kosko, "Noise", 2006)

More quotes from Bart Kosko at QuotableMath.blogspot.com.

🖍️Tim Harford - Collected Quotes

"An algorithm, meanwhile, is a step-by-step recipe for performing a series of actions, and in most cases 'algorithm' means simply 'computer program'." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Big data is revolutionizing the world around us, and it is easy to feel alienated by tales of computers handing down decisions made in ways we don’t understand. I think we’re right to be concerned. Modern data analytics can produce some miraculous results, but big data is often less trustworthy than small data. Small data can typically be scrutinized; big data tends to be locked away in the vaults of Silicon Valley. The simple statistical tools used to analyze small datasets are usually easy to check; pattern-recognizing algorithms can all too easily be mysterious and commercially sensitive black boxes." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Each decision about what data to gather and how to analyze them is akin to standing on a pathway as it forks left and right and deciding which way to go. What seems like a few simple choices can quickly multiply into a labyrinth of different possibilities. Make one combination of choices and you’ll reach one conclusion; make another, equally reasonable, and you might find a very different pattern in the data." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Each of us is sweating data, and those data are being mopped up and wrung out into oceans of information. Algorithms and large datasets are being used for everything from finding us love to deciding whether, if we are accused of a crime, we go to prison before the trial or are instead allowed to post bail. We all need to understand what these data are and how they can be exploited." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Good statistics are not a trick, although they are a kind of magic. Good statistics are not smoke and mirrors; in fact, they help us see more clearly. Good statistics are like a telescope for an astronomer, a microscope for a bacteriologist, or an X-ray for a radiologist. If we are willing to let them, good statistics help us see things about the world around us and about ourselves - both large and small - that we would not be able to see in any other way." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Ideally, a decision maker or a forecaster will combine the outside view and the inside view - or, similarly, statistics plus personal experience. But it’s much better to start with the statistical view, the outside view, and then modify it in the light of personal experience than it is to go the other way around. If you start with the inside view you have no real frame of reference, no sense of scale - and can easily come up with a probability that is ten times too large, or ten times too small." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"If we don’t understand the statistics, we’re likely to be badly mistaken about the way the world is. It is all too easy to convince ourselves that whatever we’ve seen with our own eyes is the whole truth; it isn’t. Understanding causation is tough even with good statistics, but hopeless without them. [...] And yet, if we understand only the statistics, we understand little. We need to be curious about the world that we see, hear, touch, and smell, as well as the world we can examine through a spreadsheet." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"[…] in a world where so many people seem to hold extreme views with strident certainty, you can deflate somebody’s overconfidence and moderate their politics simply by asking them to explain the details." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"It’d be nice to fondly imagine that high-quality statistics simply appear in a spreadsheet somewhere, divine providence from the numerical heavens. Yet any dataset begins with somebody deciding to collect the numbers. What numbers are and aren’t collected, what is and isn’t measured, and who is included or excluded are the result of all-too-human assumptions, preconceptions, and oversights." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Making big data work is harder than it seems. Statisticians have spent the past two hundred years figuring out what traps lie in wait when we try to understand the world through data. The data are bigger, faster, and cheaper these days, but we must not pretend that the traps have all been made safe. They have not." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Many people have strong intuitions about whether they would rather have a vital decision about them made by algorithms or humans. Some people are touchingly impressed by the capabilities of the algorithms; others have far too much faith in human judgment. The truth is that sometimes the algorithms will do better than the humans, and sometimes they won’t. If we want to avoid the problems and unlock the promise of big data, we’re going to need to assess the performance of the algorithms on a case-by-case basis. All too often, this is much harder than it should be. […] So the problem is not the algorithms, or the big datasets. The problem is a lack of scrutiny, transparency, and debate." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Much of the data visualization that bombards us today is decoration at best, and distraction or even disinformation at worst. The decorative function is surprisingly common, perhaps because the data visualization teams of many media organizations are part of the art departments. They are led by people whose skills and experience are not in statistics but in illustration or graphic design. The emphasis is on the visualization, not on the data. It is, above all, a picture." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Numbers can easily confuse us when they are unmoored from a clear definition." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Premature enumeration is an equal-opportunity blunder: the most numerate among us may be just as much at risk as those who find their heads spinning at the first mention of a fraction. Indeed, if you’re confident with numbers you may be more prone than most to slicing and dicing, correlating and regressing, normalizing and rebasing, effortlessly manipulating the numbers on the spreadsheet or in the statistical package - without ever realizing that you don’t fully understand what these abstract quantities refer to. Arguably this temptation lay at the root of the last financial crisis: the sophistication of mathematical risk models obscured the question of how, exactly, risks were being measured, and whether those measurements were something you’d really want to bet your global banking system on." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Sample error reflects the risk that, purely by chance, a randomly chosen sample of opinions does not reflect the true views of the population. The 'margin of error' reported in opinion polls reflects this risk, and the larger the sample, the smaller the margin of error. […] sampling error has a far more dangerous friend: sampling bias. Sampling error is when a randomly chosen sample doesn’t reflect the underlying population purely by chance; sampling bias is when the sample isn’t randomly chosen at all." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Statistical metrics can show us facts and trends that would be impossible to see in any other way, but often they’re used as a substitute for relevant experience, by managers or politicians without specific expertise or a close-up view." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Statisticians are sometimes dismissed as bean counters. The sneering term is misleading as well as unfair. Most of the concepts that matter in policy are not like beans; they are not merely difficult to count, but difficult to define. Once you’re sure what you mean by 'bean', the bean counting itself may come more easily. But if we don’t understand the definition, then there is little point in looking at the numbers. We have fooled ourselves before we have begun."(Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"So information is beautiful - but misinformation can be beautiful, too. And producing beautiful misinformation is becoming easier than ever." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"The contradiction between what we see with our own eyes and what the statistics claim can be very real. […] The truth is more complicated. Our personal experiences should not be dismissed along with our feelings, at least not without further thought. Sometimes the statistics give us a vastly better way to understand the world; sometimes they mislead us. We need to be wise enough to figure out when the statistics are in conflict with everyday experience - and in those cases, which to believe." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"The world is full of patterns that are too subtle or too rare to detect by eyeballing them, and a pattern doesn’t need to be very subtle or rare to be hard to spot without a statistical lens." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"The whole discipline of statistics is built on measuring or counting things. […] it is important to understand what is being measured or counted, and how. It is surprising how rarely we do this. Over the years, as I found myself trying to lead people out of statistical mazes week after week, I came to realize that many of the problems I encountered were because people had taken a wrong turn right at the start. They had dived into the mathematics of a statistical claim - asking about sampling errors and margins of error, debating if the number is rising or falling, believing, doubting, analyzing, dissecting - without taking the ti- me to understand the first and most obvious fact: What is being measured, or counted? What definition is being used?" (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Those of us in the business of communicating ideas need to go beyond the fact-check and the statistical smackdown. Facts are valuable things, and so is fact-checking. But if we really want people to understand complex issues, we need to engage their curiosity. If people are curious, they will learn." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Unless we’re collecting data ourselves, there’s a limit to how much we can do to combat the problem of missing data. But we can and should remember to ask who or what might be missing from the data we’re being told about. Some missing numbers are obvious […]. Other omissions show up only when we take a close look at the claim in question." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"We don’t need to become emotionless processors of numerical information - just noticing our emotions and taking them into account may often be enough to improve our judgment. Rather than requiring superhuman control over our emotions, we need simply to develop good habits." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"We filter new information. If it accords with what we expect, we’ll be more likely to accept it. […] Our brains are always trying to make sense of the world around us based on incomplete information. The brain makes predictions about what it expects, and tends to fill in the gaps, often based on surprisingly sparse data." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"We should conclude nothing because that pair of numbers alone tells us very little. If we want to understand what’s happening, we need to step back and take in a broader perspective." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"[…] when it comes to interpreting the world around us, we need to realize that our feelings can trump our expertise. […] The more extreme the emotional reaction, the harder it is to think straight. […] It is not easy to master our emotions while assessing information that matters to us, not least because our emotions can lead us astray in different directions. […] We often find ways to dismiss evidence that we don’t like. And the opposite is true, too: when evidence seems to support our preconceptions, we are less likely to look too closely for flaws." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"When we are trying to understand a statistical claim - any statistical claim - we need to start by asking ourselves what the claim actually means. [...] A surprising statistical claim is a challenge to our existing worldview. It may provoke an emotional response - even a fearful one." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020) 

🖍️Nikola K Kasabov - Collected Quotes

"A strategy is usually expressed by a set of heuristic rules. The heuristic rules ease the process of searching for an optimal solution. The process is usually iterative and at one step either the global optimum for the whole problem (state) space is found and the process stops, or a local optimum for a subspace of the state space of the problem is found and the problem continues, if it is possible to improve." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Adaptation is the process of changing a system during its operation in a dynamically changing environment. Learning and interaction are elements of this process. Without adaptation there is no intelligence." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

 "An artificial neural network (or simply a neural network) is a biologically inspired computational model that consists of processing elements (neurons) and connections between them, as well as of training and recall algorithms." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Artificial intelligence comprises methods, tools, and systems for solving problems that normally require the intelligence of humans. The term intelligence is always defined as the ability to learn effectively, to react adaptively, to make proper decisions, to communicate in language or images in a sophisticated way, and to understand." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996) 

"Data obtained without any external disturbance or corruption are called clean; noisy data mean that a small random ingredient is added to the clean data." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Fuzzy systems are excellent tools for representing heuristic, commonsense rules. Fuzzy inference methods apply these rules to data and infer a solution. Neural networks are very efficient at learning heuristics from data. They are 'good problem solvers' when past data are available. Both fuzzy systems and neural networks are universal approximators in a sense, that is, for a given continuous objective function there will be a fuzzy system and a neural network which approximate it to any degree of accuracy." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Fuzzy systems are rule-based expert systems based on fuzzy rules and fuzzy inference. Fuzzy rules represent in a straightforward way 'commonsense' knowledge and skills, or knowledge that is subjective, ambiguous, vague, or contradictory." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Generalization is the process of matching new, unknown input data with the problem knowledge in order to obtain the best possible solution, or one close to it. Generalization means reacting properly to new situations, for example, recognizing new images, or classifying new objects and situations. Generalization can also be described as a transition from a particular object description to a general concept description. This is a major characteristic of all intelligent systems." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996) 

"Generally speaking, problem knowledge for solving a given problem may consist of heuristic rules or formulas that comprise the explicit knowledge, and past-experience data that comprise the implicit, hidden knowledge. Knowledge represents links between the domain space and the solution space, the space of the independent variables and the space of the dependent variables." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Heuristic (it is of Greek origin) means discovery. Heuristic methods are based on experience, rational ideas, and rules of thumb. Heuristics are based more on common sense than on mathematics. Heuristics are useful, for example, when the optimal solution needs an exhaustive search that is not realistic in terms of time. In principle, a heuristic does not guarantee the best solution, but a heuristic solution can provide a tremendous shortcut in cost and time." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Heuristic methods may aim at local optimization rather than at global optimization, that is, the algorithm optimizes the solution stepwise, finding the best solution at each small step of the solution process and 'hoping' that the global solution, which comprises the local ones, would be satisfactory." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Inference is the process of matching current facts from the domain space to the existing knowledge and inferring new facts. An inference process is a chain of matchings. The intermediate results obtained during the inference process are matched against the existing knowledge. The length of the chain is different. It depends on the knowledge base and on the inference method applied." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Learning is the process of obtaining new knowledge. It results in a better reaction to the same inputs at the next session of operation. It means improvement. It is a step toward adaptation. Learning is a major characteristic of intelligent systems." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Prediction (forecasting) is the process of generating information for the possible future development of a process from data about its past and its present development." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Representation is the process of transforming existing problem knowledge to some of the known knowledge-engineering schemes in order to process it by applying knowledge-engineering methods. The result of the representation process is the problem knowledge base in a computer format." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"The most distinguishing property of fuzzy logic is that it deals with fuzzy propositions, that is, propositions which contain fuzzy variables and fuzzy values, for example, 'the temperature is high', 'the height is short'. The truth values for fuzzy propositions are not TRUE/FALSE only, as is the case in propositional boolean logic, but include all the grayness between two extreme values." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Validation is the process of testing how good the solutions produced by a system are. The results produced by a system are usually compared with the results obtained either by experts or by other systems. Validation is an extremely important part of the process of developing every knowledge-based system. Without comparing the results produced by the system with reality, there is little point in using it." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

11 April 2006

🖍️Matthew Kirk - Collected Quotes

"A good proxy for complexity in a machine learning model is how fast it takes to train it." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Cross-validation is a method of splitting all of your data into two parts: training and validation. The training data is used to build the machine learning model, whereas the validation data is used to validate that the model is doing what is expected. This increases our ability to find and determine the underlying errors in a model." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"In statistics, there is a measure called power that denotes the probability of not finding a false negative. As power goes up, false negatives go down. However, what influences this measure is the sample size. If our sample size is too small, we just don’t have enough information to come up with a good solution." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Machine learning is a science and requires an objective approach to problems. Just like the scientific method, test-driven development can aid in solving a problem. The reason that TDD and the scientific method are so similar is because of these three shared characteristics: Both propose that the solution is logical and valid. Both share results through documentation and work over time. Both work in feedback loops." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Machine learning is the intersection between theoretically sound computer science and practically noisy data. Essentially, it’s about machines making sense out of data in much the same way that humans do." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Machine learning is well suited for the unpredictable future, because most algorithms learn from new information. But as new information is found, it can also come in unstable forms, and new issues can arise that weren’t thought of before. We don’t know what we don’t know. When processing new information, it’s sometimes hard to tell whether our model is working." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Precision and recall are ways of monitoring the power of the machine learning implementation. Precision is a metric that monitors the percentage of true positives. […] Recall is the ratio of true positives to true positive plus false negatives." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Supervised learning, or function approximation, is simply fitting data to a function of any variety.  […] Unsupervised learning involves figuring out what makes the data special. […] Reinforcement learning involves figuring out how to play a multistage game with rewards and payoffs. Think of it as the algorithms that optimize the life of something." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Underfitting is when a model doesn’t take into account enough information to accurately model real life. For example, if we observed only two points on an exponential curve, we would probably assert that there is a linear relationship there. But there may not be a pattern, because there are only two points to reference. [...] It seems that the best way to mitigate underfitting a model is to give it more information, but this actually can be a problem as well. More data can mean more noise and more problems. Using too much data and too complex of a model will yield something that works for that particular data set and nothing else." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

🖍️DeWayne R Derryberry - Collected Quotes

"A complete data analysis will involve the following steps: (i) Finding a good model to fit the signal based on the data. (ii) Finding a good model to fit the noise, based on the residuals from the model. (iii) Adjusting variances, test statistics, confidence intervals, and predictions, based on the model for the noise.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"A key difference between a traditional statistical problems and a time series problem is that often, in time series, the errors are not independent." (DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"A stationary time series is one that has had trend elements (the signal) removed and that has a time invariant pattern in the random noise. In other words, although there is a pattern of serial correlation in the noise, that pattern seems to mimic a fixed mathematical model so that the same model fits any arbitrary, contiguous subset of the noise." (DeWayne R Derryberry, "Basic Data Analysis for Time Series with R" 1st Ed, 2014)

"A wide variety of statistical procedures (regression, t-tests, ANOVA) require three assumptions: (i) Normal observations or errors. (ii) Independent observations (or independent errors, which is equivalent, in normal linear models to independent observations). (iii) Equal variance - when that is appropriate (for the one-sample t-test, for example, there is nothing being compared, so equal variances do not apply).(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"Both real and simulated data are very important for data analysis. Simulated data is useful because it is known what process generated the data. Hence it is known what the estimated signal and noise should look like (simulated data actually has a well-defined signal and well-defined noise). In this setting, it is possible to know, in a concrete manner, how well the modeling process has worked." (DeWayne R Derryberry, "Basic Data Analysis for Time Series with R" 1st Ed, 2014)

"Either a logarithmic or a square-root transformation of the data would produce a new series more amenable to fit a simple trigonometric model. It is often the case that periodic time series have rounded minima and sharp-peaked maxima. In these cases, the square root or logarithmic transformation seems to work well most of the time.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"For a confidence interval, the central limit theorem plays a role in the reliability of the interval because the sample mean is often approximately normal even when the underlying data is not. A prediction interval has no such protection. The shape of the interval reflects the shape of the underlying distribution. It is more important to examine carefully the normality assumption by checking the residuals […].(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"If the observations/errors are not independent, the statistical formulations are completely unreliable unless corrections can be made.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"Not all data sets lend themselves to data splitting. The data set may be too small to split and/or the fitted model may be a local smoother. In the first case, there is too little data upon which to build a model if the data is split; and in the second case, it is not expected the model for any part of the data to directly interpolate/extrapolate to any other part of the model. For these cases, a different approach to cross-validation is possible, something similar to bootstrapping." (DeWayne R Derryberry, "Basic Data Analysis for Time Series with R" 1st Ed, 2014)

"Once a model has been fitted to the data, the deviations from the model are the residuals. If the model is appropriate, then the residuals mimic the true errors. Examination of the residuals often provides clues about departures from the modeling assumptions. Lack of fit - if there is curvature in the residuals, plotted versus the fitted values, this suggests there may be whole regions where the model overestimates the data and other whole regions where the model underestimates the data. This would suggest that the current model is too simple relative to some better model.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"Prediction about the future assumes that the statistical model will continue to fit future data. There are several reasons this is often implausible, but it also seems clear that the model will often degenerate slowly in quality, so that the model will fit data only a few periods in the future almost as well as the data used to fit the model. To some degree, the reliability of extrapolation into the future involves subject-matter expertise.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"[The normality] assumption is the least important one for the reliability of the statistical procedures under discussion. Violations of the normality assumption can be divided into two general forms: Distributions that have heavier tails than the normal and distributions that are skewed rather than symmetric. If data is skewed, the formulas we are discussing are still valid as long as the sample size is sufficiently large. Although the guidance about 'how skewed' and 'how large a sample' can be quite vague, since the greater the skew, the larger the required sample size. For the data commonly used in time series and for the sample sizes (which are generally quite large) used, skew is not a problem. On the other hand, heavy tails can be very problematic." (DeWayne R Derryberry, "Basic Data Analysis for Time Series with R" 1st Ed, 2014)

 "The random element in most data analysis is assumed to be white noise - normal errors independent of each other. In a time series, the errors are often linked so that independence cannot be assumed (the last examples). Modeling the nature of this dependence is the key to time series.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"Transformations of data alter statistics. For example, the mean of a data set can be found, but it is not easy to relate the mean of a data set to the mean of the logarithm of that data set. The median is far friendlier to transformations. If the median of a data set is found, then the logarithm of the data set is analyzed; the median of the log transformed data will be the log of the original median.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014) 

"When data is not normal, the reason the formulas are working is usually the central limit theorem. For large sample sizes, the formulas are producing parameter estimates that are approximately normal even when the data is not itself normal. The central limit theorem does make some assumptions and one is that the mean and variance of the population exist. Outliers in the data are evidence that these assumptions may not be true. Persistent outliers in the data, ones that are not errors and cannot be otherwise explained, suggest that the usual procedures based on the central limit theorem are not applicable.(DeWayne R Derryberry, "Basic data analysis for time series with R", 2014)

"Whenever the data is periodic, at some level, there are only as many observations as the number of complete periods. This global feature of the data suggests caution in understanding more detailed features of the data. While a curvature model might be appropriate for this data, there is too little data to know this, and some skepticism might be in order if such a model were fitted to the data." (DeWayne R Derryberry, "Basic Data Analysis for Time Series with R" 1st Ed, 2014)

🖍️Erik J Larson - Collected Quotes

"A well-known theorem called the 'no free lunch' theorem proves exactly what we anecdotally witness when designing and building learning systems. The theorem states that any bias-free learning system will perform no better than chance when applied to arbitrary problems. This is a fancy way of stating that designers of systems must give the system a bias deliberately, so it learns what’s intended. As the theorem states, a truly bias- free system is useless." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"First, intelligence is situational - there is no such thing as general intelligence. Your brain is one piece in a broader system which includes your body, your environment, other humans, and culture as a whole. Second, it is contextual - far from existing in a vacuum, any individual intelligence will always be both defined and limited by its environment. (And currently, the environment, not the brain, is acting as the bottleneck to intelligence.) Third, human intelligence is largely externalized, contained not in your brain but in your civilization. Think of individuals as tools, whose brains are modules in a cognitive system much larger than themselves - a system that is self-improving and has been for a long time." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"Inference is to bring about a new thought, which in logic amounts to drawing a conclusion, and more generally involves using what we already know, and what we see or observe, to update prior beliefs. […] Inference is also a leap of sorts, deemed reasonable […] Inference is a basic cognitive act for intelligent minds. If a cognitive agent (a person, an AI system) is not intelligent, it will infer badly. But any system that infers at all must have some basic intelligence, because the very act of using what is known and what is observed to update beliefs is inescapably tied up with what we mean by intelligence. If an AI system is not inferring at all, it doesn’t really deserve to be called AI." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"Machine learning bias is typically understood as a source of learning error, a technical problem. […] Machine learning bias can introduce error simply because the system doesn’t 'look' for certain solutions in the first place. But bias is actually necessary in machine learning - it’s part of learning itself." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"People who assume that extensions of modern machine learning methods like deep learning will somehow 'train up', or learn to be intelligent like humans, do not understand the fundamental limitations that are already known. Admitting the necessity of supplying a bias to learning systems is tantamount to Turing’s observing that insights about mathematics must be supplied by human minds from outside formal methods, since machine learning bias is determined, prior to learning, by human designers." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"[...] the focus on Big Data AI seems to be an excuse to put forth a number of vague and hand-waving theories, where the actual details and the ultimate success of neuroscience is handed over to quasi- mythological claims about the powers of large datasets and inductive computation. Where humans fail to illuminate a complicated domain with testable theory, machine learning and big data supposedly can step in and render traditional concerns about finding robust theories. This seems to be the logic of Data Brain efforts today." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"The idea that we can predict the arrival of AI typically sneaks in a premise, to varying degrees acknowledged, that successes on narrow AI systems like playing games will scale up to general intelligence, and so the predictive line from artificial intelligence to artificial general intelligence can be drawn with some confidence. This is a bad assumption, both for encouraging progress in the field toward artificial general intelligence, and for the logic of the argument for prediction." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"The problem-solving view of intelligence helps explain the production of invariably narrow applications of AI throughout its history. Game playing, for instance, has been a source of constant inspiration for the development of advanced AI techniques, but games are simplifications of life that reward simplified views of intelligence. […] Treating intelligence as problem-solving thus gives us narrow applications." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

"To accomplish their goals, what are now called machine learning systems must each learn something specific. Researchers call this giving the machine a 'bias'. […] A bias in machine learning means that the system is designed and tuned to learn something. But this is, of course, just the problem of producing narrow problem-solving applications." (Erik J Larson, "The Myth of Artificial Intelligence: Why Computers Can’t Think the Way We Do", 2021)

10 April 2006

🖍️Steven S Skiena - Collected Quotes

"Bias is error from incorrect assumptions built into the model, such as restricting an interpolating function to be linear instead of a higher-order curve. [...] Errors of bias produce underfit models. They do not fit the training data as tightly as possible, were they allowed the freedom to do so. In popular discourse, I associate the word 'bias' with prejudice, and the correspondence is fairly apt: an apriori assumption that one group is inferior to another will result in less accurate predictions than an unbiased one. Models that perform lousy on both training and testing data are underfit." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Exploratory data analysis is the search for patterns and trends in a given data set. Visualization techniques play an important part in this quest. Looking carefully at your data is important for several reasons, including identifying mistakes in collection/processing, finding violations of statistical assumptions, and suggesting interesting hypotheses." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Repeated observations of the same phenomenon do not always produce the same results, due to random noise or error. Sampling errors result when our observations capture unrepresentative circumstances, like measuring rush hour traffic on weekends as well as during the work week. Measurement errors reflect the limits of precision inherent in any sensing device. The notion of signal to noise ratio captures the degree to which a series of observations reflects a quantity of interest as opposed to data variance. As data scientists, we care about changes in the signal instead of the noise, and such variance often makes this problem surprisingly difficult." (Steven S Skiena, "The Data Science Design Manual", 2017)

"The advent of massive data sets is changing in the way science is done. The traditional scientific method is hypothesis driven. The researcher formulates a theory of how the world works, and then seeks to support or reject this hypothesis based on data. By contrast, data-driven science starts by assembling a substantial data set, and then hunts for patterns that ideally will play the role of hypotheses for future analysis." (Steven S Skiena, "The Data Science Design Manual", 2017)

"The danger of overfitting is particularly severe when the training data is not a perfect gold standard. Human class annotations are often subjective and inconsistent, leading boosting to amplify the noise at the expense of the signal. The best boosting algorithms will deal with overfitting though regularization. The goal will be to minimize the number of non-zero coefficients, and avoid large coefficients that place too much faith in any one classifier in the ensemble." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Using noise (the uncorrelated variables) to fit noise (the residual left from a simple model on the genuinely correlated variables) is asking for trouble." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Variables which follow symmetric, bell-shaped distributions tend to be nice as features in models. They show substantial variation, so they can be used to discriminate between things, but not over such a wide range that outliers are overwhelming." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Variance is error from sensitivity to fluctuations in the training set. If our training set contains sampling or measurement error, this noise introduces variance into the resulting model. [...] Errors of variance result in overfit models: their quest for accuracy causes them to mistake noise for signal, and they adjust so well to the training data that noise leads them astray. Models that do much better on testing data than training data are overfit." (Steven S Skiena, "The Data Science Design Manual", 2017)

🖍️Arthur L Bowley - Collected Quotes

"A knowledge of statistics is like a knowledge of foreign languages or of algebra; it may prove of use at any time under any circumstances." (Arthur L Bowley, "Elements of Statistics", 1901)

"A statistical estimate may be good or bad, accurate or the reverse; but in almost all cases it is likely to be more accurate than a casual observer’s impression, and the nature of things can only be disproved by statistical methods." (Arthur L Bowley, "Elements of Statistics", 1901)

"Great numbers are not counted correctly to a unit, they are estimated; and we might perhaps point to this as a division between arithmetic and statistics, that whereas arithmetic attains exactness, statistics deals with estimates, sometimes very accurate, and very often sufficiently so for their purpose, but never mathematically exact." (Arthur L Bowley, "Elements of Statistics", 1901)

"Some of the common ways of producing a false statistical argument are to quote figures without their context, omitting the cautions as to their incompleteness, or to apply them to a group of phenomena quite different to that to which they in reality relate; to take these estimates referring to only part of a group as complete; to enumerate the events favorable to an argument, omitting the other side; and to argue hastily from effect to cause, this last error being the one most often fathered on to statistics. For all these elementary mistakes in logic, statistics is held responsible." (Arthur L Bowley, "Elements of Statistics", 1901)

"[…] statistics is the science of the measurement of the social organism, regarded as a whole, in all its manifestations." (Arthur L Bowley, "Elements of Statistics", 1901)

"Statistics may rightly be called the science of averages. […] Great numbers and the averages resulting from them, such as we always obtain in measuring social phenomena, have great inertia. […] It is this constancy of great numbers that makes statistical measurement possible. It is to great numbers that statistical measurement chiefly applies." (Arthur L Bowley, "Elements of Statistics", 1901)

"Statistics may, for instance, be called the science of counting. Counting appears at first sight to be a very simple operation, which any one can perform or which can be done automatically; but, as a matter of fact, when we come to large numbers, e.g., the population of the United Kingdom, counting is by no means easy, or within the power of an individual; limits of time and place alone prevent it being so carried out, and in no way can absolute accuracy be obtained when the numbers surpass certain limits." (Arthur L Bowley, "Elements of Statistics", 1901)

"By [diagrams] it is possible to present at a glance all the facts which could be obtained from figures as to the increase,  fluctuations, and relative importance of prices, quantities, and values of different classes of goods and trade with various countries; while the sharp irregularities of the curves give emphasis to the disturbing causes which produce any striking change." (Arthur L Bowley, "A Short Account of England's Foreign Trade in the Nineteenth Century, its Economic and Social Results", 1905)

"Of itself an arithmetic average is more likely to conceal than to disclose important facts; it is the nature of an abbreviation, and is often an excuse for laziness." (Arthur L Bowley, "The Nature and Purpose of the Measurement of Social Phenomena", 1915)

"[...] the problems of the errors that arise in the process of sampling have been chiefly discussed from the point of view of the universe, not of the sample; that is, the question has been how far will a sample represent a given universe? The practical question is, however, the converse: what can we infer about a universe from a given sample? This involves the difficult and elusive theory of inverse probability, for it may be put in the form, which of the various universes from which the sample may a priori have been drawn may be expected to have yielded that sample?" (Arthur L Bowley, "Elements of Statistics. 5th Ed., 1926)

"Statistics are numerical statements of facts in any department of inquiry, placed in relation to each other; statistical methods are devices for abbreviating and classifying the statements and making clear the relations." (Arthur L Bowley, "An Elementary Manual of Statistics", 1934)
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.