01 January 2026

🔭Data Science: Policies (Just the Quotes)

"Every economic and social situation or problem is now described in statistical terms, and we feel that it is such statistics which give us the real basis of fact for understanding and analysing problems and difficulties, and for suggesting remedies. In the main we use such statistics or figures without any elaborate theoretical analysis; little beyond totals, simple averages and perhaps index numbers. Figures have become the language in which we describe our economy or particular parts of it, and the language in which we argue about policy." (Ely Devons,Essays in Economics", 1961)

"There are, indeed, plenty of ways in which statistics can help in the process of decision-taking. But exaggerated claims for the role they can play merely serve to confuse rather than clarify issues of public policy, and lead those responsible for action to oscillate between over-confidence and over-scepticism in using them." (Ely Devons,Essays in Economics", 1961)

"The formal structure of a decision problem in any area can be put into four parts: (1) the choice of an objective function denning the relative desirability of different outcomes; (2) specification of the policy alternatives which are available to the agent, or decisionmaker, (3) specification of the model, that is, empirical relations that link the objective function, or the variables that enter into it, with the policy alternatives and possibly other variables; and (4) computational methods for choosing among the policy alternatives that one which performs best as measured by the objective function." (Kenneth Arrow,The Economics of Information", 1984)

"Often, though, a policy or systems analyst is stuck with a bad model, that is, one that appeals to the analyst as adequately realistic but which is either: 1) contradicted by some data or is grossly implausible in some aspect it purports to represent, or 2) conjectural, that is, neither supported nor contradicted by data, either because data do not exist or because they are equivocal. [...] A model may have component parts that are not bad, but if, taken as a whole, it meets one of these criteria, it is a bad model." (James S Hodges, "Six (or So) Things You Can Do with a Bad Model", 1991)

"Management is not founded on observation and experiment, but on a drive towards a set of outcomes. These aims are not altogether explicit; at one extreme they may amount to no more than an intention to preserve the status quo, at the other extreme they may embody an obsessional demand for power, profit or prestige. But the scientist's quest for insight, for understanding, for wanting to know what makes the system tick, rarely figures in the manager's motivation. Secondly, and therefore, management is not, even in intention, separable from its own intentions and desires: its policies express them. Thirdly, management is not normally aware of the conventional nature of its intellectual processes and control procedures. It is accustomed to confuse its conventions for recording information with truths-about-the-business, its subjective institutional languages for discussing the business with an objective language of fact and its models of reality with reality itself." (Stanford Beer,Decision and Control", 1994)

"Garbage in, garbage out' is a sound warning for those in the computer field; it is every bit as sound in the use of statistics. Even if the 'garbage' which comes out leads to a correct conclusion, this conclusion is still tainted, as it cannot be supported by logical reasoning. Therefore, it is a misuse of statistics. But obtaining a correct conclusion from faulty data is the exception, not the rule. Bad basic data" (the 'garbage in') almost always leads to incorrect conclusions" (the 'garbage out'). Unfortunately, incorrect conclusions can lead to bad policy or harmful actions." (Herbert F Spirer et al,Misused Statistics" 2nd Ed, 1998)

"A sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Differently from supervised learning, in this case there is no target value for each input pattern, only a reward based of how good or bad was the action taken by the agent in the existent environment." (Marley Vellasco et al,Hierarchical Neuro-Fuzzy Systems" Part II, Encyclopedia of Artificial Intelligence, 2009)

"There are three possible reasons for [the] absence of predictive power. First, it is possible that the models are misspecified. Second, it is possible that the model’s explanatory factors are measured at too high a level of aggregation [...] Third, [...] the search for statistically significant relationships may not be the strategy best suited for evaluating our model’s ability to explain real world events [...] the lack of predictive power is the result of too much emphasis having been placed on finding statistically significant variables, which may be overdetermined. Statistical significance is generally a flawed way to prune variables in regression models [...] Statistically significant variables may actually degrade the predictive accuracy of a model [...] [By using] models that are constructed on the basis of pruning undertaken with the shears of statistical significance, it is quite possible that we are winnowing our models away from predictive accuracy." (Michael D Ward et al,The perils of policy by p-value: predicting civil conflicts" Journal of Peace Research 47, 2010)

"Using random processes in our models allows economists to capture the variability of time series data, but it also poses challenges to model builders. As model builders, we must understand the uncertainty from two different perspectives. Consider first that of the econometrician, standing outside an economic model, who must assess its congruence with reality, inclusive of its random perturbations. An econometrician’s role is to choose among different parameters that together describe a family of possible models to best mimic measured real world time series and to test the implications of these models. I refer to this as outside uncertainty. Second, agents inside our model, be it consumers, entrepreneurs, or policy makers, must also confront uncertainty as they make decisions. I refer to this as inside uncertainty, as it pertains to the decision-makers within the model. What do these agents know? From what information can they learn? With how much confidence do they forecast the future? The modeler’s choice regarding insiders’ perspectives on an uncertain future can have significant consequences for each model’s equilibrium outcomes." (Lars P Hansen,Uncertainty Outside and Inside Economic Models", [Nobel lecture] 2013)

"Comparisons are the lifeblood of empirical studies. We can’t determine if a medicine, treatment, policy, or strategy is effective unless we compare it to some alternative. But watch out for superficial comparisons: comparisons of percentage changes in big numbers and small numbers, comparisons of things that have nothing in common except that they increase over time, comparisons of irrelevant data. All of these are like comparing apples to prunes." (Gary Smith,Standard Deviations", 2014)

"it stands, in the context of computational learning, for a family of algorithms aimed at approximating the best policy to play in a certain environment" (without building an explicit model of it) by increasing the probability of playing actions that improve the rewards received by the agent." (Fernando S Oliveira,Reinforcement Learning for Business Modeling", 2014)

"We know what forecasting is: you start in the present and try to look into the future and imagine what it will be like. Backcasting is the opposite: you state your desired vision of the future as if it’s already happened, and then work backward to imagine the practices, policies, programs, tools, training, and people who worked in concert in a hypothetical past" (which takes place in the future) to get you there." (Eben Hewitt,Technology Strategy Patterns: Architecture as strategy" 2nd Ed., 2019)

"Once we know something is fat-tailed, we can use heuristics to see how an exposure there reacts to random events: how much is a given unit harmed by them. It is vastly more effective to focus on being insulated from the harm of random events than try to figure them out in the required details" (as we saw the inferential errors under thick tails are huge). So it is more solid, much wiser, more ethical, and more effective to focus on detection heuristics and policies rather than fabricate statistical properties." (Nassim N Taleb,Statistical Consequences of Fat Tails: Real World Preasymptotics, Epistemology, and Applications" 2nd Ed., 2022)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.