Showing posts with label structure. Show all posts
Showing posts with label structure. Show all posts

07 April 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 161: Why Guardrails and Safety Mechanisms Are Essential for Trustworthy AI)

Prompt Engineering Series
Prompt Engineering Series


Prompt: "write a post of 600 words on the impact of guardrails and safety mechanisms in AI"

Introduction

As Artificial Intelligence (AI) becomes more powerful and more deeply integrated into our daily lives, one truth becomes impossible to ignore: capability alone isn’t enough. For AI to be genuinely useful, trustworthy, and safe, it must operate within a framework of well‑designed guardrails and safety mechanisms. These systems don’t limit innovation—they enable it. They ensure that AI behaves predictably, respects boundaries, and aligns with human values even as its capabilities grow.

Guardrails: The Structure That Keeps AI on Course

Guardrails are the rules, constraints, and design principles that define what an AI system should and should not do. They act like the lane markers on a highway - guiding the system toward its intended destination while preventing it from veering into dangerous territory.

Effective guardrails help AI:

  • Avoid harmful or inappropriate outputs
  • Stay within its domain of expertise
  • Respect ethical and legal boundaries
  • Interpret user requests safely and responsibly

Without guardrails, even well‑trained models can misinterpret intent, generate unsafe content, or take actions that conflict with human expectations. Guardrails don’t restrict intelligence - they shape it into something reliable.

Safety Mechanisms: The Fail‑Safes That Protect Users

Safety mechanisms complement guardrails by providing additional layers of protection. They monitor the AI’s behavior, detect potential risks, and intervene when necessary. Think of them as the airbags and anti‑lock brakes of AI systems - features you hope never activate, but you’re grateful for when they do.

These mechanisms include:

  • Content filters
  • Context‑aware refusal systems
  • Bias detection and mitigation tools
  • Monitoring systems that detect harmful patterns
  • Fallback responses when uncertainty is high

Together, they ensure that AI systems remain stable and responsible even in ambiguous or high‑risk situations.

Why Guardrails and Safety Matter More as AI Grows More Capable

As AI models become more advanced, they also become more sensitive to subtle cues in language and more capable of generating complex, high‑impact outputs. This increased capability amplifies both the potential benefits and the potential risks.

Guardrails and safety mechanisms help manage this complexity by:

  • Reducing the likelihood of harmful mistakes
  • Ensuring consistent behavior across diverse scenarios
  • Protecting users from unintended consequences
  • Maintaining trust in AI systems as they scale

In other words, the more powerful the AI, the more essential its safety infrastructure becomes.

The Balance Between Capability and Control

A common misconception is that guardrails limit creativity or reduce the usefulness of AI. In reality, they do the opposite. By providing structure and boundaries, guardrails allow AI systems to operate confidently and consistently. They reduce uncertainty, which in turn makes the AI more dependable and easier to integrate into real‑world workflows.

This balance - capability supported by control - is what enables AI to be both innovative and responsible.

Building Trust Through Safety

Trust is the currency of modern AI. Users need to know that the system will behave ethically, respect boundaries, and avoid causing harm. Guardrails and safety mechanisms are the foundation of that trust.

They help ensure that AI systems:

  • Communicate responsibly
  • Handle sensitive topics with care
  • Avoid generating harmful or misleading content
  • Stay aligned with human expectations

When users trust AI, they’re more willing to adopt it, rely on it, and explore its full potential.

The Bottom Line

Guardrails and safety mechanisms aren’t optional - they’re essential. They transform raw capability into responsible intelligence. They protect users, support ethical behavior, and ensure that AI systems remain aligned with human values even as they grow more powerful

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

06 April 2026

🤖〽️Prompt Engineering: Copilot Unabridged (Part 160: How Structured Prompting and Clear User Intent Unlock the Full Power of AI)

Prompt Engineering Series
Prompt Engineering Series



Prompt: "write a post of 600 words on the impact of consistent and high‑quality training data on AI"

Introduction

Artificial Intelligence (AI) has become remarkably capable, but even the most advanced models depend on one crucial ingredient to perform at their best: the clarity of the instructions they receive. Structured prompting and clear user intent aren’t just helpful—they fundamentally shape the quality, accuracy, and reliability of an AI system’s output. When users articulate what they want with precision and structure, AI becomes more predictable, more aligned, and far more effective.

Clear Intent: The Foundation of Effective AI Interaction

AI models don’t read minds. They infer meaning (?) from the words, patterns, and context provided to them. When user intent is vague, the model must guess - and that guess (?) can drift away from what the user actually needs.

Clear intent helps AI:

  • Understand [interpret] the goal behind the request
  • Prioritize the right information
  • Avoid unnecessary assumptions
  • Produce responses that match the user’s expectations

For example, asking an AI to 'summarize this document' is far less effective than saying, 'Summarize this document in three bullet points for a technical audience.' The second version gives the model direction, constraints, and purpose. It transforms a generic task into a targeted one.

In essence, clear intent reduces ambiguity, and ambiguity is the enemy of precision.

Structured Prompting: Giving AI the Blueprint It Needs

Structured prompting takes clarity a step further. It organizes instructions in a way that mirrors how AI models process information - logically, sequentially, and contextually. Instead of a single block of text, structured prompts break the task into components.

This might include:

  • Step‑by‑step instructions
  • Defined roles ('Act as a data analyst…')
  • Formatting requirements
  • Examples of desired output
  • Constraints or exclusions

These structures act like scaffolding. They guide the model’s reasoning, reduce misinterpretation, and help the AI stay aligned with the user’s expectations throughout the task.

A well‑structured prompt doesn’t just tell the AI what to do - it shows it how to think about the task.

The Synergy Between Intent and Structure

Clear intent and structured prompting are powerful on their own, but together they create a kind of conversational precision that dramatically improves AI performance.

When both are present, AI systems become:

  • More accurate, because they understand the target
  • More consistent, because the structure reduces randomness
  • More efficient, because they require fewer iterations
  • More aligned, because the user’s expectations are explicit

This synergy is especially important in high‑stakes environments like healthcare, finance, legal analysis, and enterprise automation, where misunderstandings can have real consequences.

Why This Matters as AI Becomes More Capable

As AI systems grow more advanced, they also become more sensitive to the nuances of human instruction. A small shift in phrasing can lead to a large shift in output. Clear intent and structured prompting act as stabilizers - they ensure that increased capability doesn’t come at the cost of unpredictability.

They also democratize AI. You don’t need to be a machine learning expert to get expert‑level results. You just need to communicate with purpose and structure.

The Bottom Line

Structured prompting and clear user intent aren’t just techniques - they’re the keys to unlocking AI’s full potential. They transform AI from a reactive tool into a collaborative partner. They reduce ambiguity, increase alignment, and create outputs that are more useful, more reliable, and more reflective of what humans actually want.

As AI continues to evolve, the ability to express intent clearly and structure prompts thoughtfully will become one of the most valuable skills in the digital world. It’s not about speaking the AI’s language - it’s about helping the AI understand yours.

Disclaimer: The whole text was generated by Copilot (under Windows 11) at the first attempt. This is just an experiment to evaluate feature's ability to answer standard general questions, independently on whether they are correctly or incorrectly posed. Moreover, the answers may reflect hallucinations and other types of inconsistent or incorrect reasoning.

Previous Post <<||>> Next Post

01 September 2024

🗄️Data Management: Data Governance (Part I: No Guild of Heroes)

Data Management Series
Data Management Series

Data governance appeared around 1980s as topic though it gained popularity in early 2000s [1]. Twenty years later, organizations still miss the mark, respectively fail to understand and implement it in a consistent manner. As usual, the reasons for failure are multiple and they vary from misunderstanding what governance is all about to poor implementation of methodologies and inadequate management or leadership. 

Moreover, methodologies tend to idealize the various aspects and is not what organizations need, but pragmatism. For example, data governance is not about heroes and heroism [2], which can give the impression that heroic actions are involved and is not the case! Actions for the sake of action don’t necessarily lead to change by themselves. Organizations are in general good at creating meaningless action without results, especially when people preoccupy themselves, miss or ignore the mark. Big organizations are very good at generating actions without effects. 

People do talk to each other, though they try to solve their own problems and optimize their own areas without necessarily thinking about the bigger picture. The problem is not necessarily communication or the lack of depth into business issues, people do communicate, know the issues without a business impact assessment. The challenge is usually in convincing the upper management that the effort needs to be consolidated, supported, respectively the needed resources made available. 

Probably, one of the issues with data governance is the attempt of creating another structure in the organization focused on quality, which has the chances to fail, and unfortunately does fail. Many issues appear when the structure gains weight and it becomes a separate entity instead of being the backbone of organizations. 

As soon organizations separate the data governance from the key users, management and the other important decisional people in the organization, it takes a life of its own that has the chances to diverge from the initial construct. Then, organizations need "alignment" and probably other big words to coordinate the effort. Also such constructs can work but they are suboptimal because the forces will always pull in different directions.

Making each manager and the upper management responsible for governance is probably the way to go, though they’ll need the time for it. In theory, this can be achieved when many of the issues are solved at the lower level, when automation and further aspects allow them to supervise things, rather than hiding behind every issue. 

When too much mircomanagement is involved, people tend to busy themselves with topics rather than solve the issues they are confronted with. The actual actors need to be empowered to take decisions and optimize their work when needed. Kaizen, the philosophy of continuous improvement, proved itself that it works when applied correctly. They’ll need the knowledge, skills, time and support to do it though. One of the dangers is however that this becomes a full-time responsibility, which tends to create a separate entity again.

The challenge for organizations lies probably in the friction between where they are and what they must do to move forward toward the various objectives. Moving in small rapid steps is probably the way to go, though each person must be aware when something doesn’t work as expected and react. That’s probably the most important aspect. 

So, the more functions are created that diverge from the actual organization, the higher the chances for failure. Unfortunately, failure is visible in the later phases, and thus self-awareness, self-control and other similar “qualities” are needed, like small actors that keep the system in check and react whenever is needed. Ideally, the employees are the best resources to react whenever something doesn’t work as per design. 

Previous Post <<||>> Next Post 

Resources:
[1] Wikipedia (2023) Data Management [link]
[2] Tiankai Feng (2023) How to Turn Your Data Team Into Governance Heroes [link]


19 March 2024

𖣯Strategic Management: Inflection Points and the Data Mesh (Quote of the Day)

Strategic Management
Strategic Management Series

"Data mesh is what comes after an inflection point, shifting our approach, attitude, and technology toward data. Mathematically, an inflection point is a magic moment at which a curve stops bending one way and starts curving in the other direction. It’s a point that the old picture dissolves, giving way to a new one. [...] The impacts affect business agility, the ability to get value from data, and resilience to change. In the center is the inflection point, where we have a choice to make: to continue with our existing approach and, at best, reach a plateau of impact or take the data mesh approach with the promise of reaching new heights." [1]

I tried to understand the "metaphor" behind the quote. As the author through another quote pinpoints, the metaphor is borrowed from Andrew Groove:

"An inflection point occurs where the old strategic picture dissolves and gives way to the new, allowing the business to ascend to new heights. However, if you don’t navigate your way through an inflection point, you go through a peak and after the peak the business declines. [...] Put another way, a strategic inflection point is when the balance of forces shifts from the old structure, from the old ways of doing business and the old ways of competing, to the new." [2]

The second part of the quote clarifies the role of the inflection point - the shift from a structure, respectively organization or system to a new one. The inflection point is not when we take a decision, but when the decision we took, and the impact shifts the balance. If the data mesh comes after the inflection point (see A), then there must be some kind of causality that converges uniquely toward the data mesh, which is questionable, if not illogical. A data mesh eventually makes sense after organizations reached a certain scale and thus is likely improbable to be adopted by small to medium businesses. Even for large organizations the data mesh may not be a viable solution if it doesn't have a proven record of success. 

I could understand if the author would have said that the data mesh will lead to an inflection point after its adoption, as is the case of transformative/disruptive technologies. Unfortunately, the tracking record of BI and Data Analytics projects doesn't give many hopes for such a magical moment to happen. Probably, becoming a data-driven organization could have such an effect, though for many organizations the effects are still far from expectations. 

There's another point to consider. A curve with inflection points can contain up and down concavities (see B) or there can be multiple curves passing through an inflection point (see C) and the continuation can be on any of the curves.

Examples of Inflection Points [3]

The change can be fast or slow (see D), and in the latter it may take a long time for change to be perceived. Also [2] notes that the perception that something changed can happen in stages. Moreover, the inflection point can be only local and doesn't describe the future evolution of the curve, which to say that the curve can change the trajectory shortly after that. It happens in business processes and policy implementations that after a change was made in extremis to alleviate an issue a slight improvement is recognized after which the performance decays sharply. It's the case of situations in which the symptoms and not the root causes were addressed. 

More appropriate to describe the change would be a tipping point, which can be defined as a critical threshold beyond which a system (the organization) reorganizes/changes, often abruptly and/or irreversible.

Previous Post <<||>> Next Post

References:
[1] Zhamak Dehghani (2021) Data Mesh: Delivering Data-Driven Value at Scale (book review)
[2] Andrew S Grove (1988) "Only the Paranoid Survive: How to Exploit the Crisis Points that Challenge Every Company and Career"
[3] SQL Troubles (2024) R Language: Drawing Function Plots (Part II - Basic Curves & Inflection Points) (link)

04 March 2024

🧭🏭Business Intelligence: Microsoft Fabric (Part II: Domains and the Data Mesh I -The Challenge of Structure Matching)

Business Intelligence Series
Business Intelligence Series

The holy grail of building a Data Analytics infrastructure seems to be nowadays the creation of a data mesh, a decentralized data architecture that organizes data by specific business domains. This endeavor proves to be difficult to achieve given the various challenges faced  – data integration, data ownership, data product creation and ownership, enablement of data citizens, respectively enforcing security and governance in a federated manner. 

Microsoft Fabric promises to facilitate the creation of data mashes with the help of domains and subdomain by providing built-in security, administration, and governance features associated with them. A domain is a way of logically grouping together all the data in an organization that is relevant to a particular area or field. A subdomain is a way for fine tuning the logical grouping of the data.

Business domains
Business domains & their entities

At high level the challenge of building a data mesh is on how to match or aggregate structures. On one side is the high-level structure of the data mesh, while on the other side is the structure of the business data entities. The data entities can be grouped within a taxonomy with multiple levels that expands to the departments. That’s why it seems somehow natural to consider the departments as the top-most domains of the data mesh. The issue is that if the segmentation starts from a high level, iI becomes inflexible in modeling. Moreover, one has only domains and subdomains, and thus a 2-level structure to model the main aspects of the data mesh.

Some organizations allow unrestricted access to the data belonging to a given department, while others breakdown the access to a more granular level. There are also organizations that don’t restrict the access at all, though this may change later. Besides permissions and a way of grouping together the entities, what value brings to set the domains as departments? 

Therefore, I’m not convinced about using an organizations’ departmental structure as domains, especially when such a structure may change and this would imply a full range of further changes. Moreover, such a structure doesn’t reflect the span of processes or how permissions are assigned for the various roles, which are better reflected on how information systems are structured. Most probably the solution needs to accommodate both perspective and be somehow in the middle. 

Take for example the internal structure of the modules from Dynamics 365 (D365). The Finance area is broken down in Accounts Payable, Accounts Receivables, Fixed Assets, General Ledger, etc. In some organizations the departments reflect this delimitation to some degree, while in others are just associated with finance-related roles. Moreover, the permissions are more granular and, reflecting the data entities the users work with. 

Conversely, SCM extends into Finance as Purchase orders, Sales orders and other business documents are the starting or intermediary points of processes that span modules. Similarly, there are processes that start in CRM or other systems. The span of processes seem to be more appropriate for structuring the data mesh, though the system overlapping with the roles involved in the processes and the free definition of process boundaries can overcomplicate the whole design.

It makes sense to define the domains at a level that resembles the structure of the modules available in D365, while the macro data-entities represent the subdomain. The subdomain would represent then master as well as transactional data entities from the perspective of the domains, with there will be entities that need to be shared between multiple domains. Such a structure has less chances to change over time, allowing more flexibility and smaller areas of focus and thus easier to design, develop, test, deploy and maintain.

Previous Post <<||>> Next Post

27 February 2024

🔖Book Review: Rolf Hichert & Jürgen Faisst's International Business Communication Standards (IBCS Version 1.2)

Over the last months I found several references to Rolf Hichert & Jürgen Faisst's booklet on business communication standards [1]. It draw my attention especially because it attempts to provide a standard for reports and data visualizations, which frankly it seems like a tremendous endeavor if done right. The two authors founded the IBCS institute 20 years ago, which is a host, training institute, and certification body of the Creative Commons project called IBCS.

The 150 pages booklet considers various standardization techniques with the help of more than 180 instructive figures, the overall structure being based on a set of principles and rules rooted in an acronym that spells "SUCCESS" - Say, Unify, Condense, Check, Express, Simplify, Structure. On one side the principles seem to form a solid fundament, however the fundament seems to suffer from the same rigidity resulted from fitting something in a nicely-spelled acronym. 

Say or conveying a message reflects the principle that each report should convey a message, otherwise the report is just a data collection. According to this "definition" most of the operational reports are just collections of data. Conversely, lot of communication in organizations revolve around issues, metrics and decision making, scenarios in which the messages conveyed can be powerful though dependent on the business context. Settling on only one message can make the message fall short.

Unifying or applying semantic notation reflects the principle that things that have same meaning should look the same. There are many patterns out there that can be standardized, however it's questionable how much complex visualizations can be standardized, respectively how much liberty of expressing certain aspects the standardization allows. 

Condense or increasing the information density reflects the requirements that all information necessary to understanding the content should, if possible, be included on one page. This allows to easier navigate the content and prioritize what the audience is able to see. The principle however seems to have more to do with the ink-information ratio principle (see [2]). 

Check or ensuring the visual integrity reflects the principle that the information should be presented in the most truthful and the most easily understood way. This is something that many data visualizations out there lack.

Express or choosing the proper visualizations is based on the principle that the visuals considered should be as intuitive as possible. In theory, the more intuitive a visual the easier is to be understood and reused, however this depends on the "visual vocabulary" and "visual grammar" of each individual. Intuition is something that needs to grow through the interplay of these two areas. Having the expectation of displaying everything in terms of basic elements is unrealistic and suboptimal. 

Simplify or avoiding clutter refers to eliminating the unnecessary from a visualization, when there's nothing to take out without changing the meaning of a visualization. At least, the principle is correctly considered even if is in general difficult to apply because quite often one needs to build something more complex and reduce the complexity through iterative steps until the simple is obtained. 

Structure or organizing the content is based on the principle that content should follow (a logical consistent) structure. The interplay between function and structure is an important topic in itself.

Browsing through the many data visualizations given as example, I'd say that many of the recommendations make sense, though from there to a standardization is still a long way. The reader should evaluate against his/her own judgements the practices described and consider what seems to work. 

The book is available on the IBS website as PDF, though the Kindle version is 40% cheaper. Overall, it is worth a read. 

Previous Post <<||>>  Next Post

Resources:
[1] Rolf Hichert & Jürgen Faisst (2022) "International Business Communication Standards (IBCS Version 1.2): Conceptual, perceptual, and semantic design of comprehensible business reports, presentations, and dashboards" (link)
[2] Edward R Tufte (1983) "The Visual Display of Quantitative Information"
[3] IBCS Institude (2024) About (link)

21 May 2020

💼Project Management: Project Planning (Part III: Planning Correctly Misunderstood III)

Mismanagement

One of the most misunderstood topics in Project Management seems to be the one of planning, and this probably because everyone has a good idea of what it means to plan an activity – we do it daily and most of the times (hopefully) we hit a bull’s-eye (or we have the impression we did that). You must do this and that, you have that dependency, you must coordinate with a few people, you must first reach that milestone before going further, you do one step at a time, and so on. It’s pretty easy, isn’t it?

From a bird’s eyes view project planning is like planning every other activity though there are several important differences. The most important one is of scale – the number of activities and resources involved, the level of coordination and communication, as well the quality with which occur, the level of uncertainty and control, respectively manageability. All these create a complexity that is hardly manageable by just one person. 

Another difference is the detail needed for the planning and targets’ reachability. Some believe that the plan needs to be done down to the lowest level of detail, which even if possible can prove to be an impediment to planning. Projects’ environment share some important characteristics with a battle field in terms of complexity of interactions, their dynamics and logistical requirements. Within an army’s structure there are levels of organization that require different mindsets and levels of planning. A general thinks primarily at strategic level in which troops and actions are seen as aggregations at the needed level of abstraction that makes their organization and planning manageable. The strategy is done however in collaboration with other generals and upper structures, while having defined the strategic goals the general must devise together with the immediate subalterns the tactics. In theory the project manager must regard the project from the same perspective. Results thus three levels of planning – strategic, done with the upper management, tactical done with the team members, respectively logistical, done within the team. That’s a way of breaking the complexity and dividing the responsibilities within the project. 

Projects’ final destination seem to have the character of a wish list more or less anchored in reality. From a technical point the target can be achievable though in big projects the most important challenges are of organizational nature – of being able to allocate and coordinate effectively the resources as needed by the project. The wish-like character is reflected also by the cost, scope, time triangle in respect to the expected quality – to some point in time one is forced to choose between two of them. On the other side, there’s the tendency to see the targets and milestones as fixed, with little room for deviation. One can easily forget that a strategic plan’s purpose is to set the objectives, identify the challenges and the possible lines of action, while a tactical plan’s objective is to devise the means to reach the objectives. Bringing everything together can easily obscure the view and, in extremis, the plan loses its actuality as soon was created (and approved). 

The most confusing aspect is probably the adherence of a plan to a given methodology, one dicing a project and thus a plan to fit a methodology by following blindly the rules and principles imposed by it instead of fitting the methodology to a project. Besides the fact that the methodologies are best practices but not necessarily good practices, what fits for an organization, they tend to be either too general, by specifying the what and not the how, or too restrictive (interpreted). 

30 July 2019

🧱IT: Network (Definitions)

"Mathematically defined structure of a computing system where the operations are performed at specific locations (nodes) and the flow of information is represented by directed arcs." (Guido Deboeck & Teuvo Kohonen (Eds), "Visual Explorations in Finance with Self-Organizing Maps 2nd Ed.", 2000)

"A system of interconnected computing resources (computers, servers, printers, and so on)." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling 2nd Ed.", 2005)

"A system of connected computers. A local area network (LAN) is contained within a single company, in a single office. A wide area network (WAN) is generally distributed across a geographical area — even globally. The Internet is a very loosely connected network, meaning that it is usable by anyone and everyone." (Gavin Powell, "Beginning Database Design", 2006)

"A system of interconnected devices that provides a means for data to be transmitted from point to point." (Janice M Roehl-Anderson, "IT Best Practices for Financial Managers", 2010)

"1.Visually, a graph of nodes and connections where more than one entry point for each node is allowed. 2.In architecture, a topological arrangement of hardware and connections to allow communication between nodes and access to shared data and software." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The connection of computer systems (nodes) by communications channels and appropriate software. |" (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"The means by which electronic communications occurs between two or more nodes" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"Two or more computers connected to share data and resources." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"People working towards a common purpose or with common interests where there is no requirement for members of the network to have a work relationship with others, and there is no requirement for mutuality as there is with a team." (Catherine Burke et al, "Systems Leadership, 2nd Ed,", 2018)

24 April 2019

💼Project Management: Project Execution (Part V: The Butterflies of Project Management)

Mismanagement

Expressed metaphorically as "the flap of a butterfly’s wings in Brazil set off a tornado in Texas”, in Chaos Theory the “butterfly effect” is a hypothesis rooted in Edward N Lorenz’s work on weather forecasting and used to depict the sensitive dependence on initial conditions in nonlinear processes, systems in which the change in input is not proportional to the change in output.  

Even if overstated, the flapping of wings advances the idea that a small change (the flap of wings) in the initial conditions of a system cascades to a large-scale chain of events leading to large-scale phenomena (the tornado) . The chain of events is known as the domino effect and represents the cumulative effect produced when one event sets off a chain of similar events. If the butterfly metaphor doesn’t catch up maybe it’s easier to visualize the impact as a big surfing wave – it starts small and increases in size to the degree that it can bring a boat to the shore or make an armada drown under its force. 

Projects start as narrow activities however the longer they take and the broader they become tend to accumulate force and behave like a wave, having the force to push or drawn an organization in the flood that comes with it. A project is not only a system but a complex ecosystem - aggregations of living organisms and nonliving components with complex interactions forming a unified whole with emergent behavior deriving from the structure rather than its components - groups of people tend to  self-organize, to swarm in one direction or another, much like birds do, while knowledge seems to converge from unrelated sources (aka consilience). 

 Quite often ignored, the context in which a project starts is very important, especially because these initial factors or conditions can have a considerable impact reflected in people’s perception regarding the state or outcomes of the project, perception reflected eventually also in the decisions made during the later phases of the project. The positive or negative auspices can be easily reinforced by similar events. Given the complex correlations and implications, aspects not always correct perceived and understood can have a domino effect. 

The preparations for the project start – the Business Case, setting up the project structure, communicating project’s expectation and addressing stakeholders’ expectations, the kick-off meeting, the approval of the needed resources, the knowledge available in the team, all these have a certain influence on the project. A bad start can haunt a project long time after its start, even if the project is on the right track and makes a positive impact. In reverse, a good start can shade away some mishaps on the way, however there’s also the danger that the mishaps are ignored and have greater negative impact on the project. It may look as common sense however the first image often counts and is kept in people’s memory for a long time. 

As people are higher perceptive to negative as to positive events, there are higher the chances that a multitude of negative aspects will have bigger impact on the project. It’s again something that one can address as the project progresses. It’s not necessarily about control but about being receptive to the messages around and of allowing people to give (constructive) feedback early in the project. It’s about using the positive force of a wave and turning negative flow into a positive one. 

Being aware of the importance of the initial context is just a first step toward harnessing waves or winds’ power, it takes action and leadership to pull the project in the right direction.

24 December 2018

🔭Data Science: Randomness (Just the Quotes)

"If the number of experiments be very large, we may have precise information as to the value of the mean, but if our sample be small, we have two sources of uncertainty: (I) owing to the 'error of random sampling' the mean of our series of experiments deviates more or less widely from the mean of the population, and (2) the sample is not sufficiently large to determine what is the law of distribution of individuals." William S Gosset, "The Probable Error of a Mean", Biometrika, 1908)

"The postulate of randomness thus resolves itself into the question, ‘of what population is this a random sample?’ which must frequently be asked by every practical statistician." (Ronald  A Fisher, "On the Mathematical Foundation of Theoretical Statistics", Philosophical Transactions of the Royal Society of London Vol. A222, 1922)

"The most important application of the theory of probability is to what we may call 'chance-like' or 'random' events, or occurrences. These seem to be characterized by a peculiar kind of incalculability which makes one disposed to believe - after many unsuccessful attempts - that all known rational methods of prediction must fail in their case. We have, as it were, the feeling that not a scientist but only a prophet could predict them. And yet, it is just this incalculability that makes us conclude that the calculus of probability can be applied to these events." (Karl R Popper, "The Logic of Scientific Discovery", 1934)

"The definition of random in terms of a physical operation is notoriously without effect on the mathematical operations of statistical theory because so far as these mathematical operations are concerned random is purely and simply an undefined term." (Walter A Shewhart & William E Deming, "Statistical Method from the Viewpoint of Quality Control", 1939)

"The first attempts to consider the behavior of so-called 'random neural nets' in a systematic way have led to a series of problems concerned with relations between the 'structure' and the 'function' of such nets. The 'structure' of a random net is not a clearly defined topological manifold such as could be used to describe a circuit with explicitly given connections. In a random neural net, one does not speak of 'this' neuron synapsing on 'that' one, but rather in terms of tendencies and probabilities associated with points or regions in the net." (Anatol Rapoport, "Cycle distributions in random nets", The Bulletin of Mathematical Biophysics 10(3), 1948)

"Time itself will come to an end. For entropy points the direction of time. Entropy is the measure of randomness. When all system and order in the universe have vanished, when randomness is at its maximum, and entropy cannot be increased, when there is no longer any sequence of cause and effect, in short when the universe has run down, there will be no direction to time - there will be no time." (Lincoln Barnett, "The Universe and Dr. Einstein", 1948)

"A random sequence is a vague notion embodying the idea of a sequence in which each term is unpredictable to the uninitiated and whose digits pass a certain number of tests traditional with statisticians and depending somewhat on the uses to which the sequence is to be put." (Derrick H Lehmer, 1951)

"We must emphasize that such terms as 'select at random', 'choose at random', and the like, always mean that some mechanical device, such as coins, cards, dice, or tables of random numbers, is used." (Frederick Mosteller et al, "Principles of Sampling", Journal of the American Statistical Association Vol. 49 (265), 1954)

"The concept of randomness arises partly from games of chance. The word ‘chance’ derives from the Latin cadentia signifying the fall of a die. The word ‘random’ itself comes from the French randir meaning to run fast or gallop." (G Spencer Brown, "Probability and Scientific Inference", 1957)

"[…] random numbers should not be generated with a method chosen at random. Some theory should be used." (Donald E Knuth, "The Art of Computer Programming" Vol. II, 1968)

"The generation of random numbers is too important to be left to chance." (Robert R Coveyou, [Oak Ridge National Laboratory] 1969)

"[...] too many users of the analysis of variance seem to regard the reaching of a mediocre level of significance as more important than any descriptive specification of the underlying averages Our thesis is that people have strong intuitions about random sampling; that these intuitions are wrong in fundamental respects; that these intuitions are shared by naive subjects and by trained scientists; and that they are applied with unfortunate consequences in the course of scientific inquiry. We submit that people view a sample randomly drawn from a population as highly representative, that is, similar to the population in all essential characteristics. Consequently, they expect any two samples drawn from a particular population to be more similar to one another and to the population than sampling theory predicts, at least for small samples." (Amos Tversky & Daniel Kahneman, "Belief in the law of small numbers", Psychological Bulletin 76(2), 1971)

"It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought." (Edwin T Jaynes, "Probability Theory: The Logic of Science", 1979)

"From a purely operational point of viewpoint […] the concept of randomness is so elusive as to cease to be viable." (Mark Kac, 1983)

"Randomness is a difficult notion for people to accept. When events come in clusters and streaks, people look for explanations and patterns. They refuse to believe that such patterns - which frequently occur in random data - could equally well be derived from tossing a coin. So it is in the stock market as well." (Burton G Malkiel, "A Random Walk Down Wall Street", 1989)

"The term chaos is used in a specific sense where it is an inherently random pattern of behaviour generated by fixed inputs into deterministic (that is fixed) rules (relationships). The rules take the form of non-linear feedback loops. Although the specific path followed by the behaviour so generated is random and hence unpredictable in the long-term, it always has an underlying pattern to it, a 'hidden' pattern, a global pattern or rhythm. That pattern is self-similarity, that is a constant degree of variation, consistent variability, regular irregularity, or more precisely, a constant fractal dimension. Chaos is therefore order (a pattern) within disorder (random behaviour)." (Ralph D Stacey, "The Chaos Frontier: Creative Strategic Control for Business", 1991)

"When nearest neighbor effects exist, the randomized complete block analysis [can be] so poor as to deserver to be called catastrophic. It [can not] even be considered a serious form of analysis. It is extremely important to make this clear to the vast number of researchers who have near religious faith in the randomized complete block design." (Walt Stroup & D Mulitze, "Nearest Neighbor Adjusted Best Linear Unbiased Prediction", The American Statistician 45, 1991) 

"Chaos demonstrates that deterministic causes can have random effects […] There's a similar surprise regarding symmetry: symmetric causes can have asymmetric effects. […] This paradox, that symmetry can get lost between cause and effect, is called symmetry-breaking. […] From the smallest scales to the largest, many of nature's patterns are a result of broken symmetry; […]" (Ian Stewart & Martin Golubitsky, "Fearful Symmetry: Is God a Geometer?", 1992)

"Probability theory is an ideal tool for formalizing uncertainty in situations where class frequencies are known or where evidence is based on outcomes of a sufficiently long series of independent random experiments. Possibility theory, on the other hand, is ideal for formalizing incomplete information expressed in terms of fuzzy propositions." (George Klir, "Fuzzy sets and fuzzy logic", 1995)

"We use mathematics and statistics to describe the diverse realms of randomness. From these descriptions, we attempt to glean insights into the workings of chance and to search for hidden causes. With such tools in hand, we seek patterns and relationships and propose predictions that help us make sense of the world."  (Ivars Peterson, "The Jungles of Randomness: A Mathematical Safari", 1998)

"Events may appear to us to be random, but this could be attributed to human ignorance about the details of the processes involved." (Brain S Everitt, "Chance Rules", 1999)

"I sometimes think that the only real difference between Bayesian and non-Bayesian hierarchical modelling is whether random effects are labeled with Greek or Roman letters." (Peter Diggle, "Comment on Bayesian analysis of agricultural field experiments", Journal of Royal Statistical Society B vol. 61, 1999)

"The self-similarity of fractal structures implies that there is some redundancy because of the repetition of details at all scales. Even though some of these structures may appear to teeter on the edge of randomness, they actually represent complex systems at the interface of order and disorder."  (Edward Beltrami, "What is Random?: Chaos and Order in Mathematics and Life", 1999)

"Randomness is NOT the absence of a pattern." (Bill Venables," S-Plus User’s Conference", 1999)

"Most physical systems, particularly those complex ones, are extremely difficult to model by an accurate and precise mathematical formula or equation due to the complexity of the system structure, nonlinearity, uncertainty, randomness, etc. Therefore, approximate modeling is often necessary and practical in real-world applications. Intuitively, approximate modeling is always possible. However, the key questions are what kind of approximation is good, where the sense of 'goodness' has to be first defined, of course, and how to formulate such a good approximation in modeling a system such that it is mathematically rigorous and can produce satisfactory results in both theory and applications." (Guanrong Chen & Trung Tat Pham, "Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems", 2001)

"[…] we would like to observe that the butterfly effect lies at the root of many events which we call random. The final result of throwing a dice depends on the position of the hand throwing it, on the air resistance, on the base that the die falls on, and on many other factors. The result appears random because we are not able to take into account all of these factors with sufficient accuracy. Even the tiniest bump on the table and the most imperceptible move of the wrist affect the position in which the die finally lands. It would be reasonable to assume that chaos lies at the root of all random phenomena." (Iwo Białynicki-Birula & Iwona Białynicka-Birula, "Modeling Reality: How Computers Mirror Life", 2004)

"Chance is just as real as causation; both are modes of becoming. The way to model a random process is to enrich the mathematical theory of probability with a model of a random mechanism. In the sciences, probabilities are never made up or 'elicited' by observing the choices people make, or the bets they are willing to place. The reason is that, in science and technology, interpreted probability exactifies objective chance, not gut feeling or intuition. No randomness, no probability." (Mario Bunge, "Chasing Reality: Strife over Realism", 2006)

"Complexity arises when emergent system-level phenomena are characterized by patterns in time or a given state space that have neither too much nor too little form. Neither in stasis nor changing randomly, these emergent phenomena are interesting, due to the coupling of individual and global behaviours as well as the difficulties they pose for prediction. Broad patterns of system behaviour may be predictable, but the system's specific path through a space of possible states is not." (Steve Maguire et al, "Complexity Science and Organization Studies", 2006)

"A Black Swan is a highly improbable event with three principal characteristics: It is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random, and more predictable, than it was. […] The Black Swan idea is based on the structure of randomness in empirical reality. [...] the Black Swan is what we leave out of simplification." (Nassim N Taleb, "The Black Swan", 2007)

"[myth:] Random errors can always be determined by repeating measurements under identical conditions. […] this statement is true only for time-related random errors ." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"To fulfill the requirements of the theory underlying uncertainties, variables with random uncertainties must be independent of each other and identically distributed. In the limiting case of an infinite number of such variables, these are called normally distributed. However, one usually speaks of normally distributed variables even if their number is finite." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"While in theory randomness is an intrinsic property, in practice, randomness is incomplete information." (Nassim N Taleb, "The Black Swan", 2007)

"Regression toward the mean. That is, in any series of random events an extraordinary event is most likely to be followed, due purely to chance, by a more ordinary one." (Leonard Mlodinow, "The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"The key to understanding randomness and all of mathematics is not being able to intuit the answer to every problem immediately but merely having the tools to figure out the answer." (Leonard Mlodinow,"The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"Data always vary randomly because the object of our inquiries, nature itself, is also random. We can analyze and predict events in nature with an increasing amount of precision and accuracy, thanks to improvements in our techniques and instruments, but a certain amount of random variation, which gives rise to uncertainty, is inevitable." (Alberto Cairo, "The Functional Art", 2011)

"No matter what the laws of chance might tell us, we search for patterns among random events wherever they might occur–not only in the stock market but even in interpreting sporting phenomena." (Burton G Malkiel, "A Random Walk Down Wall Street: The Time-Tested Strategy For Successful Investing", 2011)

"Randomness might be defined in terms of order - its absence, that is. […] Everything we care about lies somewhere in the middle, where pattern and randomness interlace." (James Gleick, "The Information: A History, a Theory, a Flood", 2011)

"The storytelling mind is allergic to uncertainty, randomness, and coincidence. It is addicted to meaning. If the storytelling mind cannot find meaningful patterns in the world, it will try to impose them. In short, the storytelling mind is a factory that churns out true stories when it can, but will manufacture lies when it can't." (Jonathan Gottschall, "The Storytelling Animal: How Stories Make Us Human", 2012)

"When some systems are stuck in a dangerous impasse, randomness and only randomness can unlock them and set them free." (Nassim N Taleb, "Antifragile: Things That Gain from Disorder", 2012)

"Although cascading failures may appear random and unpredictable, they follow reproducible laws that can be quantified and even predicted using the tools of network science. First, to avoid damaging cascades, we must understand the structure of the network on which the cascade propagates. Second, we must be able to model the dynamical processes taking place on these networks, like the flow of electricity. Finally, we need to uncover how the interplay between the network structure and dynamics affects the robustness of the whole system." (Albert-László Barabási, "Network Science", 2016)

"Too little attention is given to the need for statistical control, or to put it more pertinently, since statistical control (randomness) is so rarely found, too little attention is given to the interpretation of data that arise from conditions not in statistical control." (William E Deming)

More quotes on "Randomness" at the-web-of-knowledge.blogspot.com

17 December 2018

🔭Data Science: Mathematical Models (Just the Quotes)

"Experience teaches that one will be led to new discoveries almost exclusively by means of special mechanical models." (Ludwig Boltzmann, "Lectures on Gas Theory", 1896)

"If the system exhibits a structure which can be represented by a mathematical equivalent, called a mathematical model, and if the objective can be also so quantified, then some computational method may be evolved for choosing the best schedule of actions among alternatives. Such use of mathematical models is termed mathematical programming."  (George Dantzig, "Linear Programming and Extensions", 1959)

“In fact, the construction of mathematical models for various fragments of the real world, which is the most essential business of the applied mathematician, is nothing but an exercise in axiomatics.” (Marshall Stone, cca 1960)

"[...] sciences do not try to explain, they hardly even try to interpret, they mainly make models. By a model is meant a mathematical construct which, with the addition of certain verbal interpretations, describes observed phenomena. The justification of such a mathematical construct is solely and precisely that it is expected to work - that is, correctly to describe phenomena from a reasonably wide area. Furthermore, it must satisfy certain aesthetic criteria - that is, in relation to how much it describes, it must be rather simple.” (John von Neumann, “Method in the physical sciences”, 1961)

“Mathematical statistics provides an exceptionally clear example of the relationship between mathematics and the external world. The external world provides the experimentally measured distribution curve; mathematics provides the equation (the mathematical model) that corresponds to the empirical curve. The statistician may be guided by a thought experiment in finding the corresponding equation.” (Marshall J Walker, “The Nature of Scientific Thought”, 1963)

"Thus, the construction of a mathematical model consisting of certain basic equations of a process is not yet sufficient for effecting optimal control. The mathematical model must also provide for the effects of random factors, the ability to react to unforeseen variations and ensure good control despite errors and inaccuracies." (Yakov Khurgin, "Did You Say Mathematics?", 1974)

"A mathematical model is any complete and consistent set of mathematical equations which are designed to correspond to some other entity, its prototype. The prototype may be a physical, biological, social, psychological or conceptual entity, perhaps even another mathematical model." (Rutherford Aris, "Mathematical Modelling", 1978)

"Mathematical model making is an art. If the model is too small, a great deal of analysis and numerical solution can be done, but the results, in general, can be meaningless. If the model is too large, neither analysis nor numerical solution can be carried out, the interpretation of the results is in any case very difficult, and there is great difficulty in obtaining the numerical values of the parameters needed for numerical results." (Richard E Bellman, "Eye of the Hurricane: An Autobiography", 1984)

“Theoretical scientists, inching away from the safe and known, skirting the point of no return, confront nature with a free invention of the intellect. They strip the discovery down and wire it into place in the form of mathematical models or other abstractions that define the perceived relation exactly. The now-naked idea is scrutinized with as much coldness and outward lack of pity as the naturally warm human heart can muster. They try to put it to use, devising experiments or field observations to test its claims. By the rules of scientific procedure it is then either discarded or temporarily sustained. Either way, the central theory encompassing it grows. If the abstractions survive they generate new knowledge from which further exploratory trips of the mind can be planned. Through the repeated alternation between flights of the imagination and the accretion of hard data, a mutual agreement on the workings of the world is written, in the form of natural law.” (Edward O Wilson, “Biophilia”, 1984)

“The usual approach of science of constructing a mathematical model cannot answer the questions of why there should be a universe for the model to describe. Why does the universe go to all the bother of existing?” (Stephen Hawking, "A Brief History of Time", 1988)

“Mathematical modeling is about rules - the rules of reality. What distinguishes a mathematical model from, say, a poem, a song, a portrait or any other kind of ‘model’, is that the mathematical model is an image or picture of reality painted with logical symbols instead of with words, sounds or watercolors.” (John L Casti, "Reality Rules, The Fundamentals", 1992)

“Pedantry and sectarianism aside, the aim of theoretical physics is to construct mathematical models such as to enable us, from the use of knowledge gathered in a few observations, to predict by logical processes the outcomes in many other circumstances. Any logically sound theory satisfying this condition is a good theory, whether or not it be derived from ‘ultimate’ or ‘fundamental’ truth.” (Clifford Truesdell & Walter Noll, “The Non-Linear Field Theories of Mechanics” 2nd Ed., 1992)

"Nature behaves in ways that look mathematical, but nature is not the same as mathematics. Every mathematical model makes simplifying assumptions; its conclusions are only as valid as those assumptions. The assumption of perfect symmetry is excellent as a technique for deducing the conditions under which symmetry-breaking is going to occur, the general form of the result, and the range of possible behaviour. To deduce exactly which effect is selected from this range in a practical situation, we have to know which imperfections are present." (Ian Stewart & Martin Golubitsky, "Fearful Symmetry", 1992)

“A model is an imitation of reality and a mathematical model is a particular form of representation. We should never forget this and get so distracted by the model that we forget the real application which is driving the modelling. In the process of model building we are translating our real world problem into an equivalent mathematical problem which we solve and then attempt to interpret. We do this to gain insight into the original real world situation or to use the model for control, optimization or possibly safety studies." (Ian T Cameron & Katalin Hangos, “Process Modelling and Model Analysis”, 2001)

"Formulation of a mathematical model is the first step in the process of analyzing the behaviour of any real system. However, to produce a useful model, one must first adopt a set of simplifying assumptions which have to be relevant in relation to the physical features of the system to be modelled and to the specific information one is interested in. Thus, the aim of modelling is to produce an idealized description of reality, which is both expressible in a tractable mathematical form and sufficiently close to reality as far as the physical mechanisms of interest are concerned." (Francois Axisa, "Discrete Systems" Vol. I, 2001)

"[…] interval mathematics and fuzzy logic together can provide a promising alternative to mathematical modeling for many physical systems that are too vague or too complicated to be described by simple and crisp mathematical formulas or equations. When interval mathematics and fuzzy logic are employed, the interval of confidence and the fuzzy membership functions are used as approximation measures, leading to the so-called fuzzy systems modeling." (Guanrong Chen & Trung Tat Pham, "Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems", 2001)

"Modeling, in a general sense, refers to the establishment of a description of a system (a plant, a process, etc.) in mathematical terms, which characterizes the input-output behavior of the underlying system. To describe a physical system […] we have to use a mathematical formula or equation that can represent the system both qualitatively and quantitatively. Such a formulation is a mathematical representation, called a mathematical model, of the physical system." (Guanrong Chen & Trung Tat Pham, "Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems", 2001)

“What is a mathematical model? One basic answer is that it is the formulation in mathematical terms of the assumptions and their consequences believed to underlie a particular ‘real world’ problem. The aim of mathematical modeling is the practical application of mathematics to help unravel the underlying mechanisms involved in, for example, economic, physical, biological, or other systems and processes.” (John A Adam, “Mathematics in Nature”, 2003)

“Mathematical modeling is as much ‘art’ as ‘science’: it requires the practitioner to (i) identify a so-called ‘real world’ problem (whatever the context may be); (ii) formulate it in mathematical terms (the ‘word problem’ so beloved of undergraduates); (iii) solve the problem thus formulated (if possible; perhaps approximate solutions will suffice, especially if the complete problem is intractable); and (iv) interpret the solution in the context of the original problem.” (John A Adam, “Mathematics in Nature”, 2003)

“Mathematical modeling is the application of mathematics to describe real-world problems and investigating important questions that arise from it.” (Sandip Banerjee, “Mathematical Modeling: Models, Analysis and Applications”, 2014)

“A mathematical model is a mathematical description (often by means of a function or an equation) of a real-world phenomenon such as the size of a population, the demand for a product, the speed of a falling object, the concentration of a product in a chemical reaction, the life expectancy of a person at birth, or the cost of emission reductions. The purpose of the model is to understand the phenomenon and perhaps to make predictions about future behavior. [...] A mathematical model is never a completely accurate representation of a physical situation - it is an idealization." (James Stewart, “Calculus: Early Transcedentals” 8th Ed., 2016)

"Machine learning is about making computers learn and perform tasks better based on past historical data. Learning is always based on observations from the data available. The emphasis is on making computers build mathematical models based on that learning and perform tasks automatically without the intervention of humans." (Umesh R Hodeghatta & Umesha Nayak, "Business Analytics Using R: A Practical Approach", 2017)

"Mathematical modeling is the modern version of both applied mathematics and theoretical physics. In earlier times, one proposed not a model but a theory. By talking today of a model rather than a theory, one acknowledges that the way one studies the phenomenon is not unique; it could also be studied other ways. One's model need not claim to be unique or final. It merits consideration if it provides an insight that isn't better provided by some other model." (Reuben Hersh, ”Mathematics as an Empirical Phenomenon, Subject to Modeling”, 2017)

14 December 2018

🔭Data Science: Algorithms (Just the Quotes)

"Mathematics is an aspect of culture as well as a collection of algorithms." (Carl B Boyer, "The History of the Calculus and Its Conceptual Development", 1959)

"Design problems - generating or discovering alternatives - are complex largely because they involve two spaces, an action space and a state space, that generally have completely different structures. To find a design requires mapping the former of these on the latter. For many, if not most, design problems in the real world systematic algorithms are not known that guarantee solutions with reasonable amounts of computing effort. Design uses a wide range of heuristic devices - like means-end analysis, satisficing, and the other procedures that have been outlined - that have been found by experience to enhance the efficiency of search. Much remains to be learned about the nature and effectiveness of these devices." (Herbert A Simon, "The Logic of Heuristic Decision Making", [in "The Logic of Decision and Action"], 1966)

"An algorithm must be seen to be believed, and the best way to learn what an algorithm is all about is to try it." (Donald E Knuth, The Art of Computer Programming Vol. I, 1968)

"Scientific laws give algorithms, or procedures, for determining how systems behave. The computer program is a medium in which the algorithms can be expressed and applied. Physical objects and mathematical structures can be represented as numbers and symbols in a computer, and a program can be written to manipulate them according to the algorithms. When the computer program is executed, it causes the numbers and symbols to be modified in the way specified by the scientific laws. It thereby allows the consequences of the laws to be deduced." (Stephen Wolfram, "Computer Software in Science and Mathematics", 1984)

"Algorithmic complexity theory and nonlinear dynamics together establish the fact that determinism reigns only over a quite finite domain; outside this small haven of order lies a largely uncharted, vast wasteland of chaos." (Joseph Ford, "Progress in Chaotic Dynamics: Essays in Honor of Joseph Ford's 60th Birthday", 1988)

"On this view, we recognize science to be the search for algorithmic compressions. We list sequences of observed data. We try to formulate algorithms that compactly represent the information content of those sequences. Then we test the correctness of our hypothetical abbreviations by using them to predict the next terms in the string. These predictions can then be compared with the future direction of the data sequence. Without the development of algorithmic compressions of data all science would be replaced by mindless stamp collecting - the indiscriminate accumulation of every available fact. Science is predicated upon the belief that the Universe is algorithmically compressible and the modern search for a Theory of Everything is the ultimate expression of that belief, a belief that there is an abbreviated representation of the logic behind the Universe's properties that can be written down in finite form by human beings." (John D Barrow, New Theories of Everything", 1991)

"Algorithms are a set of procedures to generate the answer to a problem." (Stuart Kauffman, "At Home in the Universe: The Search for Laws of Complexity", 1995)

"Let us regard a proof of an assertion as a purely mechanical procedure using precise rules of inference starting with a few unassailable axioms. This means that an algorithm can be devised for testing the validity of an alleged proof simply by checking the successive steps of the argument; the rules of inference constitute an algorithm for generating all the statements that can be deduced in a finite number of steps from the axioms." (Edward Beltrami, "What is Random?: Chaos and Order in Mathematics and Life", 1999)

"The vast majority of information that we have on most processes tends to be nonnumeric and nonalgorithmic. Most of the information is fuzzy and linguistic in form." (Timothy J Ross & W Jerry Parkinson, "Fuzzy Set Theory, Fuzzy Logic, and Fuzzy Systems", 2002)

"Knowledge is encoded in models. Models are synthetic sets of rules, and pictures, and algorithms providing us with useful representations of the world of our perceptions and of their patterns." (Didier Sornette, "Why Stock Markets Crash - Critical Events in Complex Systems", 2003)

"Swarm Intelligence can be defined more precisely as: Any attempt to design algorithms or distributed problem-solving methods inspired by the collective behavior of the social insect colonies or other animal societies. The main properties of such systems are flexibility, robustness, decentralization and self-organization." ("Swarm Intelligence in Data Mining", Ed. Ajith Abraham et al, 2006)

"The burgeoning field of computer science has shifted our view of the physical world from that of a collection of interacting material particles to one of a seething network of information. In this way of looking at nature, the laws of physics are a form of software, or algorithm, while the material world - the hardware - plays the role of a gigantic computer." (Paul C W Davies, "Laying Down the Laws", New Scientist, 2007)

"An algorithm refers to a successive and finite procedure by which it is possible to solve a certain problem. Algorithms are the operational base for most computer programs. They consist of a series of instructions that, thanks to programmers’ prior knowledge about the essential characteristics of a problem that must be solved, allow a step-by-step path to the solution." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"Programming is a science dressed up as art, because most of us don’t understand the physics of software and it’s rarely, if ever, taught. The physics of software is not algorithms, data structures, languages, and abstractions. These are just tools we make, use, and throw away. The real physics of software is the physics of people. Specifically, it’s about our limitations when it comes to complexity and our desire to work together to solve large problems in pieces. This is the science of programming: make building blocks that people can understand and use easily, and people will work together to solve the very largest problems." (Pieter Hintjens, "ZeroMQ: Messaging for Many Applications", 2012)

"These nature-inspired algorithms gradually became more and more attractive and popular among the evolutionary computation research community, and together they were named swarm intelligence, which became the little brother of the major four evolutionary computation algorithms." (Yuhui Shi, "Emerging Research on Swarm Intelligence and Algorithm Optimization", Information Science Reference, 2014)

"[...] algorithms, which are abstract or idealized process descriptions that ignore details and practicalities. An algorithm is a precise and unambiguous recipe. It’s expressed in terms of a fixed set of basic operations whose meanings are completely known and specified. It spells out a sequence of steps using those operations, with all possible situations covered, and it’s guaranteed to stop eventually." (Brian W Kernighan, "Understanding the Digital World", 2017)

"An algorithm is the computer science version of a careful, precise, unambiguous recipe or tax form, a sequence of steps that is guaranteed to compute a result correctly." (Brian W Kernighan, "Understanding the Digital World", 2017)

"Again, classical statistics only summarizes data, so it does not provide even a language for asking [a counterfactual] question. Causal inference provides a notation and, more importantly, offers a solution. As with predicting the effect of interventions [...], in many cases we can emulate human retrospective thinking with an algorithm that takes what we know about the observed world and produces an answer about the counterfactual world." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"Algorithms describe the solution to a problem in terms of the data needed to represent the  problem instance and a set of steps necessary to produce the intended result." (Bradley N Miller et al, "Python Programming in Context", 2019)

"An algorithm, meanwhile, is a step-by-step recipe for performing a series of actions, and in most cases 'algorithm' means simply 'computer program'." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Big data is revolutionizing the world around us, and it is easy to feel alienated by tales of computers handing down decisions made in ways we don’t understand. I think we’re right to be concerned. Modern data analytics can produce some miraculous results, but big data is often less trustworthy than small data. Small data can typically be scrutinized; big data tends to be locked away in the vaults of Silicon Valley. The simple statistical tools used to analyze small datasets are usually easy to check; pattern-recognizing algorithms can all too easily be mysterious and commercially sensitive black boxes." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Each of us is sweating data, and those data are being mopped up and wrung out into oceans of information. Algorithms and large datasets are being used for everything from finding us love to deciding whether, if we are accused of a crime, we go to prison before the trial or are instead allowed to post bail. We all need to understand what these data are and how they can be exploited." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

"Many people have strong intuitions about whether they would rather have a vital decision about them made by algorithms or humans. Some people are touchingly impressed by the capabilities of the algorithms; others have far too much faith in human judgment. The truth is that sometimes the algorithms will do better than the humans, and sometimes they won’t. If we want to avoid the problems and unlock the promise of big data, we’re going to need to assess the performance of the algorithms on a case-by-case basis. All too often, this is much harder than it should be. […] So the problem is not the algorithms, or the big datasets. The problem is a lack of scrutiny, transparency, and debate." (Tim Harford, "The Data Detective: Ten easy rules to make sense of statistics", 2020)

More quotes on "Algorithms" at the-web-of-knowledge.blogspot.com.

13 December 2018

🔭Data Science: Bayesian Networks (Just the Quotes)

"The best way to convey to the experimenter what the data tell him about theta is to show him a picture of the posterior distribution." (George E P Box & George C Tiao, "Bayesian Inference in Statistical Analysis", 1973)

"In the design of experiments, one has to use some informal prior knowledge. How does one construct blocks in a block design problem for instance? It is stupid to think that use is not made of a prior. But knowing that this prior is utterly casual, it seems ludicrous to go through a lot of integration, etc., to obtain 'exact' posterior probabilities resulting from this prior. So, I believe the situation with respect to Bayesian inference and with respect to inference, in general, has not made progress. Well, Bayesian statistics has led to a great deal of theoretical research. But I don't see any real utilizations in applications, you know. Now no one, as far as I know, has examined the question of whether the inferences that are obtained are, in fact, realized in the predictions that they are used to make." (Oscar Kempthorne, "A conversation with Oscar Kempthorne", Statistical Science, 1995)

"Bayesian methods are complicated enough, that giving researchers user-friendly software could be like handing a loaded gun to a toddler; if the data is crap, you won't get anything out of it regardless of your political bent." (Brad Carlin, "Bayes offers a new way to make sense of numbers", Science, 1999)

"Bayesian inference is a controversial approach because it inherently embraces a subjective notion of probability. In general, Bayesian methods provide no guarantees on long run performance." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"Bayesian inference is appealing when prior information is available since Bayes’ theorem is a natural way to combine prior information with data. Some people find Bayesian inference psychologically appealing because it allows us to make probability statements about parameters. […] In parametric models, with large samples, Bayesian and frequentist methods give approximately the same inferences. In general, they need not agree." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The Bayesian approach is based on the following postulates: (B1) Probability describes degree of belief, not limiting frequency. As such, we can make probability statements about lots of things, not just data which are subject to random variation. […] (B2) We can make probability statements about parameters, even though they are fixed constants. (B3) We make inferences about a parameter θ by producing a probability distribution for θ. Inferences, such as point estimates and interval estimates, may then be extracted from this distribution." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The important thing is to understand that frequentist and Bayesian methods are answering different questions. To combine prior beliefs with data in a principled way, use Bayesian inference. To construct procedures with guaranteed long run performance, such as confidence intervals, use frequentist methods. Generally, Bayesian methods run into problems when the parameter space is high dimensional." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004) 

"Bayesian networks can be constructed by hand or learned from data. Learning both the topology of a Bayesian network and the parameters in the CPTs in the network is a difficult computational task. One of the things that makes learning the structure of a Bayesian network so difficult is that it is possible to define several different Bayesian networks as representations for the same full joint probability distribution." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015) 

"Bayesian networks provide a more flexible representation for encoding the conditional independence assumptions between the features in a domain. Ideally, the topology of a network should reflect the causal relationships between the entities in a domain. Properly constructed Bayesian networks are relatively powerful models that can capture the interactions between descriptive features in determining a prediction." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015) 

"Bayesian networks use a graph-based representation to encode the structural relationships - such as direct influence and conditional independence - between subsets of features in a domain. Consequently, a Bayesian network representation is generally more compact than a full joint distribution (because it can encode conditional independence relationships), yet it is not forced to assert a global conditional independence between all descriptive features. As such, Bayesian network models are an intermediary between full joint distributions and naive Bayes models and offer a useful compromise between model compactness and predictive accuracy." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015)

"Bayesian networks inhabit a world where all questions are reducible to probabilities, or (in the terminology of this chapter) degrees of association between variables; they could not ascend to the second or third rungs of the Ladder of Causation. Fortunately, they required only two slight twists to climb to the top." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The main differences between Bayesian networks and causal diagrams lie in how they are constructed and the uses to which they are put. A Bayesian network is literally nothing more than a compact representation of a huge probability table. The arrows mean only that the probabilities of child nodes are related to the values of parent nodes by a certain formula (the conditional probability tables) and that this relation is sufficient. That is, knowing additional ancestors of the child will not change the formula. Likewise, a missing arrow between any two nodes means that they are independent, once we know the values of their parents. [...] If, however, the same diagram has been constructed as a causal diagram, then both the thinking that goes into the construction and the interpretation of the final diagram change." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The transparency of Bayesian networks distinguishes them from most other approaches to machine learning, which tend to produce inscrutable 'black boxes'. In a Bayesian network you can follow every step and understand how and why each piece of evidence changed the network’s beliefs." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"With Bayesian networks, we had taught machines to think in shades of gray, and this was an important step toward humanlike thinking. But we still couldn’t teach machines to understand causes and effects. [...] By design, in a Bayesian network, information flows in both directions, causal and diagnostic: smoke increases the likelihood of fire, and fire increases the likelihood of smoke. In fact, a Bayesian network can’t even tell what the 'causal direction' is." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 25 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.