Showing posts with label networks. Show all posts
Showing posts with label networks. Show all posts

07 November 2020

DBMS: Event Streaming Databases (More of a Kafka’s Story)

Database Management

Event streaming architectures are architectures in which data are generated by different sources, and then processed, stored, analyzed, and acted upon in real-time by the different applications tapped into the data streams. An event streaming database is then a database that assures that its data are continuously up-to-date, providing specific functionality like management of connectors, materialized views and running queries on data-in-motion (rather than on static data). 

Reading about this type of technologies one can easily start fantasizing about the Web as a database in which intelligent agents can process streams of data in real-time, in which knowledge is derived and propagated over the networks in an infinitely and ever-growing flow in which the limits are hardly perceptible, in which the agents act as a mind disconnected in the end from the human intent. One is stroke by the fusing elements of realism and the fantastic aspects, more like in a Kafka’s story in which the metamorphosis of the technologies and social aspects can easily lead to absurd implications.

The link to Kafka was somehow suggested by Apache Kafka, an open-source distributed event streaming platform, which seems to lead the trends within this new-developing market. Kafka provides database functionality and guarantees the ACID (atomicity, concurrency, isolation, durability) properties of transactions while tapping into data streams. 

Data streaming is an appealing concept though it has some important challenges like data overload or over-flooding, the complexity derived from building specific (business) and integrity rules for processing the data, respectively for keeping data consistency and truth within the ever-growing and ever-changing flows. 

Data overload or over-flooding occurs when applications are not able to keep the pace with the volume of data or events fired with each change. Imagine the raindrops falling on a wide surface in which each millimeter or micrometer has its own rules for handling the raindrops and this at large scale. The raindrops must infiltrate into the surface to be processed and find their way to the beneath water flows, aggregating up to streams that could nurture seas or even oceans. Same metaphor can be applied to the data events, in which the data pervade applications accumulating in streams processed by engines to derive value. However heavy rains can easily lead to floods in which water aggregates at the surface. 

Business applications rely on predefined paths in which the flow of data is tidily linked to specific actions found themselves in processual sequences that reflect the material or cash flow. Any variation in the data flow from expectations will lead to inefficiencies and ultimately to chaos. Some benefit might be derived from data integrations between the business applications, however applications must be designed for this purpose and handle extreme behaviors like over-flooding. 

Data streams are maybe ideal for social media networks in which one broadcasts data through the networks and any consumer that can tap to the network can further use the respective data. We can see however the problems of nowadays social media – data, better said information, flow through the networks being changed as fit for purposes that can easily diverge from the initial intent. Moreover, information gets denatured, misused, overused to the degree that it overloads the networks, being more and more difficult to distinguish between reliable and non-reliable information. If common sense helps in the process of handling such information, not the same can be said about machines or applications. 

It will be challenging to deal with the vastness, vagueness, uncertainty, inconsistency, and deceit of the networks of the future, however data streaming more likely will have a future as long it can address such issues by design. 


27 August 2019

Information Security: Distributed Denial of Service (Definitions)

"An electronic attack perpetrated by a person who controls legions of hijacked computers. On a single command, the computers simultaneously send packets of data across the Internet at a target computer. The attack is designed to overwhelm the target and stop it from functioning." (Andy Walker, "Absolute Beginner’s Guide To: Security, Spam, Spyware & Viruses", 2005)

"A type of DoS attack in which many (usually thousands or millions) of systems flood the victim with unwanted traffic. Typically perpetrated by networks of zombie Trojans that are woken up specifically for the attack." (Mark Rhodes-Ousley, "Information Security: The Complete Reference" 2nd Ed., 2013)

"A denial of service (DoS) attack that comes from multiple sources at the same time. Attackers often enlist computers into botnets after infecting them with malware. Once infected, the attacker can then direct the infected computers to attack other computers." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"A denial of service technique using numerous hosts to perform the attack. For example, in a network flooding attack, a large number of co-opted computers (e.g., a botnet) send a large volume of spurious network packets to disable a specified target system. See also denial of service; botnet." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"A DoS attack in which multiple systems are used to flood servers with traffic in an attempt to overwhelm available resources (transmission capacity, memory, processing power, and so on), making them unavailable to respond to legitimate users." (William Stallings, "Effective Cybersecurity: A Guide to Using Best Practices and Standards", 2018)

"DDoS stands for distributed denial of service. In this type of an attack, an attacker tends to overwhelm the targeted network in order to make the services unavailable to the intended or legitimate user." (Kirti R Bhatele et al, "The Role of Artificial Intelligence in Cyber Security", Countering Cyber Attacks and Preserving the Integrity and Availability of Critical Systems, 2019)

"In DDoS attack, the incoming network traffic affects a target (e.g., server) from many different compromised sources. Consequently, online services are unavailable due to the attack. The target's resources are affected with different malicious network-based techniques (e.g., flood of network traffic packets)." (Ana Gavrovska & Andreja Samčović, "Intelligent Automation Using Machine and Deep Learning in Cybersecurity of Industrial IoT", 2020)

"This refers to malicious attacks or threats on computer systems to disrupt or break computing activities so that their access and availability is denied to the consumers of such systems or activities." (Heru Susanto et al, "Data Security for Connected Governments and Organisations: Managing Automation and Artificial Intelligence", 2021)

"A denial of service technique that uses numerous hosts to perform the attack." (CNSSI 4009-2015)

"A distributed denial-of-service (DDoS) attack is a malicious attempt to disrupt normal traffic on a targeted server, service or network by overwhelming the target or its surrounding infrastructure with a flood of Internet traffic." (proofpoint) [source]

26 August 2019

Information Security: Denial of Service (Definitions)

"A type of attack on a computer system that ties up critical system resources, making the system temporarily unusable." (Tom Petrocelli, "Data Protection and Information Lifecycle Management", 2005)

"Any attack that affects the availability of a service. Reliability bugs that cause a service to crash or hang are usually potential denial-of-service problems." (Mark S Merkow & Lakshmikanth Raghavan, "Secure and Resilient Software Development", 2010)

"This is a technique for overloading an IT system with a malicious workload, effectively preventing its regular service use." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"Occurs when a server or Web site receives a flood of traffic - much more traffic or requests for service than it can handle, causing it to crash." (Linda Volonino & Efraim Turban, "Information Technology for Management 8th Ed", 2011)

"Causing an information resource to be partially or completely unable to process requests. This is usually accomplished by flooding the resource with more requests than it can handle, thereby rendering it incapable of providing normal levels of service." (Mark Rhodes-Ousley, "Information Security: The Complete Reference, Second Edition" 2nd Ed., 2013)

"Attacks designed to disable a resource such as a server, network, or any other service provided by the company. If the attack is successful, the resource is no longer available to legitimate users." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"An attack from a single attacker designed to disrupt or disable the services provided by an IT system. Compare to distributed denial of service (DDoS)." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"A coordinated attack in which the target website or service is flooded with requests for access, to the point that it is completely overwhelmed." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"An attack that can result in decreased availability of the targeted system." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"An attack that generally floods a network with traffic. A successful DoS attack renders the network unusable and effectively stops the victim organization’s ability to conduct business." (Weiss, "Auditing IT Infrastructures for Compliance" 2nd Ed., 2015)

"A type of cyberattack to degrade the availability of a target system." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"Any action, or series of actions, that prevents a system, or its resources, from functioning in accordance with its intended purpose." (Shon Harris & Fernando Maymi, "CISSP All-in-One Exam Guide" 8th Ed., 2018)

"The prevention of authorized access to resources or the delaying of time-critical operations." (William Stallings, "Effective Cybersecurity: A Guide to Using Best Practices and Standards", 2018)

"An attack shutting down running of a service or network in order to render it inaccessible to its users (whether human person or a processing device)." (Wissam Abbass et al, "Internet of Things Application for Intelligent Cities: Security Risk Assessment Challenges", 2021)

"Actions that prevent the NE from functioning in accordance with its intended purpose. A piece of equipment or entity may be rendered inoperable or forced to operate in a degraded state; operations that depend on timeliness may be delayed." (NIST SP 800-13)

"The prevention of authorized access to resources or the delaying of time-critical operations. (Time-critical may be milliseconds or it may be hours, depending upon the service provided)." (NIST SP 800-12 Rev. 1)

"The prevention of authorized access to a system resource or the delaying of system operations and functions." (NIST SP 800-82 Rev. 2)


25 August 2019

Information Security: Attack (Definitions)

[active attack:] "Any network-based attack other than simple eavesdropping (i.e., a passive attack)." (Mark S Merkow & Lakshmikanth Raghavan, "Secure and Resilient Software Development", 2010)

"Unauthorized activity with malicious intent that uses specially crafted code or techniques." (Mark Rhodes-Ousley, "Information Security: The Complete Reference" 2nd Ed., 2013)

"An attempt to destroy, expose, alter, disable, steal or gain unauthorised access to or make unauthorised use of an asset," (David Sutton, "Information Risk Management: A practitioner’s guide", 2014)

[active attack:] "Attack where the attacker does interact with processing or communication activities." (Adam Gordon, "Official (ISC)2 Guide to the CISSP CBK" 4th Ed., 2015)

[passive attack:] "Attack where the attacker does not interact with processing or communication activities, but only carries out observation and data collection, as in network sniffing." (Adam Gordon, "Official (ISC)2 Guide to the CISSP CBK" 4th Ed., 2015)

"An attempt to gain unauthorized access to system services, resources, or information, or an attempt to compromise system integrity." (Olivera Injac & Ramo Šendelj, "National Security Policy and Strategy and Cyber Security Risks", 2016)

"A sequence of actions intended to have a specified effect favorable to an actor that is adversarial to the owners of that system." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"An attempt to bypass security controls in a system with the mission of using that system or compromising it. An attack is usually accomplished by exploiting a current vulnerability." (Shon Harris & Fernando Maymi, "CISSP All-in-One Exam Guide" 8th Ed., 2018)

"Any kind of malicious activity that attempts to collect, disrupt, deny, degrade, or destroy information system resources or information itself." (William Stallings, "Effective Cybersecurity: A Guide to Using Best Practices and Standards", 2018)

"an aggressive action against a person, an organisation or an asset intended to cause damage or loss." (ISO/IEC 27000:2014)

23 August 2019

Information Security: Cybercrime (Definitions)

 "A variety of offenses related to information technology, including extortion, boiler-room investment and gambling fraud, and fraudulent transfers of funds." (Robert McCrie, "Security Operations Management" 2nd Ed., 2006)

"Any type of crime that targets computers, or uses computer networks or devices, and violates existing laws. Cybercrime includes cyber vandalism, cyber theft, and cyber-attacks." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"Any crime that is facilitated through the use of computers and networks. This can include crimes that are dependent on computers or networks in order to take place, as well as those whose impact and reach are increased by their use." (Hamid R Arabnia et al, "Application of Big Data for National Security", 2015)

"Cybercrime is defined as any illegal activity that uses a computer either as the object of the crime OR as a tool to commit an offense." (Sanjukta Pookulangara, "Cybersecurity: What Matters to Consumers - An Exploratory Study", 2016)

"Any crime that is facilitated or committed using a computer, network, or hardware device." (Anisha B D Gani & Yudi Fernando, "Concept and Practices of Cyber Supply Chain in Manufacturing Context", 2018)

"Is all illegal acts, the commission of which involves the use of information and communication technologies. It is generally thought of as any criminal activity involving a computer system."  (Thokozani I Nzimakwe, "Government's Dynamic Approach to Addressing Challenges of Cybersecurity in South Africa", 2018)

"Any criminal action perpetrated primarily through the use of a computer." (Christopher T Anglim, "Cybersecurity Legislation", 2020)

"Criminal activity involving computer systems, networks, and/or the internet." (Boaventura DaCosta & Soonhwa Seok, "Cybercrime in Online Gaming", 2020)

07 August 2019

Information Security: Certificate (Definitions)

"An asymmetric key, usually issued by a certificate authority, that contains the public key of a public/private key pair as well as identifying information, expiration dates, and other information and that provides the ability to authenticate its holder. Certificates are used in SQL Server 2005 to secure logins or other database objects." (Victor Isakov et al, "MCITP Administrator: Microsoft SQL Server 2005 Optimization and Maintenance (70-444) Study Guide", 2007)

"A certificate is an electronic document consisting of an asymmetric key with additional metadata such as an expiration date and a digital signature that allows it to be verified by a third-party like a certificate authority (CA)." (Michael Coles, "Pro T-SQL 2008 Programmer's Guide", 2008)

"A certificate is an electronic document that uses a digital signature to bind an asymmetric key with a public identity. In its simplest form, a certificate is essentially an asymmetric key which can have additional metadata, like a certificate name, subject, and expiration date. A certificate can be selfsigned or issued by a certificate authority." (Michael Coles & Rodney Landrum, , "Expert SQL Server 2008 Encryption", 2008)

"A data object that binds information about a person or some other entity to a public key. The binding is generally done using a digital signature from a trusted third party (a certification authority)." (Mark S Merkow & Lakshmikanth Raghavan, "Secure and Resilient Software Development", 2010)

"(1) A token of authorization or authentication. (2) In data security, a computer data security object that includes identity information, validity specification, and a key." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A digital document that is commonly used for authentication and to help secure information on a network. A certificate binds a public key to an entity that holds the corresponding private key. Certificates are digitally signed by the certification authority that issues them, and they can be issued for a user, a computer, or a service." (Microsoft, "SQL Server 2012 Glossary", 2012)

"A bundle of information containing the encrypted public key of the server, and the identification of the key provider." (Manish Agrawal, "Information Security and IT Risk Management", 2014)

"An electronic document used to identify an individual, a system, a server, a company, or some other entity, and to associate a public key with the entity. A digital certificate is issued by a certification authority and is digitally signed by that authority." (IBM, "Informix Servers 12.1", 2014)

"A representation of a sender’s authenticated public key used to minimize malicious forgeries" (Nell Dale & John Lewis, "Computer Science Illuminated" 6th Ed., 2015)

"A small electronic file that serves to validate or encrypt a message or browser session. Digital certificates are often used to create a digital signature which offers non-repudiation of a user or a Web site." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"An electronic document consisting of an asymmetric key with additional metadata such as an expiration date and a digital signature that allows it to be verified by a third party like a certificate authority (CA)." (Miguel Cebollero et al, "Pro T-SQL Programmer’s Guide 4th Ed", 2015)

"Cryptography-related electronic documents that allow for node identification and authentication. Digital certificates require more administrative work than some other methods but provide greater security." (Weiss, "Auditing IT Infrastructures for Compliance" 2nd Ed., 2015)

"Digital identity used within a PKI. Generated and maintained by a certificate authority and used for authentication." (Adam Gordon, "Official (ISC)2 Guide to the CISSP CBK" 4th Ed., 2015)

"A cryptographic binding between a user identifier and their public key as signed by a recognized authority called a certificate authority." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"In computer security, a digital document that binds a public key to the identity of the certificate owner, thereby enabling the certificate owner to be authenticated. A certificate is issued by a certificate authority and is digitally signed by that authority." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"An electronic document using a digital signature to assert the identity of a person, group, or organization. Certificates attest to the identity of a person or group and contain that organization’s public key. A certificate is signed by a certificate authority with its digital signature." (Daniel Leuck et al, "Learning Java" 5th Ed., 2020)

30 July 2019

IT: Network (Definitions)

"Mathematically defined structure of a computing system where the operations are performed at specific locations (nodes) and the flow of information is represented by directed arcs." (Guido Deboeck & Teuvo Kohonen (Eds), "Visual Explorations in Finance with Self-Organizing Maps 2nd Ed.", 2000)

"A system of interconnected computing resources (computers, servers, printers, and so on)." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling 2nd Ed.", 2005)

"A system of connected computers. A local area network (LAN) is contained within a single company, in a single office. A wide area network (WAN) is generally distributed across a geographical area — even globally. The Internet is a very loosely connected network, meaning that it is usable by anyone and everyone." (Gavin Powell, "Beginning Database Design", 2006)

"A system of interconnected devices that provides a means for data to be transmitted from point to point." (Janice M Roehl-Anderson, "IT Best Practices for Financial Managers", 2010)

"1.Visually, a graph of nodes and connections where more than one entry point for each node is allowed. 2.In architecture, a topological arrangement of hardware and connections to allow communication between nodes and access to shared data and software." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The connection of computer systems (nodes) by communications channels and appropriate software. |" (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"The means by which electronic communications occurs between two or more nodes" (Daniel Linstedt & W H Inmon, "Data Architecture: A Primer for the Data Scientist", 2014)

"Two or more computers connected to share data and resources." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"People working towards a common purpose or with common interests where there is no requirement for members of the network to have a work relationship with others, and there is no requirement for mutuality as there is with a team." (Catherine Burke et al, "Systems Leadership, 2nd Ed,", 2018)

28 July 2019

IT: Internet of Things (Definitions)

"A term used to describe the community or collection of people and items that use the Internet to communicate with other." (Kenneth A Shaw, "Integrated Management of Processes and Information", 2013)

"The embedding of objects with sensors, coupled with the ability of objects to communicate, driving an explosion in the growth of big data." (Brenda L Dietrich et al, "Analytics Across the Enterprise", 2014)

"The Internet of Things entails the aim of all physical or uniquely identifiable objects being connected through wired and wireless networks. In this notion, every object would be virtually represented. Connecting objects in this way offers a whole new universe of possibilities. Real-time analysis of big data streams could enhance productivity and safety of systems (for example, roadways and cars being part of the Internet of Things could help to manage traffic flow). It can also make everyday life more convenient and sustainable (such as connecting all household devices to save electricity)." (Martin Hoegl et al, "Using Thematic Thinking to Achieve Business Success, Growth, and Innovation", 2014)

"IOT refers to a network of machines that have sensors and are interconnected enabling them to collect and exchange data. This interconnection enables devices to be controlled remotely resulting in process efficiencies and lower costs." (Saumya Chaki, "Enterprise Information Management in Practice", 2015)

"An interconnected network of physical devices, vehicles, buildings, and other items embedded with sensors that gather and share data." (Jonathan Ferrar et al, "The Power of People: Learn How Successful Organizations Use Workforce Analytics To Improve Business Performance", 2017)

"Ordinary devices that are connected to the Internet at any time, anywhere, via sensors." (Jason Williamson, "Getting a Big Data Job For Dummies", 2015)

"Also referred to as IoT. Term that describes the connectivity of objects to the Internet and the ability for these objects to send and receive data from each other." (Brittany Bullard, "Style and Statistics", 2016)

"computing or 'smart' devices often with ­sensor capability and the ability to collect, share, and transfer data using the Internet." (Daniel J. Power & Ciara Heavin, "Data-Based Decision Making and Digital Transformation", 2018)

"The wide-scale deployment of small, low-power computing devices into everyday devices, such as thermostats, refrigerators, clothing, and even into people themselves to continuously monitor health." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

"A network of physical objects that have, like cell phones and laptops, internet connectivity enabling automatic communication between them and any other machine connected to the internet without human intervention." (Sue Milton, "Data Privacy vs. Data Security", 2021)

"Integration of various processes such as identifying, sensing, networking, and computation." (Revathi Rajendran et al, "Convergence of AI, ML, and DL for Enabling Smart Intelligence: Artificial Intelligence, Machine Learning, Deep Learning, Internet of Things", 2021)

"It is an interdisciplinary field who is associated with the electronics and computer science. Electronics deals with the development of new sensors or hardware for IoT device and computer science deals with the development of software, protocols and cloud based solution to store the data generated form these IoT devices."  (Ajay Sharma, "Smart Agriculture Services Using Deep Learning, Big Data, and IoT", 2021)

"IoT is a network of real-world objects which consists of sensors, software, and other technologies to exchange data with the other systems over the internet." (Hari K Kondaveeti et al, "Deep Learning Applications in Agriculture: The Role of Deep Learning in Smart Agriculture", 2021)

"This refers to a system of inter-connected computing and smart devices, that are provided with unique identifiers and the ability to transfer data over a network without requiring human interaction." (Wissam Abbass et al, "Internet of Things Application for Intelligent Cities: Security Risk Assessment Challenges", 2021)

"describes the network where sensing elements such as sensors, cameras, and devices are increasingly linked together via the internet to connect, communicate and exchange information." (Accenture)

"ordinary devices that are connected to the internet at any time anywhere via sensors." (Analytics Insight)

"Technologies that enable objects and infrastructure to interact with monitoring, analytics, and control systems over internet-style networks." (Forrester)

27 July 2019

IT: Cloud (Definitions)

"A set of computers, typically maintained in a data center, that can be allocated dynamically and accessed remotely. Unlike a cluster, cloud computers are typically managed by a third party and may host multiple applications from different, unrelated users." (Michael McCool et al, "Structured Parallel Programming", 2012)

"A network that delivers requested virtual resources as a service." (IBM, "Informix Servers 12.1", 2014)

"A secure computing environment accessed via the Internet." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"Products and services managed by a third-party company and made available through the Internet." (David K Pham, "From Business Strategy to Information Technology Roadmap", 2016)

"It has the ability to offer and to assist any kind of useful information without any limitations for users." (Shigeki Sugiyama. "Human Behavior and Another Kind in Consciousness: Emerging Research and Opportunities", 2019)

"Remote server and distributed computing environment used to store data and provision computing related services as and when needed on a pay-as-you-go basis." (Wissam Abbass et al, "Internet of Things Application for Intelligent Cities: Security Risk Assessment Challenges", 2021)

"The virtual world in which information technology tools and services are available for hire, use and storage via the internet, Wi-Fi and physical attributes ranging from IT components to data storage." (Sue Milton, "Data Privacy vs. Data Security", 2021)

"uses a network of remote servers hosted on the internet to store, manage, and process data, rather than requiring a local server or a personal computer." (Accenture)

24 July 2019

IT: Virtualization (Definitions)

"Creation of a virtual, as opposed to a real, instance of an entity, such as an operating system, server, storage, or network." (David G Hill, "Data Protection: Governance, Risk Management, and Compliance", 2009)

"The process of partitioning a computer so that multiple operating system instances can run at the same time on a single physical computer." (John Goodson & Robert A Steward, "The Data Access Handbook", 2009)

"A concept that separates business applications and data from hardware resources, allowing companies to pool hardware resources, rather than dedicate servers to application and assign those resources to applications as needed." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed, 2011)

"A technique that creates logical representations of computing resources that are independent of the underlying physical computing resources." (Carlos Coronel et al, "Database Systems: Design, Implementation, and Management" 9th Ed., 2011)

"A method for managing hardware assets used at the same time by different users or processes, or both, that makes the part assigned to each user or process appear to act as if it was running on a separate piece of equipment." (Kenneth A Shaw, "Integrated Management of Processes and Information", 2013)

"Virtual memory is the use of a disk to store active areas of memory to make the available memory appear larger. In a virtual environment, one computer runs software that allows it to emulate another machine. This kind of emulation is commonly known as virtualization." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A technique common in computing, consisting in the creation of virtual (rather than actual) instance of any element, so it can be managed and used independently. Virtualization has been one of the key tools for resource sharing and software development, and now it is beginning to be applied to the network disciplines." (Diego R López & Pedro A. Aranda, "Network Functions Virtualization: Going beyond the Carrier Cloud", 2015)

"Creation of a simulated environment (hardware platform, operating system, storage, etc.) that allows for central control and scalability." (Adam Gordon, "Official (ISC)2 Guide to the CISSP CBK 4th Ed.", 2015)

"The creation of a virtual version of actual services, applications, or resources." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"The process of creating a virtual version of a resource, such as an operating system, hardware platform, or storage device." (Andrew Pham et al, "From Business Strategy to Information Technology Roadmap", 2016)

"A base component of the cloud that consists of software that emulates physical infrastructure." (Richard Ehrhardt, "Cloud Build Methodology", 2017)

"The process of presenting an abstraction of hardware resources to give the appearance of dedicated access and control to hardware resources, while, in reality, those resources are being shared." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

16 July 2019

IT: Quality of Service (Definitions)

"The guaranteed performance of a network connection." (Tom Petrocelli, "Data Protection and Information Lifecycle Management", 2005)

"QoS (Quality of Service) is a metric for quantifying desired or delivered degree of service reliability, priority, and other measures of interest for its quality." (Bo Leuf, "The Semantic Web: Crafting infrastructure for agency", 2006)

"a criterion of performance of a service or element, such as the worst-case execution time for an operation." (Bruce P Douglass, "Real-Time Agility: The Harmony/ESW Method for Real-Time and Embedded Systems Development", 2009)

"The QoS describes the non-functional aspects of a service such as performance." (Martin Oberhofer et al, "The Art of Enterprise Information Architecture", 2010)

"QoS (Quality of Service) Networking technology that enables network administrators to manage bandwidth and give priority to desired types of application traffic as it traverses the network." (Mark Rhodes-Ousley, "Information Security: The Complete Reference" 2nd Ed., 2013)

"A negotiated contract between a user and a network provider that renders some degree of reliable capacity in the shared network." (Gartner)

"Quality of service (QoS) is the description or measurement of the overall performance of a service, especially in terms of the user’s experience. Typically it is used in reference to telephony or computer networks, or to online and cloud-hosted services." (Barracuda) [source]

"The measurable end-to-end performance properties of a network service, which can be guaranteed in advance by a Service Level Agreement between a user and a service provider, so as to satisfy specific customer application requirements. Note: These properties may include throughput (bandwidth), transit delay (latency), error rates, priority, security, packet loss, packet jitter, etc." (CNSSI 4009-2015)

13 July 2019

IT: Extranet (Definitions)

"A secure Internet site available only to a company’s internal staff and approved third-party partners. Extranets are flourishing in B2B environments where suppliers can have ready access to updated information from their business customers, and vice versa." (Evan Levy & Jill Dyché, "Customer Data Integration", 2006)

"Semi-public TCP/IP network used by several collaborating partners." (Martin J Eppler, "Managing Information Quality 2nd Ed.", 2006)

"Enterprise network using Web technologies for collaboration of internal users and selected external business partners." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"An internal network or intranet opened to selected business partners. Suppliers, distributors, and other authorized users can connect to a company’s network over the Internet or through private networks." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"Private, company-owned network that uses IP technology to securely share part of a business's information or operations with suppliers, vendors, partners, customers, or other businesses." (Linda Volonino & Efraim Turban, "Information Technology for Management 8th Ed", 2011)

"A network that is outside the control of the company. Extranets are usually connections to outside companies, service providers, customers, and business partners." (Mark Rhodes-Ousley, "Information Security: The Complete Reference" 2nd Ed., 2013)

"A special network set up by a business for its customers, staff, and business partners to access from outside the office network; may be used to share marketing assets and other non-sensitive items." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"An extension of the corporate intranet over the Internet so that vendors, business partners, customers, and others can have access to the intranet." (James R Kalyvas & Michael R Overly, "Big Data: A Businessand Legal Guide", 2015)

12 July 2019

IT: Intranet (Definitions)

"This is a network technology similar to the Internet that has been constructed by a company for its own benefit. Usually access to a company's intranet is limited to its employees, customers, and vendors." (Dale Furtwengler, "Ten Minute Guide to Performance Appraisals", 2000)

"A private network that uses web technology to distribute information. Usually used to make information available inside a company among employees." (Andy Walker, "Absolute Beginner’s Guide To: Security, Spam, Spyware & Viruses", 2005)

"An organization’s internal system of connected networks built on Internet-standard protocols and usually connected to the Internet via a firewall." (Sharon Allen & Evan Terry, "Beginning Relational Data Modeling 2nd Ed.", 2005)

"Internal company networks designed to provide a secure forum for sharing information, often in a web-browser type interface." (Martin J Eppler, "Managing Information Quality 2nd Ed.", 2006)

"The enterprise network using Web technologies for collaboration of internal users." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A subset of the Internet used internally by an organization. Unlike the larger Internet, intranets are private and accessible only from within the organization. The use of Internet technologies over a private network." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"Network designed to serve the internal informational needs of a company, using Internet tools." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"a private web site available only to those within a company or organization." (Bill Holtsnider & Brian D Jaffe, "IT Manager's Handbook" 3rd Ed., 2012)

"A computer network designed to be used within a business or company. An intranet is so named because it uses much of the same technology as the Internet. Web browsers, email, newsgroups, HTML documents, and websites are all found on intranets.  In addition, the method for transmitting information on these networks is TCP/IP (Transmission Control Protocol/Internet Protocol). See Internet." (James R Kalyvas & Michael R Overly, "Big Data: A Businessand Legal Guide", 2015)

"A special network that only staff within the company network can access. For security reasons an intranet can only be accessed onsite and not remotely." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

 "A trusted digital source of corporate communication and content designed to educate and empower employees and improve their workplace experiences." (Forrester)

08 July 2019

IT: Grid Computing (Definitions)

"A grid is an architecture for distributed computing and resource sharing. A grid system is composed of a heterogeneous collection of resources connected by local-area and/or wide-area networks (often the Internet). These individual resources are general and include compute servers, storage, application servers, information services, or even scientific instruments. Grids are often implemented in terms of Web services and integrated middleware components that provide a consistent interface to the grid. A grid is different from a cluster in that the resources in a grid are not controlled through a single point of administration; the grid middleware manages the system so control of resources on the grid and the policies governing use of the resources remain with the resource owners." (Beverly A Sanders, "Patterns for Parallel Programming", 2004)

"Clusters of cheap computers, perhaps distributed on a global basis, connected using even something as loosely connected as the Internet." (Gavin Powell, "Beginning Database Design", 2006)

"A step beyond distributed processing. Grid computing involves large numbers of networked computers, often geographically dispersed and possibly of different types and capabilities, that are harnessed together to solve a common problem." (Judith Hurwitz et al, "Service Oriented Architecture For Dummies" 2nd Ed., 2009)

"A web-based operation allowing companies to share computing resources on demand." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The use of networks to harness the unused processing cycles of all computers in a given network to create powerful computing capabilities." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"A distributed set of computers that can be allocated dynamically and accessed remotely. A grid is distinguished from a cloud in that a grid may be supported by multiple organizations and is usually more heterogeneous and physically distributed." (Michael McCool et al, "Structured Parallel Programming", 2012)

"the use of multiple computing resources to leverage combined processing power. Usually associated with scientific applications." (Bill Holtsnider & Brian D Jaffe, "IT Manager's Handbook" 3rd Ed., 2012)

"A step beyond distributed processing, involving large numbers of networked computers (often geographically dispersed and possibly of different types and capabilities) that are harnessed to solve a common problem. A grid computing model can be used instead of virtualization in situations that require real time where latency is unacceptable." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A named set of interconnected replication servers for propagating commands from an authorized server to the rest of the servers in the set." (IBM, "Informix Servers 12.1", 2014)

"A type of computing in which large computing tasks are distributed among multiple computers on a network." (Jim Davis & Aiman Zeid, "Business Transformation: A Roadmap for Maximizing Organizational Insights", 2014)

"Connecting many computer system locations, often via the cloud, working together for the same purpose." (Jason Williamson, "Getting a Big Data Job For Dummies", 2015)

"A computer network that enables distributed resource management and on-demand services." (Forrester)

"A computing architecture that coordinates large numbers of servers and storage to act as a single large computer." (Oracle, "Oracle Database Concepts")

"connecting different computer systems from various location, often via the cloud, to reach a common goal." (Analytics Insight)

06 July 2019

IT: Latency (Definitions)

"The fixed cost of servicing a request, such as sending a message or accessing information from a disk. In parallel computing, the term most often is used to refer to the time it takes to send an empty message over the communication medium, from the time the send routine is called to the time the empty message is received by the recipient. Programs that generate large numbers of small messages are sensitive to the latency and are called latency-bound programs." (Beverly A Sanders, "Patterns for Parallel Programming", 2004)

"The amount of time it takes a system to deliver data in response to a request. For mass storage devices, it is the time it takes to place the read or write heads over the desired spot on the media. In networks, it is a function of the electrical and software properties of the network connection." (Tom Petrocelli, "Data Protection and Information Lifecycle Management", 2005)

"The time delay it takes for a network packet to travel from one destination to another." (John Goodson & Robert A Steward, "The Data Access Handbook", 2009)

"The time it takes for a system to respond to an input." (W Roy Schulte & K Chandy, "Event Processing: Designing IT Systems for Agile Companies", 2009)

"A period of time that the computer must wait while a disk drive is positioning itself to read a particular block of data." (Rod Stephens, "Start Here!™ Fundamentals of Microsoft® .NET Programming", 2011)

"The measure of time between two events, such as the initiation and completion of an event, or the read on one system and the write to another system." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"The time period from start to completion of a unit of work." (Max Domeika, "Software Development for Embedded Multi-core Systems", 2011)

"The time it takes to complete a task - that is, the time between when the task begins and when it ends. Latency has units of time. The scale can be anywhere from nanoseconds to days. Lower latency is better in general." (Michael McCool et al, "Structured Parallel Programming", 2012)

"The amount of time lag before a service executes in an environment. Some applications require less latency and need to respond in near real time, whereas other applications are less time-sensitive." (Marcia Kaufman et al, "Big Data For Dummies", 2013)

"A delay. Can apply to the sending, processing, transmission, storage, or receiving of information." (Mike Harwood, "Internet Security: How to Defend Against Attackers on the Web" 2nd Ed., 2015)

"A period of waiting for another component to deliver data needed to proceed." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"The time it takes for the specified sector to be in position under the read/write head" (Nell Dale & John Lewis, "Computer Science Illuminated" 6th Ed., 2015)

"The delay between when an action such as transmitting data is taken and when it has an effect." (O Sami Saydjari, "Engineering Trustworthy Systems: Get Cybersecurity Design Right the First Time", 2018)

05 July 2019

IT: Gateway (Definitions)

"A network software product that allows computers or networks running dissimilar protocols to communicate, providing transparent access to a variety of foreign database management systems (DBMSs). A gateway moves specific database connectivity and conversion processing from individual client computers to a single server computer. Communication is enabled by translating up one protocol stack and down the other. Gateways usually operate at the session layer." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"Connectivity software that allows two or more computer systems with different network architectures to communicate." (Sybase, "Glossary", 2005)

"A generic term referring to a computer system that routes data or merges two dissimilar services together." (Paulraj Ponniah, "Data Warehousing Fundamentals for IT Professionals", 2010)

"A software product that allows SQL-based applications to access relational and non-relational data sources." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"An entrance point that allows users to connect from one network to another." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

[database gateway:] "Software required to allow clients to access data stored on database servers over a network connection." (Craig S Mullins, "Database Administration: The Complete Guide to DBA Practices and Procedures" 2nd Ed., 2012)

"A connector box that enables you to connect two dissimilar networks." (Faithe Wempen, "Computing Fundamentals: Introduction to Computers", 2015)

"A node that handles communication between its LAN and other networks" (Nell Dale & John Lewis, "Computer Science Illuminated, 6th Ed.", 2015)

"A system or device that connects two unlike environments or systems. The gateway is usually required to translate between different types of applications or protocols." (Shon Harris & Fernando Maymi, "CISSP All-in-One Exam Guide" 8th Ed., 2018)

"An application that acts as an intermediary for clients and servers that cannot communicate directly. Acting as both client and server, a gateway application passes requests from a client to a server and returns results from the server to the client." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

24 December 2018

Data Science: Randomness (Just the Quotes)

"If the number of experiments be very large, we may have precise information as to the value of the mean, but if our sample be small, we have two sources of uncertainty: (I) owing to the 'error of random sampling' the mean of our series of experiments deviates more or less widely from the mean of the population, and (2) the sample is not sufficiently large to determine what is the law of distribution of individuals." William S Gosset, "The Probable Error of a Mean", Biometrika, 1908)

"The most important application of the theory of probability is to what we may call 'chance-like' or 'random' events, or occurrences. These seem to be characterized by a peculiar kind of incalculability which makes one disposed to believe - after many unsuccessful attempts - that all known rational methods of prediction must fail in their case. We have, as it were, the feeling that not a scientist but only a prophet could predict them. And yet, it is just this incalculability that makes us conclude that the calculus of probability can be applied to these events." (Karl R Popper, "The Logic of Scientific Discovery", 1934)

"The definition of random in terms of a physical operation is notoriously without effect on the mathematical operations of statistical theory because so far as these mathematical operations are concerned random is purely and simply an undefined term." (Walter A Shewhart & William E Deming, "Statistical Method from the Viewpoint of Quality Control", 1939)

"The first attempts to consider the behavior of so-called 'random neural nets' in a systematic way have led to a series of problems concerned with relations between the 'structure' and the 'function' of such nets. The 'structure' of a random net is not a clearly defined topological manifold such as could be used to describe a circuit with explicitly given connections. In a random neural net, one does not speak of 'this' neuron synapsing on 'that' one, but rather in terms of tendencies and probabilities associated with points or regions in the net." (Anatol Rapoport, "Cycle distributions in random nets", The Bulletin of Mathematical Biophysics 10(3), 1948)

"Time itself will come to an end. For entropy points the direction of time. Entropy is the measure of randomness. When all system and order in the universe have vanished, when randomness is at its maximum, and entropy cannot be increased, when there is no longer any sequence of cause and effect, in short when the universe has run down, there will be no direction to time - there will be no time." (Lincoln Barnett, "The Universe and Dr. Einstein", 1948)

"We must emphasize that such terms as 'select at random', 'choose at random', and the like, always mean that some mechanical device, such as coins, cards, dice, or tables of random numbers, is used." (Frederick Mosteller et al, "Principles of Sampling", Journal of the American Statistical Association Vol. 49 (265), 1954)

"[…] random numbers should not be generated with a method chosen at random. Some theory should be used." (Donald E Knuth, "The Art of Computer Programming" Vol. II, 1968)

"It appears to be a quite general principle that, whenever there is a randomized way of doing something, then there is a nonrandomized way that delivers better performance but requires more thought." (Edwin T Jaynes, "Probability Theory: The Logic of Science", 1979)

"From a purely operational point of viewpoint […] the concept of randomness is so elusive as to cease to be viable." (Mark Kac, 1983)

"Randomness is a difficult notion for people to accept. When events come in clusters and streaks, people look for explanations and patterns. They refuse to believe that such patterns - which frequently occur in random data - could equally well be derived from tossing a coin. So it is in the stock market as well." (Burton G Malkiel, "A Random Walk Down Wall Street", 1989)

"The term chaos is used in a specific sense where it is an inherently random pattern of behaviour generated by fixed inputs into deterministic (that is fixed) rules (relationships). The rules take the form of non-linear feedback loops. Although the specific path followed by the behaviour so generated is random and hence unpredictable in the long-term, it always has an underlying pattern to it, a 'hidden' pattern, a global pattern or rhythm. That pattern is self-similarity, that is a constant degree of variation, consistent variability, regular irregularity, or more precisely, a constant fractal dimension. Chaos is therefore order (a pattern) within disorder (random behaviour)." (Ralph D Stacey, "The Chaos Frontier: Creative Strategic Control for Business", 1991)

"Chaos demonstrates that deterministic causes can have random effects […] There's a similar surprise regarding symmetry: symmetric causes can have asymmetric effects. […] This paradox, that symmetry can get lost between cause and effect, is called symmetry-breaking. […] From the smallest scales to the largest, many of nature's patterns are a result of broken symmetry; […]" (Ian Stewart & Martin Golubitsky, "Fearful Symmetry: Is God a Geometer?", 1992)

"Probability theory is an ideal tool for formalizing uncertainty in situations where class frequencies are known or where evidence is based on outcomes of a sufficiently long series of independent random experiments. Possibility theory, on the other hand, is ideal for formalizing incomplete information expressed in terms of fuzzy propositions." (George Klir, "Fuzzy sets and fuzzy logic", 1995)

"We use mathematics and statistics to describe the diverse realms of randomness. From these descriptions, we attempt to glean insights into the workings of chance and to search for hidden causes. With such tools in hand, we seek patterns and relationships and propose predictions that help us make sense of the world."  (Ivars Peterson, "The Jungles of Randomness: A Mathematical Safari", 1998)

"Events may appear to us to be random, but this could be attributed to human ignorance about the details of the processes involved." (Brain S Everitt, "Chance Rules", 1999)

"The self-similarity of fractal structures implies that there is some redundancy because of the repetition of details at all scales. Even though some of these structures may appear to teeter on the edge of randomness, they actually represent complex systems at the interface of order and disorder."  (Edward Beltrami, "What is Random?: Chaos and Order in Mathematics and Life", 1999)

"Most physical systems, particularly those complex ones, are extremely difficult to model by an accurate and precise mathematical formula or equation due to the complexity of the system structure, nonlinearity, uncertainty, randomness, etc. Therefore, approximate modeling is often necessary and practical in real-world applications. Intuitively, approximate modeling is always possible. However, the key questions are what kind of approximation is good, where the sense of 'goodness' has to be first defined, of course, and how to formulate such a good approximation in modeling a system such that it is mathematically rigorous and can produce satisfactory results in both theory and applications." (Guanrong Chen & Trung Tat Pham, "Introduction to Fuzzy Sets, Fuzzy Logic, and Fuzzy Control Systems", 2001)

"[…] we would like to observe that the butterfly effect lies at the root of many events which we call random. The final result of throwing a dice depends on the position of the hand throwing it, on the air resistance, on the base that the die falls on, and on many other factors. The result appears random because we are not able to take into account all of these factors with sufficient accuracy. Even the tiniest bump on the table and the most imperceptible move of the wrist affect the position in which the die finally lands. It would be reasonable to assume that chaos lies at the root of all random phenomena." (Iwo Białynicki-Birula & Iwona Białynicka-Birula, "Modeling Reality: How Computers Mirror Life", 2004)

"Chance is just as real as causation; both are modes of becoming. The way to model a random process is to enrich the mathematical theory of probability with a model of a random mechanism. In the sciences, probabilities are never made up or 'elicited' by observing the choices people make, or the bets they are willing to place. The reason is that, in science and technology, interpreted probability exactifies objective chance, not gut feeling or intuition. No randomness, no probability." (Mario Bunge, "Chasing Reality: Strife over Realism", 2006)

"Complexity arises when emergent system-level phenomena are characterized by patterns in time or a given state space that have neither too much nor too little form. Neither in stasis nor changing randomly, these emergent phenomena are interesting, due to the coupling of individual and global behaviours as well as the difficulties they pose for prediction. Broad patterns of system behaviour may be predictable, but the system's specific path through a space of possible states is not." (Steve Maguire et al, "Complexity Science and Organization Studies", 2006)

"A Black Swan is a highly improbable event with three principal characteristics: It is unpredictable; it carries a massive impact; and, after the fact, we concoct an explanation that makes it appear less random, and more predictable, than it was. […] The Black Swan idea is based on the structure of randomness in empirical reality. [...] the Black Swan is what we leave out of simplification." (Nassim N Taleb, "The Black Swan", 2007)

"[myth:] Random errors can always be determined by repeating measurements under identical conditions. […] this statement is true only for time-related random errors ." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"To fulfill the requirements of the theory underlying uncertainties, variables with random uncertainties must be independent of each other and identically distributed. In the limiting case of an infinite number of such variables, these are called normally distributed. However, one usually speaks of normally distributed variables even if their number is finite." (Manfred Drosg, "Dealing with Uncertainties: A Guide to Error Analysis", 2007)

"While in theory randomness is an intrinsic property, in practice, randomness is incomplete information." (Nassim N Taleb, "The Black Swan", 2007)

"Regression toward the mean. That is, in any series of random events an extraordinary event is most likely to be followed, due purely to chance, by a more ordinary one." (Leonard Mlodinow, "The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"The key to understanding randomness and all of mathematics is not being able to intuit the answer to every problem immediately but merely having the tools to figure out the answer." (Leonard Mlodinow,"The Drunkard’s Walk: How Randomness Rules Our Lives", 2008)

"Data always vary randomly because the object of our inquiries, nature itself, is also random. We can analyze and predict events in nature with an increasing amount of precision and accuracy, thanks to improvements in our techniques and instruments, but a certain amount of random variation, which gives rise to uncertainty, is inevitable." (Alberto Cairo, "The Functional Art", 2011)

"Randomness might be defined in terms of order - its absence, that is. […] Everything we care about lies somewhere in the middle, where pattern and randomness interlace." (James Gleick, "The Information: A History, a Theory, a Flood", 2011)

"The storytelling mind is allergic to uncertainty, randomness, and coincidence. It is addicted to meaning. If the storytelling mind cannot find meaningful patterns in the world, it will try to impose them. In short, the storytelling mind is a factory that churns out true stories when it can, but will manufacture lies when it can't." (Jonathan Gottschall, "The Storytelling Animal: How Stories Make Us Human", 2012)

"When some systems are stuck in a dangerous impasse, randomness and only randomness can unlock them and set them free." (Nassim N Taleb, "Antifragile: Things That Gain from Disorder", 2012)

"Although cascading failures may appear random and unpredictable, they follow reproducible laws that can be quantified and even predicted using the tools of network science. First, to avoid damaging cascades, we must understand the structure of the network on which the cascade propagates. Second, we must be able to model the dynamical processes taking place on these networks, like the flow of electricity. Finally, we need to uncover how the interplay between the network structure and dynamics affects the robustness of the whole system." (Albert-László Barabási, "Network Science", 2016)

"Too little attention is given to the need for statistical control, or to put it more pertinently, since statistical control (randomness) is so rarely found, too little attention is given to the interpretation of data that arise from conditions not in statistical control." (William E Deming)

More quotes on "Randomness" at the-web-of-knowledge.blogspot.com

13 December 2018

Data Science: Bayesian Networks (Just the Quotes)

"The best way to convey to the experimenter what the data tell him about theta is to show him a picture of the posterior distribution." (George E P Box & George C Tiao, "Bayesian Inference in Statistical Analysis", 1973)

"In the design of experiments, one has to use some informal prior knowledge. How does one construct blocks in a block design problem for instance? It is stupid to think that use is not made of a prior. But knowing that this prior is utterly casual, it seems ludicrous to go through a lot of integration, etc., to obtain 'exact' posterior probabilities resulting from this prior. So, I believe the situation with respect to Bayesian inference and with respect to inference, in general, has not made progress. Well, Bayesian statistics has led to a great deal of theoretical research. But I don't see any real utilizations in applications, you know. Now no one, as far as I know, has examined the question of whether the inferences that are obtained are, in fact, realized in the predictions that they are used to make." (Oscar Kempthorne, "A conversation with Oscar Kempthorne", Statistical Science, 1995)

"Bayesian methods are complicated enough, that giving researchers user-friendly software could be like handing a loaded gun to a toddler; if the data is crap, you won't get anything out of it regardless of your political bent." (Brad Carlin, "Bayes offers a new way to make sense of numbers", Science, 1999)

"Bayesian inference is a controversial approach because it inherently embraces a subjective notion of probability. In general, Bayesian methods provide no guarantees on long run performance." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"Bayesian inference is appealing when prior information is available since Bayes’ theorem is a natural way to combine prior information with data. Some people find Bayesian inference psychologically appealing because it allows us to make probability statements about parameters. […] In parametric models, with large samples, Bayesian and frequentist methods give approximately the same inferences. In general, they need not agree." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The Bayesian approach is based on the following postulates: (B1) Probability describes degree of belief, not limiting frequency. As such, we can make probability statements about lots of things, not just data which are subject to random variation. […] (B2) We can make probability statements about parameters, even though they are fixed constants. (B3) We make inferences about a parameter θ by producing a probability distribution for θ. Inferences, such as point estimates and interval estimates, may then be extracted from this distribution." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004)

"The important thing is to understand that frequentist and Bayesian methods are answering different questions. To combine prior beliefs with data in a principled way, use Bayesian inference. To construct procedures with guaranteed long run performance, such as confidence intervals, use frequentist methods. Generally, Bayesian methods run into problems when the parameter space is high dimensional." (Larry A Wasserman, "All of Statistics: A concise course in statistical inference", 2004) 

"Bayesian networks can be constructed by hand or learned from data. Learning both the topology of a Bayesian network and the parameters in the CPTs in the network is a difficult computational task. One of the things that makes learning the structure of a Bayesian network so difficult is that it is possible to define several different Bayesian networks as representations for the same full joint probability distribution." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015) 

"Bayesian networks provide a more flexible representation for encoding the conditional independence assumptions between the features in a domain. Ideally, the topology of a network should reflect the causal relationships between the entities in a domain. Properly constructed Bayesian networks are relatively powerful models that can capture the interactions between descriptive features in determining a prediction." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015) 

"Bayesian networks use a graph-based representation to encode the structural relationships - such as direct influence and conditional independence - between subsets of features in a domain. Consequently, a Bayesian network representation is generally more compact than a full joint distribution (because it can encode conditional independence relationships), yet it is not forced to assert a global conditional independence between all descriptive features. As such, Bayesian network models are an intermediary between full joint distributions and naive Bayes models and offer a useful compromise between model compactness and predictive accuracy." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, worked examples, and case studies", 2015)

"Bayesian networks inhabit a world where all questions are reducible to probabilities, or (in the terminology of this chapter) degrees of association between variables; they could not ascend to the second or third rungs of the Ladder of Causation. Fortunately, they required only two slight twists to climb to the top." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The main differences between Bayesian networks and causal diagrams lie in how they are constructed and the uses to which they are put. A Bayesian network is literally nothing more than a compact representation of a huge probability table. The arrows mean only that the probabilities of child nodes are related to the values of parent nodes by a certain formula (the conditional probability tables) and that this relation is sufficient. That is, knowing additional ancestors of the child will not change the formula. Likewise, a missing arrow between any two nodes means that they are independent, once we know the values of their parents. [...] If, however, the same diagram has been constructed as a causal diagram, then both the thinking that goes into the construction and the interpretation of the final diagram change." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"The transparency of Bayesian networks distinguishes them from most other approaches to machine learning, which tend to produce inscrutable 'black boxes'. In a Bayesian network you can follow every step and understand how and why each piece of evidence changed the network’s beliefs." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"With Bayesian networks, we had taught machines to think in shades of gray, and this was an important step toward humanlike thinking. But we still couldn’t teach machines to understand causes and effects. [...] By design, in a Bayesian network, information flows in both directions, causal and diagnostic: smoke increases the likelihood of fire, and fire increases the likelihood of smoke. In fact, a Bayesian network can’t even tell what the 'causal direction' is." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

12 December 2018

Data Science: Neural Networks (Just the Quotes)

"The terms 'black box' and 'white box' are convenient and figurative expressions of not very well determined usage. I shall understand by a black box a piece of apparatus, such as four-terminal networks with two input and two output terminals, which performs a definite operation on the present and past of the input potential, but for which we do not necessarily have any information of the structure by which this operation is performed. On the other hand, a white box will be similar network in which we have built in the relation between input and output potentials in accordance with a definite structural plan for securing a previously determined input-output relation." (Norbert Wiener, "Cybernetics: Or Control and Communication in the Animal and the Machine", 1948)

"A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network through a learning process. 2. Interneuron connection strengths known as synaptic weights are used to store the knowledge." (Igor Aleksander, "An introduction to neural computing", 1990) 

"Neural Computing is the study of networks of adaptable nodes which through a process of learning from task examples, store experiential knowledge and make it available for use." (Igor Aleksander, "An introduction to neural computing", 1990)

"A neural network is characterized by (1) its pattern of connections between the neurons (called its architecture), (2) its method of determining the weights on the connections (called its training, or learning, algorithm), and (3) its activation function." (Laurene Fausett, "Fundamentals of Neural Networks", 1994)

"An artificial neural network is an information-processing system that has certain performance characteristics in common with biological neural networks. Artificial neural networks have been developed as generalizations of mathematical models of human cognition or neural biology, based on the assumptions that: (1) Information processing occurs at many simple elements called neurons. (2) Signals are passed between neurons over connection links. (3) Each connection link has an associated weight, which, in a typical neural net, multiplies the signal transmitted. (4) Each neuron applies an activation function (usually nonlinear) to its net input (sum of weighted input signals) to determine its output signal." (Laurene Fausett, "Fundamentals of Neural Networks", 1994)

"An artificial neural network (or simply a neural network) is a biologically inspired computational model that consists of processing elements (neurons) and connections between them, as well as of training and recall algorithms." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Many of the basic functions performed by neural networks are mirrored by human abilities. These include making distinctions between items (classification), dividing similar things into groups (clustering), associating two or more things (associative memory), learning to predict outcomes based on examples (modeling), being able to predict into the future (time-series forecasting), and finally juggling multiple goals and coming up with a good- enough solution (constraint satisfaction)." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"More than just a new computing architecture, neural networks offer a completely different paradigm for solving problems with computers. […] The process of learning in neural networks is to use feedback to adjust internal connections, which in turn affect the output or answer produced. The neural processing element combines all of the inputs to it and produces an output, which is essentially a measure of the match between the input pattern and its connection weights. When hundreds of these neural processors are combined, we have the ability to solve difficult problems such as credit scoring." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Neural networks are a computing model grounded on the ability to recognize patterns in data. As a consequence, they have many applications to data mining and analysis." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Neural networks are a computing technology whose fundamental purpose is to recognize patterns in data. Based on a computing model similar to the underlying structure of the human brain, neural networks share the brains ability to learn or adapt in response to external inputs. When exposed to a stream of training data, neural networks can discover previously unknown relationships and learn complex nonlinear mappings in the data. Neural networks provide some fundamental, new capabilities for processing business data. However, tapping these new neural network data mining functions requires a completely different application development process from traditional programming." (Joseph P Bigus, "Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"The most familiar example of swarm intelligence is the human brain. Memory, perception and thought all arise out of the nett actions of billions of individual neurons. As we saw earlier, artificial neural networks (ANNs) try to mimic this idea. Signals from the outside world enter via an input layer of neurons. These pass the signal through a series of hidden layers, until the result emerges from an output layer. Each neuron modifies the signal in some simple way. It might, for instance, convert the inputs by plugging them into a polynomial, or some other simple function. Also, the network can learn by modifying the strength of the connections between neurons in different layers." (David G Green, "The Serendipity Machine: A voyage of discovery through the unexpected world of computers", 2004)

"A neural network is a particular kind of computer program, originally developed to try to mimic the way the human brain works. It is essentially a computer simulation of a complex circuit through which electric current flows." (Keith J Devlin & Gary Lorden, "The Numbers behind NUMB3RS: Solving crime with mathematics", 2007)

 "Neural networks are a popular model for learning, in part because of their basic similarity to neural assemblies in the human brain. They capture many useful effects, such as learning from complex data, robustness to noise or damage, and variations in the data set. " (Peter C R Lane, Order Out of Chaos: Order in Neural Networks, 2007)

"A network of many simple processors ('units' or 'neurons') that imitates a biological neural network. The units are connected by unidirectional communication channels, which carry numeric data. Neural networks can be trained to find nonlinear relationships in data, and are used in various applications such as robotics, speech recognition, signal processing, medical diagnosis, or power systems." (Adnan Khashman et al, "Voltage Instability Detection Using Neural Networks", 2009)

"An artificial neural network, often just called a 'neural network' (NN), is an interconnected group of artificial neurons that uses a mathematical model or computational model for information processing based on a connectionist approach to computation. Knowledge is acquired by the network from its environment through a learning process, and interneuron connection strengths (synaptic weighs) are used to store the acquired knowledge." (Larbi Esmahi et al, "Adaptive Neuro-Fuzzy Systems", 2009)

"Generally, these programs fall within the techniques of reinforcement learning and the majority use an algorithm of temporal difference learning. In essence, this computer learning paradigm approximates the future state of the system as a function of the present state. To reach that future state, it uses a neural network that changes the weight of its parameters as it learns." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"The simplest basic architecture of an artificial neural network is composed of three layers of neurons - input, output, and intermediary (historically called perceptron). When the input layer is stimulated, each node responds in a particular way by sending information to the intermediary level nodes, which in turn distribute it to the output layer nodes and thereby generate a response. The key to artificial neural networks is in the ways that the nodes are connected and how each node reacts to the stimuli coming from the nodes it is connected to. Just as with the architecture of the brain, the nodes allow information to pass only if a specific stimulus threshold is passed. This threshold is governed by a mathematical equation that can take different forms. The response depends on the sum of the stimuli coming from the input node connections and is 'all or nothing'." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"Neural networks can model very complex patterns and decision boundaries in the data and, as such, are very powerful. In fact, they are so powerful that they can even model the noise in the training data, which is something that definitely should be avoided. One way to avoid this overfitting is by using a validation set in a similar way as with decision trees.[...] Another scheme to prevent a neural network from overfitting is weight regularization, whereby the idea is to keep the weights small in absolute sense because otherwise they may be fitting the noise in the data. This is then implemented by adding a weight size term (e.g., Euclidean norm) to the objective function of the neural network." (Bart Baesens, "Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications", 2014)

"A neural network consists of a set of neurons that are connected together. A neuron takes a set of numeric values as input and maps them to a single output value. At its core, a neuron is simply a multi-input linear-regression function. The only significant difference between the two is that in a neuron the output of the multi-input linear-regression function is passed through another function that is called the activation function." (John D Kelleher & Brendan Tierney, "Data Science", 2018)

"Just as they did thirty years ago, machine learning programs (including those with deep neural networks) operate almost entirely in an associational mode. They are driven by a stream of observations to which they attempt to fit a function, in much the same way that a statistician tries to fit a line to a collection of points. Deep neural networks have added many more layers to the complexity of the fitted function, but raw data still drives the fitting process. They continue to improve in accuracy as more data are fitted, but they do not benefit from the 'super-evolutionary speedup'."  (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"A neural-network algorithm is simply a statistical procedure for classifying inputs (such as numbers, words, pixels, or sound waves) so that these data can mapped into outputs. The process of training a neural-network model is advertised as machine learning, suggesting that neural networks function like the human mind, but neural networks estimate coefficients like other data-mining algorithms, by finding the values for which the model’s predictions are closest to the observed values, with no consideration of what is being modeled or whether the coefficients are sensible." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"Deep neural networks have an input layer and an output layer. In between, are “hidden layers” that process the input data by adjusting various weights in order to make the output correspond closely to what is being predicted. [...] The mysterious part is not the fancy words, but that no one truly understands how the pattern recognition inside those hidden layers works. That’s why they’re called 'hidden'. They are an inscrutable black box - which is okay if you believe that computers are smarter than humans, but troubling otherwise." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"Neural-network algorithms do not know what they are manipulating, do not understand their results, and have no way of knowing whether the patterns they uncover are meaningful or coincidental. Nor do the programmers who write the code know exactly how they work and whether the results should be trusted. Deep neural networks are also fragile, meaning that they are sensitive to small changes and can be fooled easily." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"The label neural networks suggests that these algorithms replicate the neural networks in human brains that connect electrically excitable cells called neurons. They don’t. We have barely scratched the surface in trying to figure out how neurons receive, store, and process information, so we cannot conceivably mimic them with computers." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

More quotes on "Neural Networks" at the-web-of-knowledge.blogspot.com.

22 November 2018

Data Science: On Signals (Just the Quotes)

"If statistical graphics, although born just yesterday, extends its reach every day, it is because it replaces long tables of numbers and it allows one not only to embrace at glance the series of phenomena, but also to signal the correspondences or anomalies, to find the causes, to identify the laws." (Émile Cheysson, cca. 1877)

"The term closed loop-learning process refers to the idea that one learns by determining what s desired and comparing what is actually taking place as measured at the process and feedback for comparison. The difference between what is desired and what is taking place provides an error indication which is used to develop a signal to the process being controlled." (Harold Chestnut, 1984) 

"Complexity is not an objective factor but a subjective one. Supersignals reduce complexity, collapsing a number of features into one. Consequently, complexity must be understood in terms of a specific individual and his or her supply of supersignals. We learn supersignals from experience, and our supply can differ greatly from another individual's. Therefore there can be no objective measure of complexity." (Dietrich Dorner, "The Logic of Failure: Recognizing and Avoiding Error in Complex Situations", 1989)

"An artificial neural network is an information-processing system that has certain performance characteristics in common with biological neural networks. Artificial neural networks have been developed as generalizations of mathematical models of human cognition or neural biology, based on the assumptions that: (1) Information processing occurs at many simple elements called neurons. (2) Signals are passed between neurons over connection links. (3) Each connection link has an associated weight, which, in a typical neural net, multiplies the signal transmitted. (4) Each neuron applies an activation function (usually nonlinear) to its net input (sum of weighted input signals) to determine its output signal." (Laurene Fausett, "Fundamentals of Neural Networks", 1994)

"Data are generally collected as a basis for action. However, unless potential signals are separated from probable noise, the actions taken may be totally inconsistent with the data. Thus, the proper use of data requires that you have simple and effective methods of analysis which will properly separate potential signals from probable noise." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"No matter what the data, and no matter how the values are arranged and presented, you must always use some method of analysis to come up with an interpretation of the data.
While every data set contains noise, some data sets may contain signals. Therefore, before you can detect a signal within any given data set, you must first filter out the noise." (Donald J Wheeler," Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"While all data contain noise, some data contain signals. Before you can detect a signal, you must filter out the noise." (Donald J Wheeler, "Understanding Variation: The Key to Managing Chaos" 2nd Ed., 2000)

"The most familiar example of swarm intelligence is the human brain. Memory, perception and thought all arise out of the nett actions of billions of individual neurons. As we saw earlier, artificial neural networks (ANNs) try to mimic this idea. Signals from the outside world enter via an input layer of neurons. These pass the signal through a series of hidden layers, until the result emerges from an output layer. Each neuron modifies the signal in some simple way. It might, for instance, convert the inputs by plugging them into a polynomial, or some other simple function. Also, the network can learn by modifying the strength of the connections between neurons in different layers." (David G Green, "The Serendipity Machine: A voyage of discovery through the unexpected world of computers", 2004)

"Data analysis is not generally thought of as being simple or easy, but it can be. The first step is to understand that the purpose of data analysis is to separate any signals that may be contained within the data from the noise in the data. Once you have filtered out the noise, anything left over will be your potential signals. The rest is just details." (Donald J Wheeler," Myths About Data Analysis", International Lean & Six Sigma Conference, 2012)

"Finding patterns is easy in any kind of data-rich environment […] The key is in determining whether the patterns represent signal or noise." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don’t", 2012)

"The signal is the truth. The noise is what distracts us from the truth." (Nate Silver, "The Signal and the Noise: Why So Many Predictions Fail-but Some Don't", 2012)

"A signal is a useful message that resides in data. Data that isn’t useful is noise. […] When data is expressed visually, noise can exist not only as data that doesn’t inform but also as meaningless non-data elements of the display (e.g. irrelevant attributes, such as a third dimension of depth in bars, color variation that has no significance, and artificial light and shadow effects)." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"To find signals in data, we must learn to reduce the noise - not just the noise that resides in the data, but also the noise that resides in us. It is nearly impossible for noisy minds to perceive anything but noise in data. […] Signals always point to something. In this sense, a signal is not a thing but a relationship. Data becomes useful knowledge of something that matters when it builds a bridge between a question and an answer. This connection is the signal." (Stephen Few, "Signal: Understanding What Matters in a World of Noise", 2015)

"Information theory leads to the quantification of the information content of the source, as denoted by entropy, the characterization of the information-bearing capacity of the communication channel, as related to its noise characteristics, and consequently the establishment of the relationship between the information content of the source and the capacity of the channel. In short, information theory provides a quantitative measure of the information contained in message signals and help determine the capacity of a communication system to transfer this information from source to sink over a noisy channel in a reliable fashion." (Ali Grami, "Information Theory", 2016)
Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.