SQL Troubles: training

Showing posts with label training. Show all posts

16 April 2025

🧮ERP: Implementations (Part XIII: On Project Management)

ERP Implementations Series

Given its intrinsic complexity and extended implications, an ERP implementation can be considered as the real test of endurance for a Project Manager, respectively the team managed. Such projects typically deal with multiple internal and external parties with various interests in the outcomes of the project. Moreover, such projects involve multiple technologies, systems, and even methodologies. But, more importantly, such projects tend to have specific characteristics associated with their mass, being challenging to manage within the predefined constraints: time, scope, costs and quality.

From a Project Manager’s perspective what counts is only the current project. From a PMO perspective, one project, independent of its type, must be put within the broader perspective, while looking at the synergies and other important aspects that can help the organization. Unfortunately, for many organizations all begins and ends with the implementation, and this independently of the outcomes of the project. Often failure lurks in the background and usually there can be small differences that in the long term have a considerable impact. ERP implementations are more than other projects sensitive on the initial conditions – the premises under which the project starts and progresses.

One way of coping with this inherent complexity is to split projects into several phases considered as projects or subprojects in their own boundaries. This allows organizations to narrow the focus and split the overall work into more manageable pieces, reducing to some degree the risks while learning in the process about organization’s capabilities in addressing the various aspects. Conversely, the phases are not necessarily sequential but often must overlap to better manage the resources and minimize waste.

Given that an implementation project can take years, it’s normal for people to come and go, some taking over work from colleagues, with or without knowledge transfer. The knowledge is available further on, as long as the resources don’t leave the organization, though knowledge transfer can’t be taken for granted. It’s also normal for resources to suddenly not be available or disappear, increasing the burden that needs to be shifted on others’ shoulders. There’s seldom a project without such events and one needs to make the best of each situation, even if several tries and iterations are needed in the process.

Somebody needs to manage all this, and the weight of the whole project falls on a PM’s shoulders. Managing by exception and other management principles break under the weight of implementation projects and often it’s challenging to make progress without addressing this. Fortunately, PMs can shift the burden on Key Users and other parties involved in the project. Splitting a project in subprojects can help set boundaries even if more management could occasionally be involved. Also having clear responsibilities and resources who can take over the burdens when needed can be a sign of maturity of the teams, respectively the organization.

Teams in Project Management are often compared with teams in sports, though the metaphor is partially right when each party has a ball to play with, while some of the players or even teams prefer to play alone at their own pace. It takes time to build effective teams that play well together, and the team spirit or other similar concepts can't fill all the gaps existing in organizations! Training in team sports has certain characteristics that must be mirrored in organizations to allow for teams to improve. Various parties expect from the PM to be the binder and troubleshooter of something that should have been part of an organization’s DNA! Bringing external players to do the heavy lifting may sometimes work, though who’ll do the lifting after the respective resources are gone?

Previous Post <<||>> Next Post

29 March 2021

Notes: Team Data Science Process (TDSP)

Team Data Science Process (TDSP)

an agile, iterative data science methodology to deliver predictive analytics solutions and intelligent applications efficiently [1]
{goal} help customers fully realize the benefits of their analytics program [1]
{component} data science lifecycle definition
- {description} a framework to structure the development of data science projects [1]
- {goal} designed for data science projects that ship as part of intelligent applications that deploy ML & AI models for predictive analytics [1]
- {benefit} can be used in the context of other DM methodologies as they have common ground [1]
  - e.g. CRISP-DM, KDD
- {benefit} exploratory data science projects or improvised analytics projects can also benefit from using this process [1]
{component} standardized project structure
- {description} a directory structure that includes templates for project documents
  - ⇒makes it easy for team members to find information [1]
  - ⇐templates for the folder structure and required documents are provided in standard locations [1]
  - all code and documents are stored in an agile VCS tracking repository [1]
    - {recommendation} create a separate repository for each project on the VCS for versioning, information security, and collaboration [1]
- {benefit} organizes the code for the various activities [1]
- {benefit} allows tracking the progress [1]
- {benefit} provides checklist with key questions for each project to guarantee process and deliverables’ quality [1]
- {benefit} enables team collaboration [1]
- {benefit} allows closer tracking of the code for individual features [1]
- {benefit} enables teams to obtain better cost estimates [1]
- {benefit} helps build institutional knowledge across the organization [1]
{component} recommended infrastructure
- {description} a set of recommendations for the infrastructure and resources needed for analytics and storage [1]
- {benefit} addresses cloud and/or on-premises requirements [1]
- {benefit} enables reproducible analysis [1]
- {benefit} avoids infrastructure duplication [1]
  - ⇒minimizes inconsistencies and unnecessary infrastructure costs [1]
- {tools} tools are provided to provision the shared resources, track them, and allow each team member to connect to those resources securely [1]
- {good practice} create a consistent compute environment [1]
  - ⇐allows team members replicate and validate experiments [1]
{component} recommended tools and utilities
- {description} a set of recommendations for the tools and utilities needed for project’s execution [1]
- {benefit} help lower the barriers and increase the consistency of their adoption [1]
- {benefit} provides an initial set of tools and scripts to jump-start methodology’s adoption [1]
- {benefit} helps automate some of the common tasks in the data science lifecycle [1]
  - e.g. data exploration and baseline modeling [1]
- {benefit} well-defined structure provided for individuals to contribute shared tools and utilities into their team's shared code repository [1]
  - ⇐ resources can then be leveraged by other projects [1]
{phase} 1: business understanding
- {goal} define and document the business problem, its objectives, the needed attributes, and the metric(s) used to determine project’s success
- {goal} identify and document the relevant data sources
- {step} 1.1: define project’s objectives
  - elicit together with the stakeholders the requirements, define and document the problem and its objectives, respectively the metric(s) used to determine project’s success
    - requires a good understanding of the business processes, data and further characteristics
- {step} 1.2: identify data sources
  - identify the attributes and the data sources relevant to the problem under study
- {step} 1.3: define project plan and team*
  - develop a high-level milestone plan and identify the resources needed for executing it
- {tool} project charter
  - standard template that documents the business problem, the scope of the project, the business objectives and metric(s) used to determine project’s success
{phase} 2: data acquisition & understanding
- {goal} prepare the base dataset(s) as needed by the modeling phase into the target repository
- {goal} build the data ETL/ELT architecture and processes needed for provisioning the basis data
- {step} 2.1: ingest data
  - make the required data available for the team in the repository where the analytics operations take place
- {step} 2.2: explore data
  - understand data’s characteristics by leveraging specific tools (visualization, analysis)
  - prepare the data as needed for further processing
- {step} 2.3: set up pipelines
  - build the pipelines needed for data actualization and qualitative assessment [3]
  - set up a process to score new data or refresh the data regularly [3]
- {step} 2.4: feasibility analysis*
  - reevaluate the project to determine whether the value expected is sufficient to continue pursuing it
- {tool} data quality report
  - report that includes data summaries, data mappings, variable ranking, data qualitative assessment(s) and further information [3]
- {tool} solution architecture
  - diagram and/or textual-based description of the data pipeline(s), technical assumptions and further aspects
- {tool} data reports
  - document the structure and statistics of the raw data
- {tool} checkpoint decision
  - decision template document that
    - summarizes the findings of the feasibility analysis step
    - includes a set of choices and recommendations for the next steps
    - serves as basis for the decision on whether to continue or not the project, respectively what the next steps are
{phase} 3: modeling
- {goal} create a machine-learning model that addresses the prediction requirements and that's suitable for production
- {step} 3.1: feature engineering
  - the inclusion, aggregation, and transformation of raw variables to create the features used in the analysis [4]
    - ⇐requires a good understanding of how the features relate to each other and how the ML algorithms use those features [4]
- {step} 3.2: model selection*
  - choose one or more modeling algorithms that address problem’s characteristics the best
- {step} 3.3: model training
  - involves the following steps:
    - split the input data into training and test datasets
    - build the models by using the training dataset
    - evaluate the training and the test data set
    - determine the optimal setup and methods
- {step} 3.4: model evaluation
  - evaluate the performance of the model(s)
- {step} 3.5: feasibility analysis*
  - evaluate the readiness of the models for use into production, respectively on whether they fulfill project’s objectives
- {tool} feature sets
  - describe the features developed for the modeling and how they were generated
  - contains pointers to the code used to generate the features
- {tool} model report
  - a standard, template-based report that provides details on each experiment’s outcomes
  - created for each model tried
- {tool} checkpoint decision
- {tool} model performance metrics
  - e.g. ROC curves or MSE
{phase} 4: deployment
- {goal} deploy the models and the data pipelines to the environment used for final user acceptance
- {step} 4.1: operationalize architecture
  - prepare the models and data pipelines for use into production
  - {best practice} expose the models over an open API interface
    - enables models’ consumption from various applications
  - {best practice} build telemetry and monitoring into the models and the data pipelines [5]
    - helps in monitoring and troubleshooting [5]
- {step} 4.2: deploy solution*
  - deploy the architecture into production
- {tool} status dashboard
  - displays data on system’s health and key metrics
- {tool} model report
  - the report in its final form with deployment information
- {tool} solution architecture
  - the document in its final form
{phase} 5: customer acceptance
- {goal} confirm that project’s objectives were fulfilled and get customer’s acceptance
- {step} 5.1: system validation
  - validate system’s performance and outcomes and confirm that it fulfills customer’s needs
- {step} 5.2: project signoff*
  - finalize and review documentation
  - handover the solution and afferent documentation to customer
  - evaluate the project against the defined objectives and get customer’ signoff
- {tool} exit report
- {tool} technical report
  - contains all the details of the project that are useful for learning about how to operate the system [6]

Acronyms:

Artificial Intelligence (AI)

Cross-Industry Standard Process for Data Mining (CRISP-DM)

Data Mining (DM)

Knowledge Discovery in Databases (KDD)

Team Data Science Process (TDSP)

Version Control System (VCS)

Visual Studio Team Services (VSTS)

Resources:

[1] Microsoft Azure (2020) What is the Team Data Science Process? [source]

[2] Microsoft Azure (2020) The business understanding stage of the Team Data Science Process lifecycle [source]

[3] Microsoft Azure (2020) Data acquisition and understanding stage of the Team Data Science Process [source]

[4] Microsoft Azure (2020) Modeling stage of the Team Data Science Process lifecycle [source]

[5] Microsoft Azure (2020) Deployment stage of the Team Data Science Process lifecycle [source]

[6] Microsoft Azure (2020) Customer acceptance stage of the Team Data Science Process lifecycle [source]

12 December 2018

🔭Data Science: Neural Networks (Just the Quotes)

"The terms 'black box' and 'white box' are convenient and figurative expressions of not very well determined usage. I shall understand by a black box a piece of apparatus, such as four-terminal networks with two input and two output terminals, which performs a definite operation on the present and past of the input potential, but for which we do not necessarily have any information of the structure by which this operation is performed. On the other hand, a white box will be similar network in which we have built in the relation between input and output potentials in accordance with a definite structural plan for securing a previously determined input-output relation." (Norbert Wiener, "Cybernetics: Or Control and Communication in the Animal and the Machine", 1948)

"A neural network is a massively parallel distributed processor that has a natural propensity for storing experiential knowledge and making it available for use. It resembles the brain in two respects: 1. Knowledge is acquired by the network through a learning process. 2. Interneuron connection strengths known as synaptic weights are used to store the knowledge." (Igor Aleksander, "An introduction to neural computing", 1990)

"Neural Computing is the study of networks of adaptable nodes which through a process of learning from task examples, store experiential knowledge and make it available for use." (Igor Aleksander, "An introduction to neural computing", 1990)

"A neural network is characterized by (1) its pattern of connections between the neurons (called its architecture), (2) its method of determining the weights on the connections (called its training, or learning, algorithm), and (3) its activation function." (Laurene Fausett, "Fundamentals of Neural Networks", 1994)

"An artificial neural network is an information-processing system that has certain performance characteristics in common with biological neural networks. Artificial neural networks have been developed as generalizations of mathematical models of human cognition or neural biology, based on the assumptions that: (1) Information processing occurs at many simple elements called neurons. (2) Signals are passed between neurons over connection links. (3) Each connection link has an associated weight, which, in a typical neural net, multiplies the signal transmitted. (4) Each neuron applies an activation function (usually nonlinear) to its net input (sum of weighted input signals) to determine its output signal." (Laurene Fausett, "Fundamentals of Neural Networks", 1994)

"An artificial neural network (or simply a neural network) is a biologically inspired computational model that consists of processing elements (neurons) and connections between them, as well as of training and recall algorithms." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Many of the basic functions performed by neural networks are mirrored by human abilities. These include making distinctions between items (classification), dividing similar things into groups (clustering), associating two or more things (associative memory), learning to predict outcomes based on examples (modeling), being able to predict into the future (time-series forecasting), and finally juggling multiple goals and coming up with a good- enough solution (constraint satisfaction)." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"More than just a new computing architecture, neural networks offer a completely different paradigm for solving problems with computers. […] The process of learning in neural networks is to use feedback to adjust internal connections, which in turn affect the output or answer produced. The neural processing element combines all of the inputs to it and produces an output, which is essentially a measure of the match between the input pattern and its connection weights. When hundreds of these neural processors are combined, we have the ability to solve difficult problems such as credit scoring." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Neural networks are a computing model grounded on the ability to recognize patterns in data. As a consequence, they have many applications to data mining and analysis." (Joseph P Bigus,"Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"Neural networks are a computing technology whose fundamental purpose is to recognize patterns in data. Based on a computing model similar to the underlying structure of the human brain, neural networks share the brains ability to learn or adapt in response to external inputs. When exposed to a stream of training data, neural networks can discover previously unknown relationships and learn complex nonlinear mappings in the data. Neural networks provide some fundamental, new capabilities for processing business data. However, tapping these new neural network data mining functions requires a completely different application development process from traditional programming." (Joseph P Bigus, "Data Mining with Neural Networks: Solving business problems from application development to decision support", 1996)

"The most familiar example of swarm intelligence is the human brain. Memory, perception and thought all arise out of the nett actions of billions of individual neurons. As we saw earlier, artificial neural networks (ANNs) try to mimic this idea. Signals from the outside world enter via an input layer of neurons. These pass the signal through a series of hidden layers, until the result emerges from an output layer. Each neuron modifies the signal in some simple way. It might, for instance, convert the inputs by plugging them into a polynomial, or some other simple function. Also, the network can learn by modifying the strength of the connections between neurons in different layers." (David G Green, "The Serendipity Machine: A voyage of discovery through the unexpected world of computers", 2004)

"A neural network is a particular kind of computer program, originally developed to try to mimic the way the human brain works. It is essentially a computer simulation of a complex circuit through which electric current flows." (Keith J Devlin & Gary Lorden, "The Numbers behind NUMB3RS: Solving crime with mathematics", 2007)

"Neural networks are a popular model for learning, in part because of their basic similarity to neural assemblies in the human brain. They capture many useful effects, such as learning from complex data, robustness to noise or damage, and variations in the data set. " (Peter C R Lane, Order Out of Chaos: Order in Neural Networks, 2007)

"A network of many simple processors ('units' or 'neurons') that imitates a biological neural network. The units are connected by unidirectional communication channels, which carry numeric data. Neural networks can be trained to find nonlinear relationships in data, and are used in various applications such as robotics, speech recognition, signal processing, medical diagnosis, or power systems." (Adnan Khashman et al, "Voltage Instability Detection Using Neural Networks", 2009)

"An artificial neural network, often just called a 'neural network' (NN), is an interconnected group of artificial neurons that uses a mathematical model or computational model for information processing based on a connectionist approach to computation. Knowledge is acquired by the network from its environment through a learning process, and interneuron connection strengths (synaptic weighs) are used to store the acquired knowledge." (Larbi Esmahi et al, "Adaptive Neuro-Fuzzy Systems", 2009)

"Generally, these programs fall within the techniques of reinforcement learning and the majority use an algorithm of temporal difference learning. In essence, this computer learning paradigm approximates the future state of the system as a function of the present state. To reach that future state, it uses a neural network that changes the weight of its parameters as it learns." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"The simplest basic architecture of an artificial neural network is composed of three layers of neurons - input, output, and intermediary (historically called perceptron). When the input layer is stimulated, each node responds in a particular way by sending information to the intermediary level nodes, which in turn distribute it to the output layer nodes and thereby generate a response. The key to artificial neural networks is in the ways that the nodes are connected and how each node reacts to the stimuli coming from the nodes it is connected to. Just as with the architecture of the brain, the nodes allow information to pass only if a specific stimulus threshold is passed. This threshold is governed by a mathematical equation that can take different forms. The response depends on the sum of the stimuli coming from the input node connections and is 'all or nothing'." (Diego Rasskin-Gutman, "Chess Metaphors: Artificial Intelligence and the Human Mind", 2009)

"Neural networks can model very complex patterns and decision boundaries in the data and, as such, are very powerful. In fact, they are so powerful that they can even model the noise in the training data, which is something that definitely should be avoided. One way to avoid this overfitting is by using a validation set in a similar way as with decision trees.[...] Another scheme to prevent a neural network from overfitting is weight regularization, whereby the idea is to keep the weights small in absolute sense because otherwise they may be fitting the noise in the data. This is then implemented by adding a weight size term (e.g., Euclidean norm) to the objective function of the neural network." (Bart Baesens, "Analytics in a Big Data World: The Essential Guide to Data Science and Its Applications", 2014)

"A neural network consists of a set of neurons that are connected together. A neuron takes a set of numeric values as input and maps them to a single output value. At its core, a neuron is simply a multi-input linear-regression function. The only significant difference between the two is that in a neuron the output of the multi-input linear-regression function is passed through another function that is called the activation function." (John D Kelleher & Brendan Tierney, "Data Science", 2018)

"Just as they did thirty years ago, machine learning programs (including those with deep neural networks) operate almost entirely in an associational mode. They are driven by a stream of observations to which they attempt to fit a function, in much the same way that a statistician tries to fit a line to a collection of points. Deep neural networks have added many more layers to the complexity of the fitted function, but raw data still drives the fitting process. They continue to improve in accuracy as more data are fitted, but they do not benefit from the 'super-evolutionary speedup'." (Judea Pearl & Dana Mackenzie, "The Book of Why: The new science of cause and effect", 2018)

"A neural-network algorithm is simply a statistical procedure for classifying inputs (such as numbers, words, pixels, or sound waves) so that these data can mapped into outputs. The process of training a neural-network model is advertised as machine learning, suggesting that neural networks function like the human mind, but neural networks estimate coefficients like other data-mining algorithms, by finding the values for which the model’s predictions are closest to the observed values, with no consideration of what is being modeled or whether the coefficients are sensible." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"Deep neural networks have an input layer and an output layer. In between, are “hidden layers” that process the input data by adjusting various weights in order to make the output correspond closely to what is being predicted. [...] The mysterious part is not the fancy words, but that no one truly understands how the pattern recognition inside those hidden layers works. That’s why they’re called 'hidden'. They are an inscrutable black box - which is okay if you believe that computers are smarter than humans, but troubling otherwise." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"Neural-network algorithms do not know what they are manipulating, do not understand their results, and have no way of knowing whether the patterns they uncover are meaningful or coincidental. Nor do the programmers who write the code know exactly how they work and whether the results should be trusted. Deep neural networks are also fragile, meaning that they are sensitive to small changes and can be fooled easily." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

"The label neural networks suggests that these algorithms replicate the neural networks in human brains that connect electrically excitable cells called neurons. They don’t. We have barely scratched the surface in trying to figure out how neurons receive, store, and process information, so we cannot conceivably mimic them with computers." (Gary Smith & Jay Cordes, "The 9 Pitfalls of Data Science", 2019)

More quotes on "Neural Networks" at the-web-of-knowledge.blogspot.com.

13 November 2018

🔭Data Science: Training (Just the Quotes)

"[…] an obvious difference between our best classifiers and human learning is the number of examples required in tasks such as object detection. […] the difficulty of a learning task depends on the size of the required hypothesis space. This complexity determines in turn how many training examples are needed to achieve a given level of generalization error. Thus the complexity of the hypothesis space sets the speed limit and the sample complexity for learning." (Tomaso Poggio & Steve Smale, "The Mathematics of Learning: Dealing with Data", Notices of the AMS, 2003)

"Learning a complicated function that matches the training data closely but fails to recognize the underlying process that generates the data. As a result of overfitting, the model performs poor on new input. Overfitting occurs when the training patterns are sparse in input space and/or the trained networks are too complex." (Frank Padberg, "Counting the Hidden Defects in Software Documents", 2010)

"Decision trees are also considered nonparametric models. The reason for this is that when we train a decision tree from data, we do not assume a fixed set of parameters prior to training that define the tree. Instead, the tree branching and the depth of the tree are related to the complexity of the dataset it is trained on. If new instances were added to the dataset and we rebuilt the tree, it is likely that we would end up with a (potentially very) different tree." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies", 2015)

"Boosting defines an objective function to measure the performance of a model given a certain set of parameters. The objective function contains two parts: regularization and training loss, both of which add to one another. The training loss measures how predictive our model is on the training data. The most commonly used training loss function includes mean squared error and logistic regression. The regularization term controls the complexity of the model, which helps avoid overfitting." (Danish Haroon, "Python Machine Learning Case Studies", 2017)

"Decision trees are important for a few reasons. First, they can both classify and regress. It requires literally one line of code to switch between the two models just described, from a classification to a regression. Second, they are able to determine and share the feature importance of a given training set." (Russell Jurney, "Agile Data Science 2.0: Building Full-Stack Data Analytics Applications with Spark", 2017)

"Early stopping and regularization can ensure network generalization when you apply them properly. [...] With early stopping, the choice of the validation set is also important. The validation set should be representative of all points in the training set. When you use Bayesian regularization, it is important to train the network until it reaches convergence. The sum-squared error, the sum-squared weights, and the effective number of parameters should reach constant values when the network has converged. With both early stopping and regularization, it is a good idea to train the network starting from several different initial conditions. It is possible for either method to fail in certain circumstances. By testing several different initial conditions, you can verify robust network performance." (Mark H Beale et al, "Neural Network Toolbox™ User's Guide", 2017)

"Variance is a prediction error due to different sets of training samples. Ideally, the error should not vary from one training sample to another sample, and the model should be stable enough to handle hidden variations between input and output variables. Normally this occurs with the overfitted model." (Umesh R Hodeghatta & Umesha Nayak, "Business Analytics Using R: A Practical Approach", 2017)

"One of the most common problems that you will encounter when training deep neural networks will be overfitting. What can happen is that your network may, owing to its flexibility, learn patterns that are due to noise, errors, or simply wrong data. [...] The essence of overfitting is to have unknowingly extracted some of the residual variation (i.e., the noise) as if that variation represented the underlying model structure. The opposite is called underfitting - when the model cannot capture the structure of the data." (Umberto Michelucci, "Applied Deep Learning: A Case-Based Approach to Understanding Deep Neural Networks", 2018)

"The premise of classification is simple: given a categorical target variable, learn patterns that exist between instances composed of independent variables and their relationship to the target. Because the target is given ahead of time, classification is said to be supervised machine learning because a model can be trained to minimize error between predicted and actual categories in the training data. Once a classification model is fit, it assigns categorical labels to new instances based on the patterns detected during training." (Benjamin Bengfort et al, "Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning", 2018)

"The trick is to walk the line between underfitting and overfitting. An underfit model has low variance, generally making the same predictions every time, but with extremely high bias, because the model deviates from the correct answer by a significant amount. Underfitting is symptomatic of not having enough data points, or not training a complex enough model. An overfit model, on the other hand, has memorized the training data and is completely accurate on data it has seen before, but varies widely on unseen data. Neither an overfit nor underfit model is generalizable - that is, able to make meaningful predictions on unseen data." (Benjamin Bengfort et al, "Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning", 2018)

"There is a trade-off between bias and variance [...]. Complexity increases with the number of features, parameters, depth, training epochs, etc. As complexity increases and the model overfits, the error on the training data decreases, but the error on test data increases, meaning that the model is less generalizable." (Benjamin Bengfort et al, "Applied Text Analysis with Python: Enabling Language-Aware Data Products with Machine Learning", 2018)

"Cross-validation is a useful tool for finding optimal predictive models, and it also works well in visualization. The concept is simple: split the data at random into a 'training' and a 'test' set, fit the model to the training data, then see how well it predicts the test data. As the model gets more complex, it will always fit the training data better and better. It will also start off getting better results on the test data, but there comes a point where the test data predictions start going wrong." (Robert Grant, "Data Visualization: Charts, Maps and Interactive Graphics", 2019)

"Any machine learning model is trained based on certain assumptions. In general, these assumptions are the simplistic approximations of some real-world phenomena. These assumptions simplify the actual relationships between features and their characteristics and make a model easier to train. More assumptions means more bias. So, while training a model, more simplistic assumptions = high bias, and realistic assumptions that are more representative of actual phenomena = low bias." (Imran Ahmad, "40 Algorithms Every Programmer Should Know", 2020)

🔭Data Science: Training Data (Just the Quotes)

"Overfitting occurs when a formula describes a set of data very closely, but does not lead to any sensible explanation for the behavior of the data and does not predict the behavior of comparable data sets. In the case of overfitting, the formula is said to describe the noise of the system rather than the characteristic behavior of the system. Overfitting occurs frequently with models that perform iterative approximations on training data, coming closer and closer to the training data set with each iteration. Neural networks are an example of a data modeling strategy that is prone to overfitting." (Jules H Berman, "Principles of Big Data: Preparing, Sharing, and Analyzing Complex Information", 2013)

"Briefly speaking, to solve a Machine Learning problem means you optimize a model to fit all the data from your training set, and then you use the model to predict the results you want. Therefore, evaluating a model need to see how well it can be used to predict the data out of the training set. Usually there are three types of the models: underfitting, fair and overfitting model [...]. If we want to predict a value, both (a) and (c) in this figure cannot work well. The underfitting model does not capture the structure of the problem at all, and we say it has high bias. The overfitting model tries to fit every sample in the training set and it did it, but we say it is of high variance. In other words, it fails to generalize new data." (Shudong Hao, "A Beginner’s Tutorial for Machine Learning Beginners", 2014)

"A predictive model overfits the training set when at least some of the predictions it returns are based on spurious patterns present in the training data used to induce the model. Overfitting happens for a number of reasons, including sampling variance and noise in the training set. The problem of overfitting can affect any machine learning algorithm; however, the fact that decision tree induction algorithms work by recursively splitting the training data means that they have a natural tendency to segregate noisy instances and to create leaf nodes around these instances. Consequently, decision trees overfit by splitting the data on irrelevant features that only appear relevant due to noise or sampling variance in the training data. The likelihood of overfitting occurring increases as a tree gets deeper because the resulting predictions are based on smaller and smaller subsets as the dataset is partitioned after each feature test in the path." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies", 2015)

"Cross-validation is a method of splitting all of your data into two parts: training and validation. The training data is used to build the machine learning model, whereas the validation data is used to validate that the model is doing what is expected. This increases our ability to find and determine the underlying errors in a model." (Matthew Kirk, "Thoughtful Machine Learning", 2015)

"Tree pruning identifies and removes subtrees within a decision tree that are likely to be due to noise and sample variance in the training set used to induce it. In cases where a subtree is deemed to be overfitting, pruning the subtree means replacing the subtree with a leaf node that makes a prediction based on the majority target feature level (or average target feature value) of the dataset created by merging the instances from all the leaf nodes in the subtree. Obviously, pruning will result in decision trees being created that are not consistent with the training set used to build them. In general, however, we are more interested in creating prediction models that generalize well to new data rather than that are strictly consistent with training data, so it is common to sacrifice consistency for generalization capacity." (John D Kelleher et al, "Fundamentals of Machine Learning for Predictive Data Analytics: Algorithms, Worked Examples, and Case Studies", 2015)

"When memorization happens, you may have the illusion that everything is working well because your machine learning algorithm seems to have fitted the in sample data so well. Instead, problems can quickly become evident when you start having it work with out-of-sample data and you notice that it produces errors in its predictions as well as errors that actually change a lot when you relearn from the same data with a slightly different approach. Overfitting occurs when your algorithm has learned too much from your data, up to the point of mapping curve shapes and rules that do not exist [...]. Any slight change in the procedure or in the training data produces erratic predictions." (John P Mueller & Luca Massaron, "Machine Learning for Dummies", 2016)

"Bias is error from incorrect assumptions built into the model, such as restricting an interpolating function to be linear instead of a higher-order curve. [...] Errors of bias produce underfit models. They do not fit the training data as tightly as possible, were they allowed the freedom to do so. In popular discourse, I associate the word 'bias' with prejudice, and the correspondence is fairly apt: an apriori assumption that one group is inferior to another will result in less accurate predictions than an unbiased one. Models that perform lousy on both training and testing data are underfit." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Bias occurs normally when the model is underfitted and has failed to learn enough from the training data. It is the difference between the mean of the probability distribution and the actual correct value. Hence, the accuracy of the model is different for different data sets (test and training sets). To reduce the bias error, data scientists repeat the model-building process by resampling the data to obtain better prediction values." (Umesh R Hodeghatta & Umesha Nayak, "Business Analytics Using R: A Practical Approach", 2017)

"From a typical training set, many alternative decision trees can be created. As a rule, smaller trees are to be preferred, their main advantages being interpretability, removal of irrelevant and redundant attributes, and lower danger of overfitting noisy training data." (Miroslav Kubat, "An Introduction to Machine Learning" 2nd Ed., 2017)

"High-bias models typically produce simpler models that do not overfit and in those cases the danger is that of underfitting. Models with low-bias are typically more complex and that complexity enables us to represent the training data in a more accurate way. The danger here is that the flexibility provided by higher complexity may end up representing not only a relationship in the data but also the noise. Another way of portraying the bias-variance trade-off is in terms of complexity v simplicity." (Jesús Rogel-Salazar, "Data Science and Analytics with Python", 2017)

"In machine learning, a model is defined as a function, and we describe the learning function from the training data as inductive learning. Generalization refers to how well the concepts are learned by the model by applying them to data not seen before. The goal of a good machine-learning model is to reduce generalization errors and thus make good predictions on data that the model has never seen." (Umesh R Hodeghatta & Umesha Nayak, "Business Analytics Using R: A Practical Approach", 2017)

"Multilayer perceptrons share with polynomial classifiers one unpleasant property. Theoretically speaking, they are capable of modeling any decision surface, and this makes them prone to overfitting the training data." (Miroslav Kubat," An Introduction to Machine Learning" 2nd Ed., 2017)

"The danger of overfitting is particularly severe when the training data is not a perfect gold standard. Human class annotations are often subjective and inconsistent, leading boosting to amplify the noise at the expense of the signal. The best boosting algorithms will deal with overfitting though regularization. The goal will be to minimize the number of non-zero coefficients, and avoid large coefficients that place too much faith in any one classifier in the ensemble." (Steven S Skiena, "The Data Science Design Manual", 2017)

"Variance is error from sensitivity to fluctuations in the training set. If our training set contains sampling or measurement error, this noise introduces variance into the resulting model. [...] Errors of variance result in overfit models: their quest for accuracy causes them to mistake noise for signal, and they adjust so well to the training data that noise leads them astray. Models that do much better on testing data than training data are overfit." (Steven S Skiena, "The Data Science Design Manual", 2017)

"The classifier accuracy would be extra ordinary when the test data and the training data are overlapping. But when the model is applied to a new data it will fail to show acceptable accuracy. This condition is called as overfitting." (Jesu V Nayahi J & Gokulakrishnan K, "Medical Image Classification", 2019)

18 May 2018

🔬Data Science: Boltzmann Machine (Definitions)

[Boltzmann machine (with learning):] "A net that adjusts its weights so that the equilibrium configuration of the net will solve a given problem, such as an encoder problem" (David H Ackley et al, "A learning algorithm for boltzmann machines", Cognitive Science Vol. 9 (1), 1985)

[Boltzmann machine (without learning):] "A class of neural networks used for solving constrained optimization problems. In a typical Boltzmann machine, the weights are fixed to represent the constraints of the problem and the function to be optimized. The net seeks the solution by changing the activations (either 1 or 0) of the units based on a probability distribution and the effect that the change would have on the energy function or consensus function for the net." (David H Ackley et al, "A learning algorithm for boltzmann machines", Cognitive Science Vol. 9 (1), 1985)

"neural-network model otherwise similar to a Hopfield network but having symmetric interconnects and stochastic processing elements. The input-output relation is optimized by adjusting the bistable values of its internal state variables one at a time, relating to a thermodynamically inspired rule, to reach a global optimum." (Teuvo Kohonen, "Self-Organizing Maps 3rd" Ed., 2001)

"A neural network model consisting of interacting binary units in which the probability of a unit being in the active state depends on its integrated synaptic inputs." (Terrence J Sejnowski, "The Deep Learning Revolution", 2018)

"An unsupervised network that maximizes the product of probabilities assigned to the elements of the training set." (Mário P Véstias, "Deep Learning on Edge: Challenges and Trends", 2020)

"Restricted Boltzmann machine (RBM) is an undirected graphical model that falls under deep learning algorithms. It plays an important role in dimensionality reduction, classification and regression. RBM is the basic block of Deep-Belief Networks. It is a shallow, two-layer neural networks. The first layer of the RBM is called the visible or input layer while the second is the hidden layer. In RBM the interconnections between visible units and hidden units are established using symmetric weights." (S Abirami & P Chitra, "The Digital Twin Paradigm for Smarter Systems and Environments: The Industry Use Cases", Advances in Computers, 2020)

"A deep Boltzmann machine (DBM) is a type of binary pairwise Markov random field (undirected probabilistic graphical model) with multiple layers of hidden random variables." (Udit Singhania & B. K. Tripathy, "Text-Based Image Retrieval Using Deep Learning", 2021)

"A Boltzmann machine is a neural network of symmetrically connected nodes that make their own decisions whether to activate. Boltzmann machines use a straightforward stochastic learning algorithm to discover “interesting” features that represent complex patterns in the database." (DeepAI) [source]

"Boltzmann Machines is a type of neural network model that was inspired by the physical process of thermodynamics and statistical mechanics. [...] Full Boltzmann machines are impractical to train, which is one of the reasons why a limited form, called the restricted Boltzmann machine, is used." (Accenture)

"RBMs [Restricted Boltzmann Machines] are a type of probabilistic graphical model that can be interpreted as a stochastic artificial neural network. RBNs learn a representation of the data in an unsupervised manner. An RBN consists of visible and hidden layer, and connections between binary neurons in each of these layers. RBNs can be efficiently trained using Contrastive Divergence, an approximation of gradient descent." (Wild ML)

16 May 2018

🔬Data Science: Training Set/Dataset (Definitions)

"set of data used as inputs in an adaptive process that teaches a neural network." (Teuvo Kohonen, "Self-Organizing Maps" 3rd Ed., 2001)

"A set of observations that are used in creating a prediction model." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"the training set is composed by all labelled examples that are provided for constructing a classifier. The test set is composed by the new unlabelled patterns whose classes should be predicted by the classifier." (Óscar Pérez & Manuel Sánchez-Montañés, "Class Prediction in Test Sets with Shifted Distributions", 2009)

"A collection of data whose purpose is to be analyzed to discover patterns that can then be applied to other data sets." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A training set for supervised learning is taken from the labeled instances. The remaining instances are used for validation." (Robert J Glushko, "The Discipline of Organizing: Professional Edition" 4th Ed., 2016)

"A set of known and predictable data used to train a data mining model." (Microsoft, "SQL Server 2012 Glossary", 2012)

"In data mining, a sample of data used at each iteration of the training process to evaluate the model fit." (Meta S Brown, "Data Mining For Dummies", 2014)

"Training Data is the data used to train a machine learning algorithm. Generally, data in machine learning is divided into three datasets: training, validation and testing data. In general, the more accurate and comprehensive training data is, the better the algorithm or classifier will perform." (Accenture)

14 May 2018

🔭Data Science: Reinforcement Learning (Just the Quotes)

"A neural network training method based on presenting input vector x and looking at the output vector calculated by the network. If it is considered 'good', then a 'reward' is given to the network in the sense that the existing connection weights get increased, otherwise the network is "punished"; the connection weights, being considered as 'not appropriately set,' decrease." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"A training paradigm where the neural network is presented with a sequence of input data, followed by a reinforcement signal." (Joseph P Bigus, "Data Mining with Neural Networks: Solving Business Problems from Application Development to Decision Support", 1996)

"learning mode in which adaptive changes of the parameters due to reward or punishment depend on the final outcome of a whole sequence of behavior. The results of learning are evaluated by some performance index." (Teuvo Kohonen, "Self-Organizing Maps" 3rd Ed., 2001)

"A learning method which interprets feedback from an environment to learn optimal sets of condition/response relationships for problem solving within that environment" (Pi-Sheng Deng, "Genetic Algorithm Applications to Optimization Modeling", Encyclopedia of Artificial Intelligence, 2009)

"A sub-area of machine learning concerned with how an agent ought to take actions in an environment so as to maximize some notion of long-term reward. Reinforcement learning algorithms attempt to find a policy that maps states of the world to the actions the agent ought to take in those states. Differently from supervised learning, in this case there is no target value for each input pattern, only a reward based of how good or bad was the action taken by the agent in the existent environment." (Marley Vellasco et al, "Hierarchical Neuro-Fuzzy Systems" Part II, Encyclopedia of Artificial Intelligence, 2009)

"a type of machine learning in which an agent learns, through its own experience, to navigate through an environment, choosing actions in order to maximize the sum of rewards." (Lisa Torrey & Jude Shavlik, "Transfer Learning", 2010)

"a machine learning technique whereby actions are associated with credits or penalties, sometimes with delay, and whereby, after a series of learning episodes, the learning agent has developed a model of which action to choose in a particular environment, based on the expectation of accumulated rewards." (Apostolos Georgas, "Scientific Workflows for Game Analytics", Encyclopedia of Business Analytics and Optimization", 2014)

"A type of machine learning in which the machine learns what to do by discovering through trial and error the way to maximize a reward." (Gloria Phillips-Wren, "Intelligent Systems to Support Human Decision Making", 2014)

"it stands, in the context of computational learning, for a family of algorithms aimed at approximating the best policy to play in a certain environment (without building an explicit model of it) by increasing the probability of playing actions that improve the rewards received by the agent." (Fernando S Oliveira, "Reinforcement Learning for Business Modeling", 2014)

"a special case of supervised learning in which the cognitive computing system receives feedback on its performance to guide it to a goal or good outcome." (Judith S Hurwitz, "Cognitive Computing and Big Data Analytics", 2015)

"The knowledge is obtained using rewards and punishments which there is an agent (learner) that acts autonomously and receives a scalar reward signal that is used to evaluate the consequences of its actions." (Nuno Pombo et al, "Machine Learning Approaches to Automated Medical Decision Support Systems", 2015)

"It is also known as learning with a critic. The agent takes a sequence of actions and receives a reward/penalty only at the very end, with no feedback during the intermediate actions. Using this limited information, the agent should learn to generate the actions to maximize the reward in later trials. For example, in chess, we do a set of moves, and at the very end, we win or lose the game; so we need to figure out what the actions that led us to this result were and correspondingly credit them." (Ethem Alpaydın, "Machine learning : the new AI", 2016)

"A learning algorithm for a robot or a software agent to take actions in an environment so as to maximize the sum of rewards through trial and error." (Tomohiro Yamaguchi et al, "Analyzing the Goal-Finding Process of Human Learning With the Reflection Subtask", 2018)

"Training/learning method aiming to automatically determine the ideal behavior within a specific context based on rewarding desired behaviors and/or punishing undesired one." (Ioan-Sorin Comşa et al, "Guaranteeing User Rates With Reinforcement Learning in 5G Radio Access Networks", 2019)

"Brach of the Artificial Intelligence field devoted to obtaining optimal control sequences for agents only by interacting with a concrete dynamical system." (Juan Parras & Santiago Zazo, "The Threat of Intelligent Attackers Using Deep Learning: The Backoff Attack Case", 2020)

"Machine learning approaches often used in robotics. A reward is used to teach a system a desired behavior." (Jörg Frochte et al, "Concerning the Integration of Machine Learning Content in Mechatronics Curricula", 2020)

"This area of deep learning includes methods which iterates over various steps in a process to get the desired results. Steps that yield desirable outcomes are content and steps that yield undesired outcomes are reprimanded until the algorithm is able to learn the given optimal process. In unassuming terms, learning is finished on its own or effort on feedback or content-based learning." (Amit K Tyagi & Poonam Chahal, "Artificial Intelligence and Machine Learning Algorithms", 2020)

"A machine learning paradigm that utilizes evaluative feedback to cultivate desired behavior." (Marten H L Kaas, "Raising Ethical Machines: Bottom-Up Methods to Implementing Machine Ethics", 2021)

"Is an area of machine learning that learn for the experience in order to maximize the rewards." (Walaa Alnasser et al, "An Overview on Protecting User Private-Attribute Information on Social Networks", 2021)

"Reinforcement learning is also a subset of AI algorithms which creates independent, self-learning systems through trial and error. Any positive action is assigned a reward and any negative action would result in a punishment. Reinforcement learning can be used in training autonomous vehicles where the goal would be obtaining the maximum rewards." (Vijayaraghavan Varadharajan & Akanksha Rajendra Singh, "Building Intelligent Cities: Concepts, Principles, and Technologies", 2021)

"Reinforcement Learning uses a kind of algorithm that works by trial and error, where the learning is enabled using a feedback loop of 'rewards' and 'punishments'. When the algorithm is fed a dataset, it treats the environment like a game, and is told whether it has won or lost each time it performs an action. In this way, reinforcement learning algorithms build up a picture of the 'moves' that result in success, and those that don't." (Accenture)

15 March 2018

🔬Data Science: Training (Definitions)

"A step by step procedure for adjusting the weights in a neural net." (Laurene V Fausett, "Fundamentals of Neural Networks: Architectures, Algorithms, and Applications", 1994)

[supervised training:] "Process of adjusting the weights in a neural net using a learning algorithm; the desired output for each of a set of training input vectors is presented to the net. Many iterations through the training data may be required." (Laurene V Fausett, "Fundamentals of Neural Networks: Architectures, Algorithms, and Applications", 1994)

[unsupervised training:] "A training procedure in which only input vectors x are supplied to a neural network; the network learns some internal features of the whole set of all the input vectors presented to it." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"The process of adjusting the connection weights in a neural network under the control of a learning algorithm." (Joseph P Bigus, "Data Mining with Neural Networks: Solving Business Problems from Application Development to Decision Support", 1996)

[supervised training:] "Training of a neural network when the training examples comprise input vectors x and the desired output vectors y; training is performed until the neural network 'learns' to associate each input vector x with its corresponding and desired output vector y." (Nikola K Kasabov, "Foundations of Neural Networks, Fuzzy Systems, and Knowledge Engineering", 1996)

"Exposing a neural computing system to a set of example stimuli to achieve a particular user-defined goal." (Guido J Deboeck and Teuvo Kohonen, "Visual explorations in finance with self-organizing maps", 2000)

"The process used to configure an artificial neural network by repeatedly exposing it to sample data. In feed-forward networks, as each incoming vector or individual input is processed, the network produces an output for that case. With each pass of every case vector in a sample (see epoch), connection weights between neurons are modified. A typical training regime may require tens to thousands of complete epochs before the network converges (see convergence)." (David Scarborough & Mark J Somers, "Neural Networks in Organizational Research: Applying Pattern Recognition to the Analysis of Organizational Behavior", 2006)

"The process a data mining model uses to estimate model parameters by evaluating a set of known and predictable data." (Microsoft, "SQL Server 2012 Glossary", 2012)

"In data mining, the process of fitting a model to data. This is an iterative process and may involve thousands of iterations or more." (Meta S Brown, "Data Mining For Dummies", 2014)

"The process of adjusting the weights and threshold values in a neural net to get a desired outcome" (Nell Dale & John Lewis, "Computer Science Illuminated" 6th Ed., 2015)

"Model training is the process of fitting a model to data." (Alex Thomas, "Natural Language Processing with Spark NLP", 2020)

"Model Training is how artificial intelligence (AI) is taught to perform its tasks, and in many ways follows the same process that new human recruits must also undergo. AI training data needs to be unbiased and comprehensive to ensure that the AI’s actions and decisions do not unintentionally disadvantage a set of people. A key feature of responsible AI is the ability to demonstrate how an AI has been trained." (Accenture)

14 December 2014

✨Performance Management: Training (Just the Quotes)

"When a fixed procedure is too long to be memorized economically, job aids in the form of check lists or some other cuing mechanism are provided to enable the man to become familiar with the task during training and to perform it dependably in the field situation." (Leslie J Briggs, "Problems in Stimulation and Programming in the Design of Complex Trainers", 1959)

"Management techniques are obviously essential, but what matters is leadership. [...] Leading the whole organization needs wisdom and flair and vision and they are another matter; they cannot be reduced to a system and incorporated into a training manual." (Anthony Jay, "Management and Machiavelli", 1967)

"A job in which young people are not given real training though, of course, the training need not be a formal 'training program' does not measure up to what they have a right and a duty to expect." (Peter F Drucker, "People and Performance", 1977)

"It makes little sense to subject all employees to training programs, to personnel policies, and to supervision designed for one group of employees, and in particular designed, as so many of the policies are, for yesterday's typical entrant into the labor force the fifteen or sixteen year old without any experience. More and more we will have to have personnel policies that fit the person rather than bureaucratic convenience or tradition." (Peter F Drucker, "Management in Turbulent Times", 1980)

"Because the importance of training is so commonly underestimated, the manager who wants to make a dramatic improvement in organizational effectiveness without challenging the status quo will find a training program a good way to start." (Theodore Caplow, "Managing an Organization", 1983)

"Training is the teaching of specific skills. It should result in the employee having the ability to do something he or she could not do before." (Mary A Allison & Eric Anderson, "Managing Up, Managing Down", 1984)

"Training frequently fails to pay off in behavioral changes on the job: Trainees go back to work and do it the way they've always done it instead of the way you taught them to do it." (Ruth C Clark, "Manager, Training and Information Services", Training, 1986)

"Training won't cover up for poor equipment and outmoded methods. It won't offset mediocre products or deteriorating markets. It won't compensate for poor compensation or abusive supervisory or management practices. And training definitely won't turn the unwilling and uncaring in your organization into motivated, devoted, gung-ho fireballs." (Ron Zemke, Training, 1986)

"You can change behavior in an entire organization, provided you treat training as a process rather than an event." (Edward W Jones, "Training", 1986)

"Effective training programs are essential to identify needs for improvement wherever they might occur, and develop solutions to the concerns that we discover [...] before they become problems." (T Allen McArtor, [speech] 1987)

"One of management's most important functions is to train people for their jobs." (Philip W Metzger, "Managing Programming People", 1987)

"The way to get higher productivity is to train better managers and have fewer of them." (William Woodside, "Thriving on Chaos", 1987)

"Some people are excited about learning a new piece of software. Other people get very depressed. Good managers anticipate both situations - they involve the persons to be affected in the process of selecting a particular program, and they provide time and resources for training. Training is the key in both cases." (Jonathan P Siegel, "Communications", 1988)

10 December 2014

✨Performance Management: Skills (Just the Quotes)

"By far the most valuable possession is skill. Both war and the chances of fortune destroy other things, but skill is preserved." Hipparchus, Commentaries, 2nd century BC)

"Let a man practice the profession which he best knows." (Cicero, "Tusculanarum Disputationum", cca. 45 BC)

"Numeracy has two facets - reading and writing, or extracting numerical information and presenting it. The skills of data presentation may at first seem ad hoc and judgmental, a matter of style rather than of technology, but certain aspects can be formalized into explicit rules, the equivalent of elementary syntax." (Andrew Ehrenberg, "Rudiments of Numeracy", Journal of Royal Statistical Society, 1977)

"Five coordinating mechanisms seem to explain the fundamental ways in which organizations coordinate their work: mutual adjustment, direct supervision, standardization of work processes, standardization of work outputs, and standardization of worker skills." (Henry Mintzberg, "The Structuring of Organizations", 1979)

"The skills that make technical professionals competent in their specialties are not necessarily the same ones that make them successful within their organizations." (Bernard Rosenbaum, "Training", 1986)

"[…] data analysis in the context of basic mathematical concepts and skills. The ability to use and interpret simple graphical and numerical descriptions of data is the foundation of numeracy […] Meaningful data aid in replacing an emphasis on calculation by the exercise of judgement and a stress on interpreting and communicating results." (David S Moore, "Statistics for All: Why, What and How?", 1990)

"[By understanding] I mean simply a sufficient grasp of concepts, principles, or skills so that one can bring them to bear on new problems and situations, deciding in which ways one’s present competencies can suffice and in which ways one may require new skills or knowledge." (Howard Gardner, "The Unschooled Mind", 1991)

"Education is not the piling on of learning, information, data, facts, skills, or abilities - that's training or instruction - but is rather making visible what is hidden as a seed." (Thomas W Moore, "The Education of the Heart", 1996)

"Even when you have skilled, motivated, hard-working people, the wrong team structure can undercut their efforts instead of catapulting them to success. A poor team structure can increase development time, reduce quality, damage morale, increase turnover, and ultimately lead to project cancellation." (Steve McConnell, "Rapid Development", 1996)

"Success or failure of a project depends upon the ability of key personnel to have sufficient data for decision-making. Project management is often considered to be both an art and a science. It is an art because of the strong need for interpersonal skills, and the project planning and control forms attempt to convert part of the 'art' into a science." (Harold Kerzner, "Strategic Planning for Project Management using a Project Management Maturity Model", 2001)

"Even with simple and usable models, most organizations will need to upgrade their analytical skills and literacy. Managers must come to view analytics as central to solving problems and identifying opportunities - to make it part of the fabric of daily operations." (Dominic Barton & David Court, "Making Advanced Analytics Work for You", 2012)

"The biggest thing to know is that data visualization is hard. Really difficult to pull off well. It requires harmonization of several skills sets and ways of thinking: conceptual, analytic, statistical, graphic design, programmatic, interface-design, story-telling, journalism - plus a bit of ‘gut feel.’ The end result is often simple and beautiful, but the process itself is usually challenging and messy." (David McCandless, 2013)

"Finding the right answer is important, of course. But more important is developing the ability to see that problems have multiple solutions, that getting from X to Y demands basic skills and mental agility, imagination, persistence, patience." (Mary H Futrell)

"Productivity is the name of the game, and gains in productivity will only come when better understanding and better relationships exist between management and the work force. [...] Managers have traditionally developed the skills in finance, planning, marketing and production techniques. Too often the relationships with their people have been assigned a secondary role. This is too important a subject not to receive first-line attention." (William Hewlett, "The Human Side of Management", [speech])

"Solving problems is a practical skill like, let us say, swimming. We acquire any practical skill by imitation and practice." (George Polya)

14 July 2014

🌡️Performance Management: Training (Definitions)

"Formal and informal learning options, which may include in-class training, informal mentoring, Web-based training, guided self-study, and formalized on-the-job training programs. The learning options selected for each situation are based on an assessment of the need for training and the performance gap to be addressed." (Sandy Shrum et al, "CMMI®: Guidelines for Process Integration and Product Improvement", 2003)

[cross-training:] "When an employee in one primary job task is trained in another or other tasks." (Robert McCrie, "Security Operations Management" 2nd Ed., 2006)

"An umbrella term to include training, development, and education, where training is learning that pertains to the job, development is learning for the growth of the individual that is not related to a specific job, and education is learning to prepare the individual but not related to a specific job." (Richard Caladine, "Taxonomies for Technology", 2008)

"Learning is a personal construction of knowledge. In order to learn a particular concept or skill, the learner needs to consider how new information relates to the existing understandings that the learner has. The process of sifting through available information in order to select the most appropriate information to use in knowledge construction requires the skills of information literacy. Good information literacy skills are a prerequisite for effective learning." (Carmel McNaught, "Information Literacy in the 21st Century", 2008)

"Activities undertaken to ensure that all individuals have the knowledge and skills required to perform their assignments." (Sally A Miller et al, "People CMM: A Framework for Human Capital Management" 2nd Ed., 2009)

"It is the process of fixing meaning to stimulus. It is the process of constructing new knowledge. Learning should proceed from learner’s sense of vocation, occur in settings or activity systems where the function and purposes of the learning are clear and explicit, focus primarily on developing the capacity to do and where learners seek to accomplish goals. In addition, learning should involve sharing meaning and building connection among meanings and different renditions of the meaning." (Kisilu M Kitainge, "Challenges of Training Motor Vehicle Mechanics for Changing World Contexts and Emergent Working Conditions: Cases of Kenya and Australia", 2009)

"Learning occurs through a cognitive process that occurs in the mind of the individual or, in contrast, learning occurs through a process of socialization and increasing participation rather than formal inquiry." (Mary F Ziegler, "Three Theoretical Perspectives on Informal Learning at Work", 2009)

"The process to obtain or transfer knowledge, skills, and abilities needed to carry out a specific activity or task" (Bettina M Davis & Wendy L Combsand, "Demystifying Technical Training: Partnership, Strategy, and Execution", 2009)

[business training: "Training on concepts that teach skills to understand and work effectively within a company." (Bettina M Davis & Wendy L Combsand, "Demystifying Technical Training: Partnership, Strategy, and Execution", 2009)

[IT training:] "Training on content involving the development, maintenance, and use of computer systems, software, and networks." (Bettina M Davis & Wendy L Combsand, "Demystifying Technical Training: Partnership, Strategy, and Execution", 2009)

[non-technical training:] "Training that is not technical training, for example, personal effectiveness or business training." (Bettina M Davis & Wendy L Combsand, "Demystifying Technical Training: Partnership, Strategy, and Execution", 2009)

[cross-training:] "Enables personnel to learn tasks associated with more than one job." (Barry Berman & Joel R Evans, "Retail Management: A Strategic Approach" 12th Ed., 2013)

"Programs used to teach new (and existing) personnel how best to perform their jobs or how to improve themselves." (Barry Berman & Joel R Evans, "Retail Management: A Strategic Approach" 12th Ed., 2013)

"Is a multidimensional process that results in a relatively enduring change in a person or persons, and consequently how that person or persons will perceive the world and reciprocally respond to its affordances physically, psychologically, and socially. The process of learning has as its foundation the systemic, dynamic, and interactive relation between the nature of the leaner and the objective of the learning as ecologically situated in a given time and place as well as over time." (Francisco Cua, "Authentic Education: Affording, Engaging, and Reflecting", 2014)

[on-the-job training:] "Training from an experienced employee to a new employee while working on the job. This is a form of one-on-one training." (Darril Gibson, "Effective Help Desk Specialist Skills", 2014)

"It can be defined as a mental activities by means of which knowledge, skill attitude are acquired, retained and utilized. It is defined it as changes in the particular form, change in behaviour tendency, resulting in relatively permanent practice. It involves that the changes, which occurs as a result of reinforced practice that gives new meaning and orientation. This leads to acquisition of new skills, behaviour tendency that is permanent." (Monsuru B Muraina, "Relevance of the Use of Instructional Materials in Teaching and Pedagogical Delivery: An Overview", 2015)

"Learning is a dynamic concept; it refers to the various processes by which skills and knowledge are acquired by individuals and, through them by organizations. Learning encompasses processes and outcomes as well as both, individual and organizational levels; it´s use in theory emphasizes the continually changing nature of organizations, and that goes beyond the view of organizations as bundles of resources. Learning includes the capacity to create new capabilities both internally and by acquiring knowledge from sources external to the firm. It also includes the methods for the diffusion of the new knowledge throughout the firm organization." (Arturo T Vargas & Javier J Villazul, "Learning and Innovation in Multinational Companies from Emerging Economies: The Case of CEMEX", 2016)

"The process of improving performance in one or more aspects of an employee’s work output through additional knowledge and or skill." (Fred MacKenzie, "7 Paths to Managerial Leadership", 2016)

"Learning is the act of gaining new knowledge, behaviors, skills, or ability. It may be regarded as a process, rather than a collection of factual and procedural knowledge. Human learning may occur as part of education, professional development, or training." (Chunfang Zhou, "Developing Creativity and Learning Design by Information and Communication Technology (ICT) in Developing Contexts", 2018)

[technical training:] "covers the acquisition of knowledge, skills and competencies leading to overall individual or company performance in the use and application of technology." (BCS Learning & Development Limited, "CEdMA Europe", 2019)

"Learning involves any process that in living organisms leads to permanent capacity change. Learning develops knowledge, abilities, understandings, emotions, attitudes, and sociality, which are important elements of the conditions and raw material of society." (Chunfang Zhou & Zhiliang Zhu, "Fostering Problem-Based Learning (PBL) in Chinese Universities for a Creative Society", 2019)

"The capacity of an individual and an organization to explore new challenges and contexts. It is an opportunity to unlearn which is a dynamic way of learning. It is through unlearning that people shape their brain, to readjust and continue learning. It is essential condition for transformation, creativity and innovation." (Ana Martins et al, "Unravelling Hurdles to Organizational Sustainability by Virtue of Sharing and Creating Knowledge", 2019)

"A shift of mind and what goes on inside learners as they undertake to gain or acquire new knowledge, understanding, skill, attitudes, values, and interests. The ‘what goes on’ could be described as perceiving - sensing and feeling concrete reality, thinking or reasoning abstractly; and internalizing or processing - making it a part of ourselves by actively jumping in and trying it, or reflecting on and watching what is happening; thus, the learner - anywhere along his/her life path, at any age - would have going on inside of him/her the perceiving and internalizing of new knowledges, understandings, skills, attitudes, values, and interests." (John A Henschke,"Leadership Ethics in Higher Education Administration: An Andragogical Perspective", 2020)

SQL Troubles

Pages