16 May 2018

Data Science: Training Set/Dataset (Definitions)

"set of data used as inputs in an adaptive process that teaches a neural network." (Teuvo Kohonen, "Self-Organizing Maps" 3rd Ed., 2001)

"A set of observations that are used in creating a prediction model." (Glenn J Myatt, "Making Sense of Data: A Practical Guide to Exploratory Data Analysis and Data Mining", 2006)

"the training set is composed by all labelled examples that are provided for constructing a classifier. The test set is composed by the new unlabelled patterns whose classes should be predicted by the classifier." (Óscar Pérez & Manuel Sánchez-Montañés, "Class Prediction in Test Sets with Shifted Distributions", 2009)

"A collection of data whose purpose is to be analyzed to discover patterns that can then be applied to other data sets." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A training set for supervised learning is taken from the labeled instances. The remaining instances are used for validation." (Robert J Glushko, "The Discipline of Organizing: Professional Edition" 4th Ed., 2016)

"A set of known and predictable data used to train a data mining model." (Microsoft, "SQL Server 2012 Glossary", 2012)

"In data mining, a sample of data used at each iteration of the training process to evaluate the model fit." (Meta S Brown, "Data Mining For Dummies", 2014)

"Training Data is the data used to train a machine learning algorithm. Generally, data in machine learning is divided into three datasets: training, validation and testing data. In general, the more accurate and comprehensive training data is, the better the algorithm or classifier will perform." (Accenture)

No comments:

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.