Showing posts with label data synchronization. Show all posts
Showing posts with label data synchronization. Show all posts

03 June 2012

Data Migrations (DM): What is Data Migration?

Data Migration
Data Migrations Series

If you are working in a data-centric business it’s almost impossible for the average worker not to have heard this term, even tangentially. Considering the meaning of “migration” - the act or process of moving from one place to another - the intuition might even tell what data migration is about: the process of moving data from one place to another. It’s pretty basic, isn’t it? Now as data are moved over and over again between various places, for example the various layers of an applications, between databases, between media storage devices, and so on, we need some precision in defining the term because not all these can be considered as data migration examples. Actually we can talk about data copying or data movement without speaking of data migration. So, what is data migration? Here are a few takes on defining data migration:

process of transferring data from one platform or operating system to another” (Babylon)

"Data migration is the process of transferring data between storage types, formats, or computer systems." (Wikipedia)

 "Data migration is the movement of legacy data to new media and technologies as the older ones are displaced." (Toolbox)

 “The purpose of data migration is to transfer existing data to the new environment.” (Talend)

 “Data Migration is the process of moving data from one or more sources into a target application” (Utopia Inc.)

 “[…] is the one off selection, preparation and transportation of appropriate data, of the right quality, to the right place, at the right time.(J. Morris)

Resuming the above definitions, data migration can be defined as “the process of selecting, assessing, converting, preparing, validating and moving data from one or more information systems to another system”. The definition isn’t at all perfect, first of all because some of the terms need further explanation, secondly because any of the steps may be skip or other important steps can be identified in the process, and thirdly because further clarifications are needed. Anyway, it offers some precision, and at least for this reason, could be preferred to the above definitions.

So, resuming, data migration supposes the movement of data from one or more information systems, referred as source systems, to another one, the target system. Typically the new system replaces the old systems, they being retired, or they can continue to be used with reduced scope, for example for reporting purposes or . Even if performed in stages, the movement is typically one time activity, so everything has to be perfect. That’s the purpose of the other steps – to minimize the risks of something going wrong. The choice of steps and their complexity depends on the type of information systems involved, on the degree of resemblance between source and target, business needs, etc.

As mentioned above, not everything that involves data movement can be considered as data migration. For example data integration involves the movement and combination of data from various information systems in order to provide a unified view. Data synchronization involves the movement of data in order to reflect the changes of data in one information system into another, when data from the two systems need to be consistent. Data mirroring involves the synchronization of data, though it involves an exact copy of the data, the mirroring occurring continuously in real time. Data backup involves the movement/copy of data at a given point in time for eventual restore in case of data loss. Data transfer refers to the movement of row data between the layers of information systems. To make things even fuzzier, these types of data movements can be considered in a data migration too, as data need to be locally integrated, synchronized, transferred, mirrored or back up. Data migration is overall a complex thematic.

Previous Post <<||>> Next Post

16 February 2009

DBMS: Data Synchronization (Definitions)

"In replication, the process that ensures the publication and destination tables contain the same schema and data. This process must occur before a subscription server can receive replicated transactions from an article or a publication." (Patrick Dalton, "Microsoft SQL Server Black Book", 1997)

"Refers to the process in which the article or articles subscribed to on a subscription server are initially synchronized with the original article or articles on the publication server." (Owen Williams, "MCSE TestPrep: SQL Server 6.5 Design and Implementation", 1998)

[automatic synchronization:] "Synchronization that is accomplished automatically by SQL Server when a server initially subscribes to a publication. A snapshot of the table data and schema are written to files for transfer to the Subscriber. The table schema and data are transferred by the distribution agent. No operator intervention is required." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The process of maintaining the same schema and data in a publication at a Publisher and in the replica of a publication at a Subscriber. See also initial snapshot." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"The process of ensuring that the publication and destination tables contain the same schema and data. This process must occur before a new Subscriber can receive replicated transactions from a publication. It is also called initial synchronization." (Microsoft Corporation, "Microsoft SQL Server 7.0 Data Warehouse Training Kit", 2000)

"Synchronization is the process in replication of maintaining the same schema and data at a Publisher and at a Subscriber." (Anthony Sequeira & Brian Alderman, "The SQL Server 2000 Book", 2003)

"Integrating, matching, or linking data from disparate sources." (Linda Volonino & Efraim Turban, "Information Technology for Management" 8th Ed., 2011)

"The continuous harmonization of data attribute values between two or more different systems, with the end result being the data attribute values are the same in all of the systems." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

[initial synchronization:] "The first synchronization for a subscription, during which system tables and other objects that are required by replication, and the schema and data for each article, are copied to the Subscriber." (Microsoft, "SQL Server 2012 Glossary", 2012)

"The process by which a satellite downloads and runs the same DB2 database commands, operating system commands, and SQL statements from the satellite control server as the other members of its group download and then reports the results to the satellite control server." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A form of embedded middleware that allows applications to update data on two systems so that the data sets are identical. These services can run via a variety of different transports but typically require some application-specific knowledge of the context and notion of the data being synchronized." (Gartner)

"Data synchronization is the effort to ensure that, once data leaves a system or storage entity, it does not fall out of harmony with its source, thereby creating inconsistency in the data record." (Information) [source

 "1. In replication, the process of data and schema changes being propagated between the Publisher and Subscribers after the initial snapshot has been applied at the Subscriber. 2. In database mirroring, when a mirroring session starts or resumes, the process in which log records of the principal database that have accumulated on the principal server are sent to the mirror server, which writes these log records to disk as quickly as possible to catch up with the principal server." (Microsoft Technet)

"The process of keeping selected data in multiple data sources in agreement." (Microsoft Technet)

"The term, Synchronization, refers to the process of replicating the changes made to documents on one database to the same documents in a second instance of that database." (Couchbase)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.