Data Migrations Series |
Introduction
When designing a Data Migration (DM), respectively any software solution, it’s important to take inventory of project’s requirements, evaluate, document, communicate and monitor them accordingly. Each of them can have an important impact on the solution, as a solution’s success will be validated and judged upon them. Therefore, the identified requirements must be considered as baseline for conceptualization, design, implementation and sign-off, and should go through same procedures and rigor as other projects requirements. The existence of a standardized Requirements Management process can facilitate their management through project’s lifecycle.
The requirements are usually driven by the source and target systems (e.g. data import/export features, data models and their constraints), the environments they are hosted on (e.g. cloud vs. on-premise), respectively the layers in between (e.g. network, firewalls), project and business aspects that need to be considered (e.g. freeze window for the Go-Live, data availability dates, data quality, external dependencies, etc.). They resume to the solution itself as well to the data and processes involved, and are reflected but not limited to the following important aspects, that can be considered upon case also as quality acceptance criteria:
Accessibility
Accessibility is the degree to which the data are available for a solution so it can be processed when needed, in the form, by resources, or means intended for processing. It’s critical for a DM solution to access or have available the master, transaction, parameter and further data when needed. The team must make sure that the data become easily accessible.
Unavailability of data can impact the DM and can easily lead to delays in the project. This also means that the various project activities (parametrization, cleansing, enrichment, development) need to be synchronized with the migration activities.
Upon case, accessibility can involve the solution itself expressed as the degree to which it’s available to the resources supposed to use it. Certain architectural decisions can have impact on the carried activities. As the solution is usually deployed on a server, it can happen that only a limited number of people is able to access it concurrently. Moreover, a DM’s complexity makes the involvement of multiple developers challenging.
Accountability
Accountability is the degree to which accountability is enforced for the various resources involved in DM processes and related activities. As multiple resources are involved for parametrization, cleaning, processing, validation, software development, each resource needs to be aware about the extent they are accountable for. Without accountability made explicit, there’s the danger that the activities are neglected, with all the implications deriving from it – quality deviations, delays, data unavailability, etc.
Adaptability
Adaptability is the degree to which a solution can be adapted to environment or requirement changes. Even if typically, the environments don’t change, it doesn’t mean that this will not happen as the IT infrastructure goes through continuous changes that can affect directly or indirectly a migration. Same can be said about requirements, which however have higher probability to change even late in the process as new knowledge is acquired and needs to be integrated in the solution.
Atomicity
Atomicity is the degree to which data entities can be processes at the required level of abstraction in an atomic manner. Even if transformations occur during the various stages, the data belonging to an entity need to be kept and processed together (e.g. Customers and their Addresses). This can involve processing attributes in advance even if the data might be required later. There can be situations in which the data belonging to the same entity need to be processed on different paths, though in the end it’s important to keep the data together, when the processing logic allows it.