Showing posts with label OLEDB. Show all posts
Showing posts with label OLEDB. Show all posts

03 February 2021

Data Migrations (DM): Conceptualization IV (Data Access)

Data Migration
Data Migrations Series

Once the data sources for a Data Migration (DM) were identified the first question is how the data can be accessed. The legacy systems relying on ODBC-based databases are in theory relatively easy to access as long they allow the direct access to their data, which would enable thus a pull strategy. Despite this, there are organizations that don’t allow the direct access to the data even for read-only operations, being preferred to push the data directly to the consumers (aka push strategy) or push the data to a given location from where the consumer can use the data as needed (aka hybrid strategy). 

The direct access to the data allows in theory the best flexibility as the solution can extract the data when needed and this especially important during the initial phases of the project when the data need to be pulled more frequently until the requirements and logic is stabilized. A push strategy tends to add additional overhead as usually somebody else oversees the data exports, respectively the data need to be prepared in the expected format. On the other side, it would make sense to make an exception for a DM and allow the direct access to the data. 

 Hybrid strategies tend to be more complex and require additional resources or overhead as the data are stored temporarily at a separate location. Unfortunately, in certain scenarios this is the only approach can be used. Are preferred data files that allow keeping the integrity of the data and facilitate data consumption. Therefore, tabular text files or JSON files are preferred in the detriment of XML or Excel files. It’s preferable to export one data structure individually then storing parent-child solutions even if the latter can prove to be useful in certain scenarios. When there’s no other solution one can use also the standard reports available in the legacy systems.

When storing data outside the legacy systems for further processing it’s recommended to follow organization’s best practices, respectively to address the data security and privacy requirements. ETL tools allows accessing data from password protected areas like FTP, OneDrive or SharePoint. The fewer security layers in between the lower is in theory the overhead. Therefore, given its stability and simplicity FTP might prove to be a better storage solution than OneDrive, SharePoint or other similar technologies.

Ideally the extraction/export mechanisms should use the database objects that encapsulate already the logic in the legacy systems otherwise the team will need to reengineer the logic – for master data this can prove to be easy, though the logic of transactional data like on-hand or open invoices can be relatively complex to reengineer. Otherwise, the logic can be implemented directly in the extraction/export mechanisms or sometimes is more advisable to create database objects (usually on a different schema) on the legacy systems and just call the respective objects. 

When connecting directly to the data source it’s advisable using the data provider which allows the best performance and flexibility, however several tests might be needed to determine the best fit. It would be useful to check the limitations of each provider and find a stable driver version.  OLEDB and ADO.Net data providers provide in general a good performance, though native drivers of the legacy systems can be a better option upon case. 

Some legacy systems allow the access to their data only via service-based technologies like OData. OData tends to have poor performance for large data exports than standard access methods and therefore not indicated in such scenarios. In such cases might be a good idea to export the data directly from the legacy system. 

Previous Post <<||>> Next Post

07 February 2009

DBMS: Linked Server (Definitions)

"A definition of an OLE DB data source used by SQL Server distributed queries. The linked server definition specifies the OLE DB provider required to access the data, and includes enough addressing information for the OLE DB provider to connect to the data. Any rowsets exposed by the OLE DB data source can then be referenced as tables, called linked tables, in SQL Server distributed queries." (Microsoft Technet, "Glossary", 0)

"An abstraction of an OLE DB data source that looks like another server to the local SQL Server. A linked server has an associated OLE DB provider that manages the data source. Linked servers allow heterogeneous data access as if the data were local SQL Server data." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A database object that represents a particular data source and the attributes, including security and collation attributes, necessary to access the data source." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"An OLE DB data source used by SQL Server distributed queries." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A definition that specifies an external OLE DB database source, such as another SQL Server, or an Oracle server. Once defined, the linked server can be used for distributed queries using only the four-part name in the query." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Database objects that provide the connection information for remote data sources housed on another server. The other server could be a SQL Server, an Oracle server, a Microsoft Access database, or one of many other data sources. Linked servers are created when the remote data source will be accessed more than once or twice." (Robert D. Schneider and Darril Gibson, "Microsoft SQL Server 2008 All-In-One Desk Reference For Dummies", 2008)

"Database objects that provide the connection information for remote data sources housed on another server. The other server could be a SQL Server, an Oracle server, a Microsoft Access database, or one of many other data sources. Linked servers are created when the remote data source will be accessed more than once or twice. See also ad hoc query." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A definition of an OLE DB data source used by SQL Server distributed queries. The data exposed by a linked server is then referenced as tables, called linked tables." (Jim Joseph, "Microsoft SQL Server 2008 Reporting Services Unleashed", 2009)

"A definition of an OLE DB data source used by SQL Server distributed queries. The linked server definition specifies the OLE DB provider required to access the data, and includes enough addressing information for the OLE DB provider to connect to the data." (Microsoft, "SQL Server 2012 Glossary", 2012)

"This is the remote servers in a distributed query." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.