Showing posts with label linked server. Show all posts
Showing posts with label linked server. Show all posts

25 October 2018

💎SQL Reloaded: Cursor and Linked Server for Data Import

There are times when is needed to pull some data (repeatedly) from one or more databases for analysis and SSIS is not available or there’s not much time to create individual packages via data imports. In such scenarios is needed to rely on the use of SQL Server Engine’s built-in support. In this case the data can be easily imported via a linked server into ad-hoc created tables in a local database. In fact, the process can be partially automated with the use of a cursor that iterates through a predefined set of tables.For exemplification I will use a SELECT instead of an EXEC just to render the results:

-- cleaning up
-- DROP TABLE dbo.LoadTables 

-- defining the scope
SELECT *
INTO dbo.LoadTables
FROM (VALUES ('dbo.InventTable')
           , ('dbo.InventDimCombination')
    , ('dbo.InventDim')
    , ('dbo.InventItemLocation')) DAT ([Table])


-- creating the stored procedure 
CREATE PROCEDURE dbo.pLoadData(
    @Table as nvarchar(50))
AS
/* loads the set of tables defiend in dbo.LoadTables */
BEGIN
   DECLARE @cTable varchar(50)

   -- creating the cursor
   DECLARE TableList CURSOR FOR
   SELECT [Table]
   FROM dbo.LoadTables
   WHERE [Table] = IsNull(@Table, [Table])
   ORDER BY [Table]

   -- opening the cursor
   OPEN TableList 

   -- fetching next record 
   FETCH NEXT FROM TableList
   INTO @cTable

   -- looping through each record 
   WHILE @@FETCH_STATUS = 0 
   BEGIN
 --- preparing the DROP TABLE statement 
        SELECT(' DROP TABLE IF EXISTS ' + @cTable + '')

        -- preparing the SELECT INTO STATEMENT
        SELECT( ' SELECT *' +
         ' INTO ' + @cTable +
                ' FROM [server].[database].[' + @cTable + ']')

 -- fetching next record 
 FETCH NEXT FROM TableList
 INTO @cTable
   END

   --closing the cursor
   CLOSE TableList 
   -- deallocating the cursor
   DEALLOCATE TableList 
END

Running the stored procedure for all the tables:

 -- Testing the procedure 
 EXEC dbo.pLoadData NULL -- loading all tables 

-- output 
 DROP TABLE IF EXISTS dbo.InventDim
 SELECT * INTO dbo.InventDim FROM [server].[database].[dbo.InventDim]

 DROP TABLE IF EXISTS dbo.InventDimCombination
 SELECT * INTO dbo.InventDimCombination FROM [server].[database].[dbo.InventDimCombination]

 DROP TABLE IF EXISTS dbo.InventItemLocation
 SELECT * INTO dbo.InventItemLocation FROM [server].[database].[dbo.InventItemLocation]

 DROP TABLE IF EXISTS dbo.InventTable
 SELECT * INTO dbo.InventTable FROM [server].[database].[dbo.InventTable]

Running the stored procedure for a specific table:

-- Testing the procedure 
EXEC dbo.pLoadData 'dbo.InventTable' -- loading a specific table

-- output 
DROP TABLE IF EXISTS dbo.InventTable
SELECT * INTO dbo.InventTable FROM [server].[database].[dbo.InventTable]

Notes:
Having an old example of using a cursor (see Cursor and Lists)  the whole mechanism for loading the data was available in 30 Minutes or so.
Tables can be added or removed after need, and the loading can be made more flexible by adding other parameters to the logic.
The solution is really easy to use and the performance is as well acceptable in comparison to SSIS packages.
Probably you already observed the use of DROP TABLE IF EXSISTS introduced with SQL Server 2016 (see also post)

Advantages:The stored procedure can be extended to any database for which can be created a linked server.
Structural changes of the source tables are reflected in each load.
Tables can be quickly updated when needed just by executing the stored procedure.

Disadvantages:
Such solutions are more for personal use and their use should be avoided in a production environment.
The metadata will be temporarily unavailable during the time the procedure is run. Indexes need to be created after each load.

Happy Coding!

20 June 2010

🎡SSIS: Using Oracle as Data Source

Making SSIS work with Oracle doesn’t seem to be a complicated task especially after several years of experience in doing that, though, as usual, something new appears in the landscape – new software versions, new requirements, an environment with its own particularities, etc. In general, when an application needs a connection to an Oracle server, is needed to install on the client computer several components that come with Oracle Client, add/configure the TNS (Transparent Network Substrate) name, eventually set-up some global variables in case the components were not installed in the default location, and depending on the chosen provider might be needed to configure also a DSN (Data Source Name) pointing to the Oracle server. SSIS makes actually no exception from these steps, once the steps performed you should be in theory ready to develop, test, deploy, schedule and run packages using a connection to an Oracle server.

Unfortunately, from my experience, every 2-3 installations, there is a problem with the Oracle Client and its configuration, most of the times the solution being quite simple – removing some declarations from the set-up files, correcting the global variables or the TNS name. As it seems the Oracle Client is quite sensible to the changes in the default installation path, therefore is indicated to install the Client using the default path unless you’d like to gain more experience in troubleshooting such installation issues. In addition, as any respectable company, some of the products come with their own defects, the patching of such issues being not so easy, therefore it’s advisable to check beforehand the known issues coming with the Client version you’d need to install, preferably in case you need to take advantage of the latest Oracle features, it makes sense to install the latest stable Client.  

Because the Oracle Client downloadable package has a few hundred MB, there is a thinner Oracle alternative to Oracle Client, namely Instant Client package, the components could be downloaded from Oracle site (here) and installed individually, as a minimum being necessary to install Instant Client Package – Basic or Basic Lite versions, the ODBC libraries and the SQL *Plus libraries in case you want to test the connection. Of course, after case could be installed also the JDBC, SDK or any other packages Oracle made available.

On 64 bit platforms might be needed to install in parallel the 32 and 64 bit Oracle/Instant Clients (see SSIS, Oracle and X64 post from Business Vision DEV Team), while in order to troubleshoot the various issues could be a good idea to check the differences between 32 bit and 64 bit registry (here). 

There are several drivers that allows you to connect to an Oracle database using SSIS, the most popular ones:
- Microsoft OLE BD Provider for Oracle
- Oracle Provider for OLE DB
- .Net Framework Data Provider for Oracle
- .Net Framework Data Provider for Odbc
- Oracle Data Provider for .NET

The Microsoft drivers come with MDAC (Microsoft Data Access Components), its latest version being MDAC 2.8. Starting with Windows Vista and Windows Server 2008, Microsoft changed MDAC into WDAC (Windows Data Access Components), including it as part of the operating system, removing thus the need to redistribute the components. See Data Access Technologies Road Map, FAQ and Troubleshooting MDAC/WDAC for MDAC/WDAC architecture, components and releases, respectively troubleshooting. Given the various issues that exist with a particular MDAC/WDAC library, see in KB301202 how you could check the current version by using the registers, while for  Windows version up to Windows Server 2003 could be used the Component Checker MDAC Utility.

If .Net Framework Data Provider for Oracle, respectively for Odbc come with the .Net framework, in exchange, Oracle Data Provider for .NET, see also FAQ is Oracle’s implementation for ADO.NET data access,  supposed to take advantage of advanced Oracle database functionality, it comes with ODAC (Oracle Data Access Components). Please note that you might need in install the respective components in addition to the Oracle Client.

There are also third party drivers for Oracle or even particularly for SSIS, for example Microsoft Connectors Version 1.1 for Oracle and Teradata for use with SSIS from Attunity; see also a few tips from SQL Server Performance blog.

Before taking any decision on which driver to use, it might be a good idea to look also at drivers’ limitations and advantages. In the past I often used the Microsoft Oracle ODBC Driver until I run into in an important limitation, namely its inexistent support for unicodes, this residing, according to KB244661, in the fact that “from Microsoft Data Access Components (MDAC) version 2.5 and later versions, both the Microsoft ODBC Driver and OLE DB Provider support ONLY Oracle 7 and Oracle 8i”. Also Oracle Provider for OLE DB seems to have its own limitations.

SQL Server provides linked servers, the powerful functionality of executing commands against OLE DB data sources on remote servers, including Oracle. As linked servers offer the ability to create cross-vendor distributed queries, in certain scenarios they could prove to be a powerful alternative of querying Oracle databases, in such cases not being needed to create in SSIS package an additional connection to Oracle. KB280106 and an article on Oracle Provider for OLE DB describe how to set up and troubleshoot a linked server to an Oracle database in SQL Server.

You might want to check also the SQL Server Integration Services with Oracle Database 10 White Paper coming from Microsoft. “Connectivity and SQL Server 2005 Integration Services”  MSDN article written by Bob Beauchemin, Scott Barrett’s blog or this SSIS 64 bit – Using Oracle Provider post.

11 May 2009

🛢DBMS: Federated Database (Definitions)

"A database that is spread across multiple servers, often in multiple geographical locations, is called a federated database. The servers that hold the different parts of a federated database are referred to as a federation, or federated database servers. A federation of database servers is used to spread the processing load across a group of servers. The data is horizontally partitioned allowing each of the servers to be independently managed, but distributed queries can be used to process requests on the entire database." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"A database system wherein constituent databases, that are geographically decentralized and using various computer systems, are interconnected via a computer network or software that allows metasearching via a single platform. Since the constituent database systems remain autonomous, a federated database system is a contrastable alternative to the task of merging together several disparate databases." (Mila M. Ramos et al, "The CGIAR Virtual Library Bridging the Gap Between Agricultural Research and Worldwide Users", 2009)

"A set of databases that are documented and then interconnected to operate as one database, even when those databases are on different platforms." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

[federated data warehouse:] "1.A conceptual Data Warehouse made up of multiple decision support databases, potentially on multiple servers, but presented transparently to Business Intelligence users as a unified schema for query, analysis, and reporting. 2.An Enterprise Data Warehouse fed by extracts from departmental Data Warehouses and/or legacy Data Warehouses prior to their incorporation and/or retirement." (DAMA International, "The DAMA Dictionary of Data Management", 2011)

"A distributed database management system (DBMS) that consists of a DB2 instance that operates as a server, a database that serves as the federated database, one or more data sources, and clients (users and applications) who access the database and data sources. A federated system can be used to query and manipulate data that is located on other data servers." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"In a federated system, the database that is within the federated server. Users and applications interface with the federated database. To these clients, the data sources and the federated database seem to be a single database." (Sybase, "Open Server Server-Library/C Reference Manual", 2019)

"A set of linked servers that shares the processing load of data by hosting partitions of a distributed partitioned view." (Microsoft Technet)

"A system in which multiple databases appear to function as a single entity. However, the databases typically involved in this kind of system exists independently of the others. Once the different databases are 'combined', one federated database is formed." (Solutions Review)

07 February 2009

🛢DBMS: Linked Server (Definitions)

"A definition of an OLE DB data source used by SQL Server distributed queries. The linked server definition specifies the OLE DB provider required to access the data, and includes enough addressing information for the OLE DB provider to connect to the data. Any rowsets exposed by the OLE DB data source can then be referenced as tables, called linked tables, in SQL Server distributed queries." (Microsoft Technet, "Glossary", 0)

"An abstraction of an OLE DB data source that looks like another server to the local SQL Server. A linked server has an associated OLE DB provider that manages the data source. Linked servers allow heterogeneous data access as if the data were local SQL Server data." (Microsoft Corporation, "SQL Server 7.0 System Administration Training Kit", 1999)

"A database object that represents a particular data source and the attributes, including security and collation attributes, necessary to access the data source." (Thomas Moore, "EXAM CRAM™ 2: Designing and Implementing Databases with SQL Server 2000 Enterprise Edition", 2005)

"An OLE DB data source used by SQL Server distributed queries." (Marilyn Miller-White et al, "MCITP Administrator: Microsoft® SQL Server™ 2005 Optimization and Maintenance 70-444", 2007)

"A definition that specifies an external OLE DB database source, such as another SQL Server, or an Oracle server. Once defined, the linked server can be used for distributed queries using only the four-part name in the query." (Darril Gibson, "MCITP SQL Server 2005 Database Developer All-in-One Exam Guide", 2008)

"Database objects that provide the connection information for remote data sources housed on another server. The other server could be a SQL Server, an Oracle server, a Microsoft Access database, or one of many other data sources. Linked servers are created when the remote data source will be accessed more than once or twice." (Robert D. Schneider and Darril Gibson, "Microsoft SQL Server 2008 All-In-One Desk Reference For Dummies", 2008)

"Database objects that provide the connection information for remote data sources housed on another server. The other server could be a SQL Server, an Oracle server, a Microsoft Access database, or one of many other data sources. Linked servers are created when the remote data source will be accessed more than once or twice. See also ad hoc query." (Robert D Schneider & Darril Gibson, "Microsoft SQL Server 2008 All-in-One Desk Reference For Dummies", 2008)

"A definition of an OLE DB data source used by SQL Server distributed queries. The data exposed by a linked server is then referenced as tables, called linked tables." (Jim Joseph, "Microsoft SQL Server 2008 Reporting Services Unleashed", 2009)

"A definition of an OLE DB data source used by SQL Server distributed queries. The linked server definition specifies the OLE DB provider required to access the data, and includes enough addressing information for the OLE DB provider to connect to the data." (Microsoft, "SQL Server 2012 Glossary", 2012)

"This is the remote servers in a distributed query." (Joseph L Jorden & Dandy Weyn, "MCTS Microsoft SQL Server 2005: Implementation and Maintenance Study Guide - Exam 70-431", 2006)

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
Koeln, NRW, Germany
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.