SELECT * -- specify explicitly the attributes FROM [Production].[Bikes] UNION ALL SELECT * -- specify explicitly the attributes FROM [Production].[ClothesAccessories] UNION ALL SELECT * -- specify explicitly the attributes FROM [Production].[Components]
If the datasets are distributed across several data sources then in theory could be created a package for each data source and dump the data locally, however this often implies the synchronization of all involved packages, eventually by calling each package from a parent package. A better and easier approach is to use the Union All Data Flow Transformation that merges the data from multiple sources into a common result set, a data source needing to be defined for each input dataset. As exemplification I will use the three Product tables created in the previous post, namely [Production].[Bikes] and [Production].[ClothesAccessories] and[Production].[Components]. Even if the tables are located on the same server and the datasets could be merged with a UNION, I will use the general approach and reuse the template SSIS package defined in Third Magic Class post. Copy paste the Package.dtsx in the project and rename it (e.g. Package Union All.dtsx), in Data Flow Task rename the Data Source (e.g. OLE DB Source Bikes) and change the SQL command text using the following query:
SELECT ITM.ProductID , ITM.ProductName , ITM.ProductNumber , ITM.ProductModel , ITM.ProductSubcategory , ITM.ProductCategory , ITM.MakeFlag , ITM.FinishedGoodsFlag , ITM.Color , ITM.StandardCost , ITM.ListPrice , ITM.Class , ITM.Style , ITM.SellStartDate , ITM.SellEndDate , ITM.DiscontinuedDate , ITM.ModifiedDate FROM Production.Bikes ITM
Once this step completed copy the OLE DB Source, paste it again in the current Data Flow Task twice, and modify the queries of the two new added data sources to point to Production.ClothesAccessories, respectively to Production.Components tables. From Toolbox add a Union All Transformation and link to it the (green) connectors of the three sources, and once you open the Union All Transformation Editor you’ll see that SQL Server mapped the columns automatically:
Accept the mapping and link the (green) connector of the Union All Transformation to the existing OLE DB Destination. When attempting to use the OLE DB Destination Editor SQL Server will first show the Restore Invalid Column References Editor in which you could update the invalid references.
Accept the default settings and in the OLE DB Destination Editor create a new table based on the existing input (e.g. [Production].[ProductsOfInterest]), and test the Data Flow Task:
TRUNCATE TABLE [Production].[ProductsOfInterest]
Save the project, test (debug) the package (twice) and don’t forget to validate the output data:
SELECT * FROM [Production].[ProductsOfInterest]
Note:
In case you need to reuse the datasets in multiple packages, it makes sense to load each dataset in its own staging table rather then loading the data over and over again.
No comments:
Post a Comment