12 March 2010

SSIS: The Conditional Split Data Flow Transformation

    In Third Magic Class I was showing how to create a Data Flow Task using the SQL Server 2008 Business Intelligence Development Studio (BIDS), the respective SSIS package being based only on Products with the Product Category having the value ‘Bikes’. What if would be needed to create a table for ‘Components’ Product Category, and another one for ‘Clothes’ and ‘Accessories’? Normally this would equate with creating a package for each destination table, though SSIS provides the Conditional Split component that allows to partition a data set horizontally based on a set of constraints.

   For current example copy paste the Package.dtsx created in the mentioned tutorial and rename it (e.g. Package Conditional Split.dtsx). The first important change is to leverage the scope to the four Product Categories considered - 'Bikes', 'Components', 'Clothing' and 'Accessories', therefore in the ‘OLD DB Source’ change the query as follows: 
 
SELECT ITM.ProductID 
,ITM.ProductName 
, ITM.ProductNumber 
,ITM.ProductModel 
,ITM.ProductSubcategory 
,ITM.ProductCategory 
,ITM.MakeFlag 
,ITM.FinishedGoodsFlag 
,ITM.Color 
,ITM.StandardCost 
,ITM.ListPrice 
,ITM.Class 
,ITM.Style 
,ITM.SellStartDate 
,ITM.SellEndDate 
,ITM.DiscontinuedDate 
,ITM.ModifiedDate 
FROM Production.vProducts ITM 
WHERE ProductCategory IN ('Bikes', 'Components', 'Clothing', 'Accessories') 

   From the Toolbox add the Conditional Split component, delete the previous connector and attach it to the new added transformation element. Double click on the Conditional Split in order to access the Conditional Split Transformation Editor in which you’ll need to specify the conditions used for the split. Considering as the first condition the case treated in the previous tutorial, define as Condition [ProductCategory]==”Bikes” and rename the default ‘Case 1’ Output Name to ‘Case Bikes’. For the second condition consider [ProductCategory] == "Components" as Constraint and ‘Case Components’ as ‘Output Name’, as in below screenshot: 
 
SSIS - Conditional Split
 
   There is no need to create a third Case for 'Clothing' and 'Accessories', because they will be considered on the Conditional Split Default branch. Eventually you could rename the ‘Conditional Split Default Output’ Default output name to ‘Case Else’ as above.

    For each Case in the Conditional Split, including the Default one, you’ll need to have a destination, therefore drop two more OLE DB destinations and link to all three destination one green connector from the Conditional Split. For each connector you’ll have to select the constraint the branch is supposed to address, and be sure that you selected ‘Case Bikes’ for the destination inherited from the template package! As for the other two you’ll have to create two destination tables using the default structure and rename them [Production].[ClothesAccessories], respectively [Production].[Components]. The two tables will have the same definition as [Production].[Bikes] table. Do not forget to check the ‘Keep identity’ and ‘Keep nulls’ checkboxes too!

   Save the package and debug it, the resulted package showing similarly with the one from the below screenshot: 
 
SSIS - Conditional Split Package  

   We have used the Conditional Split Transformation and tested it, though we haven’t finished yet! As I remarked in the previous post, the data from the destination table(s) needs to be deleted each time the package is run. For this in Control Flow tab modify the Execute SQL Task and in the SQL Statement replace the existing statement with the following lines: 

TRUNCATE TABLE [Production].[Bikes] 
TRUNCATE TABLE [Production].[ClothesAccessories] 
TRUNCATE TABLE [Production].[Components] 

   Link the Execution SQL Task connector to the Data Flow Task, assure that the Constraint option value is set to ‘Success’, save the project, test (debug) the package (twice) and don’t forget to validate the output data!

Note:
   I preferred to keep things as simple as possible, especially when considering the constraints used for the Conditional Split, however this shouldn't stop you to attempt using the supported library of functions and operators in order to create more complex constraints. This post is dependent on Third Magic Class post and the SSIS “template” package create in it, so please start with the respective post!

1 comment:

Anonymous said...

Thanks, very helpfull

Related Posts Plugin for WordPress, Blogger...

About Me

My photo
IT Professional with more than 24 years experience in IT in the area of full life-cycle of Web/Desktop/Database Applications Development, Software Engineering, Consultancy, Data Management, Data Quality, Data Migrations, Reporting, ERP implementations & support, Team/Project/IT Management, etc.