Saturday, March 09, 2013 9:02 AM
I am creating a ssis solution which encompasses a parent - child design pattern. In my child package there a table AID which needs to be checked for data. The table consists of 2 fields: [tableid] [recID] TableID is the identifier of a table where data resides. recID is the row identifier for the data in that table. There is alson another table called usageTables which consists fo fields [TableID][TableName] There are total of 7 different tables where data can reside. The data retrieved from these tables then goes into a DQS component to be cleansed.
I am trying to decide the most efficient way to send data from the data sources to the DQS component. is it to have have 7 data flow tasks for each table?
Saturday, March 09, 2013 12:44 PM
That's one way.
My preference is to create 7 nearly identical SSIS packages that do a single table each. Once you have the packages, save them to a folder on the server and then create 7 SQL jobs with each one pointing to the different package. But my purpose for this is to follow their progress and verify the integrity (which could be done with an email task just as easily). I can click the job, close the window, click the next, close that, and so on until all 7 packages are running asynchronously on the server. Your way is probably more efficient, and in the development environment you can follow the progress as well but once the package is automated somewhere, that ability goes away.
I question whether you need the sort of control I'm suggesting here over your SSIS tasks or not. Your way is more efficient that what I am suggesting here. The only difference being, my way creates a line in the Event log when it runs as a job.
Saturday, March 09, 2013 1:48 PM
That kind of control is not needed in my solution. Also it the complexity in every package varies and having a separate package will mean creating a lot of packages which is not wanted for this solution.
Assuming I have the following data AID table as follows:
...AND SO ON
If I have 7 data sources for each table in 1 data flow task. How can I pass the recIDs to each of the separate data sources? For example with the data above, the data source component for table 102789 must return the row with recID 10. Also if the are no records for a given table then the data source component does not have to execute. Any ideas on how to dynamically create this in SSIS?
- Edited by Sauce1979 Saturday, March 09, 2013 2:07 PM
Monday, March 11, 2013 4:32 PMI managed to come up with a solution where I Created 7 data sources, each with a filter in the in a where clause on TableID. I then executed them in parallel.
- Marked As Answer by Sauce1979 Monday, March 11, 2013 4:32 PM