locked
Control Flow vs. Data Flow RRS feed

  • Question

  • I find it very difficult why SSIS has  distinction between Control Flow and Data Flow. In many cases; you can do things in either flow. In control flow one can use execute sql database tasks; save result sets in variables and then pass it to  another execute sql tasks. Control flow tasks have precedence constraints; one can use them to decide if then else logic(instead of conditional transformation)  Data flow does not provide precedence constraint; wonder why? Control flow does not provide data viewer; why?

    Some cases it is clear cut to use control flow; e.g.; to loop over a set if files; however in other cases it is. Can someone point me to some link which provides guidelines when to  use  it.

    Sunday, May 26, 2013 8:04 PM

Answers

  • Yes you are correct that sometimes execute sql task and data flow task, both, can be used to achieve same result and it depends on your requirement which one to use. If you are doing a data load within same server and you can easily write the transformation rules using sql then you can wrap all the data load logic within a stored procedure and call it using a execute sql task. If your source and destination database are on different servers then you have to use a data flow task unless yo use linked servers in your query. Using a dataflow task allows you to use more than one connection managers unlike execute sql task which allows to connect to only one server using a connection manager. It also allows to use various inbuilt data flow components that can be easily used to implement data transformation rules. A developer who is not well versed in writing sql can also implement these rules using data flow components. If you want to continue your data load even if sourec data has some incorrect record, you can do it using data flow task and it's error redirection/ ignore features and at same time you can easily log these recorsd for later use. So,,yes you can use execute sql task to replace a data flow task sometimes but not always.

    Nitesh Rai- Please mark the post as answered if it answers your question

    • Marked as answer by Prem Mehrotra Monday, May 27, 2013 12:24 PM
    Monday, May 27, 2013 7:25 AM

All replies

  • An SSIS package not only does the actual data loading but also does some processing and stage setting activities before loading the data to a database. Control flow allows to do these activities including:

    • creating some tables in database
    • checking some files in a folder
    • executing a batch file or application
    • downloading some files from specific location
    • Ensuring a data load to happen only if certain set of files are present using precedence constraints etc.) . 

    Data flow task only allows to work with the incoming data from source and any logic implemented within data flow task is based on the actual data values. So, data flow task revolves completely around the data stored in the memory buffers. It means that all the activities mentioned (bullet points) earlier cannot be done by data flow task.

    Control flow task includes various steps and precedence constraints are used to govern the execution of these steps (and precedence constraint is in no way similar to conditional split. See last section). Precedence constraints allow you to specify certain conditions so that a control flow task can run (or not run)

    Precedence constraint allows you to ensure that a control flow task runs on success or failure or completion of another task while conditional split checks values for incoming source data and decides to either pass it or filter it based on the condition specified. As I have already mentioned earlier that precedence constraints determines the execution of control flow tasks based on execution status of other tasks and since data flow task itself is a control flow task, precedence constraints cannot be used inside a data flow task.

    As already mentioned earlier that data flow task deals with the data using memory buffers, concept of data viewers within data flow task does makes sense. It allows you to see the data flowing from source to various components in the memory buffer. However control flow does not deal with the data in memory buffers flowing from one task to other there is no concept of data viewer.


    Nitesh Rai- Please mark the post as answered if it answers your question

    • Proposed as answer by SSISJoostMVP Monday, May 27, 2013 5:32 AM
    Sunday, May 26, 2013 9:25 PM
  • Nitesh:

    Thanks for detailed explanation. I have seen way too many exanples of for each loop container and exceute sql task which can be done in data flow using ole db source, ole db command etc. So it is quite confusing to me when to use Control flow vs Data Flow.  Whatever I can do in data flow; I try that first;  and use control flow only when it cannot be done in data flow.

    Sunday, May 26, 2013 11:08 PM
  • Yes you are correct that sometimes execute sql task and data flow task, both, can be used to achieve same result and it depends on your requirement which one to use. If you are doing a data load within same server and you can easily write the transformation rules using sql then you can wrap all the data load logic within a stored procedure and call it using a execute sql task. If your source and destination database are on different servers then you have to use a data flow task unless yo use linked servers in your query. Using a dataflow task allows you to use more than one connection managers unlike execute sql task which allows to connect to only one server using a connection manager. It also allows to use various inbuilt data flow components that can be easily used to implement data transformation rules. A developer who is not well versed in writing sql can also implement these rules using data flow components. If you want to continue your data load even if sourec data has some incorrect record, you can do it using data flow task and it's error redirection/ ignore features and at same time you can easily log these recorsd for later use. So,,yes you can use execute sql task to replace a data flow task sometimes but not always.

    Nitesh Rai- Please mark the post as answered if it answers your question

    • Marked as answer by Prem Mehrotra Monday, May 27, 2013 12:24 PM
    Monday, May 27, 2013 7:25 AM