none
SSIS Package Execution: Multi-core machine utilization RRS feed

  • Question

  • Hi,

    I have been working with SQL Server Integration Services for sometime now and the level of flexibility available for ETL is amazing. One thing that bothers me is how it operates in an enterprise-level environment.

    Usually when the amount of data increases, organizations tend to ramp-up on the hardware. With this comes the possibility of using multi-core machines. The problem with multi-core machines is that load-sharing is done at the operating system level. Unless the application programmers have written code that makes use of multiple cores in the server machine, it would be upto Windows to share the load as and when it deems fit.

    That is, say, if Windows encounters 2 threads of an application, it would execute them separately on separate cores. But if it encounters 1 thread, it would continue to execute that thread on a single core while the rest of the cores remain idle; unless the application developer wrote that single thread to make use of multiple cores, using an API that allows for processor instruction-level distribution and paralellism on multiple cores.

    My problem with SSIS starts when I have to load data of >1TB in a Data Warehouse. In case of loading this amount of data, would the SSIS package (that is the IS pipeline engine) make use of a multi-core machine if available?

    And if it doesn't, how can I use SSIS in such an environment? I am thinking of maybe using multiple machines (all multi-core boxes) and run Windows HPC Server 2008 in order to distribute tasks on multiple platforms. That still doesn't solve the problem of distribution of load across multiple cores.

    Has anybody got any ideas?
    Wednesday, July 23, 2008 8:39 AM

Answers

  •  Hi Irtiza,

    This forum is for software developers who are using the Open Protocol Specification documentation to assist them in developing systems, services, and applications that are interoperable with Windows.

    The Open Protocol Specifications can be found at: http://msdn2.microsoft.com/en-us/library/cc203350.aspx.

    Since your post does not appear to be related to the Open Protocol Specification documentation set but rather to an implementation question, I suggest you post your question on the following forum: 

    For future reference, I find the SQL Server Integration Services forum (http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=80&SiteID=1) as the most adequate for your question.

    Thanks and regards,

    SEBASTIAN CANEVARI - MSFT SEE Protocol Documentation Team
    • Marked as answer by Chris Mullaney Tuesday, September 30, 2008 10:00 PM
    Wednesday, July 23, 2008 5:43 PM

All replies

  •  Hi Irtiza,

    This forum is for software developers who are using the Open Protocol Specification documentation to assist them in developing systems, services, and applications that are interoperable with Windows.

    The Open Protocol Specifications can be found at: http://msdn2.microsoft.com/en-us/library/cc203350.aspx.

    Since your post does not appear to be related to the Open Protocol Specification documentation set but rather to an implementation question, I suggest you post your question on the following forum: 

    For future reference, I find the SQL Server Integration Services forum (http://forums.microsoft.com/msdn/ShowForum.aspx?ForumID=80&SiteID=1) as the most adequate for your question.

    Thanks and regards,

    SEBASTIAN CANEVARI - MSFT SEE Protocol Documentation Team
    • Marked as answer by Chris Mullaney Tuesday, September 30, 2008 10:00 PM
    Wednesday, July 23, 2008 5:43 PM
  • Hi,

    I have been working with SQL Server Integration Services for sometime now and the level of flexibility available for ETL is amazing. One thing that bothers me is how it operates in an enterprise-level environment.

    Usually when the amount of data increases, organizations tend to ramp-up on the hardware. With this comes the possibility of using multi-core machines. The problem with multi-core machines is that load-sharing is done at the operating system level. Unless the application programmers have written code that makes use of multiple cores in the server machine, it would be upto Windows to share the load as and when it deems fit.

    That is, say, if Windows encounters 2 threads of an application, it would execute them separately on separate cores. But if it encounters 1 thread, it would continue to execute that thread on a single core while the rest of the cores remain idle; unless the application developer wrote that single thread to make use of multiple cores, using an API that allows for processor instruction-level distribution and paralellism on multiple cores.

    My problem with SSIS starts when I have to load data of >1TB in a Data Warehouse. In case of loading this amount of data, would the SSIS package (that is the IS pipeline engine) make use of a multi-core machine if available?

    And if it doesn't, how can I use SSIS in such an environment? I am thinking of maybe using multiple machines (all multi-core boxes) and run Windows HPC Server 2008 in order to distribute tasks on multiple platforms. That still doesn't solve the problem of distribution of load across multiple cores.

    Has anybody got any ideas?

    Have you got the answer? I have the similar question.
    Wednesday, February 9, 2011 11:15 PM
  • Hi,

    I have been working with SQL Server Integration Services for sometime now and the level of flexibility available for ETL is amazing. One thing that bothers me is how it operates in an enterprise-level environment.

    Usually when the amount of data increases, organizations tend to ramp-up on the hardware. With this comes the possibility of using multi-core machines. The problem with multi-core machines is that load-sharing is done at the operating system level. Unless the application programmers have written code that makes use of multiple cores in the server machine, it would be upto Windows to share the load as and when it deems fit.

    That is, say, if Windows encounters 2 threads of an application, it would execute them separately on separate cores. But if it encounters 1 thread, it would continue to execute that thread on a single core while the rest of the cores remain idle; unless the application developer wrote that single thread to make use of multiple cores, using an API that allows for processor instruction-level distribution and paralellism on multiple cores.

    My problem with SSIS starts when I have to load data of >1TB in a Data Warehouse. In case of loading this amount of data, would the SSIS package (that is the IS pipeline engine) make use of a multi-core machine if available?

    And if it doesn't, how can I use SSIS in such an environment? I am thinking of maybe using multiple machines (all multi-core boxes) and run Windows HPC Server 2008 in order to distribute tasks on multiple platforms. That still doesn't solve the problem of distribution of load across multiple cores.

    Has anybody got any ideas?

    Have you got the answer? I have the similar question.
    Who has marked the above post as abusive. and even i would like to ask if you have found the answer to your question if so please share the same @administrators : Please move this post to the ssis forums

    --------------------------------------------------------

    Surender Singh Bhadauria

    My Blog

     

    Tuesday, December 27, 2011 7:26 AM