none
How do I troubleshoot SSIS package hanged for running?

    General discussion

  • [This is just for sharing information for those common asked questions collected from forums. If you have any better way or feedback, please directly reply in this thread so that community users can benefit from your answers.]

    Question:
    My SSIS package is running in a SQL Server Agent job, but when it is running, it hangs. I could not see error messages from the job history. What is the problem?
     
    Answer:
    There might be various reasons to cause a SSIS package hanged for running and generally for a hang issue, there is no log information helpful for diagnostics. These are the difficulties of troubleshooting hang issues. In this FAQ, we would like to give you three common aspects to help you perform self-troubleshooting:
    1. Resource bottleneck
    2. Blocking or Deadlock
    3. Poor performance query

    Resource Bottleneck
    Generally resource bottleneck includes bottlenecks of memory, processor, physical disk or network interface. You can run Performance Monitor (perfmon.exe) to collect the trace with the following counters:
    Processor: %Processor Time
    Processor: %DPC Time
    Processor: %Privileged Time
    Processor: Interrupts/sec
    Memory: Available Bytes
    Memory: Pages/sec
    Memory: Page Faults/sec
    Memory: Committed Bytes
    PhysicalDisk: Avg. Disk Queue Length
    PhysicalDisk: Avg. Disk Read Queue Length
    PhysicalDisk: Avg. Disk Write Queue Length
    PhysicalDisk: Disk Reads/sec
    PhysicalDisk: Disk Writes/sec
    PhysicalDisk: Disk Transfers/sec
    Network Interface: Bytes Total/sec
    Network Interface: Bytes Sent/sec
    Network Interface: Bytes Received/sec
    Network Interface: Current Bandwidth
    Process: *
    System: Processor Queue Length
    System: Threads

    (*) means all the related performance counters


    Memory pressure
    When you observe that there is not much available memory and the Pages/sec is over 20, it indicates that you are probably facing a memory bottleneck.
    You can check or try the following methods to reduce memory usage:
    • Check if your SSIS package is trying to load a large amount of data with a Sort transformation that is memory intensive, if there is a sort transformation in your SSIS package, consider removing it and put the sort operation to your database level.
    • Avoid including unnecessary columns from your source component
    • Stop all unnecessary applications or services to release some memory;
    • Reduce the values of DefaultBufferMaxRows and DefaultBufferSize.
    • If your SSIS package uses SQL Server data source, try configuring "min server memory" and "max server memory" on your SQL Server instance so as to restrict the memory usage by SQL Server;
    • Split your SSIS package into multiple packages with each one only transferring small amount of data.
    • If your server is 32-bit with 4GB memory, you can enable /3GB switch in your system boot file so as to assign 1GB more memory to applications running in user mode.
    If these steps still do not help, consider adding physical memory to your machine.

    High CPU
    If you encounter high CPU issues during the package execution, you can identify which process is intensively using the CPU by observing the Process: %Processor Time counter or simply using Task Manager to sort the process by CPU. If the high CPU is for dtexec process, you need to check if there are too many concurrent threads for SSIS tasks. If so, try reducing the package property MaxConcurrentExecutables value or combining parallel tasks in sequence to see if this situation will be improved.
     
    If your SSIS package is accessing a local SQL Server instance, you need to check if the high CPU is caused by a nonoptimal query plan. You can refer to this article on further troubleshooting, http://msdn.microsoft.com/en-us/library/cc966540.aspx.

    If the high CPU is caused by another process, you need to observe the relationship between the process and your SSIS package, and evaluate if you can temporarily stop the process or schedule to run your SSIS package in a different time.
     
    Disk bottleneck
    If you see PhysicalDisk: Avg. Disk Queue Length >2, it usually indicates that you are probably facing a disk I/O bottleneck issue. You need to contact your hardware vendor to perform I/O stress test and diagnostics.

    Network bottleneck
    For NIC, you can check the network utilization by observing the counters Network Interface: Bytes Total/sec and Network Interface: Current Bandwidth. If the average value of (bytes total/sec) / current bandwidth is over 70% during your SSIS package execution, it means that your NIC may encounter a bottleneck and you may consider replacing it with a faster NIC.

    In addition, you can run Network Monitor (netmon.exe) to monitor the network traffic on your NIC, if you see a significantly large number of TCP re-transmissions, it could indicate that your network condition is not healthy and it may cause intermittent connection issue.
    You can also refer to this article for using perfmon to diagnose performance issues:
    http://technet.microsoft.com/en-us/magazine/2008.08.pulse.aspx

    Blocking & Deadlock
    If you do not encounter resource bottleneck problems, you can check if there are blocking/deadlock issue. Blocking issues may happen in the underlying data sources, files or script tasks.

    For SQL Server, there are a lot of methods to monitor blocking and deadlock issues. A common way is to use SQL Server Profiler to capture the trace of Locking events. You can also refer to this KB article to monitor blocking issue:
    http://support.microsoft.com/kb/271509
    You can also enable the trace flags 1204 and 1222 on your SQL Server to check if there is any deadlock happening:
    DBCC TRACEON(1204, 1222, -1)
    Enabling these trace flags will output deadlock information to your SQL Server error logs and you can find the detailed information from your SQL Server Error log. This article describe more information about deadlock:
    http://msdn.microsoft.com/en-us/library/ms178104.aspx

    When you identify blockers, you can kill them with the KILL command. Blocking and deadlock are very common for concurrent access and sometimes it is by design and cannot be avoided, but you can refer to the following article to minimize the blocking opportunity:
    http://technet.microsoft.com/en-us/magazine/2008.04.blocking.aspx

    For file access, avoiding concurrent access usually can resolve blocking issues.
     
    For Script task, a common blocking issue happens on concurrently reading/writing variables. You need to use VariableDispenser.LockForRead or VariableDispenser.LockForWrite for read or write variables. Remember to explicitly call Variables.Unlock() to release the locks once you finish handling the variables. More information can be found here:
    Variables.Unlock Method
    http://msdn.microsoft.com/en-us/library/microsoft.sqlserver.dts.runtime.variables.unlock.aspx
     
    Poor Performance Query
    If through your observation, you find that your SSIS package is hanging at the step of an Execute SQL Task or other tasks running some queries to your SQL Server database, in this case, you can use SQL Server Profiler to capture the trace of the queries and identify the cause of the issue by analyzing the profiler trace file. Then you can use Database Engine Tuning Advisor (DTA) to analyze the queries and get some suggestions from DTA on how to improve the query performance such as creating missing indexes. You can refer to this article for more information:
    http://msdn.microsoft.com/en-us/library/ms166575.aspx

    Using DTA is a simple way to help you perform quick tuning on your queries.
    Besides if your database is a large database with busy CRUD operations, you may need to pay special attentions on your database statistics and fragmentations. If database statistics are not updated in-time, the cached query execution plan may be out and thus is not the best optimized one for the current query, it is recommended that you schedule a job to timely update statistics on those heavy tables.
    Also huge fragmentations also degrade query performance, you can run the DMF sys.dm_db_index_physical_stats on your table indexes to check the value of avg_fragmentation_in_percent, if it is between 5% and 30%, try re-organizing your indexes, if it is larger than 30%, try rebuilding indexes.

    If you want to know more detailed information of troubleshooting SQL Server performance issue, you can refer to the following two articles:
    Troubleshooting Performance Problems in SQL Server 2005
    http://msdn.microsoft.com/en-us/library/cc966540.aspx
    Troubleshooting Performance Problems in SQL Server 2008
    http://msdn.microsoft.com/en-us/library/dd672789.aspx

    If you have a large amount of data to be transferred, it is recommended that you enable the FastLoad option which requires a TABLE LOCK on the destination table. If your destination is SQL Server, please use SQL Server Destination for the best performance.
     
    Other
    Some problems are caused by known issues which are addressed in the latest SQL Server Service Packs. It is recommended that you apply the latest service pack before you perform in-depth troubleshooting.
     
    If you still cannot resolve this issue after taking the above jobs, please collect a hang dump file with ADPlus and contact Microsoft Customer Support Services for further assistance.
    How to use ADPlus to troubleshoot "hangs" and "crashes"
    http://support.microsoft.com/kb/286350


    Please remember to mark the replies as answers if they help and unmark them if they provide no help
    Tuesday, June 22, 2010 3:55 AM

All replies

  • Charles,

    Another scenario of jobs 'hanging' is when the packages run a process taht requires user's input. This specially dificult to validate when the package is run via SQL Server agent job or similar.


    Help Others! Don't forget to mark your thread as answered

    Rafael Salas - SSIS and more
    Tuesday, October 05, 2010 11:16 PM