none
Failed to start a new STA thread RRS feed

  • Question

  • Hi!

    We are having some really weird failures where a .NET 1.1 dll fails to start a new STA thread. The weird part is that the caller does not get any kind of exception (except the one that we are throwing because the resetEvent.WaitOne won't succeed within 5 minutes).

    The method is called from a Biztalk 2004 orchestration. The worker thread does use a 3rd party com-dll that won't work in MTA.  

    Here is a short code snippet to show what we are trying to do.

        public abstract class MyBaseClass
        {
            private ManualResetEvent resetEvent = new ManualResetEvent(false);
    
            protected void Execute()
            {
                try
                {
                    Thread STAThread = new Thread(new ThreadStart(RunCommand));
                    STAThread.ApartmentState = ApartmentState.STA;
                    STAThread.Name = "STA worker";
    
                    STAThread.Start();
    
                    if (!resetEvent.WaitOne(new TimeSpan(0, 0, 5, 0, 0), false))
                    {
                        throw new ApplicationException("Thread did not respond within timeout limit");
                    }
                }
                catch (Exception ex)
                {
                    // dumping of the exception to the eventlog omitted for brevity
                    throw;
                }
            }
    
            private void RunCommand()
            {
                try
                {
                    /*
                     * this  is where the real work is done
                     */ 
                }
                catch ( Exception ex)
                {
                    // dumping of the exception to the eventlog omitted for brevity
                    throw;
                }
                finally
                {
                    resetEvent.Set();
                }
            }
        }
    


    I've added a lot of debugging code the that function. It shows that most of the time the function works fine but a few times a day the child-thread just don't start.

    We've also noticed that quite often it takes a few seconds before the runcommand delegate really does anything.

    Do you have any idea what could cause this and how to investigate this further?

    With regards

    Tapio Kulmala
    Thursday, November 5, 2009 8:51 AM

Answers

All replies

  • Hello Tapio

    It's not clear from your description whether it's the failure of thread creation or the long run in "RunCommand" that caused the time out. Could you please add a timer in the RunCommand method to record its run time? In this way, we are able to narrow down the focuses. I look forward to you test result.
    Regards,
    Jialiang Ge
    MSDN Subscriber Support in Forum
    If you have any feedback of our support, please contact msdnmg@microsoft.com.
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Thursday, November 5, 2009 10:08 AM
    Moderator
  • Hi!

    The problem is in the thread creation. The delegate does not event start or it takes way too long time before the delegate does anything. I have a lot of trace-logging code in those methods. Quite often the RunCommand takes less than a second but it takes a few seconds before it even starts.

    Tapio 
    Thursday, November 5, 2009 10:20 AM
  • Hello Tapio

    STAThread.Start(); causes the operating system to change the state of the target thread to "running", however, it does not mean that the target thread will run immediately. It runs based on .NET thread scheduling:
    http://msdn.microsoft.com/en-us/library/2k34xtf3.aspx
    If there are many, many threads in your process, or if some running thread has a higher priority, the run of your thread may be delayed.

    Could you please attach a debugger (e.g. VS debugger or windbg + sos) to the biztalk process that loads and runs your .NET module? When time out happens, please break the process, and check the running threads in the debugger. How many threads do you see? Could you please paste here these threads' basic information outputted in the debugger?
    Regards,
    Jialiang Ge
    MSDN Subscriber Support in Forum
    If you have any feedback of our support, please contact msdnmg@microsoft.com.
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Thursday, November 5, 2009 11:27 AM
    Moderator
  • Hello Again

    It wasn't easy the repeat the problem....   :(

    !threads
    ThreadCount: 109
    UnstartedThread: 2
    BackgroundThread: 74
    PendingThread: 2
    DeadThread: 13


    Tapio

    Monday, November 9, 2009 2:50 PM
  • I've investigated this more and my trace-logging clearly shows that when the resetEvent.WaitOne ends with timeout after 5 minutes. the state of the thread is still unstarted. Why doesn't the start change the state of the thread? The server has very light load and it has a lot of cpus/cores ( 2 * Xeon quadcores). There shouldn't be any reason for this.

    Tapio
    Tuesday, November 10, 2009 9:22 AM
  • Hello Tapio

    Thanks for your effort to reproduce the problem. Please make sure to caputure a process dump file when the problem happens. I'd like to dig into the dump and see if I can find out anything wrong. You can follow the article http://support.microsoft.com/default.aspx/kb/286350 to capture the dump file of the  application. After you get the dump, please let me know your email address by sending a mail to jialge@microsoft.com. Then I will create a file transfer workspace where you can upload your dump file. The dump will be kept confidential.
    Regards,
    Jialiang Ge
    MSDN Subscriber Support in Forum
    If you have any feedback of our support, please contact msdnmg@microsoft.com.
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Tuesday, November 10, 2009 11:51 AM
    Moderator
  • Thanks. I'll try to recreate the situation again. I'll probably also have to get a permission for this.

    One more thing. When the problem happens, the whole system starts to behave erratically. For example many web-service soap-calls outwards start failing with timeouts at System.Net.HttpWebRequest.GetRequestStream. Also all other Biztalk activity slows down. It's feels like everything is waiting for something and not getting it in a timely fashion.


    Tapio     
    Tuesday, November 10, 2009 12:42 PM
  • Hello

    Here is a small update of our research of the case. We found that CPU utilization is 100% from the output of !threadpool. This can explains why the newly created thread does not start in time. The high CPU utilization may be related to some third party modules. Tapio is digging into it.


    Regards,
    Jialiang Ge
    MSDN Subscriber Support in Forum
    If you have any feedback of our support, please contact msdnmg@microsoft.com.
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Monday, December 7, 2009 4:08 AM
    Moderator
  • Hi Jialiang,

    Did you find any solution for this case? I have the same problem trying to start a new stathread in a library called from the biztalk orchestration,but it never starts and there is no error to understand why.

    Regards,

    Thursday, January 16, 2014 9:10 AM