Hundreds of threads from Worker Role connecting to SignalR on Web Role RRS feed

  • Question

  • User-1625935058 posted

    My system has a Cloud Service with a Worker Role that reads messages from a queue (Azure Service Bus) and spawns a thread that uses the C# SignalR client to connect to a Cloud Service running a Web Role hosting the SignalR Hub.  The worker thread runs for about 5 minutes doing various things including intermittently sending messages to the Hub - maybe 25 messages total.  I am scaling out with Azure Service Bus topics - the default of 5.  The Cloud Services are separate but reside in the same Virtual Network - the Worker Role points to the load balancer probes for the Web Role (but right now I am only running a single instance of each Role).

    I am trying to determine the capacity of both the Worker Role (with the SignalR clients) and the Web Role (hosting the SIgnalR hub).

    I can run 200 concurrent threads on the Worker Role with each connecting, exchanging messages, and disconnecting cleanly.  Neither Role experiences more than a 35% CPU spike during the testing.  SignalR Performance counters all look great - there are no errors, no SSE or LP connections, and no scaleout queueing or scaleout errors.

    When I try 300, suddenly all but 1 of my threads on the Worker Role cannot connect, and experience TimeoutExceptions that read "Transport timed out trying to connect" issue.  I enabled tracing on the C# client in the Worker Role and I see that WebSockets, SSE, and LP all fail (Auto: Failed to connect to using transport webSockets/serverSideEvents/longPolling).

    I am hoping to understand if:

    a) my expectations are off - that I expect that I should be able to have more than 200+ concurrent connections from my Worker Role to my WebRole,

    b) are the IIS settings for a Web Role adequate out of the box?  Note that I have applied the SignalR performance changes supplied in the Wiki

    c) are there Worker Role configurations / limitations with the number of concurrent connections I can make to a single source?  Note that I applied the system.net configuration to allow a max of 1000.

    d) is the type of Cloud Service size inhibiting me in any way?  Both are set to "Medium" size which is 2 cores and 3.5 GB.  Am I short-changing anything by stay small?  The idea was to find the limits of this size server and then be able to apply more instances in real-time as needed.

    It should be stated that if I add instances, I can get past this limitation.  But I want to understand why my current bottleneck is 200.

    Any ideas or comments are welcome.  I'm kind of stuck.

    Tuesday, January 19, 2016 10:27 PM


All replies

  • User61956409 posted

    Hi gregheidorn,

    According to your description, the problem is related to Azure, you could try to ask for help on Azure forums.


    Best Regards,

    Fei Han

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Wednesday, January 20, 2016 7:13 AM
  • User-1625935058 posted

    I will cross-post there as well.  I was hoping to attract some SignalR expertise here.

    Wednesday, January 20, 2016 4:42 PM