Answered Output adapter random? dance

  • Monday, March 04, 2013 3:57 PM
     
     

    Hello again,

    I have an interesting situation here. I installed the premium edition of SI. I implemented a push input adapter that receives events at a rate of approx 10k. I'm using an in-process server and I logged pretty much everything I could think of. I wrote a simple "select from" query and let it running for one minute.

    I found that the input adapter (IA) never enters a Suspended state. The output adapter however hits Resume over 11300 times in 60 seconds. Is this normal? To be more specific, what's the time interval at which the output adapter pulls data from the server?

    Keeping that question in mind, going between Running and Suspended seems to be pretty random for the output adapter,  i.e., right after I start the query:

    - The output adapter (OA) becomes Suspended, having not received any events yet from the server. The input adapter hasn't received anything yet as well.

    - OA resumes and outputs one event. Then it enters Suspended state! THis is indeed surprising, because the input adapter has sent 1100 tuples to the SI server. Also, between the moment that one event was enqueued and its dequeueing time, 140 milliseconds have passed, which is quite long.

    - the OA resumes and processes 3 events, then it becomes Suspended again. At this time, over 1200 events have been enqueued by the IA since the query was started. 

    - the OA continues to dance like this, alternating Suspended and Running states, and only processing a small amount of input events. Meanwhile, the IA continues to push events to the server.

    - then the OA processes about 1000 events until it becomes Suspended again.

    ANd it goes on and on. I ran the query multiple times, each time the behavior was different, i.e., sometimes the OA output 1000 events and only after that went into Suspended, etc.

    WHat's more, every time the OA Resumes, events seem to arrive later at the sink (the difference between enqueueing time and dequeueing time can go to a couple of thousands of milliseconds), but after the SI processes a (variable) number of events this latency is drastically reduced. Until the next time the OA resumes, when it reaches astronomic figures again.

    Also, I found batches of output events with the exact same latency (10 consecutive events processed in 2.9298 ms, the next three processed in 0? ms, aso asf).

    What do you think about this?

    Cheers




All Replies

  • Monday, March 04, 2013 5:14 PM
     
     

    How are you enqueuing your CTIs?

    The OA will enter a suspended state when the current output queue is drained (returns DequeueOperationResult.Empty). When this happens, you call "Ready()" to tell StreamInsight when you are ready for more. It will then wake you up (by calling Resume()) when you have more events to dequeue.

    The IA will enter a suspended state when an enqueue operation returns EnqueueOperationResult.Full. This means that there are 100,000 events in the queue waiting to be processed. You can actually get this to happen ... but it does take some doing.

    I have to wonder if your adapter threads are contending with the StreamInsight processing threads and that's what is causing your latency. How many cores do you have? Keep in mind that HyperThreading isn't a real core - it's a bit of voodoo majick that Intel pulls to make it seem like you have another core by using time that would otherwise be idle for other tasks. If, however, the core wouldn't be idle ... it does you no good. See http://en.wikipedia.org/wiki/Hyper-threading.


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

  • Monday, March 04, 2013 5:58 PM
     
     

    I'm enqueueing my CTIs declaratively, by inheriting from ITypedDeclareAdvanceTimeProperties. I enqueue one CTI event after every INSERT event.

    For the OA.. that's exactly what I'm doing. Only that the ConsumeEvents (the writer routine) is called by Resume or Start on a separate worker thread.

    Ok, it's not my goal to flood the input adapter, but it's good to know the 100k limit (although I suppose it depends on the payload as well).

    I can't answer this question now, I'm running this app on a remote windows machine and I see 4 cores, however it is a VM; I'll get back when I have an exact answer.

    To sum up, I don't understand why the OA becomes Suspended so often. And why, when the OA resumes, the latency is huge (up to 4, 5 seconds) and then decreases to almost 1 millisecond (for the most of the time the OA is Running, until it is Suspended again). If it was a hardware issue, wouldn't this big latency be constant?

    Thanks,



    • Edited by winterquery Monday, March 04, 2013 6:01 PM
    • Edited by winterquery Monday, March 04, 2013 6:01 PM
    •  
  • Monday, March 04, 2013 7:03 PM
     
     

    OK ... so you are doing it declaratively. What settings are you using? IncreasingStartTime? StrictlyIncreasingStartTime? Something else? This can make a big difference because you may be dropping events due to CTI violations. How are you determining the StartTime for your inserts?

    Have you tried enabling the performance counters for the queries to see what the counters say? Specifically, the input queue length? The counters and diagnostic views will also show you if any types are being dropped.

    As for the 100K limit, it's a hard limit and doesn't depend on the payload. StreamInsight will queue input events that it can't schedule immediately for processing, typically due to CPU utilization. By the way ... what does your CPU utilization look like? Are you spiked the entire time?


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

  • Monday, March 11, 2013 6:20 PM
     
     

    Hey,

    Every CTI has the timestamp of the previous INSERT and I set the AdvanceTimePolicy to Adjust in the AdapterAdvanceTimeSettings.

    The CPU usage goes to about 98% during query processing. 

    I did enable the perf counters and I found (about 5 minutes after starting the query):

    - at the server level - the average # events in output queues is about 10 * # events in input queue

    - at the query level - there are about 6 times less produced CTIs than insert events - due to CPU perhaps(?)

    - the output adapter's produced events / sec is about 1000 events higher than the input adapter\s incoming events/sec. So maybe here is the key. Since the OA constantly defeats the input adapter it goes to Suspended (almost 100k times in 10 minutes).

    Thanks for all your suggestions.

  • Tuesday, March 12, 2013 3:17 PM
     
     

    Note that Adjust in your AdapterAdvanceTimeSettings only works for interval events - it won't adjust point events. You didn't mention if you were seeing any dropped events.

    And don't worry much about the OA going into suspended mode - as long as it is working properly, you're good. It just means that you are emptying the output queue.

    Now ... with the high CPU utilization ... that may be introducing some latency into your queries. Do your input adapters have multiple threads? Keep in mind that any threads  you create will be contending for CPU time with your StreamInsight queries and the query scheduler - there's only so much CPU to go around and, at 98% utilization, you'll have some contention.


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

  • Wednesday, March 13, 2013 9:59 AM
     
     

    Hi,

    No, I don't see any dropped events, just a smaller (than expected) number of CTIs. 

    Ok, that's one tick off. But is there a way of knowing how often the OA "wakes" up to check the output queue, and, if any event there, Resumes? 

    Are there performance counters for the OA? The Resumes/sec would be interesting to see for the OA as well. The doc doesn't specify any.

    My IA has just one thread, but there are other threads running in the process. But I'm good with the results I get now.

    Thanks,


    • Edited by winterquery Wednesday, March 13, 2013 10:06 AM
    •  
  • Wednesday, March 13, 2013 8:19 PM
     
     Answered
    Question ... when you enqueue your events, do you have several events with the same start time? And are you using AdvanceTimeSettings or enqueuing them in your adapter? If you are using AdvanceTimeSettings by count of 1, it won't enqueue for each event but for each unique start time. So ... if you have 6 events with the exact same start time, you'll only get 1 CTI for the bunch ... and that's enqueued when a new event with a new start time is enqueued.

    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

    • Marked As Answer by winterquery Tuesday, March 19, 2013 11:38 AM
    •  
  • Tuesday, March 19, 2013 11:36 AM
     
     

    I'm using the latter. This explanation does clear things up.

    Thanks again.

    Update. I'm gonna mark this last post as the answer, but all the replies in this thread have been of great help and contain 'pieces' of answer.
    • Edited by winterquery Tuesday, March 19, 2013 11:38 AM
    •