Proposed Answer Multiple or Single Input Sink

  • Thursday, February 14, 2013 8:42 PM
     
     

    Hello,

    My current application takes bursty data from multiple (100+) TCP streams, puts into a single concurrent list and then decodes and analyzes packet contents to provide various realtime dashboard KPIs. I also aggregate KPIs and insert into SQL server for historical analysis.

    I would like to investigate changing the solution to use StreamInsight in the hope to make more efficient with managing the incoming data and flexible with dashboard reporting, my question is how to go about doing this:

    Should I create a input sink for each TCP socket or continue to manage a list to join all the TCP data and feed a single sink ?

    As well as averaging and reporting KPIs across all the TCP sockets to provide a Network view of performance, I want the ability to drill-down and query on individual TCP streams, what would be a good approach to achieve this ?

    Thanks for any pointers you can provide !

All Replies

  • Thursday, February 14, 2013 10:03 PM
     
     Proposed Answer

    What does your data look like? This is important to know so you can create a proper payload class for use in StreamInsight. At a minimum, it will contain some kind of TCP stream identifier that identifies which event belongs to which TCP stream.

    1. Are all these TCP streams connecting to the same socket? If you are using the Rx-based approach in StreamInsight 2.1, inputs are sources while sinks are outputs. You would need to create an input source for each socket you want to accept data on.

    2. You could create a WCF sink that would allow your dashboard(s) to subscribe to the data they want to display. So by default, you can subscribe to the aggregate data. Then if you want to drill-down to see the specifics on a specific TCP stream you can subscribe to those events which your WCF sink will handle filtering out the relevant events.

    • Proposed As Answer by TXPower125 Friday, February 15, 2013 4:27 AM
    •  
  • Friday, February 15, 2013 4:03 AM
     
     

    I have multiple sensors each with its own socket sending data back to the server. Data is nothing more than a byte array containing the sensor identifer and payload which I have to then decode to get status information on sensor conditions.

    As an example I display (amongst other metrics) average power level across all the sensors over the past minute, but also want to monitor the individual sensors to trigger when any sensor power level exceeds a certain threshold over a 30 second window.

    Hope this makes sense.

  • Friday, February 15, 2013 4:27 AM
     
     

    Yep makes perfect sense. This is the kind of thing that StreamInsight is built to do. Sounds like a cool project.

  • Friday, February 15, 2013 3:02 PM
     
     

    Does each sensor/socket combination have the same deserialization requirements? Can each one be placed into the same schema (I'm assuming yes for this)? While StreamInsight can certainly handle 100+ input sinks, I would have some caution just from a manageability perspective ... writing a query where you have 100+ input sinks that you then union ... and have to synchronize in time ... could be quite challenging. And while my initial, knee-jerk instinct is to do 1 sink per socket, synchronizing the CTIs across all of those sensors that are at different event rates could get ugly. If you do go down the path of multiple sockets into one sink, you'll have to think long and hard about your AdvanceTimePolicy and what kind of delay you'll need to make sure that you handle any late-arriving events from sensors.

    Make sense?


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

  • Friday, February 15, 2013 3:50 PM
     
     

    Every byte array will be decoded in the same manner, different measurements will be reported in each packet at differing rates so a sensor may report power measurements at one point in time and light levels another time, and perhaps nothing for a period of time. My dashboard needs to show various user defined measurements (such as avg power, max light level) so at times there may be nothing to report for the current time (indicating a potential issue)

    I have just stumbled on StreamInsight technology in the past week, read high-level and seemed to fit the design need but I have to now dig into the nuts and bolts and wasn't clear on how I should approach the problem - good to hear that StreamInsight should be a good solution to this task though.

    Thanks for your advice