Answered Streaminsight vs event store

  • Monday, November 26, 2012 6:11 AM
     
     
    We are heavily utilizing NServiceBus to publish and subscribe various events in our system. We have a need to store these events and analyze them. We started looking at geteventstore.com as our storage mechanism so that we could look through events via a tool that we use called tableau and analyze them. How does streaminsight compare to the geteventstore.com? I realize the latter does not provide analysis but is that the main advantage? Additionally, what is the performance of writes? Any benchmarks? Thank you in advance

All Replies

  • Monday, November 26, 2012 4:52 PM
     
     

    Well, you are comparing apples and oranges. They are kinda similar (both are fruits) but very different.

    First, geteventstore.com is a store. StreamInsight is not. StreamInsight operates on streams of events in memory, in near real-time, not on stored events. It tells you what is happening now. GetEventStore (from what little I read so far) operates on stored events and provides historical analysis on what already happened.

    That said, your output adapters/sinks in StreamInsight can feed a data store for historical analysis. It could certainly be the initial source for data that goes into GetEventStore from NServiceBus (which would be an input adapter/source). In this case, StreamInsight can do additional analysis on the data before storage. This could be downsampling, aggregating or correlating events. For example, if you have a data source that provides 1 sample/second (1 Hz), it's feasible to store every value. When you have 10,000 of them, it becomes less feasible. StreamInsight could downsample these events and store only unique values (on change) or aggregated values. Or you could have a data source that operates at 10 KHz (10,000 events/sec) ... you wouldn't want to save every value there either. StreamInsight does this very well. You may also have several sources that are related to each other and need to do additional calculations or correlation before storing ... again, StreamInsight does that very well. There are also those cases where you want the minimum amount of latency ... anything that involves storage to disk greatly increases your latency due to the very nature of and slowness inherent in the disk. Since StreamInsight operates in memory, your latency is limited by your CPU speed and memory throughput ... not by disk ... as well as by how fast you can get the data to StreamInsight from its source (for example, network bandwidth). Write performance isn't inherent to StreamInsight, but is a from the write performance of your sink's target. Event throughput, however, is very, very high. We've tested out over 100,000 events/sec using StreamInsight Premium on a dual quad-core Xeon ... at around 32% average CPU utilization.

    I like to compare StreamInsight with a store/retrieve paradigm to driving down the freeway. As you drive, you are looking around at everything going on around you and your brain is, in real-time, analyzing traffic and everything else going on around you. Some details are important and followed actively, some are stored away for later retrieval but the vast majority is discarded by your brain when it is no longer necessary. StreamInsight works in a similar fashion and, like your brain, inherently understands time as a dimension. A store/retrieve paradigm is like using a digital camera to drive with ... you take a picture and then look at it. Now, you may be taking a lot of pictures very quickly but there's still latency involved there so you may not be able to respond to fast-moving events in time. And, well, you wind up with a bunch of saved pictures that really aren't important ... so you have to have some way of going back and deleting those old pictures that you don't need any longer. And time is an attribute of the picture that is helpful, but it's not an inherent dimension of the data itself.

    You can also use an event store with StreamInsight to "replay" events ... kinda like a DVD player ... but at higher speed and keeping the original timestamps ... so a super fast forward on the DVD player. (Example: we analyzed about 4 months worth of sensor data using StreamInsight's temporal analysis in about 45 minutes.)

    I wouldn't look at StreamInsight and GetEventStore necessarily as an either-or. They do different things and can actually be quite complimentary with StreamInsight providing real-time analytics, event detection, storage optimization and correlation while GetEventStore provides longer-term trending.

    Make sense?


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

  • Monday, November 26, 2012 5:57 PM
     
     
    Awesome. Thanks a lot for giving me a detailed explanation as it really helps. Just one last piece, without the need for in-memory analysis or pre-storage logic, would you still use StreamInsight for its say SQL Server storage and then somehow use StreamInsight to still analyze the data later from storage or would you say that at this point StreamInsight is not meant froma  reporting standpoint and other tools are better suited?
  • Wednesday, November 28, 2012 3:40 AM
     
     Answered

    Well, StreamInsight doesn't do storage. It does require a Sql Server license but it doesn't need any other Sql Server components. Now, you wouldn't use StreamInsight to directly feed traditional reports ... monitoring dashboards, certainly, but not direct feed to a report. Traditional reports are still a request-response paradigm - you ask the question (query) of data that is stored somewhere, get your answer (results) and you're done. With StreamInsight you ask the question (LiNQ query) of data in-flight (stream) and it continually gives you the answer; you aren't done until you stop the query.

    That said, it is perfectly feasible to replay stored data that is then stored and later retrieved for a report. The temporal semantics of StreamInsight make it very easy to do things that are more difficult to do in traditional SQL queries. For example, if you want a 1 hour moving average updated every 5 minutes ... easy in StreamInsight. If you want to do calculations and have reports based on things like rate-of-change or change-from-previous ... again, easy in StreamInsight. These kinds of things typically require cursors and all kinds of manipulations in SQL queries to make happen. Again, it's not an either-or but an "it depends" ... and without knowing more about your use case and scenarios, I can't really say if StreamInsight would be a good fit for your solution or not.


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

    • Marked As Answer by TimJohnson_1 Wednesday, November 28, 2012 4:29 PM
    •