Query with a group by : how can I get the StartTime of the Event

Answered Query with a group by : how can I get the StartTime of the Event

  • Monday, November 12, 2012 10:15 AM
     
      Has Code

    Hello,

    I have a query with a group by to compute a Sum group by a specific dimension.

    I have a stream that contains purchase orders. The start time of each event is the date/time of the Purchase Order.

    My query is the following one :

    var orderSum = from p in orderStream 
                   group p by new { p.Country, p.Region } into countryAndRegionGroup
                   from window in porteGroup.HoppingWindow(TimeSpan.FromMinutes(5), TimeSpan.FromMinutes(5), HoppingWindowOutputPolicy.ClipToWindowEnd)             
                  select new {
                    Country=p.Country,
                    Region=p.Region,
                    Number=window.Count(),
                    ComputationDateTime=DateTime.Now
                  };

    This query compute the number of orders, each 5 minutes, that are created in the last 5 minutes (if my query is right :)).

    What I need to provide is the date/time of the Computation of the Count, but in this query I use the system date/time when the computation is generated, but what I need is to provide the Order date/time. If I replace DateTime.Now by p.OrderDateTime, I have an error because this field is not in the group.

    How can I do that ?

    Thanks.


    David GROSPELIER ReachSOA - Architect and co-founder dgr@reachsoa.com http://www.reachsoa.com

All Replies

  • Monday, November 12, 2012 3:41 PM
     
     Answered

    First, since it's an aggregate, you'll have multiple order date/times. Which one are you looking for?

    Second, this query will result in an interval with a start time as the start of the window and the end time as the end of the window. It won't necessarily be one of the date times for the orders in the window; probably won't be, in fact. However, the end time of the window will be the timestamp, in application time, of the calculation. If you want to get the timestamp of the last order, you can use window.Max(e=> e.OrderDateTime).

    I am a little confused, though, about exactly what you are looking for here.


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

    • Marked As Answer by Dgrospelier Friday, November 16, 2012 2:46 PM
    •  
  • Monday, November 12, 2012 4:35 PM
     
     

    Hi,

    What I want exactly is the number of orders that are created in the last 5 minutes, and compute the count every 5 minutes and I want to keep this values because I have multiple applications interested in these indicators (in my case, I store the results in an AppFabric cache). When I store the result, I want to have these values : at 01/01/2012 10:00:00 -> 2 orders was created, at 01/01/2012 10:05:00 -> 4 orders was created, ...

    As the date/time field (01/01/2012 10:00:00) I don't want the date/time when the result is calculated because the orders are not sent in real-time to StreamInsight. I want to know, that, if the orders was sent in real-time, I will have this trend.

    The back-end system sent a bunch of orders every 10 minutes. I received this list of orders via BizTalk Server, process it in BizTalk (out-of-scope of StreamInsight) and after the process, I split the file and send the order to StreamInsight. So the date/time StreamInsight received the order is not the date/time of the order. 

    Is it more clear ? (sorry for the english).




    David GROSPELIER ReachSOA - Architect and co-founder dgr@reachsoa.com http://www.reachsoa.com

  • Tuesday, November 13, 2012 2:49 PM
     
     

    I would put the order creation time into your payload fields as an explicity payload field and not rely on the event timestamp

  • Tuesday, November 13, 2012 3:19 PM
     
     Answered

    You do realize that the timestamp of the events in StreamInsight doesn't have to be the time that the event was received/processed by StreamInsight, right? The times are determined by the input adapter - so you can enqueue the timestamps of the original order, rather than the time it was received (DateTimeOffset.Now). That's why it's call application time ... not system time. If you look at the samples in LinqPad, very few, if any, of them actually use system time for the events in the samples. When you are doing an aggregate window, a 5 minute window is done based on application time (progression of CTIs), not on system time.

    If that's not appropriate (and I'm not sure why it wouldn't be but ...), you would need to have your timestamp in the payload. As I mentioned earlier, you can then use an aggregate to get the timestamp from the payload (Min, Max or a User Defined Aggregate).


    DevBiker (aka J Sawyer)
    Microsoft MVP - Sql Server (StreamInsight)


    Ruminations of J.net


    If I answered your question, please mark as answer.
    If my post was helpful, please mark as helpful.

    • Marked As Answer by Dgrospelier Friday, November 16, 2012 2:46 PM
    •  
  • Friday, November 16, 2012 2:46 PM
     
     

    Hello,

    Yes I know that I can choose what I want to be the StartTime. In my case, I choose the OrderDateTime as the event StartTime. So I'm totally agree with you, I don't need to add the OrderDateTime in the Payload.

    I will try to use the Min or Max value to get the DateTime of the last order in the timeframe of my computation. 

    Thanks.


    David GROSPELIER ReachSOA - Architect and co-founder dgr@reachsoa.com http://www.reachsoa.com