locked
Saved state in Stream queries? RRS feed

  • Question

  • Hi,

    Is it possible to write a stream analytics query, such that it returns a result whenever a value changes?

    For example:

    Take a stream of events. Write a query that return true/false when Event named "X" has occurred 5 times in the past 2 days.

    Is there anyway to signal that the output of this query has changed from false to true, without polling the output?

    Ideally, I'd love a message to be sent to an output eventhub whenever False changes to True, AND when it goes back again.

    Thanks,


    Andy

    Monday, March 30, 2015 7:59 AM

Answers

All replies

  • Andy, you don't necessarily need to save the state to do what you described. You can use a self-join on the true/false event stream can do comparison. The query can be a bit complicated though. However, in the next production deployment, we will introduce a new analytics function called Lag, which will allow you compare with the event right before. Stay tuned.
    Saturday, April 4, 2015 10:37 PM
  • Thanks, Very interesting. Would you be able to show me an example. I looked at the join syntax and don't quite see how you would get the old state using the join. Do you just do a tolerance within a small window, like 0.5 seconds? Is this expensive to run in practice I.e. Would this do continuous polling behind the scenes and run up a large bill? Looking forward to the lag feature. What timescale are we looking at? Weeks? Months? Thanks again for your help

    Andy

    Sunday, April 5, 2015 3:31 AM
    • Marked as answer by Zafar Abbas Thursday, April 23, 2015 6:18 PM
    Friday, April 17, 2015 5:45 PM
  • Hi Zhong,

    I am trying to use LAG function but it seems to be only working with the fields which are at the top level in JSON e.g. 

    This will not work (Invalid column name 'Attributes.ABC')-

    SELECT Attributes.ABC, 
      LAG(Attributes.ABC) OVER (LIMIT DURATION(ss, 5)) as previousA
    FROM Input TIMESTAMP BY [TimeStamp]

    But this will work -

    SELECT XYZ, 
      LAG(XYZ) OVER (LIMIT DURATION(ss, 5)) as previousA
    FROM Input TIMESTAMP BY [TimeStamp]

    Wonder if nested fields are not supported in LAG function?

    Update: using brackets around the field names avoids the error but still no output-

    SELECT [Attributes.ABC], 
      LAG([Attributes.ABC]) OVER (LIMIT DURATION(ss, 5)) as previousA
    FROM Input TIMESTAMP BY [TimeStamp]

    Thanks


    • Edited by SuneetN Thursday, May 14, 2015 8:24 AM
    Thursday, May 14, 2015 8:19 AM
  • Update: I spoke to one of our assigned MS CSA (Rupert) around this and the problem now has work around which I've blogged about here... if anyone is interested. I guess Rupert will speak to the product team to see if this a bug or designed like this.

    http://suneet.net/azure-stream-analytics-lag-and-lag/

    Cheers

    Monday, May 18, 2015 11:09 AM