locked
LAG unexpected behaviour RRS feed

  • Question

  • Hi!

    My query of Stream Analytics have an unexpected behaviour and I don't know what is happening.

    The purpose of the query is the following:

    I have devices that perform a some tests. The result of this tests is compared with thresholds data (reference data) inside blob storage. Depending on the threshold surpassed the event is marked like critical (alertLevel = 1) or warning (alertLevel =2).

    The critical events always generate an alert (isAlert = 1) but for the warning events, I only want to generate an alert with two consecutive warning events for the same device and test. Each test generate a value every 3 minutes.

    I use lag with this purpose, to search the existance of the previous event marked as warning (alertLevel =2) with the same deviceId and idtest of the current warning event within a limit of 4 minutes. If the lag operator return a value not NULL, then it's the second warning event in a row for the same device and test. Then I mark this event as an alert (isAlert = 1).

    In fact, to perform this I only need to execute lag once (to calculate prevAlertResult). I have added twoPrevAlertTime and twoPrevAlertResult to make more clear what is happening.

    WITH alertStep1 AS
    (
        SELECT
        CAST(I1.testResult AS float) AS testResult,
        CAST(I1.linkSpeed AS float) AS linkSpeed,
        CAST (bl.warnValue as float) AS warnValue,
        CAST (bl.critValue as float) AS critValue,
        0 AS isDiscarded,
        I1.*,
        CASE WHEN (I1.idTest = 1 AND (I1.testResult > CAST(bl.critValue as float))) OR (I1.idTest != 1 AND (I1.testResult < CAST(bl.critValue as float))) OR (I1.testResult = -1) THEN 1 ELSE 2 END as alertLevel
    	FROM testdataeventhubinput I1 TIMESTAMP BY eventTime INNER JOIN alertthersholdblobinput bl on ((I1.deviceId = bl.deviceId) AND (I1.idTest = CAST(bl.testId as BIGINT)))
    	WHERE (I1.idTest = 1 AND (I1.testResult > CAST(bl.warnValue as float))) OR (I1.idTest != 1 AND (I1.testResult < CAST(bl.warnValue as float))) OR (I1.testResult = -1)
    ),
    
    alertStep2 AS
    (
    SELECT
        *,
        CASE
        WHEN (alertLevel = 2) THEN LAG(eventTime, 1) OVER (PARTITION BY deviceId,idTest,alertLevel LIMIT DURATION (Second, 210)) 
        ELSE NULL
        END
        AS prevAlertTime,
        CASE
        WHEN (alertLevel = 2) THEN LAG(eventTime, 2) OVER (PARTITION BY deviceId,idTest,alertLevel LIMIT DURATION (Second, 210)) 
        ELSE NULL
        END
        AS twoPrevAlertTime,
        CASE
        WHEN (alertLevel = 2) THEN LAG(testResult, 1) OVER (PARTITION BY deviceId,idTest,alertLevel LIMIT DURATION (Second, 210))
        ELSE NULL
        END
        AS prevAlertResult,
        CASE
        WHEN (alertLevel = 2) THEN LAG(testResult, 2) OVER (PARTITION BY deviceId,idTest,alertLevel LIMIT DURATION (Second, 210))
        ELSE NULL
        END
        AS twoPrevAlertResult
        FROM alertStep1 aS1
    )
        
    SELECT
        *,
        CASE
        WHEN (alertLevel = 1) THEN 1
        WHEN (prevAlertTime IS NOT NULL) THEN 1
        ELSE 0
        END AS isAlert,
     
    INTO
        alertstableoutput
    FROM
        alertStep2

    This query works pretty well the first few days. One day or two after starting the Stream Analytics job (and the issue always start with a change of day) it stop working properly. The lag operator starts to return always the value of the current event and not the previous event.

    This is the output of Stream Analytics when the LAG operator it's not working:

    Look at the value of prevalerttime and twoprevalerttime that matches eventtime. 


    If I restart the Stream Analytics job the lag operator works again and the alerts are recalculated well.

    This is the output of Stream Analytics after the restart of the job:


    Note: In this screenshot the columns twoPrevAlertTime and twoPrevAlertResult doesn't appear because the value is always null. This is the expected result. As each test generate a value every 3 minutes, it's impossible to find the two previous value within a range of 4 minutes.

    I can't be restarting the Stream Analytics job every day and I don't know why is this happening.

    Any ideas?

    Thank you!


    • Edited by MarinaMT Wednesday, July 27, 2016 1:56 PM
    Wednesday, July 27, 2016 1:52 PM

All replies

  • Anyone?
    Monday, August 1, 2016 9:58 AM
  • Apologies for the delay in getting back to you. Will need some additional info from you, could you please email us at askasa at Microsoft dot com so that we can get these additional details from you and investigate the issue for you.
    Tuesday, August 2, 2016 10:02 PM