locked
aggregated values multiplied by 8 out of a sudden requiring restart of ASA stream RRS feed

  • Question

  • I 've had an ASA query running for about 2 weeks, without any problems.

    It is a query which groups events from 2 eventhubs by hour by means of a tumblingwindow aggregating 3 values:a sum float, sum bigint and a count. It uses 3 reference data json files.

    The result is sent to an Azure SQL Database.

    This worked perfect until Friday 10th of November 6:00 AM UTC. After that (as of the next window (the 7:00 AM UTC tumbling window)) all aggregated value's where multiplied by 8!!

    Nothing was changed in the query. Nevertheless, I started to analyse everything. The query, the input data, the reference data. I sampled data from the stream input and ran this via Test mode. It was giving the expected correct output, not the output the ASA query gave!

    It was on Monday around 13 pm that I started the analysis. That was also the moment I stopped the ASA query. At the end of the evening I was desparate and decided just to start the query again to see what would happen.

    I restarted the stream as of de Last Stopped Time. The result was that the query ran well as expected again!

    This meant all data in between 7 AM Friday and 13PM Monday was multiplied by 8. By restarting the stream the problem seems to be resolved.

    This was really driving me crazy! I decided to stop the stream again, delete the corrupted data from the SQL Database and restart the stream again with starttime 7AM in Friday. The result: NO output at all!

    Above case is a second time we encounter inconsistent behaviour with Stream Analytics. A couple of week ago my colleague also went crazy on a certain query, which by accident recovered to normal behaviour after a few days!

    I hope the ASA team contacts me so they can solve this problem! These kind of problems give me the feeling that I work with immature software, which I cannot accept because we are building an application for commercial customer.


    • Edited by FIC_Leader Tuesday, November 15, 2016 8:31 PM
    Tuesday, November 15, 2016 8:23 PM

All replies

  • Hi FIC_Leader,

    In order to understand the root cause of the issues you described, wrong aggregate result and no output after restart, we need detailed information of your job and a thorough investigation. May I suggest you open a support case so that a engineer from the product team will follow-up with you?

    Regards,

    Min He, Program Manager, Azure Stream Analytics


    This posting is provided "AS IS" with no warranties, and confers no rights.

    Tuesday, November 15, 2016 9:36 PM
  • OK, I figured out the following:

    If I restart the query at 12 AM (and before) on that Friday: no output

    restarting the query at 13 PM that Friday: give output! correct output!


    • Edited by FIC_Leader Tuesday, November 15, 2016 10:28 PM
    Tuesday, November 15, 2016 10:28 PM
  • So I let it run overnight to see in the morning what happened.

    Just after the start I see a peak with output events. 

    And then, again, out of a sudden the output stops. Last output of data of the 15th November 10AM.


    • Edited by FIC_Leader Wednesday, November 16, 2016 12:34 PM
    Wednesday, November 16, 2016 8:27 AM
  • Hi FIC_Leader,

    As I have not yet seen a support case opened by you, here is how to find the general troubleshooting steps:

    1) Logon to portal.azure.com

    2) Locate your Stream Analytics job

    3) Under the job find "Diagnose and solve problems"

    4) Go through the listed steps under "My job is not outputting data"

    If you still cannot figure out the root cause of why you are not getting expected output, I would suggest you contact Microsoft support and provide your Stream Analytics job information, so that we can investigated.


    This posting is provided "AS IS" with no warranties, and confers no rights.

    Thursday, November 17, 2016 6:15 AM
  • Hi Min He,

    I just submitted a support ticket.

    This is what I did after last post.

    See also: https://social.msdn.microsoft.com/Forums/en-US/68c2b5c3-a44f-4fa8-a246-f08bdaa24f9c/maximum-event-hub-receivers-exceeded-only-5-receivers-per-partition-are-allowed-please-use-a?forum=AzureStreamAnalytics

    I started to strip the query in order to find the problematic subquery. Output goes to debug (a blob in Storage Account).

    Every time I restart the query as of the same moment  11th November 11:00 AM in order the have the same test conditions for all steps.

    The first subquery is this, and it works! Gives output, no degraded messages, etc.

    WITH Source AS
    (
        SELECT
          *
        FROM
            [wgperform-wgvision-unloaded] TIMESTAMP BY event_timestampUTC
    )
    , InputFlattened AS 
    (
    SELECT
        wp.IdWorkplace          as workplace_id,
        w.Uri                   as workplace_uri,
        w.IdSite                as idsite,
        l.site_publisher        as site_publisher,
        l.timezone_offset       as timezoneoffset,
        l.data.post.customercode as customercode,
        l.data.post.categorycode as categorycode,
        l.data.post.weight      as weight,
        l.event_timestampUTC    as event_timestamputc
        FROM
            Source l 
        INNER JOIN [workplace-publisher] wp on wp.PublisherUri=l.site_publisher
        INNER JOIN [workplace] w on w.Id=wp.IdWorkplace
        WHERE wp.Settings='unloaded' and wp.IdCalculationType=2
    )

    Than I add this:

    Proces0 AS
    (
      SELECT
        System.Timestamp event_timestamputc,    
        SUM(bi.weight) as weight,
        count(*) as counts,
        bi.workplace_id,
        bi.workplace_uri,
        bi.idsite,
        bi.site_publisher,
        bi.timezoneoffset,
        bi.customercode,
        bi.categorycode
      FROM InputFlattened bi 
      GROUP BY
        TumblingWindow(hour,1),
        bi.workplace_id,
        bi.workplace_uri,
        bi.idsite,
        bi.site_publisher,
        bi.timezoneoffset,
        bi.customercode,
        bi.categorycode
    )

    This fails: No output, Degraded messages...

    What is wrong?


    • Edited by FIC_Leader Friday, November 18, 2016 9:49 AM
    Friday, November 18, 2016 9:48 AM