none
Reading historical data RRS feed

  • Question

  • In may and june we did a proof of concept with among other Stream Analytics. In this POC we tried to read historical data from a Blob. Normally you can't read data older than 20 days because of the late arrival setting. However, in the documentation it is mentioned that, when you set the late arrival setting to -1 that it functions as indefinite. You can not set this by using the Stream Analytics GUI, but when using an ARM template setting this value is possible. We tested this with data older than 20 days (to be exact a month and a few days).  And they were imported and processed successfully (using a group by with a tumbling window and averages within this window).

    In july we started implementing a new platform in which we use Stream Analytics for the ingestion of real time data, but also for historical data. Now, 4 months after the POC, when trying to read historical data, older than 20 days, no output events are generated, nor do we get late arrival events. But the amount of input events keeps increasing, without any output events.

    Has something changed in the way old data are processed?






    Friday, September 14, 2018 12:03 PM

All replies

  • Can you please let us know the link of the document that you are referring?
    Friday, September 14, 2018 9:12 PM
    Moderator
  • Hi Johan, 

    Take a look at the thread here:

    https://social.msdn.microsoft.com/Forums/en-US/0d917ed4-d599-46ff-8134-cb5e64cd80ac/processing-a-stream-of-historical-data?forum=AzureStreamAnalytics

    If you using the testing feature within the Portal or Visual Studio, you may have been able to work with data older than 20 days as time policies are not enforced.  

    Azure Analysis Services does not support a late arrival policy of greater than 20 days.  

    Friday, September 14, 2018 10:16 PM
    Moderator
  • Hi Johan, 

    Take a look at the thread here:

    https://social.msdn.microsoft.com/Forums/en-US/0d917ed4-d599-46ff-8134-cb5e64cd80ac/processing-a-stream-of-historical-data?forum=AzureStreamAnalytics

    If you using the testing feature within the Portal or Visual Studio, you may have been able to work with data older than 20 days as time policies are not enforced.  

    Azure Analysis Services does not support a late arrival policy of greater than 20 days.  

    It does. See here:

    https://docs.microsoft.com/en-us/azure/templates/microsoft.streamanalytics/streamingjobs

    Setting the late arrival setting to -1 allows for an indefinite late arrival. You can also set this value by using resources.azure.com. In may this worked correctly. We tested this with data between one and two months. 

    But it seems as if this no longer functions - I've tested with data of 8 months old and one month old, and with files of 2 months, 1 month, 2 weeks and 1 week of data. For some reason the input events keep increasing, without any output events being generated. When using data that are younger than 20 days, output events are generated within a minute after input events are being processed.


    Monday, September 17, 2018 6:25 AM
  • Can you please let us know the link of the document that you are referring?

    https://docs.microsoft.com/en-us/azure/templates/microsoft.streamanalytics/streamingjobs

    As I mentioned above, I can also set the late arrival to -1 using resources.azure.com.


    Monday, September 17, 2018 6:28 AM
  • After some tests yesterday with a newly created job (in the Azure GUI) and a test with the old job from may (in the old POC environment) we found out that it is not Stream Analytics that is not functioning in general, but the SA job we created with an ARM template in power shell.

    The old job from may and the newly created job were writing output events for historic data (from February and onward) just fine, but the SA job created with an ARM template last week, as mentioned before, was just increasing the amount of input events, without any output events. What could be wrong here? As SA is a black box for us, it's quite difficult to troubleshoot issues like the ones that occurred last week.

    Tuesday, September 18, 2018 7:41 AM