locked
No data and no output into POWER BI vs stream analytics RRS feed

  • Question

  • I am newbie in Azure stream analytics,

    I create demo stream analytics real time to power bi dashboard.

    step1: i move file test2.json from my computer to blob storage via Cloubberry Explorer (container is : uploadedresources).

    step2: i create a stream analytics job, input i enter general and serialization information (EVENT SERIALIZATION FORMAT: json, ENCODING : UTF8, and about PATH PREFIX PATTERN in enter : 'test2')

    stEP 3: I ENTER QUERY:

    select 
    product,
    count([units sold]) units
    into output
    from test
    group by product,
    TumblingWindow(second, 10)


    PRODUCT AND UNITS SOLD are columns IN MY json file, test is input alias, output is output alias.

    step 4:  i configure output power bi WITH AUTHOR IS organization ACCOUNT (DIFFERENT FROM AZURE ACCOUNT), named dataset : demo1, table: demo1, group name: my workspace .

    INPUT ,OUTPUT CONNECTION IS successful, THEN i START JOB, BUT NO DATASET APPEAR IN POWER BI. THEN, I TEST OUTPUT INTO a CSV FILE IN BLOB STORAGE AND AFTER START DONE, NO DATA IN CSV FILE.

    I find documents to check my issues, i havent found what issues make no output in my job. who have experience in this case, please help me.

    Thank you very much.

    cherryle



    • Edited by cherryle Saturday, December 5, 2015 12:07 PM
    Saturday, December 5, 2015 11:59 AM

Answers

  • Please make sure to select a job start time that is at or before the start time of your data. Since you are not using "TIMESTAMP BY" in your query, all your data (a single file) will be viewed as coming in with a single arrival time - the timestamp the blob was assigned when you uploaded your file. With that, you either get no output (starting the job at a time later than that arrival time) or you get all of your file aggregated at once.

    To get more meaningful behavior, make sure your test data has a timestamp field per event and that you add "TIMESTAMP BY timestampField" to your FROM clause - where "timestampField" is the name of your events' timestamp field. Then make sure that the events in your file are (roughly - you can control this with your out-of-order arrival policy settings) sorted by that timestamp field. Finally, specify a job start time that is at or before your first event's timestamp.

    Note that Stream Analytics will apply a late arrival policy that drops all data that is (logically) arriving more than (your policy setting, cannot be 0) time units after the current wallclock time. That will also lead Stream Analytics to giving you no output if your test data has event timestamps that are too far in the past.

    Hope this helps.

    Clemens

    • Marked as answer by cherryle Thursday, December 10, 2015 3:10 AM
    Tuesday, December 8, 2015 6:33 PM

All replies

  • Please make sure to select a job start time that is at or before the start time of your data. Since you are not using "TIMESTAMP BY" in your query, all your data (a single file) will be viewed as coming in with a single arrival time - the timestamp the blob was assigned when you uploaded your file. With that, you either get no output (starting the job at a time later than that arrival time) or you get all of your file aggregated at once.

    To get more meaningful behavior, make sure your test data has a timestamp field per event and that you add "TIMESTAMP BY timestampField" to your FROM clause - where "timestampField" is the name of your events' timestamp field. Then make sure that the events in your file are (roughly - you can control this with your out-of-order arrival policy settings) sorted by that timestamp field. Finally, specify a job start time that is at or before your first event's timestamp.

    Note that Stream Analytics will apply a late arrival policy that drops all data that is (logically) arriving more than (your policy setting, cannot be 0) time units after the current wallclock time. That will also lead Stream Analytics to giving you no output if your test data has event timestamps that are too far in the past.

    Hope this helps.

    Clemens

    • Marked as answer by cherryle Thursday, December 10, 2015 3:10 AM
    Tuesday, December 8, 2015 6:33 PM
  • Thank Clemens, 

    I resolved my issues base on your suggestion. 

    I create a file csv sample has timestamp column( i get system.TimeStamp  at start job time). 

    my example file:

    winstarttime winendtime deviceid avgtemperature eventcount
    2015-12-10T02:51:50.000Z 2015-12-10T02:51:55.000Z 701
    2015-12-10T02:52:35.000Z 2015-12-10T02:52:40.000Z 701
    2015-12-10T02:52:45.000Z 2015-12-10T02:52:50.000Z 701
    2015-12-10T02:53:05.000Z 2015-12-10T02:53:10.000Z 701
    2015-12-10T02:53:15.000Z 2015-12-10T02:53:20.000Z 701
    2015-12-10T03:13:45.000Z 2015-12-10T03:13:50.000Z 701

    i input query:

    SELECT
        a.*    
        into output
    FROM input a
    TIMESTAMP by WinEndTime

    Start job, and output show in power BI. 

    One again, thank you very much, Clemens.


    • Edited by cherryle Thursday, December 10, 2015 3:19 AM
    Thursday, December 10, 2015 3:18 AM