locked
How to use Avro Serialization for ASA input RRS feed

  • Question

  • Hi There,

    I am trying to use Avro serialization format for Streaming Analytics input data. I am using Event Hub as my data source. Based on the answer to a similar question in this forum about CSV format, it seems like ASA expects header/schema information to be included with each event, rather than defined externally. So, each of my input events is a proper Avro Data File, with schema included followed by multiple binary records. I have tried to verify this format using the "Test" button in the Azure Streaming Portal. When I upload a file containing my Avro event, the portal returns "Unable to parse JSON" error. I have double checked that my input is indeed marked as Avro serialization and not JSON.

    Can you let me know what the expected Avro data format is? 

    Thanks,

    Dave

    Tuesday, December 23, 2014 10:26 PM

Answers

  • Hi Dave,

    Currently in-portal query testing only supports JSON serialization regardless of what was set in the INPUTS tab.

    We will be looking into adding additional serialization formats for in-portal query testing in the future.

    Thanks!

    Ziv.

    • Marked as answer by dQuig Wednesday, January 7, 2015 12:29 AM
    • Unmarked as answer by dQuig Thursday, January 15, 2015 12:38 AM
    • Marked as answer by Ryan CrawCour [MSFT] Friday, July 29, 2016 9:53 PM
    Tuesday, January 6, 2015 11:41 PM

All replies

  • When I try to run the same query - I get the following in the operations logs. Looks like a field mismatch error but the expected fields are the same as the fields found.

    Correlation ID: 5fc34f7a-941e-41d3-9aa7-36673df0e320 Error: Message: Missing fields specified in create table. Fields expected: index, outlet_pressure, inlet_pressure, flow, temperature. Fields found: index, outlet_pressure, inlet_pressure, flow, temperature. Message Time: 2014-12-24 01:09:32Z Microsoft.Resources/EventNameV2: sharedNode92F920DE-290E-4B4C-861A-F85A4EC01D82.flow-data_0_c76f7247_25b7_4ca6_a3b6_c7bf192ba44a#0.output Microsoft.Resources/Operation: Information Microsoft.Resources/ResourceUri: /subscriptions/d5c15d24-d2ef-4443-ba7e-8389a86591ff/resourceGroups/StreamAnalytics-Default-Central-US/providers/Microsoft.StreamAnalytics/streamingjobs/avro-events Type: FieldsMismatchError

    Wednesday, December 24, 2014 1:20 AM
  • Hi Dave,

    Could you let me know what the query you used was?

    Thanks,

    Mahith


    Tuesday, January 6, 2015 10:45 PM
  • Hi Mahith,

    The query was "select * from [input]". This is now working on event hub data - just not working with the query tester. Will post some more detail when I have some free time this evening.

    Tuesday, January 6, 2015 11:04 PM
  • Yes, when I try to upload test data using the query tester, I get "Failed to parse JSON" error. The data works fine, however, when using the event hub data source.

    To help others using avro - each event should be a valid Avro data file format with the schema included - followed by a set of events. The number of events included with each event can be configured based on your specific needs for throughput and latency.

    Tuesday, January 6, 2015 11:13 PM
  • Hi Dave,

    Currently in-portal query testing only supports JSON serialization regardless of what was set in the INPUTS tab.

    We will be looking into adding additional serialization formats for in-portal query testing in the future.

    Thanks!

    Ziv.

    • Marked as answer by dQuig Wednesday, January 7, 2015 12:29 AM
    • Unmarked as answer by dQuig Thursday, January 15, 2015 12:38 AM
    • Marked as answer by Ryan CrawCour [MSFT] Friday, July 29, 2016 9:53 PM
    Tuesday, January 6, 2015 11:41 PM
  • Thanks Ziv -

    Might be helpful to be a little more explicit about the expected format of avro events in the documentation. In-portal query json only support is just fine for me - but took some time to get the format correct.   

    Wednesday, January 7, 2015 12:28 AM
  • Hi Dave,

    I just wanted to follow up on the exception message you were getting - the 'FieldsMismatchError'. Is that still an issue you are encountering?

    Thanks,
    Mahith

    Wednesday, January 7, 2015 6:23 PM
  • I am unable to reproduce the FieldsMismatchError
    Wednesday, January 7, 2015 6:26 PM
  • Glad to hear that!

    Wednesday, January 7, 2015 7:18 PM
  • Hi Mahith,

    I am now able to reproduce the FieldsMismatchError. I am attaching the log. Could you help debug this issue?

                        
    Correlation ID:
    000593f0-4c05-412b-b096-2cf818bf6e9f
    
    Error:
    
    
    Message:
    Missing fields specified in create table. Fields expected: avgLight, avgOffLight, avgHum, avgLrTemp, avgBrLight, avgBrTemp, avgLrLight, avgOffHum, avgLrHum, avgOffTemp, avgBrHum, avgTemp, groupId, ts. Fields found: avgLight, avgOffLight, avgHum, avgLrTemp, avgBrLight, avgBrTemp, avgLrLight, avgOffHum, avgLrHum, avgOffTemp, avgBrHum, avgTemp, groupId, ts.
    
    Message Time:
    2015-01-15 00:28:59Z
    
    Microsoft.Resources/EventNameV2:
    sharedNode92F920DE-290E-4B4C-861A-F85A4EC01D82.hvac-input_0_c76f7247_25b7_4ca6_a3b6_c7bf192ba44a#0.output
    
    Microsoft.Resources/Operation:
    Information
    
    Microsoft.Resources/ResourceUri:
    /subscriptions/d5c15d24-d2ef-4443-ba7e-8389a86591ff/resourceGroups/StreamAnalytics-Default-Central-US/providers/Microsoft.StreamAnalytics/streamingjobs/hvac
    
    Type:
    FieldsMismatchError

    Thursday, January 15, 2015 12:38 AM