locked
Flatten Nested JSON Array in Stream Analytics RRS feed

  • Question

  • Hi,

    I’ve been stuck in the following query for quite some times, any suggestions/ideas will be really helpful, thanks!

    I’ve got a JSON data file saved in Azure Blob, file format as below. I am using Azure Stream Analytics to retrieve the data and output to a SQL DB. 

    {
    	"event":[{"name":"FORM1","count":1}],
    	"context":{"application":{"version":"1"},
    	"custom":{
    		"dimensions":[
    				{"MessageType":"EventLog"},
    				{"Properties":
    						"{\"TYPE\":\"VISIT\",\"PACKAGE_NAME\":\"app\",\"DEVICE_ID\":\"8ea5-b8c3cdf67b1c\",\"USER_IDENTIFIER\":\"38b423e61c6\"}"},
    						{"WrapperSdkVersion":"2.1"}
    					]
    			}
    		}
    }
    

    Here’s my query in Stream Analytics. I try to retrieve the value from Properties, which is one level down under custom.dimensions.  However the value always come back as Null. It seems the flatten function can only go as far as the custom.dimensions. Would it because of the backslash symbol ‘\’? 

    WITH Temp AS (
        SELECT
            GetRecordPropertyValue(GetArrayElement(A.event, 0), 'name') as EventName,
            UDF.flattenCustomDimensions(A.context.custom.dimensions) as dim
    		
        FROM ClientAnalyticsContinuousInput A
    )
    SELECT
    
         EventName,
         UserId = Temp.dim.Properties.USER_IDENTIFIER,
         DeviceId = Temp.dim.Properties.DEVICE_ID
    
    FROM Temp
    

    Regards

    Stephen


    Thursday, February 27, 2020 11:27 AM

Answers

  • Hi Kranthipakala,

    Many Thanks for the suggestions! Yep it’s a JSON format under properties with back slash.

    I worked it out over the weekend by adding an extra functions to process the JSON under Properties.

    Here’s couple try I did, it might be helpful for others that had the same issues:

    First, I tried to removed the backslash ‘\’ out from the Properties. However it didn’t work. It seems Azure Stream Analytics Query cannot simply replace \” as just double quotation mark “  

    Then what I did instead, is following the suggestion on Microsoft Document to write a separate Function parseJSON to parse the key/values under Properties. Then it work! Apparently the function can be able to recognise the Properties as JSON and parse the key/value under it, even with a backslash symbol. 

    https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parsing-json#parse-json-record-in-sql-reference-data

    Here’s the two functions I built:

    //parseJson
    function parseJson(string) {
    if (string ){
    return JSON.parse(string);
    }
    }
    
    //flattenCustomerDimensions
    function main(dimensions) {
      let output = {};
      for(let i in dimensions) {
        let dim = dimensions[i];
        for(let key in dim) {
          output[key] = dim[key];
        }
      }
      return output;
    }
    


    The rest of thing is relative straight forward, I just need to parse the JSON in Properties to get the value I needed:

    UDF.parseJson((UDF.FlattenCustomDimensions(I.context.custom.dimensions)).Properties) as PropertiesDim,

    Regards

    Stephen

    • Marked as answer by stephen huo Monday, March 2, 2020 11:56 AM
    Monday, March 2, 2020 11:55 AM

All replies

  • Hi stephen huo,

    Thanks for reaching out. As per my understanding your QUERY is is expecting your UDF to return below JSON structure, but according to your current JSON structure (because of back slash '\'),  the value ("{\"TYPE\":\"VISIT\",\"PACKAGE_NAME\":\"app\",\"DEVICE_ID\":\"8ea5-b8c3cdf67b1c\",\"USER_IDENTIFIER\":\"38b423e61c6\"}") of "Properties" is considered as a string. 


    {
    	"event": [
    		{
    			"name": "FORM1",
    			"count": 1
    		}
    	],
    	"context": {
    		"application": {
    			"version": "1"
    		},
    		"custom": {
    			"dimensions": [
    				{
    					"MessageType": "EventLog"
    				},
    				{
    					"Properties": {
    						"TYPE": "VISIT",
    						"PACKAGE_NAME": "app",
    						"DEVICE_ID": "8ea5-b8c3cdf67b1c",
    						"USER_IDENTIFIER": "38b423e61c6"
    					}
    				},
    				{
    					"WrapperSdkVersion": "2.1"
    				}
    			]
    		}
    	}
    }



    Could you please make sure that your Flatten function is returning the JSON data (value of "Properties") as mentioned above?

    If you need further assistance, would you please share the flatten function code to dive deeper on this issue?


    Thank you

    If a post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered"Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.



    Friday, February 28, 2020 7:44 PM
  • Hi Kranthipakala,

    Many Thanks for the suggestions! Yep it’s a JSON format under properties with back slash.

    I worked it out over the weekend by adding an extra functions to process the JSON under Properties.

    Here’s couple try I did, it might be helpful for others that had the same issues:

    First, I tried to removed the backslash ‘\’ out from the Properties. However it didn’t work. It seems Azure Stream Analytics Query cannot simply replace \” as just double quotation mark “  

    Then what I did instead, is following the suggestion on Microsoft Document to write a separate Function parseJSON to parse the key/values under Properties. Then it work! Apparently the function can be able to recognise the Properties as JSON and parse the key/value under it, even with a backslash symbol. 

    https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-parsing-json#parse-json-record-in-sql-reference-data

    Here’s the two functions I built:

    //parseJson
    function parseJson(string) {
    if (string ){
    return JSON.parse(string);
    }
    }
    
    //flattenCustomerDimensions
    function main(dimensions) {
      let output = {};
      for(let i in dimensions) {
        let dim = dimensions[i];
        for(let key in dim) {
          output[key] = dim[key];
        }
      }
      return output;
    }
    


    The rest of thing is relative straight forward, I just need to parse the JSON in Properties to get the value I needed:

    UDF.parseJson((UDF.FlattenCustomDimensions(I.context.custom.dimensions)).Properties) as PropertiesDim,

    Regards

    Stephen

    • Marked as answer by stephen huo Monday, March 2, 2020 11:56 AM
    Monday, March 2, 2020 11:55 AM
  • Hi stephen huo,

    Glad to know you were unblocked  and thanks much for sharing your findings. It will be helpful for other community members with similar requirement. :)


    Thank you

    If a post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.

    Monday, March 2, 2020 9:25 PM