locked
custom dimensions query by name RRS feed

  • Question

  • Hi

    Is there a way to query by name in the custom dimensions json object from a blob being made available any time soon?, i couldn't find any answer on that online.

    If not, would it be possible to cross apply the fields, then do some sort of query to flatten the results?

    Kind regards,

    Sasha

    Monday, October 24, 2016 10:05 PM

Answers

  • Sasha, with some assumptions I think you still can achieve what you want.
    Assuming that:
    - you know the set of possible objects in dimensions, and
    - you do not have duplicate objects in this array, and
    - there is something you can uniquely identify your initial rows (like the id in the example below).

    Then following should do what you want.

    Example data:

    { "id":"1", "custom": {"dimensions": [ {"Obj1":"O1v1"}, {"Obj2":"O2v1"}, {"Obj3":"O3v1"} ] } }
    { "id":"2", "custom": {"dimensions": [ {"Obj1":"O1v2"}, {"Obj2":"O2v2"}, {"Obj3":"O3v2"} ] } }
    { "id":"3", "custom": {"dimensions": [ {"Obj2":"O2v3"}, {"Obj1":"O1v3"}, {"Obj3":"O3v3"} ] } }
    { "id":"4", "custom": {"dimensions": [ {"Obj2":"O2v4"}, {"Obj3":"O3v4"}, {"Obj1":"O1v4"} ] } }
    { "id":"5", "custom": {"dimensions": [                  {"Obj2":"O2v5"}, {"Obj3":"O3v5"} ] } }
    { "id":"6", "custom": {"dimensions": [ {"Obj1":"O1v6"},                  {"Obj3":"O3v5"} ] } }
    { "id":"7", "custom": {"dimensions": [                                                   ] } }
    { "id":"8", "custom": {"dimensions": [                                   {"ObjX":"OXvX"} ] } }
    

    Query:

    SELECT
        id = input.id,
        Obj1 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj1') as nvarchar(max))),
        Obj2 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj2') as nvarchar(max))),
        Obj3 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj3') as nvarchar(max)))
    FROM
      input CROSS APPLY GetArrayElements(input.custom.dimensions) d
    GROUP BY System.Timestamp, input.id
    

    Result:

    {"id": "1", "obj1": "O1v1", "obj2": "O2v1", "obj3": "O3v1" }
    {"id": "2", "obj1": "O1v2", "obj2": "O2v2", "obj3": "O3v2" }
    {"id": "3", "obj1": "O1v3", "obj2": "O2v3", "obj3": "O3v3" }
    {"id": "4", "obj1": "O1v4", "obj2": "O2v4", "obj3": "O3v4" }
    {"id": "5", "obj1":   null, "obj2": "O2v5", "obj3": "O3v5" }
    {"id": "6", "obj1": "O1v6", "obj2":   null, "obj3": "O3v5" }
    {"id": "8", "obj1":   null, "obj2":   null, "obj3":   null }
    

    Note, that missing object values represented with nulls, unexpected objects (like ObjX) are ignored, and rows with empty dimensions do not appear at all.

    Hope this helps,
    Alex.


    Azure Stream Analytics.

    • Marked as answer by Sasha_za Friday, November 4, 2016 9:35 PM
    Thursday, October 27, 2016 7:32 PM

All replies

  • Yes, this is available.

    See following MSDN articles:

    Record Functions - GetRecordProperties finction in particular for flattening with CROSS APPLY;

    Array Functions; and

    CROSS APPLY which also have some examples tying together with above functions.

    Thanks,
    Alex. 


    Azure Stream Analytics.

    Monday, October 24, 2016 10:28 PM
  • thanks Alex

    I don't think Record functions will work for me, unless I am missing something, so basically my data looks like this:

    "custom": {"dimensions":[{"Obj1":"Obj1val"},{"Obj2":"Obj2val"},{"Obj3":"Obj3val"}]}

    so I dont get any results if I use GetrecordProperties on the dimensions object, since dimensions has an array of objects.

    In some records in the blob, there might not be an Obj3, while in others the position of Obj2 and Obj3 will be swapped.

    Using the Array functions will only work if the position remains the same in the array, but the position of the objects change.

    Sasha

    Monday, October 24, 2016 10:55 PM
  • Sasha, with some assumptions I think you still can achieve what you want.
    Assuming that:
    - you know the set of possible objects in dimensions, and
    - you do not have duplicate objects in this array, and
    - there is something you can uniquely identify your initial rows (like the id in the example below).

    Then following should do what you want.

    Example data:

    { "id":"1", "custom": {"dimensions": [ {"Obj1":"O1v1"}, {"Obj2":"O2v1"}, {"Obj3":"O3v1"} ] } }
    { "id":"2", "custom": {"dimensions": [ {"Obj1":"O1v2"}, {"Obj2":"O2v2"}, {"Obj3":"O3v2"} ] } }
    { "id":"3", "custom": {"dimensions": [ {"Obj2":"O2v3"}, {"Obj1":"O1v3"}, {"Obj3":"O3v3"} ] } }
    { "id":"4", "custom": {"dimensions": [ {"Obj2":"O2v4"}, {"Obj3":"O3v4"}, {"Obj1":"O1v4"} ] } }
    { "id":"5", "custom": {"dimensions": [                  {"Obj2":"O2v5"}, {"Obj3":"O3v5"} ] } }
    { "id":"6", "custom": {"dimensions": [ {"Obj1":"O1v6"},                  {"Obj3":"O3v5"} ] } }
    { "id":"7", "custom": {"dimensions": [                                                   ] } }
    { "id":"8", "custom": {"dimensions": [                                   {"ObjX":"OXvX"} ] } }
    

    Query:

    SELECT
        id = input.id,
        Obj1 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj1') as nvarchar(max))),
        Obj2 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj2') as nvarchar(max))),
        Obj3 = min(cast(GetRecordPropertyValue(d.arrayvalue, 'Obj3') as nvarchar(max)))
    FROM
      input CROSS APPLY GetArrayElements(input.custom.dimensions) d
    GROUP BY System.Timestamp, input.id
    

    Result:

    {"id": "1", "obj1": "O1v1", "obj2": "O2v1", "obj3": "O3v1" }
    {"id": "2", "obj1": "O1v2", "obj2": "O2v2", "obj3": "O3v2" }
    {"id": "3", "obj1": "O1v3", "obj2": "O2v3", "obj3": "O3v3" }
    {"id": "4", "obj1": "O1v4", "obj2": "O2v4", "obj3": "O3v4" }
    {"id": "5", "obj1":   null, "obj2": "O2v5", "obj3": "O3v5" }
    {"id": "6", "obj1": "O1v6", "obj2":   null, "obj3": "O3v5" }
    {"id": "8", "obj1":   null, "obj2":   null, "obj3":   null }
    

    Note, that missing object values represented with nulls, unexpected objects (like ObjX) are ignored, and rows with empty dimensions do not appear at all.

    Hope this helps,
    Alex.


    Azure Stream Analytics.

    • Marked as answer by Sasha_za Friday, November 4, 2016 9:35 PM
    Thursday, October 27, 2016 7:32 PM
  • Thanks Alex

    I was attempting to use the group by but it kept giving me errors, I now used the example you provided, and its getting what i want.

    Much appreciated

    Kind regards,

    Sasha

    Friday, November 4, 2016 9:35 PM