none
Copy json array from blob to blob "as is"

    Question

  • I have many small json files containing a set of json object I need to merge into one large file. So I did set up a pipeline to do this using the ADF V2 Data Copy block. My source and destination is azure blob storage.

    All goes wonderful wel except for arrays. They end up as a string as shows below in the payload property for example:

    {
        "Id": "f81d3099-eeba-4b24-b924-3ac70e602698",
        "Timestamp": "2018-06-25T20:26:14.3749484+02:00",
        "EventEntry": {
            "EventId": 9,
            "Payload": "[\"7063d251-b371-48a9-bfb2-04ece42c80e5\",905,905,\"be0d1ccf-133a-469f-b125-e4bd6d4521b3\",\"10.1.3.26\",0,1440.0,\"4732e3c7-70b8-4836-81ba-61e83e57f667\",\"11.29.0\",\"11.29.18151.18\",\"12770522F7C4C564CAA46A3645252EBFFF9C56ED\",0]",
            "ProcessId": 1920,
            "ThreadId": 4728,
            "EventName": "ApiSessionStartStart",
            "KeyWords": 18,
            "KeyWordsDescription": "Session WebApi",
            "ProviderName": "DeHeerSoftware-PlanCare-Diagnostics",
            "Level": 4,
            "Version": 5,
            "Task": 0,
            "TaskName": null,
            "ProviderId": "3bdad957-1215-5270-f285-16d22ac9f6c4",
            "PayloadSchema": "[\"instanceSessionId\",\"userId\",\"onBehalfOf\",\"apiKey\",\"remoteIp\",\"legSid\",\"expiration\",\"orgId\",\"pc2Version\",\"pc2Build\",\"envId\",\"cosId\"]"
        },
        "Session": {
            "SessionId": "05bc544a-dab8-46f4-89f2-c7964107b6d9",
            "EnvironmentId": "12770522F7C4C564CAA46A3645252EBFFF9C56ED",
            "Build": "11.29.18151.18",
            "Version": "11.29.0",
            "OrganisationId": "4732e3c7-70b8-4836-81ba-61e83e57f667",
            "UserId": "905",
            "Process": "xxx",
            "Origin": "xxx",
            "Organisation": "xxx"
        }
    }

    My source json looks like this (different content, but same schema as the example above):

    {
        "Id": "10458d5c-69cd-4e8a-9927-8fa36988bb6e",
        "Timestamp": "2018-06-25T02:27:39.9334378+02:00",
        "EventEntry": {
            "EventId": 14,
            "Payload": [
                "0fca851e-295f-426d-a8c2-5045e6281bc1",
                "GET",
                "https://xxx.xxx.nl:5003/PlanCare2Api/Relations/19517/Correlations?correlationTypeId=2",
                516.8307,
                "OK",
                true,
                "DHS.PlanCare.Web.Api.Controllers.Relation.RelationsController",
                "FetchRelationCorrelationsAsync",
                0,
                63,
                0,
                "00000000-0000-0000-b245-0080020000ed",
                "Server - Plancare 2 @IO Server_Prod (4502)"
            ],
            "ProcessId": 4332,
            "ThreadId": 5196,
            "EventName": "WebMethodDuration",
            "KeyWords": 1,
            "KeyWordsDescription": "Duration",
            "ProviderName": "DeHeerSoftware-PlanCare-Telemetry",
            "Level": 4,
            "Version": 2,
            "Task": 0,
            "TaskName": null,
            "ProviderId": "97813ec8-8ea2-5078-2d5b-40183bbbcd58",
            "PayloadSchema": [
                "instanceSessionId",
                "httpMethod",
                "uri",
                "durationInMilliseconds",
                "responseCode",
                "isSuccessStatusCode",
                "controller",
                "action",
                "respContentLenght",
                "repsHdrLenght",
                "reqContentLenght",
                "correlationId",
                "context"
            ]
        },
        "Session": {
            "SessionId": "1b08e0d1-307d-45af-8984-55b20e767dfc",
            "EnvironmentId": "3EC0E59C152685CDEE6E1298CFBD9BAB7F5B53AB",
            "Build": "11.28.18094.3",
            "Version": "11.28.0.1",
            "OrganisationId": "b228eb05-1a5e-4f0f-b315-59c57f703cc0",
            "UserId": "6463",
            "Process": "xxx",
            "Origin": "xxx",
            "Organisation": "xxx"
        }
    }

    So how can I copy the json array as is?


    • Edited by Expecho Monday, August 6, 2018 10:00 AM
    Monday, August 6, 2018 9:59 AM

All replies

  • Hi Expecho,

    If you want to copy files as-is, keep 'Binary Copy' selected in UI to skip the format section in both source and sink dataset definitions.

    Thanks

    Tuesday, August 7, 2018 1:17 AM
  • Hi,

    You need to do schema mapping and explicit column mapping in copy activity before sink/or transform your data else your source and target will have its different set of schema except columns.

    Please refer this link to map the column and schema.

     


    Murugesa Pandian MCSA,MCSE,MCPD

    Gear up for some solid action by doing.
    

    Tuesday, August 7, 2018 1:38 AM
  • Hi Expecho,

    If you want to copy files as-is, keep 'Binary Copy' selected in UI to skip the format section in both source and sink dataset definitions.

    Thanks


    But that disallows the merge functionality and that is the whole point for me to use adf.
    Tuesday, August 7, 2018 11:12 AM
  • Hi,

    You need to do schema mapping and explicit column mapping in copy activity before sink/or transform your data else your source and target will have its different set of schema except columns.

    Please refer this link to map the column and schema.

     


    Murugesa Pandian MCSA,MCSE,MCPD

    Gear up for some solid action by doing.
    

    But when I use explicit column mapping I cannot use array as a type, only simple types seem to be supported.
    Tuesday, August 7, 2018 11:13 AM