locked
Transforming JSON input to csv output RRS feed

  • Question

  • Hi,

    Is there a way I could transform input events that are in JSON format to CSV and store it to blob storage

    e.g. Input

    {"col1":"val1", "col2","val2"}
    {"col1":"val11", "col2","val22"}

    e.g. Output CSV

    val1, val2
    val11, val22

    Reason I need to do this is so that I can load the csv data copied to blob storage into Azure SQL Datawarehouse using Polybase

    I don't care if I loose the header in the CSV output



    • Edited by Nik_SJ Thursday, April 19, 2018 8:15 PM
    Thursday, April 19, 2018 8:14 PM

All replies

  • Is this a streaming scenario? Where does the JSON data come from, and is it in batches (daily/ hourly) or in smaller increments. 

    1. Two things to test - pretty easy to test by the way.

    Azure Data Factory (ADF) copy activity does JSON to Polybase natively I recall. Give it a try and see if it fits your needs. its more batch oriented (15 minutes or hourly). Blob input to SQLDW with Polybase output

    https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview

    ADF Blob support JSON and CSV https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-blob-storage

    ADF SQL DW supports polybase with SQL auth: https://docs.microsoft.com/en-us/azure/data-factory/connector-azure-sql-data-warehouse

    2. Stream Analytics can do ETL, but is geared towards temporal streaming data of course. Less about managing batch inserts (ADF is meant for that) and more about contstant aggregates to average and count data as it flow through to the destination.

    Stream Analytics does support JSON/CSV/AVRO inputs from Event Hub or Blob, and outputs in Blob Storage or SQL too. Details here 

    https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-inputs

    https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-define-outputs


    The query in stream analytics doesn't have to transform the data. Can pass the data through as is - SELECT * INTO output FROM input. The output could be blob or SQL DW. The input could be blob or Event Hub for that scenario you describe. 

    Thanks, Jason


    Didn't get enough help here? Submit a case with the Microsoft Customer Support teams for deeper investigation - Azure service support: https://manage.windowsazure.com/?getsupport=true For on Premise software support go here instead: http://support.microsoft.com/select/default.aspx?target=assistance

    Thursday, April 19, 2018 8:48 PM