none
How to compress the JSON files into Parquet files in ADLS

All replies

  • Hi

    Please note that Parquet is not a compression format but a column store format that allows you to compress the columns with different compression schemes.

    Since JSON documents are not relational but hierarchical, you would first need to map your hierarchy into some form of columns. Then you can write the rowset that you produced into Parquet format.

    You can for example use ADLA and U-SQL to do it. An example for how to extract information from JSON, you can find a sample JSON extractor on https://github.com/Azure/usql/tree/master/Examples/DataFormats and the Parquet support is documented here.


    Michael Rys

    Thursday, June 7, 2018 9:27 AM
    Moderator