none
Relational Data in the Data Lake store RRS feed

  • Question

  • In the Azure Data Lake Store documentation it states that you can store all data in it's native format and there's also an image that shows relational as a type of data in the store, but it seems like you can only add files.  Can you store relational data in it's original format in the data lake store or would you have to export it into a delimited file or something like that first?
    Thursday, January 14, 2016 4:16 PM

Answers

  • In addition to Rajesh's reply, you can create your own relational tables with U-SQL (or Hive via HDINSIGHT), in the Azure Data Lake.

    If you have existing relational data (e.g., a SQL Server MDF file), then you can store it in ADLS, but we currently do not support a way to understand the content of the file. So neither U-SQL nor Hive could query it, unless you write some custom extractors/SerDes.

    Best regards

    Michael


    Michael Rys

    • Marked as answer by E.Hanson Friday, January 15, 2016 1:32 PM
    Friday, January 15, 2016 8:37 AM
    Moderator
  • The way you store the data is dependent on how you are eventually going to be accessing and processing the data. If the application processing the data can read the data from the data lake using the data lake APIs, extract the format and process it, Azure Data Lake doesn't come in the way from the perspective of mandating a pre-defined schema, format or structure. It enables the Hadoop File System (HDFS) capability for the application that the analytics applications can then leverage. 

    If you are running the standard Hadoop applications (Hive, Pig, M/R, USQL) to process the relational data, the typical pattern is to store the data in the delimited form as you describe.

    Hope that helps

    Thanks,

    Rajesh Dadhia

    Group Program Manager

    Azure Data Lake 

    • Marked as answer by E.Hanson Friday, January 15, 2016 1:32 PM
    Thursday, January 14, 2016 9:36 PM

All replies

  • The way you store the data is dependent on how you are eventually going to be accessing and processing the data. If the application processing the data can read the data from the data lake using the data lake APIs, extract the format and process it, Azure Data Lake doesn't come in the way from the perspective of mandating a pre-defined schema, format or structure. It enables the Hadoop File System (HDFS) capability for the application that the analytics applications can then leverage. 

    If you are running the standard Hadoop applications (Hive, Pig, M/R, USQL) to process the relational data, the typical pattern is to store the data in the delimited form as you describe.

    Hope that helps

    Thanks,

    Rajesh Dadhia

    Group Program Manager

    Azure Data Lake 

    • Marked as answer by E.Hanson Friday, January 15, 2016 1:32 PM
    Thursday, January 14, 2016 9:36 PM
  • In addition to Rajesh's reply, you can create your own relational tables with U-SQL (or Hive via HDINSIGHT), in the Azure Data Lake.

    If you have existing relational data (e.g., a SQL Server MDF file), then you can store it in ADLS, but we currently do not support a way to understand the content of the file. So neither U-SQL nor Hive could query it, unless you write some custom extractors/SerDes.

    Best regards

    Michael


    Michael Rys

    • Marked as answer by E.Hanson Friday, January 15, 2016 1:32 PM
    Friday, January 15, 2016 8:37 AM
    Moderator