none
What is the use of SQL database in U-Sql or data lake analytics

    Question

  • I can see the we can read any type of file from data lake storage/azure blob etc in U-Sql. I am looking at some example of USQL scripts there I can see that we can use SQL database of Azure data lake store. I want to know what is the difference of Azure Sql database vs Data lake SQL database. Also it will be better if someone tell me in which case we should use data lake SQL database. I don't see there is any need to use SQL database inside U-Sql. We can write data into files of data lake and read it, then we there is facility of SQL database in data lake store.



    Sunday, June 12, 2016 4:25 PM

Answers

  • Hi Mahesh

    I am a bit confused about what your question is. There is no SQL Database per se in ADLS. U-SQL provides concepts such as databases and tables to manage and share data objects such as tables and views and code objects such as assemblies. It also gives you federated query support to run U-SQL against SQL Server instances running in Azure.

    The U-SQL tables can give you performance benefits because they provide internal clustering, partitions and statistics that help the query optimizer to chose better plans.

    You can find a short tutorial in the U-SQL hands-on-lab here: http://aka.ms/usql-hol and you can find more samples here: https://github.com/Azure/usql/tree/master/Examples


    Michael Rys


    Thursday, June 16, 2016 12:47 AM
    Moderator
  • Hi Mahesh,

    Adding to what Michael has already said. Azure Data Lake Analytics (ADLA) and Azure SQL Databases (or SQL Server database VMs in Azure) can be complimentary to each other.

    One scenario where both work well together is when ADLA is used to do the heavy lifting (ETL/ELT), transforming large or varying data sources (either semi-structured or unstructured) into something that a relational database, such as an Azure SQL Database, can easily handle. You can then load the refined, structured output into an Azure SQL database (e.g. via Azure Data Factory) for interactive querying/analytics purposes. Interactive querying is something that ADLA isn't great at just yet (although addressing this type of workload is on the road map). By loading the output into an Azure SQL Database, you can take advantage of in-memory analytics technologies such as the clustered columnstore index to achieve highly interactive query speeds.

    Another scenario where you could use the two technologies is when you have various raw files that can be easily read and manipulated with ADLA via U-SQL, and you have some structured data existing in a Azure SQL Database that you want to use to augment it with. Rather than exporting the data from the SQL Database to a flat file first, you'd use ADLA's federated query feature to avoid that data movement altogether and still be able to join these datasets residing in different backing stores.


    Regards,

    Michael Amadi

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful :)

    Website: http://www.nimblelearn.com, Twitter: @nimblelearn

    Monday, June 20, 2016 4:23 PM

All replies

  • Hi Mahesh

    I am a bit confused about what your question is. There is no SQL Database per se in ADLS. U-SQL provides concepts such as databases and tables to manage and share data objects such as tables and views and code objects such as assemblies. It also gives you federated query support to run U-SQL against SQL Server instances running in Azure.

    The U-SQL tables can give you performance benefits because they provide internal clustering, partitions and statistics that help the query optimizer to chose better plans.

    You can find a short tutorial in the U-SQL hands-on-lab here: http://aka.ms/usql-hol and you can find more samples here: https://github.com/Azure/usql/tree/master/Examples


    Michael Rys


    Thursday, June 16, 2016 12:47 AM
    Moderator
  • Hi Mahesh,

    Adding to what Michael has already said. Azure Data Lake Analytics (ADLA) and Azure SQL Databases (or SQL Server database VMs in Azure) can be complimentary to each other.

    One scenario where both work well together is when ADLA is used to do the heavy lifting (ETL/ELT), transforming large or varying data sources (either semi-structured or unstructured) into something that a relational database, such as an Azure SQL Database, can easily handle. You can then load the refined, structured output into an Azure SQL database (e.g. via Azure Data Factory) for interactive querying/analytics purposes. Interactive querying is something that ADLA isn't great at just yet (although addressing this type of workload is on the road map). By loading the output into an Azure SQL Database, you can take advantage of in-memory analytics technologies such as the clustered columnstore index to achieve highly interactive query speeds.

    Another scenario where you could use the two technologies is when you have various raw files that can be easily read and manipulated with ADLA via U-SQL, and you have some structured data existing in a Azure SQL Database that you want to use to augment it with. Rather than exporting the data from the SQL Database to a flat file first, you'd use ADLA's federated query feature to avoid that data movement altogether and still be able to join these datasets residing in different backing stores.


    Regards,

    Michael Amadi

    Please use the 'Mark as answer' link to mark a post that answers your question. If you find a reply helpful, please remember to vote it as helpful :)

    Website: http://www.nimblelearn.com, Twitter: @nimblelearn

    Monday, June 20, 2016 4:23 PM