none
Storage account v2 Data lake storage gen 2 feature RRS feed

  • Question

  • Hi Team,

    Can anyone tell me the difference between ADLS 2 and storage account v2 data lake storage gen 2 feature?

    Basically wanted to know whether by enabling the Data lake storage gen 2 feature in storage account would we get all the benefits of ADLS?

    Wednesday, July 17, 2019 6:55 AM

Answers

  • Cost effectiveness is made possible as Data Lake Storage Gen2 is built on top of the low-cost Azure Blob storage. The additional features further lower the total cost of ownership for running big data analytics on Azure.

    The following articles describe some of the main concepts of Data Lake Storage Gen2 and detail how to store, access, manage, and gain insights from your data:

    Hierarchical namespace

    Create a storage account

    Use a Data Lake Storage Gen2 account in Azure Databricks

    Azure Data Lake Storage Gen2: Designed for enterprise big data analytics

    Azure Data lake storage Gen2

    Azure Storage GPV2

    Regional availability   See here

    Available in all Azure regions

    Data lake storage gen 2 feature in storage account would we get all the benefits of ADLS?

    Yes

    Azure Data Lake Storage Gen2 is integrated into the Azure Storage platform, applications can use either the BLOB APIs or Azure Data Lake Storage Gen2 file system APIs for accessing data. BLOB APIs allow you to leverage your existing investments in BLOB Storage and continue to take advantage of the large ecosystem of first and third party applications already available while the Azure Data Lake Storage Gen2 file system APIs are optimized for analytics engines like Hadoop and Spark.

    Additional benefits from integration with Azure Storage include:

    • Unlimited scale and performance due to significant advances made in storage account architecture.
    • Performance improvements when reading and writing individual objects resulting in significantly higher throughput and concurrency.
    • Removes the need for customers to have to decide a priority whether they want to run analytics or not at the data ingestion time. In Azure Storage we believe all data can and will be used for analytics.
    • Data protection capabilities including all data being encrypted at rest using either Microsoft or customer manager keys.
    • Integrated network Firewall capabilities that allow you to define rules restricting access to only requests originating from specified networks.
    • Durability options such as Zone Redundant Storage and Geo-Redundant Storage to enable your applications to be designed for high-availability and disaster recovery.
    • Linux integration - BlobFUSE allows customers to mount Blob Storage from their Linux VMs and interact with Azure Data Lake Storage Gen2 using standard Linux shell commands.

    Of course, Azure Storage is built on a platform grounded in strong consistency guaranteeing that writes are made durable before acknowledging success to the client. This is critically important for big data workloads where the output from one task is often the input to the next job. This greatly simplifies development of big data applications since they do not have to work around issues that surface with weaker consistency models such as eventual consistency.

    Hope this helps!

    Kindly let us know if the above helps or you need further assistance on this issue.
    ------------------------------------------------------------------------------------------

    Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members


    Wednesday, July 17, 2019 10:44 AM
    Moderator
  • You can create an account by using Powershell or the CLI. You can't perform operations or set access control lists on file systems, directories, and files. You can provide the access through Storage Explorer .

    However you can refer to the suggestion mentioned in this GitHub link.

    Kindly let us know if the above helps or you need further assistance on this issue.
    ------------------------------------------------------------------------------------------

    Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members

    Wednesday, July 17, 2019 12:09 PM
    Moderator

All replies

  • Cost effectiveness is made possible as Data Lake Storage Gen2 is built on top of the low-cost Azure Blob storage. The additional features further lower the total cost of ownership for running big data analytics on Azure.

    The following articles describe some of the main concepts of Data Lake Storage Gen2 and detail how to store, access, manage, and gain insights from your data:

    Hierarchical namespace

    Create a storage account

    Use a Data Lake Storage Gen2 account in Azure Databricks

    Azure Data Lake Storage Gen2: Designed for enterprise big data analytics

    Azure Data lake storage Gen2

    Azure Storage GPV2

    Regional availability   See here

    Available in all Azure regions

    Data lake storage gen 2 feature in storage account would we get all the benefits of ADLS?

    Yes

    Azure Data Lake Storage Gen2 is integrated into the Azure Storage platform, applications can use either the BLOB APIs or Azure Data Lake Storage Gen2 file system APIs for accessing data. BLOB APIs allow you to leverage your existing investments in BLOB Storage and continue to take advantage of the large ecosystem of first and third party applications already available while the Azure Data Lake Storage Gen2 file system APIs are optimized for analytics engines like Hadoop and Spark.

    Additional benefits from integration with Azure Storage include:

    • Unlimited scale and performance due to significant advances made in storage account architecture.
    • Performance improvements when reading and writing individual objects resulting in significantly higher throughput and concurrency.
    • Removes the need for customers to have to decide a priority whether they want to run analytics or not at the data ingestion time. In Azure Storage we believe all data can and will be used for analytics.
    • Data protection capabilities including all data being encrypted at rest using either Microsoft or customer manager keys.
    • Integrated network Firewall capabilities that allow you to define rules restricting access to only requests originating from specified networks.
    • Durability options such as Zone Redundant Storage and Geo-Redundant Storage to enable your applications to be designed for high-availability and disaster recovery.
    • Linux integration - BlobFUSE allows customers to mount Blob Storage from their Linux VMs and interact with Azure Data Lake Storage Gen2 using standard Linux shell commands.

    Of course, Azure Storage is built on a platform grounded in strong consistency guaranteeing that writes are made durable before acknowledging success to the client. This is critically important for big data workloads where the output from one task is often the input to the next job. This greatly simplifies development of big data applications since they do not have to work around issues that surface with weaker consistency models such as eventual consistency.

    Hope this helps!

    Kindly let us know if the above helps or you need further assistance on this issue.
    ------------------------------------------------------------------------------------------

    Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members


    Wednesday, July 17, 2019 10:44 AM
    Moderator
  • Hello Sumanth,

    Thank you for the detailed description :)

    Had 1 question:

    In ADLS, we can give access at a folder level rather than at the parent level.

    So in case if we enable the  data lake storage gen 2 feature in storage account v2, then can I restrict the access at folder level rather than parent level (In blob we can give access max at container level.The folders within the container get default access .) 

    Wednesday, July 17, 2019 11:11 AM
  • You can create an account by using Powershell or the CLI. You can't perform operations or set access control lists on file systems, directories, and files. You can provide the access through Storage Explorer .

    However you can refer to the suggestion mentioned in this GitHub link.

    Kindly let us know if the above helps or you need further assistance on this issue.
    ------------------------------------------------------------------------------------------

    Do click on "Mark as Answer" and Upvote on the post that helps you, this can be beneficial to other community members

    Wednesday, July 17, 2019 12:09 PM
    Moderator
  •  Just checking in to see if the above answer helped. If this answers your query, do click “Mark as Answer” and Up-Vote for the same, which might be beneficial to other community members reading this thread. And, if you have any further query do let us know.
    Friday, July 19, 2019 9:36 AM
    Moderator
  • @Nandan Hegde Is there any update on the issue?

    If the suggested answer helped for your issue, do click on "Mark as Answer" and “Vote as Helpful” on the post that helps you, this can be beneficial to other community members.

    Monday, July 22, 2019 4:37 AM
    Moderator
  • @Nandan Hegde Just checking in to see if the above answer helped. If this answers your query, do click “Mark as Answer” and Up-Vote for the same, which might be beneficial to other community members reading this thread. And, if you have any further query do let us know.
    Wednesday, July 31, 2019 5:45 AM
    Moderator