locked
Hierarchical Name Space property RRS feed

  • Question

  • Hello - 

    I'm new to Azure.  I'm being asked to build an Azure Data Lake for one of our departments with multiple big data teams.  Management want all of the teams in this department to share one data lake.  I have heard that getting the isHnsEnabled property right is critical because it cannot be changed in the future.  I have to present the options to the managers and make a recommendation, but I don't know what the implications are of having/not having Hierarchical Name Space avaialble for a data lake in terms of how that choice enables/limits services that can be deployed in that data lake.  Is there an online resource anyone knows of that covers this question?  I haven't found any helpful articles online.  

    Thanks much,

    Michael

    Wednesday, May 6, 2020 5:00 PM

All replies

  • Hello mikebutak and thank you for your question.  Doc link

    You heard correctly.  The Hierarchical namespace feature must be selected during account creation.  It cannot be changed after account creation.

    The Hierarchical Name Space is greatly preferred for certain application and services.  In particular, anything Hadoop-like which would use HDFS.  These tend to be distributed computing applications such as HDInsight.

    The more  immediate and widespread implications of using or not using Hierarchical Name Space is granularity of permission.  This is a consequence of how the concept of 'folders' or 'directories' are implemented.

    With the Hierarchical Name Space, you can grant / restrict access to folders or files.  Without Hierarchical Name Space, this is restricted to the container level. ACL Doc Link

    There are different drivers / protocols used depending upon whether you are using Hierarchical Name Space.  Most services call out which they support.  Storage without Hierarchical Name Space is often called 'Blob Storage'.  Storage with Hierarchical Name space is often called 'Azure Data Lake Gen2'.

    Is there a set of services you are looking at in particular?

    Wednesday, May 6, 2020 6:10 PM
  • Is there an online resource anyone knows of that covers this question?  I haven't found any helpful articles online.  

    If you enable the HNS, the storage becomes Azure Data lake store gen2. Below blog also covers the advantages of it. 

    Also remember, Azure data lake gen1 + Azure blob storage benefits = Azure data lake gen2 (which is latest one) 

    You mentioned, big data teams are asking for data lake. If these teams query/process the data stored in data lake in form of text, parquet etc from Azure databricks, Azure HDI, then probably you will have to enable HNS. 

    https://www.blue-granite.com/blog/10-things-to-know-about-azure-data-lake-storage-gen2


    If the response helped, do "Mark as answer" and upvote it
    - Vaibhav

    Wednesday, May 6, 2020 6:23 PM
  • Hi mikebutak,

    Following up to see if any of the above suggestions were helpful.  If this answers your query, please do consider to click “Mark as Answer” and "Up-Vote" the comment that helped, as it might be beneficial to other community members reading this thread.

    And, if you have any further query do let us know.


    Thank you

    If a post helps to resolve your issue, please click the "Mark as Answer" of that post and/or click Answered "Vote as helpful" button of that post. By marking a post as Answered and/or Helpful, you help others find the answer faster.

    Thursday, May 28, 2020 10:03 PM