none
Accessing DataLake Store from Azure DWH

    Question

  • Hi,

    I've created an Azure DWH instance (North Europe), with some (ORC formatted) files located in Data Lake Store.

    I can successfully load these files with CTAS. However, when I turn the Data Lake Store Firewall on, I get an error back saying:

    Msg 105019, Level 16, State 1, Line 143
    EXTERNAL TABLE access failed due to internal error: 'Java exception raised on call to HdfsBridge_IsDirExist. Java exception message:
    Operation GETFILESTATUS failed with HTTP403 : null
    Last encountered exception thrown after 1 tries [HTTP403(null)]: Error [Operation GETFILESTATUS failed with HTTP403 : null
    Last encountered exception thrown after 1 tries [HTTP403(null)]] occurred while accessing external file.'

    The only change is enabling the Data Lake Firewall, which is configured with the Client VM (Azure VM) IP Address. 

    Many thanks

    Nick Haslam


    Thursday, February 16, 2017 12:48 PM

Answers

  • Hi Nick,

    The firewall capability in ADLS cannot at present be used for this scenario. When querying ADLS via Azure DW, the machines that access ADLS are from the Azure DW infrastructure. For this we are investigting adding a switch in the firewall to allow all azure services access to an ADLS account. I do not have firm dates on this yet. In the meanwhile, please feel free to upvote this related feature request.

    https://feedback.azure.com/forums/327234-data-lake/suggestions/16703647-adding-ip-filtering-firewall-to-azure-data-lake-st

    Thanks,

    Amit

    • Marked as answer by Nick__H Monday, March 20, 2017 7:46 AM
    Thursday, February 16, 2017 5:39 PM

All replies

  • Hi Nick,

    The firewall capability in ADLS cannot at present be used for this scenario. When querying ADLS via Azure DW, the machines that access ADLS are from the Azure DW infrastructure. For this we are investigting adding a switch in the firewall to allow all azure services access to an ADLS account. I do not have firm dates on this yet. In the meanwhile, please feel free to upvote this related feature request.

    https://feedback.azure.com/forums/327234-data-lake/suggestions/16703647-adding-ip-filtering-firewall-to-azure-data-lake-st

    Thanks,

    Amit

    • Marked as answer by Nick__H Monday, March 20, 2017 7:46 AM
    Thursday, February 16, 2017 5:39 PM
  • Thanks Amit.

    So, just to be clear, the Data Lake Firewall is only really for connecting to Data Lake to non-azure sites ?

    For us, all traffic to/from the Data Lake will come from within Azure, either via an Azure VM or the Azure SQLDW. Can you confirm how best to secure this ? Is there anything else we can do aside from restricting permissions ?

    Many thanks

    Nick.


    Monday, February 20, 2017 8:24 AM
  • That is correct. For now nothing with the firewall. You should wait for the "Allow Azure services" option to be available. We are working on it.

    Thanks!

    Monday, March 20, 2017 4:43 AM