none
Reading data from AzureDataLake using Service Principal App ID's

    Question

  • Hi,

    We have link for azure data lake store path as something below
    Azure Data Lake path - adl://test.azuredatalakestore.net/folder1/folder2/folder3
    File Name - testfile.tsv

    Service Principal App ID's that have access to the above source data lake are:
    App ID - GUIDNumber
    App Name - TestAppName

    Wanted to know the ways of reading data from above data lake path to sql database in a detail way.

    Thanks in advance,

    Raksha



    Raksha

    Saturday, October 20, 2018 3:54 PM

All replies

  • Hello Raksha,

    Without knowing that much about what you are trying to accomplish, have you considered Azure Data Factory?  Take a look at what a copy activity in ADF looks like:

    https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview

    You may also be interested in transforming data:

    https://docs.microsoft.com/en-us/azure/data-factory/transform-data

    ADF offers a connectors for ADL and various databases:

    https://docs.microsoft.com/en-us/azure/data-factory/copy-activity-overview#supported-data-stores-and-formats

    If this isn't what you are looking for, could you explain in more detail what you are trying to do with ADL and your SQL database?

    Monday, October 22, 2018 8:42 PM
    Moderator
  • Hi Jason,

    Thank you for the reply.

    This is basically a datawarehouse and now one of the source is ADL.
    Basically I have to just DAILY dump the file at ADL source to my staging table and then we have some business logic to load to main dbo table.

    1) Currently I am stuck in how to pull ADL data to SQL Database/Local System.

    As you mentioned ADF is one of the option. I am looking for some more options such as 
    a)Power Shell
    b)SSIS

    I am mainly looking for above options because already we have SSIS in place

    2)If we use ADF how can we integrate it to the existing SSIS framework

    3)We have details of Principal ID & Principal Name. Do we need any other details.

    Any guidance would be of great help

    Thanks,
    Raksha

    Raksha

    Friday, October 26, 2018 4:50 PM
  • We could achieve the requirement using powershell.
    Below are values that we require.
    1) ClientID (this is nothing but the AppID which has access to datalake folder)
    2) AccessKey
    3) TenantID
    4) SubscriptionName
    5) ADL folder name
    6) Source path in ADL along with filename
    7) Destination path where the file needs to be downloaded


    $clientID = "xyz-pqr-abcd"
    $AccessKey =  "******="
    $TenantId = "pqr-dgsahg-ggddgg"

    $azurePassword    = ConvertTo-SecureString $AccessKey -AsPlainText -Force
    $psCred           = New-Object System.Management.Automation.PSCredential($clientID, $azurePassword)
    Login-AzureRmAccount -Credential $psCred -ServicePrincipal -Tenant $TenantId


    Get-AzureRmSubscription -SubscriptionName "DEVELOPMENT - Test Subscription"  | Set-AzureRmContext

    Export-AzureRmDataLakeStoreItem -AccountName "adlacct.azuredatalakestore.net" -Path "srcpath/folder/file.csv" -Destination "C:\Destination\file.csv"

    Raksha

    Tuesday, December 4, 2018 6:02 AM