Multiple file upload in parallel using Data Factory and Data Management Gateway


  • Hello,

    I am trying to upload 20 GB data on Data lake using ADF and Data Management Gateway. It is coping the files on Data Lake, but it copies files sequentially which means it waits to copy other file till the first completes. I thought When we make pipeline to copy it copies files in parallel which in turns taking lot of time to process. But When I am manually uploading files using upload feature, it copies 50 files together, which is faster than the ADF-pipeline-DMG approach. 

    I am not sure if any specific settings we need to do while defining Gateway/Linked service/Dataset or Pipeline, if it is so than what setting we need to define to speed up the process? Find below attachment for manual upload which shows how it do file upload in parallel:

    Manual File upload in parallel

    Thanks, Manthan Upadhyay

    Wednesday, March 9, 2016 4:41 AM

All replies

  • Thanks for using Data Management Gateway to move on-premise data to Azure Data Lake Store.As of now, data management gateway only support sequential data movement from on-prem sources to Azure Data Lake Store. We are working on releasing a new feature (in the coming weeks) that will enable parallel data movement from Azure Blob to Azure Data Lake Store. We are also working on enabling other combinations with Azure Data Lake Store including scenarios using Data Management Gateway. We will share updates as we get closer to lighting up parallel data movement of all combinations with Azure Data Lake Store.
    Thursday, March 10, 2016 2:27 AM