none
Azure Data Factory pipeline copy activity slow? RRS feed

  • Question

  • I have a pipeline with 4 copy activities scheduled, I have just created one copy activity where the input and output dataset are azure SQl, the source table has more than 1 lakh record and it takes a long time to copy in destination. Copy activity timeout interval is 1 hr so it always timeout and the records copied on destination is just 50000. I have noticed that because of one columns which is "description" column the process is taking time.
    Wednesday, August 24, 2016 6:43 AM

All replies

  • Hi harshu288, 

    The activity timeout could be configured, you can try to extend that to a longer time and see whether that help. 

            "policy": {
              "concurrency": 1,
              "executionPriorityOrder": "OldestFirst",
              "retry": 0,
              "timeout": "01:00:00"
            }

    How much total rows there? What's the performance Tier(S0?P1?
    ) for both source/target Azure SQL Database? On the other side, it would be great if you can share azure subscriptionid/data facotry name/ "runid" for the slow run. Then we can help take a look at is there any way to further optimize.

    In general, several factor would result different throughput: performance tier for data source, schema of the dataset (smaller row size might need large batch size), other workloads in the source/target DB. The more you can shared, the better it will help us troubleshooting. Here you can find the reference data for copy throughput among different data sources. 

    https://azure.microsoft.com/en-us/documentation/articles/data-factory-copy-activity-performance/

    Regards.

    Chao


    panchao

    Wednesday, August 24, 2016 8:49 AM
  • well I have already increased timeline from 1 hr to 3 hr still getting timout error, let me check the blog which you have provided if possible please share your email id.

    Thanks !!

    Thursday, August 25, 2016 6:27 AM
  • As I mentioned, could you please share the runid as well as which performance tier the source/target databases are in? You can reach me through chaopan@microsoft.com. Thanks.

    panchao

    Sunday, September 4, 2016 1:53 PM