locked
Copy only changes from AWS S3 bucket to Azure blob container RRS feed

  • Question

  • Is it possible to define a pipeline that only copies changes from AWS S3 bucket to an Azure Blob Container? I do not want all the files to be copied every time the pipeline is run, only the files that have changed since the last run or new files?

    Also, is the inbound data set data stored in the data factory region at any point before it is moved into the outbound data set? For example if my pipeline is in US and my Azure Blob Container is in Asia, would the data be temporarily stored in the US when the pipeline is processing?

    Thanks

    Friday, September 15, 2017 4:42 AM

All replies

  • at the moment ADF only supports incremental loads if you partition your data into folders e.g. create 1 folder for each day and load the full folder but you cannot make a selective load based on the last modified date. (custom activities would of course work)

    the data has to be processed by ADF but I do not know if it has to be stored physically somewhere during this process

    -gerhard


    Gerhard Brueckl
    blogging @ http://blog.gbrueckl.at
    working @ http://www.pmOne.com

    Friday, September 15, 2017 6:42 AM