none
how to get file Count in Azure data factory and do activities based on it RRS feed

  • Question

  • Hi,

    I have a sftp server folder from where I am copying all the child files to a file system(destination). After the copy activity is finished, I have to delete the source folder. But before deleting the source folder, I need to confirm that number of files in the source was copied to the destination folder.

    Is there a way to check whether the number of files in source and destination are same, and if they are same proceed, else exit,

    Thanks

    Monday, December 2, 2019 7:50 AM

All replies

  • Hello Piyush Gopinath and thank you for your question.  There is a way to count the child items in a SFTP folder.

    I used a Get Metadata activity on my SFTP dataset.  In the 'Field list' I selected 'Child Items'.  This returns a list of filenames.  After that I used a Set Variable activity to count and store the result.  The expression I used was:

    @string(length(activity('Get Metadata1').output.childItems))

    This approach might not be suitable if you have a very very large list of items.  The Get Metadata activity's output is limited to 1 MB in size.  To clarify, that means the result of the activity, not the size of the files.

    Monday, December 2, 2019 10:55 PM
    Moderator
  • Thanks Martin for the reply,

    I have already tried the getmetadata activity but I was getting error, maybe because the filelist is huge

    Tuesday, December 3, 2019 5:34 AM
  • Since Get Metadata activity didn't work for you, I'll brainstorm some alternatives.

    • You could try using a Custom Activity to connect to the SFTP server and do the count.

    Going beyond solutions in the box of Data Factory, I do have more ideas.

    • On your SFTP server, create a script to count the items in the folder, and write the result to a separate file.  Have any writes to your target folder trigger the script.  Data Factory can then do a lookup on the count-file.
    • On your SFTP server, expose an endpoint whose sole capability is to return the count.  Data Factory can call this endpoint.
    • Have Data Factory call another service such as Azure Functions or Azure Logic App to go and count the files and return the result to Data Factory.
    Thursday, December 5, 2019 1:50 AM
    Moderator
  • Please let me know if you need any more help, else I will assume this resolved.
    Friday, December 6, 2019 7:23 PM
    Moderator