Delete all files and folders (in ADLS G1) BELOW a given dataset with the Delete activity ("recursively" checkbox issue?) RRS feed

  • Question

  • I know there is a "Delete" activity for ADF (deleting in ADLS G1) and I am trying to use it to delete all files and folders below a given dataset, for example:

    Main Folder:  datalakeName/processedFiles

    now i want to delete everything in it including subfolders, BUT KEEP the actual folder name itself, just delete everything below.

    some examples of what would be in it:

    • datalakeName/processedFiles/recType1Files/file1.txt
    • datalakeName/processedFiles/recType1Files/file32.txt
    • datalakeName/processedFiles/recType1Files/file1asdf4.txt
    • datalakeName/processedFiles/recType2Files/file7771.txt
    • datalakeName/processedFiles/recType2Files/file35552.txt
    • datalakeName/processedFiles/recType2Files/file155asdf4.txt

    When I tried running this feature and checking the "Recursively" box, it deleted everything INCLUDING the main folder at "datalakeName/processedFiles", However I want to LEAVE that base folder, just delete all the files that were processed for that day.

    Another thing I tried was adding a wildcard like "*.*" (which just deletes all the files and leaves empty folders) and also just a "/" (which gives an error)

    How do I wind up with an empty folder called "datalakeName/processedFiles" so that when i go to process more files the next day, it's empty and ready for more files?


    Edit : changed title to include adls G1.  I had only mentioned it in an earlier follow up post from yesterday

    • Edited by mschandler Tuesday, February 11, 2020 11:55 AM
    Monday, February 10, 2020 5:11 PM

All replies

  • I tried getting creative and thought that if I just provide a filter date that would delete all the files and folders since the base folder is older but then I get this error, indicating it is not supported:

    The activity payload is currently not supported. If you want to delete files filtered by LastmodifiedDate, please set fileName value in your dataSet payload. For example: fileName: '*'. If you want to delete the entire folder, please remove all LastmodifiedDate properties from payload.

    I just want to empty a folder...

    Another option I was looking at was to just let it delete everything and recreate the base folder, but there's no "create folder" option in ADF for ADLS G1 as far as I know.

    • Edited by mschandler Monday, February 10, 2020 5:46 PM
    Monday, February 10, 2020 5:40 PM
  • Hi there,

    You are using Azure blob storage. Folders are a virtual concept in blob storage while Container are not. This means that folders are deleted when their underlying files are deleted. This is a limitation in Azure Blob Storage by design.

    As a workaround, you can chain a copy activity to your delete activity to copy an empty file to the folder you want.

    Hope this helps.

    Tuesday, February 11, 2020 10:25 AM
  • No, I'm not using blob storage, I'm using ADLS G1, I mentioned that in my earlier post and it supports folders. Adls being azure data lake storage gen 1 that does support folders.  I wasn't clear in original post, only earlier follow ups, so I modified original and now have it mentioned in both, thanks. 

    • Edited by mschandler Tuesday, February 11, 2020 2:42 PM
    Tuesday, February 11, 2020 11:49 AM
  • Hi there,

    Thanks for clarifying the ask. You can chain a web activity to your delete activity to create folders using the REST APIs for ADLS G1 (https://docs.microsoft.com/bs-latn-ba/azure/data-lake-store/data-lake-store-data-operations-rest-api#create-folders).

    This workaround should create the deleted folder(s).

    Tuesday, February 18, 2020 7:13 AM