locked
Clear up storage after task is finished RRS feed

  • Question

  • I am using ITaskSubmissionHelper to define my batch tasks. The result (among other things) is that lots of new blob containers appear in my storage account. One of these contains all the files that are involved with at least one task that I submitted, the others are empty (I assume they correspond to the created tasks, as their number agrees).

    My problem is that even though I delete the work item once it is finished, these blob containers are still there in my storage account and I could not find the means to efficiently remove them.

    Can you tell me how can I get a list of blob containers that I should remove, when I am removing a work item? I am pretty sure there is a simple way to do this, I just could not find it.

    Thanks,

      Sándor Kolumbán

    Wednesday, May 27, 2015 6:32 PM

Answers

  • Here is how it is supposed to work... I would like to get to your observed behavior towards the end of this reply:

    ITaskSubmissionHelper.CommitAsync() begins a workflow that frontloads filestaging.

    As each implementation of IFileStagingProvider is called, it is asked to populate IJobCommitUnboundArtifacts.FileStagingArtifacts (if an entry is missing).  Each implementation is free to expose, in its artifacts, whatever data it feels is interesting.

    The FileToStage implementation populates its artifacts with the name of the new container.

    SequentialFileStagingArtifact.BlobContainerCreated will be populated with the name of the new container.  The property is set early in the execution tree and is available realtime.  It is a best practice for implementations of IFileStagingProvider to expose, realtime, any artifacts that might be left behind in case of failure.  This enables cleanup, as you seek here, and something close to resilient programming patterns.

    ITaskSubmissionHelper.CommitAsync()/Commit() should only be called once.  Results from multiple threads/async tasks are undefined.

    FileToStage names each container beginning with Batch.Constants.DefaultConveniencePrefix.  If a NamingFragment was set prior to Commit(), that value is appended.  Finally, a timestamp based on DateTime.UtcNow is appended.

    FiletoStage creates one and only one new container for each call to CommitAsync() (really, IFileStagingProvider.StageFilesAsync()).

    Assuming you are using FileToStage,  I cannot explain why you are seeing empty containers and I cannot explain why they correspond to created tasks.  Perhaps you could share some of you code?


    daryl

    Thursday, May 28, 2015 8:47 PM

All replies

  • Here is how it is supposed to work... I would like to get to your observed behavior towards the end of this reply:

    ITaskSubmissionHelper.CommitAsync() begins a workflow that frontloads filestaging.

    As each implementation of IFileStagingProvider is called, it is asked to populate IJobCommitUnboundArtifacts.FileStagingArtifacts (if an entry is missing).  Each implementation is free to expose, in its artifacts, whatever data it feels is interesting.

    The FileToStage implementation populates its artifacts with the name of the new container.

    SequentialFileStagingArtifact.BlobContainerCreated will be populated with the name of the new container.  The property is set early in the execution tree and is available realtime.  It is a best practice for implementations of IFileStagingProvider to expose, realtime, any artifacts that might be left behind in case of failure.  This enables cleanup, as you seek here, and something close to resilient programming patterns.

    ITaskSubmissionHelper.CommitAsync()/Commit() should only be called once.  Results from multiple threads/async tasks are undefined.

    FileToStage names each container beginning with Batch.Constants.DefaultConveniencePrefix.  If a NamingFragment was set prior to Commit(), that value is appended.  Finally, a timestamp based on DateTime.UtcNow is appended.

    FiletoStage creates one and only one new container for each call to CommitAsync() (really, IFileStagingProvider.StageFilesAsync()).

    Assuming you are using FileToStage,  I cannot explain why you are seeing empty containers and I cannot explain why they correspond to created tasks.  Perhaps you could share some of you code?


    daryl

    Thursday, May 28, 2015 8:47 PM
  • Thanks for the info. So what I need is to collect the artifacts from the created FileToStage objects, so I can clean up once the tasks are finished. That is good news, I will try that. Maybe in the mean time I will discover when the empty blob container names are created.

    ITaskSubmissionHelper./Commit() is called only once, so that is not the issue.

    Saturday, May 30, 2015 11:35 AM