none
Has UploadToBlob Changed it's behavior based on creating directories?

Answers

  • Yeah, I'd check with the Sync Framework guys to see if they changed something.

    Blob storage doesn't and never has had a concept of a directory, so nothing changed there.  (A container is a bit like a directory, but there's only one level of it.)

    Saturday, November 20, 2010 1:57 AM
  • I don't know what CloudBerry does.  What we store is:

    mycontainer

    - myblob1

    - another/one/of/my/blobs

    - yet/another/blob

    We have a List Blobs API that allows you to treat the forward slash (or any other delimiter) as a magic character and return a list like this (for the above):

    mycontainer

    - myblob1

    - another*

    - yet*

    I put the asterisk there because the API actually sends back information that says those two aren't really blobs but rather "blob prefixes," which means there's at least one blob with a name that start with that string (and then the delimiter).

    Most of the tools use the default forward slash as a delimiter and display results like what List Blobs gives you.  Empty directories, though, are a bit problematic, since we only store blobs.  If there's no blob that starts with "foo", then there's no listing for a blob prefix called "foo."

    I downloaded CloudBerry to try to figure out what it did, and I didn't see it storing empty directories at all.  I dragged an empty directory over, and nothing happened.  When I dragged a directory with a file in it and then deleted the blob, an empty directory stayed there, but only until I refreshed.  Are you seeing something different?

    Saturday, November 20, 2010 5:45 AM
  • Hi Peter,

    As you're aware the blob storage has a 2 level hierarchy i.e. containers and blobs, you just can't upload an empty directory and expect it to persist in blob storage. I think what's going on behind the scenes in CloudBerry explorer is that when you upload an empty folder in order to persist that folder, CloudBerry explorer creates an empty file in that folder so that you can upload files later in that folder. I used SpaceBlock a long time back and this is the approach used by them at that time.

    As per your other question regarding how tools know if a file is a directory, please look at REST API documentation for listing blobs here: http://msdn.microsoft.com/en-us/library/dd135734.aspx . Basically anything under <BlobPrefix> node is treated as a folder (with some logic of course which could vary from vendor to vendor). So if we take Steve's example above, when you try and list blobs in mycontainer, Azure storage returns two entries under <BlobPrefix> node: another and yet. Tools like our Cloud Storage Studio will treat that as a folder so that when you click on that folder in our tool, we use it's value as blob prefix and try to list blobs where blob name starts with that (e.g. "another/").


    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

    Saturday, November 20, 2010 6:55 AM
  • In both of your bullets "... are always ignored", I'm not 100% sure who's doing the ignoring.  Are you asking about a specific client?  In terms of Windows Azure storage:

    1. "Empty directories are always ignored" - Windows Azure doesn't have directories, period.  It has a way to retrieve blob prefixes, which are just portions of blob names.  (So if there's no blob, there's no prefix.)
    2. "Empty files are always ignored" - Windows Azure can store zero-length blobs.

    I'm assuming your code looks something like this:

    upload(directory)
        for each file in directory
            upload file to blob fullpathof(file).replace('\\', '/')
        for each subdirectory in directory
            upload(subdirectory)

    Code like that doesn't do anything for empty directories.  You'll need to do something, as you said, and using custom metadata (as the sync sample does) to represent an NTFS directory seems sensible.

    Saturday, November 20, 2010 9:43 PM

All replies

  • I couldn't quite follow... what's the expected behavior, what are you seeing, and what's "UploadToBlob"?

    Also, what do you mean by a normal file versus a directory file?  Are you talking about blobs in the cloud showing up the wrong way, or the wrong local files getting created?  Maybe an example would help me understand.

    Saturday, November 20, 2010 1:32 AM
  • This code from the sample suggests to me like a blob is created for each directory on the local machine:

        // Called by AzureBlobSyncProvider.InsertItem.
        internal SyncedBlobAttributes InsertFile(FileData fileData, string relativePath, Stream dataStream)
        {
    ...
          if (fileData.IsDirectory)
          {
            // Directories have no stream
            dataStream = new MemoryStream();
          }
    
          // Specify an optimistic concurrency check to prevent races with other endpoints syncing at the same time.
          BlobRequestOptions opts = new BlobRequestOptions();
          opts.AccessCondition = AccessCondition.IfNotModifiedSince(uninitTime);
          
          try
          {
            blob.UploadFromStream(dataStream,opts);
    ...
    
    It does look like the sample was updated in June, so it could be that this is new behavior.  It seems like a good idea to me... otherwise you can't keep track of empty directories.  (Blob storage has no notion of a directory, just blobs with paths that contain slashes.)

    Saturday, November 20, 2010 1:37 AM
  • When I send the blob to Azure (with the directory attribute set I believe) it is showing up on clouldberry browser as an ordinary file which I'm 99% sure means that the type is not directory.  I'm asking because this is running the stock software in codeplex from the sync framework team for syncing files to blob storage. I'm also 95% sure that this stock software use to work so that if you tried to sync just a single empty directory, it would show up in cloudberry as an empty directory.

    Now, it is showing up in Cloudberry as a file instead of as a directory.  What I was looking for was "oh yeah, we just changed the way we..." but obviously that wasn't hte response.  I'll need to dig in further and really figure out what is going on.  I only found this because the sync software I'm developing stopped working correctly.  I went back to the one from the sync framework team, and there's no longer works correctly either which prompted my note here.

    Thanks Steve,


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Saturday, November 20, 2010 1:37 AM
  • Yeah, I'd check with the Sync Framework guys to see if they changed something.

    Blob storage doesn't and never has had a concept of a directory, so nothing changed there.  (A container is a bit like a directory, but there's only one level of it.)

    Saturday, November 20, 2010 1:57 AM
  • I've been pretty heads down on this for the last month, so it would be a recent thing.  At the moment, a mystery. I'm sure I'll figure it out.

    So, exactly how to products like cloudberry know the file is a directory and not just an empty file?  The behavior I see in cloudberry is the following:

    I upload 1 file in subdirectory, cloudberry shows it as a directory and a file in that directory

    I upload just a directory and cloudberry shows me an empty file and a directory with the same name.

    The sync framework sets an attribute on the blob called IsDirectory which is how it knows.

    Thanks,

    (BTW, I get the container concept, I'm talking about inside the container)

     


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Saturday, November 20, 2010 2:08 AM
  • I don't know what CloudBerry does.  What we store is:

    mycontainer

    - myblob1

    - another/one/of/my/blobs

    - yet/another/blob

    We have a List Blobs API that allows you to treat the forward slash (or any other delimiter) as a magic character and return a list like this (for the above):

    mycontainer

    - myblob1

    - another*

    - yet*

    I put the asterisk there because the API actually sends back information that says those two aren't really blobs but rather "blob prefixes," which means there's at least one blob with a name that start with that string (and then the delimiter).

    Most of the tools use the default forward slash as a delimiter and display results like what List Blobs gives you.  Empty directories, though, are a bit problematic, since we only store blobs.  If there's no blob that starts with "foo", then there's no listing for a blob prefix called "foo."

    I downloaded CloudBerry to try to figure out what it did, and I didn't see it storing empty directories at all.  I dragged an empty directory over, and nothing happened.  When I dragged a directory with a file in it and then deleted the blob, an empty directory stayed there, but only until I refreshed.  Are you seeing something different?

    Saturday, November 20, 2010 5:45 AM
  • Hi Peter,

    As you're aware the blob storage has a 2 level hierarchy i.e. containers and blobs, you just can't upload an empty directory and expect it to persist in blob storage. I think what's going on behind the scenes in CloudBerry explorer is that when you upload an empty folder in order to persist that folder, CloudBerry explorer creates an empty file in that folder so that you can upload files later in that folder. I used SpaceBlock a long time back and this is the approach used by them at that time.

    As per your other question regarding how tools know if a file is a directory, please look at REST API documentation for listing blobs here: http://msdn.microsoft.com/en-us/library/dd135734.aspx . Basically anything under <BlobPrefix> node is treated as a folder (with some logic of course which could vary from vendor to vendor). So if we take Steve's example above, when you try and list blobs in mycontainer, Azure storage returns two entries under <BlobPrefix> node: another and yet. Tools like our Cloud Storage Studio will treat that as a folder so that when you click on that folder in our tool, we use it's value as blob prefix and try to list blobs where blob name starts with that (e.g. "another/").


    Hope this helps.

    Thanks

    Gaurav Mantri

    Cerebrata Software

    http://www.cerebrata.com

    Saturday, November 20, 2010 6:55 AM
  • OK, I'm pretty sure I follow.  Let me paraphrase and please correct me if I'm wrong.

    1) Empty Directories Are Always Ignored

    2) Empty Files Are Always Ignored

    So, my problem is that I need to faithfully reproduce an existing directory tree.  That is, I first sync my directory from local to Azure storage, then, I need to be able to point that Azure Container I sync'd to to a different directory root and have it come back same as original.  both empty files and empty directories are legitimate artifacts I have to reproduce.  

    So, to do this, do I need to invent some metadata tags and actually stick some data in the blob so it works?  That is, I'm the only one reading and writing these so I have control of it.

    Please stick holes in my plan.  It would be a big help for the plan to fall on it's face now, rather than later.

    BTW Gaurav, I've just started using your explorer again.  I hadn't used it because it seemed harder/slower than "that other one".  Now I've found the checkbox "show dev storage" and that seems to have done the trick.  I'll keep using it and see how it goes.


    Peter Kellner http://peterkellner.net Microsoft MVP • ASPInsider
    Saturday, November 20, 2010 5:49 PM
  • In both of your bullets "... are always ignored", I'm not 100% sure who's doing the ignoring.  Are you asking about a specific client?  In terms of Windows Azure storage:

    1. "Empty directories are always ignored" - Windows Azure doesn't have directories, period.  It has a way to retrieve blob prefixes, which are just portions of blob names.  (So if there's no blob, there's no prefix.)
    2. "Empty files are always ignored" - Windows Azure can store zero-length blobs.

    I'm assuming your code looks something like this:

    upload(directory)
        for each file in directory
            upload file to blob fullpathof(file).replace('\\', '/')
        for each subdirectory in directory
            upload(subdirectory)

    Code like that doesn't do anything for empty directories.  You'll need to do something, as you said, and using custom metadata (as the sync sample does) to represent an NTFS directory seems sensible.

    Saturday, November 20, 2010 9:43 PM