locked
Upload folders to Azure Blob Container using Python SDK RRS feed

  • Question

  • Hello,

    I am trying to write a script to upload folders containing image files stored locally on my linux machine to an Azure blob storage account. The images are satellite remote sensing tiles gathered over a range of dates. I need a separate container for each tile, and within that container should be separate folders containing images for each date. This structure is easily achieved using the Azure Storage Explorer by simply clicking and dragging the folders into the correct container, but I am struggling to achieve it in script.

    |-Tile

        |--date

              |-- image files

    My current code is shown below, but returns the error:  "blob IsADirectoryError: [Errno 21] Is a directory".

    def send_to_blob(blob_account_name, blob_account_key, tile, L1Cpath): """

    Function uploads processed folders of L2A products for each date to blob storage account.

    Upload folder of images to container matching tile ID.

    Check if container matching tile name already exists -if so add folder of imgs to that container, if not, create new container with name = tile. :param blob_account_name: :param blob_account_key: :param tile: :return:

    """

    block_blob_service = BlockBlobService(account_name=blob_account_name, account_key=blob_account_key) local_path = L1Cpath containers = block_blob_service.list_containers()

    container_name = tile

    if any(tile in filenames for filenames in containers): # add files to existing container matching tile name for file in os.listdir(local_path): block_blob_service.create_blob_from_path(container_name, file, os.path.join(local_path, file)) else: # Create a container with the tile as filename, then add files. block_blob_service.create_container(container_name) for file in os.listdir(local_path): block_blob_service.create_blob_from_path(container_name,file,os.path.join(local_path,file)) return

    Any thoughts how to solve this?

    Many thanks!

    Monday, April 15, 2019 10:03 AM

Answers

  • I think that code error may refer that you're trying to create a directory/folder within a container, while this is not supported in blob storage, you should use Virtual folders, which you can simply append the path to the file name, ex: "date/filename" it will create a virtual directory in which you can add files to by consistently appending the path (date) to the filename you are going to upload. 

    I created an example which lists blobs via the python SDK: https://github.com/adamsmith0016/Azure-storage , under the del-blob.py script, it shows how these blobs are listed. 

    Let me know if this helps.

    Monday, April 15, 2019 9:39 PM

All replies

  • I think that code error may refer that you're trying to create a directory/folder within a container, while this is not supported in blob storage, you should use Virtual folders, which you can simply append the path to the file name, ex: "date/filename" it will create a virtual directory in which you can add files to by consistently appending the path (date) to the filename you are going to upload. 

    I created an example which lists blobs via the python SDK: https://github.com/adamsmith0016/Azure-storage , under the del-blob.py script, it shows how these blobs are listed. 

    Let me know if this helps.

    Monday, April 15, 2019 9:39 PM
  • Thanks Adam, that has helped a lot. I have a solution that works - it is part of a much larger script but here is the relevant function in case it is useful to you or others.

    many thanks again,

    Joe

    def send_to_blob(blob_account_name, blob_account_key, tile, local_path): """ Function uploads processed L2A products to blob storage. Check if container matching tile name already exists -if so add files to that container, if not, create new container with name = tile. This function is called in a loop where tile is selected from a list of strings. :param blob_account_name: :param blob_account_key: :param tile: :return: """ block_blob_service = BlockBlobService(account_name=blob_account_name, account_key=blob_account_key) tile = tile[0] # find files to upload and append names to list for folder in os.listdir(local_path): file_names = [] file_paths = [] filtered_paths = [] filtered_names = []

    folder_path = str(local_path+folder) # append all file paths and names to list, then filter to the relevant jp2 files for (dirpath, dirnames, filenames) in os.walk(folder_path): file_paths += [os.path.join(dirpath, file) for file in filenames] file_names += [name for name in filenames] for path in fnmatch.filter(file_paths,"*.jp2"): filtered_paths.append(path) for file in fnmatch.filter(file_names,"*.jp2"): filtered_names.append(file) #' check for existing containers existing_containers = block_blob_service.list_containers() existing_container_names = [] for item in existing_containers: existing_container_names.append(item.name) if any(tile.lower() in p for p in existing_container_names): print("*** CONTAINER {} ALREADY EXISTS IN STORAGE ACCOUNT ***".format(tile)) # add files to existing container matching tile name for i in np.arange(0,len(filtered_paths)): print("*** UPLOADING FOLDERS TO EXISTING CONTAINER {} ***".format(tile)) source = str(filtered_paths[i]) destination = str(folder+'/'+filtered_names[i]) try: block_blob_service.create_blob_from_path(container_name, destination, source) except: "Uploading to blob failed" else: print("*** CONTAINER DOES NOT ALREADY EXIST. CREATING NEW CONTAINER {} ***".format(tile)) # Create a container with the tile as filename, then add files. block_blob_service.create_container(container_name) print("*** CONTAINER CREATED. UPLOADING FOLDERS TO NEW CONTAINER ***") for i in np.arange(0, len(filtered_paths)): source = str(filtered_paths[i]) destination = str(folder + '/' + filtered_names[i]) try: block_blob_service.create_blob_from_path(container_name, destination, source) except: "Uploading to blob failed"

    return



    • Edited by tothepoles Thursday, April 18, 2019 9:17 AM updated to work for multiple subdirectories in local_path
    Wednesday, April 17, 2019 3:34 PM
  • Glad it helped, and thanks for sharing your solution with the community as well well :)
    Wednesday, April 17, 2019 6:00 PM