locked
I/O to Azure Files is noticeable slower on Ubuntu VM compared with Windows VM RRS feed

  • Question

  • I'm using Azure Files in North Central US. I have a shared folder that is primarily read by several Ubuntu 14.04 LTS machines (from the gallery). 

    When I do a "rsync -avh --progress someBigFileOnAzureFiles /dev/null" I get around 1-9 MB/s of reads (and slightly slower for writes). However, when I use a Windows VM on the exact same file, I get 60-70 MB/s (Azure Files' limits) for reads and writes.

    The shared folder is mounted as:

    mount -t cifs //ACCOUNT_NAME.file.core.windows.net/SHARE_NAME /local/path -o vers=2.1,username=USERNAME,password=PASSWORD,ro

    How can I get faster Linux reads speeds to match the Windows VM speed? This happens even when VM's are both D4's.


    • Edited by Jeff.Moser Tuesday, June 2, 2015 10:50 PM
    Tuesday, June 2, 2015 10:46 PM

All replies

  • We will take a look and try to repro and get back to you.

    jason

    Wednesday, June 3, 2015 3:57 AM
  • Thanks for looking into this! I only did the rsync test after noticing that the performance felt much slower.

    This performance issue is currently causing problems with our application that was expecting higher performance (for both random and sequential I/O). Getting it to the point where we could max out Azure Files like we can do on Windows would be a big help.

    Wednesday, June 3, 2015 4:49 PM
  • Hi Jeff. Can you provide:

    1. Storage Account Name

    2. Timestamp for when you did this test.  

    3. Network Trace

    It would be preferable if you did this test again and provided us with a timestamp and network trace so that it's easier for us to look into this. 

    Thanks.

    Thursday, June 18, 2015 11:22 PM
  • Hi Jeff,

    Apologies for the delay.

    Can you please tell me how or what tools you are measuring the speeds of reads and writes(both on windows and linux VMs)? I would like to repro this in our labs and see if we get similar results?

    Thanks,
    Liju

    Friday, June 19, 2015 2:49 AM
  • Thanks for reply. This is still happening to us on our "kagglesds" storage account. I just confirmed it right now (2:12 PM UTC today, 22 Jun 2015). 

    On my D14 Ubuntu 14.04 LTS VM (from the Azure gallery) When I do a "rsync -avh --progress /path/to/large/file/on/CIFS/volume /dev/null" I get a peak of about 9 MB/s.  This is similar to the speed I get with a normal "cp" command. However, if I use the REST API from the exact same Linux machine, I get a speed of 40-55 MB as measured by cURL.

    If I start up a similar D4+ sized Windows machine and mount the same file share, I get an Explorer copy speed of around 60-70 MB/s.

    Thus, it seems like something is inefficient with the Ubuntu CIFS driver when talking to the Azure Files Preview share that makes it ~7-10X slower than the Windows version. I was wondering if there's something I should tweak?

    Monday, June 22, 2015 2:24 PM
  • Hi Jeff,

    >>However, if I use the REST API from the exact same Linux machine, I get a speed of 40-55 MB as measured by cURL.

    What is the exact command and where is it that you see the speeds of 45 - 55 MB, can you copy paste the output here?

    >>I get an Explorer copy speed of around 60-70 MB/s.

    When you mention Explorer copy, does it mean that you copy and paste data from a local drive to the Window Azure Files drive you get a speed of 60-70 MB/s.

    Thanks,

    Liju

    Thursday, June 25, 2015 5:21 PM
  • I downloaded a 67GB file from our network share using the REST API via this script which calls cURL:

    #!/usr/bin/env bash
    # Code adapted from http://stackoverflow.com/questions/20103258/accessing-azure-blob-storage-using-bash-curl
    #echo "usage: ${0##*/} <storage-account-name> <share-name> <access-key> <remote-path> <local-path>"
    
    storage_account="$1"
    share_name="$2"
    access_key="$3"
    remote_path="$4"
    local_path="$5"
    
    echo "fetching ${remote_path} to ${local_path}"
    
    
    file_store_url="file.core.windows.net"
    authorization="SharedKey"
    
    request_method="GET"
    request_date=$(TZ=GMT date "+%a, %d %h %Y %H:%M:%S %Z")
    storage_service_version="2014-02-14"
    
    # HTTP Request headers
    x_ms_date_h="x-ms-date:$request_date"
    x_ms_version_h="x-ms-version:$storage_service_version"
    
    # Build the signature string
    canonicalized_headers="${x_ms_date_h}\n${x_ms_version_h}"
    canonicalized_resource="/${storage_account}/${share_name}${remote_path}"
    
    string_to_sign="${request_method}\n\n\n\n\n\n\n\n\n\n\n\n${canonicalized_headers}\n${canonicalized_resource}"
    
    # Decode the Base64 encoded access key, convert to Hex.
    decoded_hex_key="$(echo -n $access_key | base64 -d -w0 | xxd -p -c256)"
    
    # Create the HMAC signature for the Authorization header
    signature=$(printf "$string_to_sign" | openssl dgst -sha256 -mac HMAC -macopt "hexkey:$decoded_hex_key" -binary | base64 -w0)
    
    authorization_header="Authorization: $authorization $storage_account:$signature"
    
    curl \
      -o "${local_path}" \
      -H "$x_ms_date_h" \
      -H "$x_ms_version_h" \
      -H "$authorization_header" \
      "https://${storage_account}.${file_store_url}/${share_name}${remote_path}"

    For Explorer copy on the Windows box, I did a copy and paste from the shared folder to the local desktop.

    Thursday, June 25, 2015 5:42 PM
  • Hi Jeff,

    Thank you for the additional details. Let me try to repro the issue in the lab servers and I will revert.

    Appreciate the patience.

    Regards,
    Liju

    Monday, July 6, 2015 3:15 AM
  • Hi Jeff,

    Apologies for the delayed response. As of now we don't recommend any specific setting for Ubuntu while using Azure files? I have involved the performance team on this issue, it could very well be that there is some sort of caching involved while using Azure Files with Windows VM. We will revert with our findings.

    Please note as there are multiple teams involved it might take some time before they revert with updates.

    Thanks,
    Liju


    Saturday, July 18, 2015 3:37 AM
  • Any update on this issue?
    Friday, July 31, 2015 3:40 PM
  • Hi Jeff,

    Apologies for the delay. I have been communicating with the developers. They tested it and informed that they were getting better read speeds in the lab environments.

    The developers asked for additional information that can't be collected from you over forum. Can you please open a support case so that we can get the developers involved and resolve the issue.

    Thank you for your patience.

    Regards,
    Liju

    Monday, August 17, 2015 9:36 AM
  • I created support request 115081713048675 with further details. Please advise on that ticket or here if you need any further information. Thanks! 

    Monday, August 17, 2015 3:25 PM
  • Hi Jeff,

    Thanks I have tracked the owner of the Support Request. I will resume communication over the Service Request.

    Regards,
    Liju

    Tuesday, August 18, 2015 2:39 AM
  • Hi Jeff, Hi Liju,

    we have the same problem with Azure Files and Linux (we're using CentOS from Openlogic).

    We get only 10 MB/s maximum via Azure Files on Linux. On a Windows Machine we have more than 60 MB/s.

    Thanks for feedback,

    Robert

    Wednesday, October 7, 2015 5:56 PM
  • The final "resolution" of this ticket was filing a bug report with cifs-utils. It's not gone anywhere since.
    • Edited by Jeff.Moser Wednesday, October 7, 2015 5:59 PM
    Wednesday, October 7, 2015 5:59 PM
  • Jeff, you may want give a try on Azure CLI which has been optimized for Linux user to transfer data into Azure https://azure.microsoft.com/en-us/documentation/articles/storage-azure-cli/

    Wednesday, October 21, 2015 11:56 PM
  • Robert, are you using Rsync as well?

    You may want give a try on Azure CLI which has been optimized for Linux user to transfer data into Azure https://azure.microsoft.com/en-us/documentation/articles/storage-azure-cli/

    Wednesday, October 21, 2015 11:57 PM
  • Hi Jason,

    Maybe I'm misunderstanding your suggestion. I am able to download the full file at closer-to-Windows speeds using the cURL script I posted above. However, I can't stream files using system file read calls (e.g. sequentially reading a "on disk" file descriptor). Thus, legacy programs that depend on file paths (e.g. SQLite, most command line utilities, etc) are much slower when using Azure Files on Linux than compared to equivalent ones on Windows. The only workaround I've found that helps is to download local copies of the files to the machine's SSD and then access that cache directly at SSD speeds. However, the SSDs aren't nearly big enough as the 5 TB that Azure offers per share, so it requires a custom caching solution to make this workaround feasible.

    I'm thinking about sharing read-only mounts of Premium Storage, but since that's not available yet in NCUS, I haven't made it a priority yet to investigate that.

    This is something that affects us each day, so I welcome further suggestions.

    Jeff

    Thursday, October 22, 2015 12:27 AM
  • I just wanted to comment that I was having issues copying large files to Azure Files Share (would receive FAILED TO CLOSE errors using cp / mv / rsync) and I installed the Azure CLI and used the AZURE STORAGE FILE UPLOAD command to transfer my large files. The transfer is still slow however.

    Not 100% ideal imo but at least it works, thanks for the tip Jason!

    Thursday, December 3, 2015 6:24 PM