How to optimize large file transactions using a WCF Service as an endpoint for the Azure storage?

Odpovědět How to optimize large file transactions using a WCF Service as an endpoint for the Azure storage?

  • 2012年3月11日 17:25
     
     

      I have an WCF REST Service as an endpoint for the Azure Storage. The WCF REST Services handles uploads and downloads of files that usually measure 5-10 MB. When handling the stream (both for download an upload) the bytes are in the Azure VM RAM memory, right? Even if for upload the data is splitted into 4 MB blocks, those 4 MB are kept in the RAM memory until the upload is complete. For download, the bytes are kept until the download is complete. So, if I have 1000 users downloading a file at the same time that means that the Azure VM should have 4 GB RAM just for the transfer.

      Is there a way to optimize this? Correct me if I'm wrong when I assume that the data is kept in VM machine RAM until the operation is finished. Should I use Microsoft's Azure REST Service? Where does that service keep the data until the transfer is finished?

すべての返信

  • 2012年3月11日 17:41
     
     

    You could also use the CloudBlockBlob instead of the CloudBlob, which allows you to stream a file in blocks to the Windows Azure storage.
    Once you've written all blocks to the storage, you use the PutList operation that will write away all the blockid's of the blocks you've written to storage. That way you do not have to load the file into memory to write it to blob storage, but you can write the file in separate blocks, in the size you want it to be.

    If you're using a CloudBlob, it is getting chopped up in multiple blocks if you're trying to upload a file larger then 32MB, which is defined by the SingleBlobUploadTresholdInBytes property on the CloudBlobClient. See:
    http://msdn.microsoft.com/en-us/library/microsoft.windowsazure.storageclient.cloudblobclient.paralleloperationthreadcount.aspx

    You can find some basic code in chapter 10:
    http://robbincremers.me/2012/02/27/everything-you-need-to-know-about-windows-azure-blob-storage/

    Lemme know if this could help improve your solution



    Be nice to nerds ... Chances are you'll end up working for one!


  • 2012年3月11日 21:00
     
     回答済み コードあり

    Hi Johnny,

    You have 2 options here:

    1. Upload directly to blob storage. But if you use a WCF service I'm assuming this isn't an option for you (maybe you're exposing the service to a customer).
    2. Use WCF streaming

    You can use WCF streaming with MTOM to optimize file transfers AND not have all the data in memory until the transfer completes. I've tested this with a 1GB file and my memory usage is minimal (40mb or so). You'll need to configure a basicHttpBinding like this:

          <basicHttpBinding>
            <binding name="transferBinding" closeTimeout="00:10:00" openTimeout="00:10:00" sendTimeout="00:10:00" maxReceivedMessageSize="262144000" messageEncoding="Mtom" transferMode="Streamed">
              <security mode="Transport">
                <transport clientCredentialType="None" proxyCredentialType="None" />
              </security>
            </binding>
          </basicHttpBinding>

    Then, you'll need a message contract containing the stream:

        [MessageContract]
        public class UploadRequest : IDisposable
        {
            [MessageHeader(MustUnderstand = true)]
            public string Filename;
    
            [MessageBodyMember(Order = 1)]
            public System.IO.Stream Stream;
    
            public void Dispose()
            {
                try
                {
                    if (Stream != null)
                    {
                        Stream.Close();
                        Stream = null;
                    }
                }
                catch
                {
    
                }
            }
        }

    On the side of the server, you'll simply be working with the stream:

    int chunkSize = 4096;
    byte[] buffer = new byte[chunkSize];
    
    using (var stream = File.Create(outputFilePath))
    {
    	do
    	{
    		int bytesRead = uploadRequest.Stream.Read(buffer, 0, chunkSize);
    		if (bytesRead == 0)
    			break;
    		stream.Write(buffer, 0, bytesRead);
    	}
    	while (true);
    }

    And finally, your client will have a simple implementation:

    var client = new UploadServiceClient();
    client.Upload("Video.avi", new FileStream("Video.avi", FileMode.Open));

    This should be enough to implement a WCF service that allows you (or your customers) to upload files to Azure with a great performance and a low memory footprint.

    Sandrino


    Sandrino Di Mattia | Twitter: http://twitter.com/sandrinodm | Azure Blog: http://fabriccontroller.net/blog | Blog: http://sandrinodimattia.net/blog

  • 2012年3月12日 8:15
     
     

    For downloading, instead of downloading the blob through your WCF service can you not download it directly from blob storage? Each blob has a URL which you can expose in your application. For added security you may want to look into shared access signature.

    Hope this helps.

  • 2012年7月1日 9:19
     
     
    @Sandrino: I would like to ask you a quiestion.

    I am trying to
    send files between the client and the service, and I am using your
    service configuration, the basicHttpBinding, and a class like yours, two
    properties, one for the name, the header, and  Stream for the binary
    information.

    However, if I decorate the class with the
    MessageContract, if I run the service and I try to listen the port, for
    example with this command:

    netstat -ona | find "7997"

    I
    get nothing, so when I try to create the proxy with svcutil, I get an
    error that says that there is not any extrem listen in this port.

    However, if I decorate the class with the DataContract attribute, all works fine.

    which
    could be the problem? I need a MessageContract because I need to send
    aditional information (name of the file) with the binary data.



    Thanks.