locked
Azure Storage Queue - Put Message Speed RRS feed

  • General discussion

  • I put this original reply on a separate thread, but wanted to get more eyes on it for your feedback.  I'm evaluating the queue service in a possible move to Azure (I love the new stuff you are doing) for one of our processes as a mechanism to queue up a large batch of personalized information (100k+ records of things like firstname,lastname,etc) however just on some small preliminary tests, I'm not sure it will perform at least in my case.   My threshold is 5k records per minute.

    Test batch of 1000 small messages (< 20 characters of text): 

    Put Messages to Azure Queue (South US):

    • Dev machine (at my office): 470000 ms to put 1000 messages
    • Azure Virtual Machine (East US, extra small): 234000 ms

    I got curious and wanted to test this against the Amazon SQS (I'm looking at moving to azure but I'm agnostic about whatever solution for the queue):

    Put messages to Amazon SQS (US East):

    • Dev machine: 150000 ms
    • Azure Virtual Machine (East US, extra small): 18000ms

    So I got curious again after those two tests and realized that the virtual machine was in US East , the Azure Storage Account was in US South, and Amazon SQS Service US East, so I reran the Azure test after changing the storage account to East US:

    Put Messages to Azure Queue (East US):

    • Dev machine:  593000 ms * this makes sense because I'm much closer to US South (S. America) however I'm  not sure why the Amazon region performs so much quicker.
    • Azure Virtual Machine:  202000  *I expected this to be a lot faster than Amazon.

    I'm just not sure what to take away from this.  I realize I can run things in parallel but realistically the azure storage seems to be slow especially compared to Amazon's service.   Guys, any suggestions here?

    The same message load to amazon's sqs takes 18 seconds whereas azure queue takes 202 seconds?


    Living the dream http://conchadeltoro.com

    Wednesday, June 20, 2012 10:45 PM

All replies

  • It's important to separate the performance of the queue service and the performance of your client code.

    Queues handle 500 messages per second, and I've seen them go much higher, so enqueuing 1000 messages shouldn't take longer than 2 seconds. If it takes longer, it's because the client sending the messages is the bottleneck.

    To comment on why the client is the bottleneck, we'd need more information (like what programming language and library you're using, and probably the exact code). Based on the fact that you mention "I realize I can run things in parallel," I assume you're doing things sequentially now, so I would expect that to be the bottleneck. (You're effectively measuring latency, when what you care about is probably throughput.)

    Wednesday, June 20, 2012 11:27 PM
  • BTW, this fairly naive Node.js code took just over 22 seconds on my laptop (talking from Seattle to the US East data center):

    var start = new Date();
    var queues = azure.createQueueService();
    Q.all(_.map(_.range(1000), function () { return Q.ncall(queues.createMessage, queues, 'testqueue', 'hello world'); }))
    .then(function () {
    	console.log('Completed in ' + (new Date() - start) / 1000 + ' seconds.');
    });
    

    I assume it would run faster from within the data center, but that depends on what the bottleneck is. I also assume that running 500 iterations each on two different computers would double the speed, but again, that's a guess based on where I think the bottleneck is.

    Wednesday, June 20, 2012 11:35 PM
  • Thanks for the response.  You are correct, as a part of how many messages that can be processed I have to measure both.  My specific question is why is the latency so slow from an Azure VM in the same region as the queue?

    More answers.  I am using C#, The exact code I'm using is from this page: https://www.windowsazure.com/en-us/develop/net/how-to-guides/queue-service/#insert-message  I built a small console app.

                      
    CloudStorageAccount storageAccount = CloudStorageAccount.Parse(CloudConfigurationManager.GetSetting("StorageConnectionString"));         
    CloudQueueClient queueClient = storageAccount.CreateCloudQueueClient();        
    CloudQueue queue = queueClient.GetQueueReference("######");  // hard coded to my queue, http
    
    CloudQueueMessage message;        
    
    queue.CreateIfNotExist();          
    int j = 0;         
    Stopwatch sw = Stopwatch.StartNew();          
    foreach (string s in Templates)   {                 
      message = new CloudQueueMessage(s);
      queue.AddMessage(message);                 
      j++;                
      if (j % 100 == 0) {                      
        Console.WriteLine("Added:" + j + " to the queue");
      }          
    }             
    sw.Stop();
    Console.WriteLine("Adding to the queue: " + sw.ElapsedMilliseconds);

    The exact code I'm using to test the Amazon SQS service.

    So the Queue service can accept 500 messages per second however due to latency I'm getting closer to 5 per second with something being the bottleneck. I'm not concerned about the slowness of my dev machine but I'd expect the VM to outperform put messages to Azure queue over amazon sqs..   There has to be an issue somewhere.

    Wednesday, June 20, 2012 11:41 PM
  • Can you run that from a Virtual Machine in East US and see what happens?  I understand that I can do some in parallel but with the latency and issues I"m seeing it isn't even worth me looking at that part yet.
    Wednesday, June 20, 2012 11:44 PM
  • The simplest guess would be that it just takes longer to write a message to a Windows Azure queue than to SQS. But I can't imagine why you would care about the latency, so it doesn't seem worth investigating. What use case do you have where the latency is important?
    Thursday, June 21, 2012 12:03 AM
  • Not easily. But that's the code. Just install Node, "npm install azure underscore q", and add to the top:

    var _ = require('underscore'),
        azure = require('azure'),
        Q = require('q');
    
    process.env.AZURE_STORAGE_ACCOUNT='<your account>';
    process.env.AZURE_STORAGE_KEY='<your key>';

    Apologies in advance if I didn't get the names of the environment variables correct... I already deleted the code.

    I don't know why the latency is deterring you from investigating throughput.

    Thursday, June 21, 2012 12:06 AM
  • From your description, it seems you experienced very high latency. Can you turn on Storage Analytics to trace your requests to see if it’s server-side or client-side causes latency? Storage Analytics logs request’s E2E latency and server latency. (More details about analytics can be found from this post.) You can also turn on the analytics for your storage account in the new portal (manage.windows.azure.com <See here for details>). We see in many cases high CPU usage and .NET GC can cause client-side slowness leading to overall latency- this can be observed by smaller server latency in storage logs compared to E2E latency. Can you also profile your CPU usage and .NET GC?

    Thursday, June 21, 2012 1:42 AM
  • I've enabled the two metrics and will report back later after further tests.
    Thursday, June 21, 2012 2:13 PM
  • Steve,

    I appreciate the code, in fact I've not done anything in node until after I saw your post and built a little test node app based partly on your code.

    Thursday, June 21, 2012 2:28 PM