locked
Azure Table inserts do not scale RRS feed

  • Question

  • After investigating the error I reported on this post, I have found proof that something is going wrong with azure storage performance. After (54070 * 10 * 100) rows in a partition, any insert take disproportionate amount of time.

    To reproduce it, I just inserted dummy data in the partition and monitored how much time it took to insert 10 batch.
    The code is joined.
    I suspect this problem is very recent, I already had so much data without any problem at all.

    Here is the source code to reproduce this scaling problem (you have to wait around (54070 * 10) batches until you start having problems)

    static void Main(string[] args)
    {
        StorageCredentials creds = new StorageCredentials("account", "pass");
        CloudStorageAccount account = new CloudStorageAccount(creds, false);
        var table = account.CreateCloudTableClient().GetTableReference("scaleproblem3");
        table.CreateIfNotExists();
        Random rand = new Random();
        var random = new byte[32];
        FileStream fs = File.Open("data.csv", FileMode.Append);
        StreamWriter writer = new StreamWriter(fs);
        int total = 0;
        int count = 0;
        Stopwatch watch = new Stopwatch();
        watch.Start();
        while (true)
        {
    
            TableBatchOperation batch = new TableBatchOperation();
            for (int i = 0 ; i < 100 ; i++)
            {
                rand.NextBytes(random);
                var rowKey = String.Join("", random.Select(b => b.ToString("00")).ToArray());
                batch.Add(TableOperation.InsertOrReplace(new DynamicTableEntity("a", rowKey)));
            }
            table.ExecuteBatch(batch);
            count++;
            if (count == 10)
            {
                total += count;
                count = 0;
                var elapsedSec = (int)watch.Elapsed.TotalSeconds;
                watch.Restart();
                string line = total + "," + elapsedSec;
                writer.WriteLine(line);
                writer.Flush();
                Console.WriteLine(line);
            }
        }
    }
    Storage account West Europe like the VM that runned the test, same affinity group.






    Saturday, January 10, 2015 7:24 PM

All replies

  • After investigating the error I reported on this post, I have found proof that something is going wrong with azure storage performance. After around 54070 * 10 batches or 54070 * 10 * 100 rows, any insert take a disproportionate amount of time.

    To reproduce it, I just inserted dummy data in the partition and monitored how much time it took to insert 10 batch.
    The code is joined.
    I suspect this problem is very recent, I already had so much data without any problem at all.

    Here is the source code to reproduce this scaling problem (you have to wait around (54070 * 10) batches until you start having problems)

    static void Main(string[] args)
    {
        StorageCredentials creds = new StorageCredentials("account", "pass");
        CloudStorageAccount account = new CloudStorageAccount(creds, false);
        var table = account.CreateCloudTableClient().GetTableReference("scaleproblem3");
        table.CreateIfNotExists();
        Random rand = new Random();
        var random = new byte[32];
        FileStream fs = File.Open("data.csv", FileMode.Append);
        StreamWriter writer = new StreamWriter(fs);
        int total = 0;
        int count = 0;
        Stopwatch watch = new Stopwatch();
        watch.Start();
        while (true)
        {
    
            TableBatchOperation batch = new TableBatchOperation();
            for (int i = 0 ; i < 100 ; i++)
            {
                rand.NextBytes(random);
                var rowKey = String.Join("", random.Select(b => b.ToString("00")).ToArray());
                batch.Add(TableOperation.InsertOrReplace(new DynamicTableEntity("a", rowKey)));
            }
            table.ExecuteBatch(batch);
            count++;
            if (count == 10)
            {
                total += count;
                count = 0;
                var elapsedSec = (int)watch.Elapsed.TotalSeconds;
                watch.Restart();
                string line = total + "," + elapsedSec;
                writer.WriteLine(line);
                writer.Flush();
                Console.WriteLine(line);
            }
        }
    }


    Saturday, January 10, 2015 7:23 PM
  • My understanding is that once your table has (54070 * 10 * 100) rows in it and you start your batch import, you will immediately experience slow performance.

    What is your calculation of the index size and data size?

    Can it be due to the new storage library 4.3 which was made available Sept 16, 2014?

    Will using Insert rather than InsertOrReplace make a difference in performance?


    Frank

    Saturday, January 10, 2015 8:37 PM
  • Yes TChiang.

    The 54070 is approximately when I started having slowness. (reading from the graph I generated with the data in the repro code.

    I was taking one sample (batchCount, insertTime) every 10 batch.

    Which means that 54070 was equal to 10 * 54070 batch and so 10*54070*100 rows.

    Insert instead of Insert or Replace does not change anything. Each row are approximately 100 bytes.

    You can run this code yourself. You will see that around 54070 +- 5000, you'll start experience the same problem. (It took me several hours)

    [UPDATE]

    No, in fact, since it is the number of batch, the bug appeared after 54070 * 100 rows.
    Nevertheless, the point is that after some threshold insert take a unreasonable amount of time suddenly.

    I am almost sure this is a bug from the storage, since I never seen that before, and no documentation talk about the consequence of partition size on insert time.
    The fact that it jump suddenly and not progressively is very strange also.

    Also, the limit does not apply to the table but to the partition.
    It is not storage library fault, like I show in the previous post, fiddler clearly show that the storage takes time to respond to http requests.


    Sunday, January 11, 2015 12:36 AM
  • It looks like an architectural limit to me and for a partition it may mean 54M rows. Wonder if you will consider creating different partitions as this practice is supposed to improve performance. How about storage library version 4.3.0.0? Of course we all love to hear from Microsoft about this possible limitation.

    Frank

    Sunday, January 11, 2015 2:13 AM
  • This limit is undocumented.
    Moreover, I never had the problem before. So I'd like confirmation about it from Microsoft, and knowing if they will fix it.

    I don't consider using more partition. Using more would have impact on RAM requirement for my application. (Every time I had 1 bit of info into a partition, I need to double the amount of data I am storing in RAM)
    If Microsoft can't do anything about it, and are clear about partition limits, then I'll think about re architecturing my solution. Having some system already deployed, it will also mean redeploying every thing with the new partitioning strategy, and reindexing tons of data, which will take lots of time.

    As I said, storage library can't do anything about it. The storage library sends the request as it should do, but can't influence how much time the storage server will take for the insert. Fiddler screenshot in previous post prove that.


    Sunday, January 11, 2015 2:38 AM
  • Hi,

    Thank you for posting in here.

    We are looking into this and will get back to you at earliest. Your patience is greatly appreciated.

    Regards,

    Manu Rekhar

    Sunday, January 11, 2015 2:16 PM
  • Tried again. On North Europe this time.

    For North Europe insert time goes up smoothly with time. (indicating nevertheless that insert inside a partition does not scale, and this is documented nowhere)

    The "Steps" effect is because I truncated my data down to the second.

    West Europe is messed up. The storage take 2 more time to do my stuff + the insert time jumping to 20 seconds at 50K....

    Tuesday, January 13, 2015 11:55 AM
  • I see that you have already spent a lot of time trying to document and resolve a problem that didn't exist before, and this problem is so important to your application. Re-architecting your solution will mean a lot more work for you so I think it is worth opening a support ticket so that you get the official support and answer. It may be free anyway since this problem didn't exist before and you may have discovered a deficiency in their system.

    Frank

    Tuesday, January 13, 2015 2:47 PM
  • May I ask you where is the support ticket and official support ?
    I thought that Microsoft support guys were around here.

    I finished by migrating my stuff by adding *4 more partition and moving out of West Europe, I am reindexing everything. (takes 3 or 4 days)
    Even the insert throughtput at time 0 is 2 times faster on North Europe than West Europe.

    10 batch of 100 takes 2 sec on west europe but less than 1 sec in North Europe. (Given my test program)
    I have seen this performance slowness of west europe long time ago. Only the big insert jump perf kill is new.

    But want still to know the reasons, this shook some of my deep assumption of Azure, if this is by design and not a bug then I would need to know for the future.

    Tuesday, January 13, 2015 10:53 PM
  • After logged in, click your account name and you will see a dropdown with a 'contact support' menu item, there you can create a support ticket. My experience in the past was that you discuss your case with the support engineer and they will evaluate. I haven't done this for quite a while so not 100% sure how it goes now. I suppose Microsoft staff here can give you the authoritative answer.


    Frank

    Wednesday, January 14, 2015 1:14 AM
  • Thanks for the help, no support with my Bizpark account. ;(

    I'll just cross fingers that the support come across this post ! ;)

    Wednesday, January 14, 2015 2:01 AM
  • Hi,

    I am not sure this is a bug. Within a single partition, the scalability limits for accessing (inserting) entities is 2000 entities per second.  

    You are probably throttled because you approach any of the scalabilty targets of the system or the system is rebalancing your partition to allow for higher throughput. Do you see any errors in the client library, for example “503 Server busy” or “500 Timeout”?

    If this is the case, the default behavior of the client library is to use an exponential back off policy and wait for a few seconds before it tries to insert the batch again. Is this what is happening?

    Hope this helps....

    Edward

    Thursday, January 15, 2015 7:26 PM
  • It seems to me that this isn't really a real world scenario...  What situation could possibly clamp to force only one PartitionKey, with 54 million rows underneath?

    How do you plan to find a single or group of records in that mess?

    I'd follow the azure storage best practices and partition your data properly.  It's all about retrieving data and how to store it to favour fast reads.  If you can read it fast, you're storing it correctly.  Unless you never have to read, but then why store it in the first place?

    In terms of Microsoft not documenting the limitation - I'd say Microsoft hoped to catch 99% of the data use cases, and you're the 1% (assuming you have a legitimate reason for not being able to do it any other way).

    You'll have the same problem, perhaps at different limits, for every other storage solution I've ever worked with, including NTFS or SQL server, or Oracle...

    I'm dying to hear more...


    Darin R.

    Thursday, January 15, 2015 7:36 PM
  • I am not sure this is a bug. Within a single 
    partition, the scalability limits for accessing (inserting) entities is 
    2000 entities per second.

    5 second for inserting 500 entities is not what I call "2000 entities per second".
    And if it were the case, the problem would manifest from the beginning, not after a magic threshold.

    Thursday, January 15, 2015 8:08 PM
  • It seems to me that this isn't really a real world scenario...  What situation could possibly clamp to force only one PartitionKey, with 54 million rows underneath?How do you plan to find a single or group of records in that mess?


    I have sparse data, which mean that very few of that is accessed, but when they are accessed, I do a range query which is really fast in the partition. I had initially 255 partition.

    The scenario is indexing the Bitcoin Blockchain. There is more than 300 000 million line to index, and it  increase every day. I need to insert all that data the most efficiently way I can, without breaking my RAM. Having 255 partitions was good enough for me and took only 100 MB of RAM. Improving that means I need to store more entities in RAM during the insert. (now taking 10 bytes, which is 400 MB in RAM)

    My query load is very low. Maybe one query per minutes, so I don't care about throttling once everything is indexed.

    I am fine with microsoft not supporting such a big partition.... but they should document it, because it messed up my environment suddenly, and made me reindex everything from scratch.
    I can adapt to smaller partition, but the partitioning strategy is an architectural decision that can have an high impact if you need to change it.
    You don't change 400 000 million line by clasping hands.
    Thursday, January 15, 2015 8:08 PM
  • Try querying your 54 million entities as you would in your app before you go any further.

    I don't see how "Microsoft messed up your environment suddenly."  You just ran into an effect of a poor design for the platform - Read the scaling and performance docs for azure, and their best practices.  Watch their Channel 9 videos.  Check out the MSDN magazine blogs, such as this one.  Read about partitioning and scaling and continuation tokens.

    Remember, you're probably doing something at the upper limits, and you'll have to be exceptional at making your data match the benefits of the platform.

    I don't have enough info to assist and help you do it a better way, but if you can let me know what you'll be searching for in the block chain after they are all stored, I might be able to come up with a better partitioning method.  Again, if you can't find it quickly, there's no need to bother with the storing.


    Darin R.

    Thursday, January 15, 2015 10:06 PM
  • Try querying your 54 million entities as you would in your app before you go any further.

    It works, no problem, near instant since it is range queries. I am well aware about how to optimize queries, and never had problem with it. My problem is about insert and magic threshold.

    Read the scaling and performance docs for azure, and their best practices.  Watch their Channel 9 videos.  Check out the MSDN magazine blogs, such as this one.  Read about partitioning and scaling and continuation tokens.

    Read it all, listened to all, and no one ever mentioned that you should not put 54 million entities in a partition.
    What they say is that you should not query more than 2000 entity per seconds per partition. And I don't.
    Nothing says that the time of insert depends on the size of the partition.

    I even quote Microsoft "In general, write and query times are less affected by how the table is partitioned." of Niranjan Nilakantan of Microsoft (Source)

    It is not like I never used Azure, and don't know how the storage works. Using it in prod for several year, and I am even a Microsoft Certified Trainer on it. The data is of several TB.

    What I complain is that the limits are not documented, and worse, depends on the location of your storage region. As I said, changing my partition strategy impact the RAM... If I don't want to impact the RAM, I need to make small batches, which make more transactions. More transactions make my perf suffer from more latency as well as the price.

    West europe is a shitty region.

    I did not chose my partition at random.

    Worse the problem in west europe came after 5 million line, at 100 bytes each which is 500 MB... hardly big data. The problem is on the number of line, not the size.





    Friday, January 16, 2015 12:59 AM
  • I'd definitely start a support call with Microsoft if you've got something contradictory to the courses you've taken.

    re: Nothing says that the time of insert depends on the size of the partition.

    I guess I just assumed that based on other laws of storage.  It makes sense that storing items in an existing structure containing 1 million would be faster than storing items in an existing group of 1 billion.  I can't see how azure would be exempt.

    Re: "In general, write and query times are less affected by how the table is partitioned"

    Interesting quote.  I'd love to see the article/bigger context where this was taken from, but I can't seem to find the referenced article.  I do think it's slightly out of context - if partitioning and RowKeys weren't important, there wouldn't be so many articles placing the importance of it.  (Including the very blog post that quote was featured on.)

    Re: What I complain is that the limits are not documented,

    I believe that's why they publish "best practices" and all the articles about choosing Partition and RowKeys.  The individual developer would have to take their need and build a POC of a variety of different methods, then find the best one for them.  Support could help you with that, I'd suppose.

    Re: and worse, depends on the location of your storage region.

    Not surprised at that at all.  Different regions would have different storage or network loads.  I'd also bet this would change over time as Microsoft deployed resources to data centers that have been found to be loaded more than others.

    Sorry I couldn't be of any real help.  I've not done it your way for any of my projects, and don't have any of the issues you've had. 

    Based on your posts, it sounds like you're pretty cemented into one design, so I'm not sure how anyone other than Microsoft could help out.  Start a support ticket with them, see what they offer. 

    I'd be very interested to hear how it's resolved  - post a solution if they have one.


    Darin R.

    Friday, January 16, 2015 4:54 PM
  • Re : I guess I just assumed that based on other laws of storage.  It makes sense that storing items in an existing structure containing 1 million would be faster than storing items in an existing group of 1 billion.

    It does not. Since range queries are fast, it means that internally microsoft is surely using binary trees to store items in a partition. A binary tree complexity for insert is O(Log(n)), so I should get Log(n) time for inserts, not a WTF bump after magic threshold.

    It would makes sense that for big partitions in size, the storage node is split after some threshold... but geez 500MB is not the end of the world for one storage node.... Splitting would make my perf slower for some times, until the split is finished, and the second storage node is operational. (I should take a look at ms white paper)

    Re : The individual developer would have to take their need and build a POC of a variety of different methods, then find the best one for them.

    And I did ! the magic started happening suddenly, not at the time of my POC.

    I'll try to make a call, or asking some connection that can get contact with ms.

    [UPDATE]

    reading the white paper of Storage http://www-bcf.usc.edu/~minlanyu/teach/csci599-fall12/papers/11-calder.pdf to solve the mystery... 

    [/UPDATE] 

    [Conclusions]

    West Europe is shit, don't put your big data there. Thanks to not mark this as a response since microsoft did not gave any solution nor response.

    [/Conclusions]


    Friday, January 16, 2015 5:06 PM
  • The larger the partition gets, random writes may become relatively slower. Small partitions are better for performance and scale as we can load balance it to meet the scale of your traffic. That said, it will  be good to get your storage account to confirm what you are seeing is due to just large partition or some other reason. If you can send an email to ascl@microsoft.com with your account name and approximate timeline of this slowness, we can have a look at it. Thanks.
    Monday, March 9, 2015 9:08 PM
  • I sent a mail but it got rejected. (said not authenticated)
    So I copy here

    Please don't mark a response as answer when no solution has been provided.

    I contact you for the scaling problem noticed in West Europe and detailed here : https://social.msdn.microsoft.com/Forums/azure/en-US/dbde4333-26a7-44b5-a2a6-8d373dd12d89/azure-table-inserts-do-not-scale?forum=windowsazuredata

    I repeat myself : I am not affected in North Europe until only way way way more data is inserted.  On North Europe the time it takes for a batch insert goes up linearly after way much more data is inserted.
    Again please read carefully my thread on the forum, I documented that carefully.
    Microsoft tends to give copy/pasted responses on the forum after reading the first 3 words, without reading the content. Please don't waste the time I spent documenting the problem.
    The difference of performance behavior between different regions, clearly led me to advice some region over others to my customers depending, not on the geographic position of their users, but on the quality of their storage of the region. The state of West Europe storage is BAD, not advisable for production for some of my customers.
    The performance has gotten even worse since the time I posted on the forum.
    I reproduced the problem on the storage account "nbitcoin".

    A tool running my test program is running right now and it takes 3 sec per batch to insert. I noticed the same pattern I'm talking about on the thread.

    This graph show the insert time performance variation after 21570 batches in one partition (or 2.157.000 entities)
    The same graph in North Europe is going up linearly after more than 150.000 batch.

    Entity size is 32 bytes approximately. (total approx 65 MB, which is hardly "a big partition")

    Now Microsoft Please : either do not respond to this thread, either respond after having read my actual complaint COMPLETELY. But do not mark it solved until you have an answer about why west Europe sucks so much compared to north Europe.

    I am very pleased and excited by Azure products, but the way questions are handled on this forum has been a big waste of time until now, and it is not the first time it happens.

    Sorry if it seems to be aggressive, but it hurts spending so much time documenting your flaws and see it ignored with a copy/pasted response you use to give everywhere.



    Tuesday, March 10, 2015 1:38 AM
  • My apologies for the inconvenience. I have a different contact that should be able to help you out with your problem Manish.Chablani@microsoft.com. Please include me on the TO line as well micurd@microsoft.com. Thanks.
    Thursday, March 12, 2015 5:54 PM