none
Perfomance - Table Service, SQL Azure - insert. Query speed on large amount of data.

    Question

  • Hi! I'd read many posts and articles about comparing SQL Azure and Table Service and most of them told that Table Service is more scalabile than SQL Azure. 1 2 3 4 5 6 7 8 9  But this benchmark shows different picture.

    My case. Using SQL Azure: one table with many inserts, about 172,000,000 per day(2000 per second). Can I expect good perfomance for inserts and selects when I have 2 million records or 9999....9 billion records in one table?

    Using Table Service: one table with some number of partitions. Number of partitions can be large, very large.

    Question #1: is Table service has some limitations or best practice for creating many, many, many partitions in one table?

    Question #2: in a single partition I have a large amount of small entities, like in SQL Azure example above.  Can I expect good perfomance for inserts and selects when I have 2 million records or 9999 billion entities in one partition?

    I know about sharding or partition solutions, but it is a cloud service, is cloud not powerfull and do all work without my code skills?

    Question #3: Can anybody show me benchmarks for quering on large amount of datas for SQL Azure and Table Service?

    Question #4: May be you could suggest a better solution for my case.

    Thanks in advance.

     

     

    Wednesday, October 06, 2010 12:06 PM

Answers

  • The scalability targets for Azure Storage are described in this post from the Azure Storage Team. The scalability statements about NoSQL stores like Azure Storage refer to the quantity of data stored not query performance.

    1) A Table Service table can have any number of partitions from one to the number of entities. The partition structure is very application dependent - i.e. on how the data is used.

    Smaller partitions helps performance because the storage service can migrate hot partitions. Furthermore, the maximum number of operations per second on a table is somewhat higher than that for partitions so more partitions allow higher performance through parallel operations.

    Larger partitions provide more opportunity for entity group transactions. They also offer a decreased probability of getting continuation tokens which decreases performance.

    2) The PartitionKey provides auto-sharding since its value is used to distribute entities to different partitions automatically. Inside a single partition entities are indexed by RowKey so any query would need to include that. I have not seen any performance numbers for large numbers of entities in a single partition.

    • Marked as answer by Mog Liang Wednesday, October 13, 2010 9:24 AM
    Wednesday, October 06, 2010 4:14 PM
    Answerer

All replies

  • The scalability targets for Azure Storage are described in this post from the Azure Storage Team. The scalability statements about NoSQL stores like Azure Storage refer to the quantity of data stored not query performance.

    1) A Table Service table can have any number of partitions from one to the number of entities. The partition structure is very application dependent - i.e. on how the data is used.

    Smaller partitions helps performance because the storage service can migrate hot partitions. Furthermore, the maximum number of operations per second on a table is somewhat higher than that for partitions so more partitions allow higher performance through parallel operations.

    Larger partitions provide more opportunity for entity group transactions. They also offer a decreased probability of getting continuation tokens which decreases performance.

    2) The PartitionKey provides auto-sharding since its value is used to distribute entities to different partitions automatically. Inside a single partition entities are indexed by RowKey so any query would need to include that. I have not seen any performance numbers for large numbers of entities in a single partition.

    • Marked as answer by Mog Liang Wednesday, October 13, 2010 9:24 AM
    Wednesday, October 06, 2010 4:14 PM
    Answerer
  • This recently published paper documents observations of Azure performance and describes results for Azure Tables among other aspects.
    Wednesday, October 06, 2010 7:35 PM
    Answerer
  • Sorry about for my silence, I've drill down into cloud computing and make some little research. It's simple stress test. Now need time to collect statistics and some day I share my result, I think :)
    Thanks you for reply! pdf file with Azure benchmark is great! Thanks a lot!
    Friday, October 22, 2010 6:27 AM