locked
A couple basic questions on Table Storage RRS feed

  • Question

  • I have a few basic questions on Table Storage.

    1. Records that are in different tables, so long they are in the same partition (having the same PartitionKey), they will be stored on the same physical machine?  Thus, I won't degrade access performace by storing things in different tables, so long they are stored with the same PartitionKey?
    2. It is only for doing transactions that I need to have the records stored in the same table and the same partition?
    3. Will the operations in a transation batch have read operations, or only CUD operations?

    Thanks,

    Ray.

    Thursday, December 16, 2010 4:25 PM

Answers

  • The following is from the Azure Storage Team scalability post referred to earlier:

    a table partition are all of the entities in a table with the same partition key value, and most tables have many partitions.

    In the post ray237 refers to, Yi-Lun Luo writes:

    Within a single storage account, entities belong to the same partition are always stored on the same server, even if they're in different tables. Entities with different partition keys are stored on different servers, so queries can be load balanced.

    This is the only time I have ever seen mention that entities with the same PartitionKey are stored on the same server regardless of table. This may be a completely undocumented implementation detail and as such it is not something you should rely on. It could, of course, be an uncharacteristic mistake.

    You should assume that data in different tables is in different partitions and as such may be handled by different partition servers.

    • Marked as answer by ray247ray Thursday, December 16, 2010 10:05 PM
    Thursday, December 16, 2010 9:44 PM
    Answerer

All replies

  • Hi,

    1) I think id objects are in different tables with the same partition key, the partitions for each table can be distributed across multiple servers.

    2) Entity group transactions can currently include up to 100 entities in the same partition key and the payload must be under 4 MB.

    3) I think they are used for CUD operations only.

    Regards,

    Alan

     


    http://www.CloudCasts.net - Community Webcasts Powered by Azure
    Thursday, December 16, 2010 8:12 PM
  • Hi Ray,

    In addition to the answers Alan gave I suggest to read this blog post about Azure storage scalability. This also includes some more info on partitioning and load balancing and batch transactions.

    Hope this helps.

    Edward 

    Thursday, December 16, 2010 8:21 PM
  • Records that are in different tables, so long they are in the same partition (having the same PartitionKey), they will be stored on the same physical machine?  Thus, I won't degrade access performace by storing things in different tables, so long they are stored with the same PartitionKey?

    A partition is defined by table name and PartitionKey. Consequently, objects in different tables are in different partitions regardless of the values of the PartitionKey. Storing things in different tables may actually improve performance since there are different scalability targets (see post referred to by Edward Bakker) for partitions and  storage accounts. Indeed, using distinct storage accounts - for example, for diagnostics - could improve performanc even further. These performance benefits require the use of parallel queries.

    It is only for doing transactions that I need to have the records stored in the same table and the same partition?

    No. There are performance benefits from doing a range scan within a single partition. You may encounter continuation tokens if a range scan crosses a partition boundary (with a different PartitionKey). These continuation tokens indicate you need to go back to the server for more data - and this has a performance impace.

    Will the operations in a transation batch have read operations, or only CUD operations?

    The rules for entity group transactions are documented here. You can submit a transaction with create, update and delete OR you can submit a transaction with queries. You can't mix them.

    Thursday, December 16, 2010 8:59 PM
    Answerer
  • Thanks guys for your inputs.  I read in this post that http://social.msdn.microsoft.com/Forums/en-US/windowsazuredata/thread/5ed96f0f-e4e2-4b0a-9c39-2e227f10b1f1 same partition records, regardless table, will be stored on the same server.  Just want to confirm if this is true.
    Thursday, December 16, 2010 9:23 PM
  • The following is from the Azure Storage Team scalability post referred to earlier:

    a table partition are all of the entities in a table with the same partition key value, and most tables have many partitions.

    In the post ray237 refers to, Yi-Lun Luo writes:

    Within a single storage account, entities belong to the same partition are always stored on the same server, even if they're in different tables. Entities with different partition keys are stored on different servers, so queries can be load balanced.

    This is the only time I have ever seen mention that entities with the same PartitionKey are stored on the same server regardless of table. This may be a completely undocumented implementation detail and as such it is not something you should rely on. It could, of course, be an uncharacteristic mistake.

    You should assume that data in different tables is in different partitions and as such may be handled by different partition servers.

    • Marked as answer by ray247ray Thursday, December 16, 2010 10:05 PM
    Thursday, December 16, 2010 9:44 PM
    Answerer
  • Just to confirm, entities in different tables are in different partitions, regardless of the partition key.  (I'm agreeing with what Neil said, and disagreeing with the post ray247 cited.  I believe that was just a mistake.)
    Friday, December 17, 2010 1:01 AM