none
How to Generate Unique PartitionKeys in Azure Table Storage

    Question

  • SQL databases provide an autonumber primary key field that is incremented for each new data row inserted.  This makes it easy to add a new customer to tblCustomers and ensure their customerID is unique.  How do I do this in Azure Tables?

    Thank you!


    Tuesday, February 7, 2017 4:23 PM

All replies

  • Hello, 

    Please check the link below:

    https://docs.microsoft.com/en-us/azure/storage/storage-dotnet-how-to-use-tables

    Warm regards,

    Siby Philip

    Tuesday, February 7, 2017 4:51 PM
  • HI Siby - That page has zero information on what I am asking.

    How do I generate a unique PartitionKey in Azure Tables?

    Do you have any specific information on how to do this?

    Thank you.

    Tuesday, February 7, 2017 7:17 PM
  • More specifics for you:

    I have a table for customers that uses customerID as the primary key. In SQL Server, this field is an integer and auto number that is incremented by 1 for each new row added to the SQL database table.  This feature is built into SQL server but is not available in Azure Tables.

    What is the best way (performance, consistency, scalability) to accomplish this goal in Azure Tables?  If the largest customerID value is 1234, what is the best way to add a new row with customerID 1235?

    If I call the table to get the latest value, then manually increment that value and insert it into the table...the latency involved with those steps opens up the possibility of duplicate customerIDs.

    I hope that helps. I am assuming this is a common issue and was hoping for a 'best possible' solution.

    Thank you.

    Tuesday, February 7, 2017 7:27 PM
  • There's currently no built-in auto-increment feature in Azure Table Storage. You can vote for this feature here.

    Also, it would be worth checking out the guide for Designing a Scalable Partitioning Strategy for Azure Table Storage. The concept of a primary key in Azure Table is different from that of SQL Server.

    --------------------------------------------------------------------------------------------

    Md. Shihab

    Thursday, February 9, 2017 9:34 AM
  • Thank you for the reply.  You are correct.  Does MSFT have a best practices work around for this?
    Friday, February 10, 2017 2:41 AM
  • There are three alternatives for you:
    1) Use a GUID instead.  It’s bigger, but basically eliminates the need for any coordination.
    2) Build a parallel mechanism for providing incremented values.  This can be tough if concurrency isn't properly handled (either requires building a service, or multiple table accesses per key, the latter may not scale well).
    3) Stick with SQL.  Azure Tables isn’t meant to completely replace SQL, it’s not a full database, it’s just a place to store some data with table like semantics.
    --------------------------------------------------------------------------------------------
    Md. Shihab
    • Proposed as answer by Md Shihab Sunday, February 19, 2017 7:31 AM
    Monday, February 13, 2017 2:45 AM
  • Also worth mentioning that Append Only pattern like sequential partition keys, will likely have a negative impact on performance of inserting a new entity because of the fact that storage service may create range partitions for these entities and bundle them to be served from the same partition server. The idea behind that is I believe to optimize for range queries. If they are not frequently using range queries then they may be better off not using sequential partition keys for scalability.

    If you choose to use the sequential partition keys, an alternative solution is to use a header entity when you insert a new entity to the table and do a batch insert. The header entity would have the same – supposedly latest – partition key and a fixed row key (ie. something like a constant string “HeaderRow”).

    The operation would look like batch inserting ((PK1, “HeaderEntity”) and (PK1, RK1)).  If another process did insert (PK1, “HeaderEntity”)in the meantime, the batch insert operation would fail because the header entity would exist in the table already.

    Two outcomes of that, firstly batch operation is atomic so no dupe keys / entities would be created, secondly the failed batch operation would give you the exact http status code for that error and it would also give you the index of the failed operation in the batch  (embedded inside its extended error info) which would be 0 in this case assuming that the first operation in your batch is your header entity. The order does not matter really, since batch operation is atomic, just pointing out the correlation. Based on that you can deterministically understand that, in the interim some other process has created the same partition key and so he can re-compute the partition key and resubmit the request. All of this could be done programmatically in n autonomous retry logic.

    If you are not familiar with parsing the Storage Exception, there is a handy nugget package to do that and extract failing index of the operation, http status code etc. here  https://www.nuget.org/packages/AzureStorageExceptionParser/

    --------------------------------------------------------------------------------------------

    Md. Shihab


    • Edited by Md Shihab Monday, February 13, 2017 10:31 AM
    Monday, February 13, 2017 10:30 AM