Windows Azure Platform Developer Center > Microsoft Visual Studio 2010 Beta 2 Forums > Windows Azure > POST to Azure tables w/o PartitionKey/RowKey: that's a bug, right?
Ask a questionAsk a question
 

AnswerPOST to Azure tables w/o PartitionKey/RowKey: that's a bug, right?

  • Saturday, December 13, 2008 6:20 AMMike Amundsen Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    i noticed tonight that i can POST an entity body to a table w/o supplying a PartitionKey or RowKey. I can see the new Entity when I query against the table. I can even do a DELETE against the table by passing empty Partition/Row elements.

    that's a bug, right?

    my traces:

    REQUEST **************************************
    POST /moretables HTTP/1.1
    User-Agent: amundsen/1.0
    Content-Type: application/atom+xml
    Accept: application/atom+xml
    Content-MD5: p+JNsXK0mGi539eht4+d/g==
    x-ms-date: Sat, 13 Dec 2008 06:11:11 GMT
    authorization: SharedKey mamund:******************
    Host: mamund.table.core.windows.net
    Content-Length: 1392
    Expect: 100-continue
    Connection: Keep-Alive

    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
      .. interesting stuff removed...
      <!-- two required items - do not change -->
      <d:PartitionKey></d:PartitionKey>
      <d:RowKey></d:RowKey>
      <!-- two required items - do not change -->
    ....

    RESPONSE ************************************
    HTTP/1.1 201 Created
    Cache-Control: no-cache
    Transfer-Encoding: chunked
    Content-Type: application/atom+xml;charset=utf-8
    ETag: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
    Location: http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')
    Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
    x-ms-request-id: da949d7f-1536-41a0-9f42-abe860aad743
    Date: Sat, 13 Dec 2008 06:10:53 GMT

    5A9
    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
    <entry xml:base="http://mamund.table.core.windows.net/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" m:etag="W/&quot;datetime'2008-12-13T06%3A10%3A54.60097Z'&quot;" xmlns="http://www.w3.org/2005/Atom">
      <id>http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')</id>
      <title type="text"></title>
      <updated>2008-12-13T06:10:54Z</updated>
      <author>
        <name />
      </author>
      <link rel="edit" title="moretables" href="moretables(PartitionKey='',RowKey='')" />
    ...
     

    REQUEST *******************************
    GET /moretables(PartitionKey='',RowKey='') HTTP/1.1
    User-Agent: amundsen/1.0
    Content-Type: application/atom+xml
    Accept: application/atom+xml
    x-ms-date: Sat, 13 Dec 2008 06:16:30 GMT
    authorization: SharedKey mamund:****************************
    Host: mamund.table.core.windows.net
    Connection: Keep-Alive

    RESPONSE ******************************
    HTTP/1.1 200 OK
    Cache-Control: no-cache
    Transfer-Encoding: chunked
    Content-Type: application/atom+xml;charset=utf-8
    ETag: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
    Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
    x-ms-request-id: 3d6e6997-3d2d-4f9c-a531-618e79bc36b1
    Date: Sat, 13 Dec 2008 06:15:33 GMT

    5AA
    <?xml version="1.0" encoding="utf-8" standalone="yes"?>
    <entry ...  m:etag="W/&quot;datetime'2008-12-13T06%3A10%3A54.60097Z'&quot;" xmlns="http://www.w3.org/2005/Atom">
      <id>http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')</id>
    ...

    REQUEST ****************************************************
    DELETE /moretables(PartitionKey='',RowKey='') HTTP/1.1
    User-Agent: amundsen/1.0
    Content-Type: application/atom+xml
    Accept: application/atom+xml
    x-ms-date: Sat, 13 Dec 2008 06:16:30 GMT
    authorization: SharedKey mamund:*********************************
    if-match: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
    Host: mamund.table.core.windows.net
    Content-Length: 0

    RESPONSE ***************************************************
    HTTP/1.1 204 No Content
    Cache-Control: no-cache
    Content-Length: 0
    Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
    x-ms-request-id: ba6fb0c5-28f6-4d75-97f3-a456ba1def20
    Date: Sat, 13 Dec 2008 06:15:33 GMT




    Mike Amundsen [http://amundsen.com/blog/]

Answers

  • Sunday, December 21, 2008 2:20 AMNiranjan NilakantanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Yes, partitions are the units of scaling.

    For the best query times, use a filter on the PartitionKey first, and then a filter on the RowKey if possible.
    A query that filters on just the RowKey (but not on a partition key) will only be as efficient as a filter on any other property.

    Best : Filter on PartitonKey and RowKey
    Next : Filter on PartitionKey and some other property
    Last : Filter on just RowKey or any other property.

    For efficiency, you should specify the filter on the PartitionKey even if it has the same value for all entities.
  • Saturday, December 13, 2008 7:47 PMJai HaridasMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    Hi Mike,
     We allow empty string for partition and row keys and only prevent nulls i.e. when the following is sent to the server:
      <d:PartitionKey m:null="true"></d:PartitionKey>   
      <d:RowKey m:null="true"></d:RowKey>   
     
    So, empty string is by design.

    Thanks,
    Jai
    • Marked As Answer byMike Amundsen Saturday, December 13, 2008 9:00 PM
    •  

All Replies

  • Saturday, December 13, 2008 7:47 PMJai HaridasMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    Hi Mike,
     We allow empty string for partition and row keys and only prevent nulls i.e. when the following is sent to the server:
      <d:PartitionKey m:null="true"></d:PartitionKey>   
      <d:RowKey m:null="true"></d:RowKey>   
     
    So, empty string is by design.

    Thanks,
    Jai
    • Marked As Answer byMike Amundsen Saturday, December 13, 2008 9:00 PM
    •  
  • Saturday, December 13, 2008 9:55 PMMike Amundsen Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    "empty string is by design"

    thanks for the reply.





    Mike Amundsen [http://amundsen.com/blog/]
  • Friday, December 19, 2008 6:36 PMRoger Jennings Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    How can String.Empty for both PartitionKey and RowKey ensure a unique primary key value?
     
    Jim Nakashima says in step 7 of his Windows Azure Walkthrough: Simple Table Storage: This default of assigning the PartitionKey and setting the RowKey to a hard coded value (String.Empty) gives the storage system the freedom to distribute the data.

    Even that seems strange to me. It appears to create a different entity group from each entity. Does that mean that using a common PartitionKey to define an entity group prevents distributing the data?

    --rj

    OakLeaf Blog
  • Friday, December 19, 2008 7:45 PMNiranjan NilakantanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Proposed Answer
    Having String.Empty for both PartitionKey and RowKey will ensure one unique entity.  This does imply that your table will have only one entity under this scheme.

    If only the PartitionKey is set on each entity (but not the RowKey), then as you pointed out, each entity group has only one entity, and this allows maximum distribution.

    If PartitionKey was empty(or constant), and only the RowKey was set for each entity, there is only one entity group for all the entities in the table, and this allows the minimum distribution i.e. none.

    A hybrid where both are set allows the application to choose the size of a partition/entity group.
  • Friday, December 19, 2008 9:29 PMMike Amundsen Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Proposed Answer
    Naranjan:

    ok, my summary of your post follows (please check me if i have this incorrect):
    #1 - P=empty, R=empty (one partition, one row)
    #2 - P=data, R=empty (multiple partitions, one row per partition)
    #3 - P=empty, R=data (one partition, multiple rows for the one partition)
    #4 - P=data, R=data (multiple partitions, multiple rows for each partition)

    i think i understand the logic of this arrangement, but have yet to come up w/ a compelling use case for it.  Can you point me to some example scenarios where any one of the options offers a clear advantage/disadvantage (i.e. read speed, write speed, query speed) over the others?

    for example:
    - does #2 offer faster read times (for a large collection of rows) over #3 or #4?
    - does #3 offer the fastest write time? #2?
    - does #3 offer the fastest query time?
    - other than specificity, does #4 have any advantage?

    also, is option #1 just a (unintended) consequence of the desire to support option #2 & #3?

    finally, once billing comes into play, will any of the above scenarios have an impact on account charges (i.e. more partitions = more cost, etc.)?

    Thanks

    Mike Amundsen [http://amundsen.com/blog/]
  • Saturday, December 20, 2008 9:13 PMNiranjan NilakantanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    The matrix you have is correct.

    #1 is a byproduct of not constraining the values in PartitionKey or RowKey, other than that they need to be strings.

    If the key for your logical data model has more than 1 property, you should default to 4.

    If the key for your logical data model has only one property, you would default to 2.

    We have two columns in the key to separate what defines uniqueness(PartitionKey and RowKey) from what defines scalability(just PartitionKey).

    In general, write and query times are less affected by how the table is partitioned.  It is affected more by whether you specify the PartitionKey and/or RowKey in the query.

    The overall recommendation is "Use option 4 (or 2 if you have only one key property), unless your application has a special need to use only 1 partition".
  • Saturday, December 20, 2008 11:28 PMMike Amundsen Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Proposed Answer
    Niranjan:

    thanks for the quick reply. your comments give a new perspective on the thinking behind designing Azure Tables.

    "We have two columns in the key to separate what defines uniqueness(PartitionKey and RowKey) from what defines scalability(just PartitionKey)."
    Not sure i get this. Partitions are for scaling, then?

    "In general, write and query times are less affected by how the table is partitioned.  It is affected more by whether you specify the PartitionKey and/or RowKey in the query."
    So if you want to improve query times, use only one of the two keys, not both, right?

    again, thanks for the info.


    Mike Amundsen [http://amundsen.com/blog/]
  • Sunday, December 21, 2008 2:20 AMNiranjan NilakantanMSFTUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Yes, partitions are the units of scaling.

    For the best query times, use a filter on the PartitionKey first, and then a filter on the RowKey if possible.
    A query that filters on just the RowKey (but not on a partition key) will only be as efficient as a filter on any other property.

    Best : Filter on PartitonKey and RowKey
    Next : Filter on PartitionKey and some other property
    Last : Filter on just RowKey or any other property.

    For efficiency, you should specify the filter on the PartitionKey even if it has the same value for all entities.
  • Sunday, December 21, 2008 3:03 AMMike Amundsen Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Niranjan:

    makes sense now that i see how things are designed.

    thank you for the added comments.



    Mike Amundsen [http://amundsen.com/blog/]