POST to Azure tables w/o PartitionKey/RowKey: that's a bug, right?
- i noticed tonight that i can POST an entity body to a table w/o supplying a PartitionKey or RowKey. I can see the new Entity when I query against the table. I can even do a DELETE against the table by passing empty Partition/Row elements.
that's a bug, right?
my traces:
REQUEST **************************************
POST /moretables HTTP/1.1
User-Agent: amundsen/1.0
Content-Type: application/atom+xml
Accept: application/atom+xml
Content-MD5: p+JNsXK0mGi539eht4+d/g==
x-ms-date: Sat, 13 Dec 2008 06:11:11 GMT
authorization: SharedKey mamund:******************
Host: mamund.table.core.windows.net
Content-Length: 1392
Expect: 100-continue
Connection: Keep-Alive
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
.. interesting stuff removed...
<!-- two required items - do not change -->
<d:PartitionKey></d:PartitionKey>
<d:RowKey></d:RowKey>
<!-- two required items - do not change -->
....
RESPONSE ************************************
HTTP/1.1 201 Created
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: application/atom+xml;charset=utf-8
ETag: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
Location: http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')
Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: da949d7f-1536-41a0-9f42-abe860aad743
Date: Sat, 13 Dec 2008 06:10:53 GMT
5A9
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry xml:base="http://mamund.table.core.windows.net/" xmlns:d="http://schemas.microsoft.com/ado/2007/08/dataservices" xmlns:m="http://schemas.microsoft.com/ado/2007/08/dataservices/metadata" m:etag="W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"" xmlns="http://www.w3.org/2005/Atom">
<id>http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')</id>
<title type="text"></title>
<updated>2008-12-13T06:10:54Z</updated>
<author>
<name />
</author>
<link rel="edit" title="moretables" href="moretables(PartitionKey='',RowKey='')" />
...
REQUEST *******************************
GET /moretables(PartitionKey='',RowKey='') HTTP/1.1
User-Agent: amundsen/1.0
Content-Type: application/atom+xml
Accept: application/atom+xml
x-ms-date: Sat, 13 Dec 2008 06:16:30 GMT
authorization: SharedKey mamund:****************************
Host: mamund.table.core.windows.net
Connection: Keep-Alive
RESPONSE ******************************
HTTP/1.1 200 OK
Cache-Control: no-cache
Transfer-Encoding: chunked
Content-Type: application/atom+xml;charset=utf-8
ETag: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: 3d6e6997-3d2d-4f9c-a531-618e79bc36b1
Date: Sat, 13 Dec 2008 06:15:33 GMT
5AA
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<entry ... m:etag="W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"" xmlns="http://www.w3.org/2005/Atom">
<id>http://mamund.table.core.windows.net/moretables(PartitionKey='',RowKey='')</id>
...
REQUEST ****************************************************
DELETE /moretables(PartitionKey='',RowKey='') HTTP/1.1
User-Agent: amundsen/1.0
Content-Type: application/atom+xml
Accept: application/atom+xml
x-ms-date: Sat, 13 Dec 2008 06:16:30 GMT
authorization: SharedKey mamund:*********************************
if-match: W/"datetime'2008-12-13T06%3A10%3A54.60097Z'"
Host: mamund.table.core.windows.net
Content-Length: 0
RESPONSE ***************************************************
HTTP/1.1 204 No Content
Cache-Control: no-cache
Content-Length: 0
Server: Table Service Version 1.0 Microsoft-HTTPAPI/2.0
x-ms-request-id: ba6fb0c5-28f6-4d75-97f3-a456ba1def20
Date: Sat, 13 Dec 2008 06:15:33 GMT
Mike Amundsen [http://amundsen.com/blog/]
解答
- Yes, partitions are the units of scaling.
For the best query times, use a filter on the PartitionKey first, and then a filter on the RowKey if possible.
A query that filters on just the RowKey (but not on a partition key) will only be as efficient as a filter on any other property.
Best : Filter on PartitonKey and RowKey
Next : Filter on PartitionKey and some other property
Last : Filter on just RowKey or any other property.
For efficiency, you should specify the filter on the PartitionKey even if it has the same value for all entities.- 已標示為解答Mike Amundsen 2008年12月21日 上午 03:03
- Hi Mike,
We allow empty string for partition and row keys and only prevent nulls i.e. when the following is sent to the server:So, empty string is by design.<d:PartitionKey m:null="true"></d:PartitionKey> <d:RowKey m:null="true"></d:RowKey>
Thanks,
Jai- 已標示為解答Mike Amundsen 2008年12月13日 下午 09:00
所有回覆
- Hi Mike,
We allow empty string for partition and row keys and only prevent nulls i.e. when the following is sent to the server:So, empty string is by design.<d:PartitionKey m:null="true"></d:PartitionKey> <d:RowKey m:null="true"></d:RowKey>
Thanks,
Jai- 已標示為解答Mike Amundsen 2008年12月13日 下午 09:00
- "empty string is by design"
thanks for the reply.
Mike Amundsen [http://amundsen.com/blog/] - How can String.Empty for both PartitionKey and RowKey ensure a unique primary key value?
Jim Nakashima says in step 7 of his Windows Azure Walkthrough: Simple Table Storage: This default of assigning the PartitionKey and setting the RowKey to a hard coded value (String.Empty) gives the storage system the freedom to distribute the data.
Even that seems strange to me. It appears to create a different entity group from each entity. Does that mean that using a common PartitionKey to define an entity group prevents distributing the data?
--rj
OakLeaf Blog - Having String.Empty for both PartitionKey and RowKey will ensure one unique entity. This does imply that your table will have only one entity under this scheme.
If only the PartitionKey is set on each entity (but not the RowKey), then as you pointed out, each entity group has only one entity, and this allows maximum distribution.
If PartitionKey was empty(or constant), and only the RowKey was set for each entity, there is only one entity group for all the entities in the table, and this allows the minimum distribution i.e. none.
A hybrid where both are set allows the application to choose the size of a partition/entity group.- 已提議為解答Aleks GershaftMSFT, 版主2008年12月19日 下午 07:54
- Naranjan:
ok, my summary of your post follows (please check me if i have this incorrect):
#1 - P=empty, R=empty (one partition, one row)
#2 - P=data, R=empty (multiple partitions, one row per partition)
#3 - P=empty, R=data (one partition, multiple rows for the one partition)
#4 - P=data, R=data (multiple partitions, multiple rows for each partition)
i think i understand the logic of this arrangement, but have yet to come up w/ a compelling use case for it. Can you point me to some example scenarios where any one of the options offers a clear advantage/disadvantage (i.e. read speed, write speed, query speed) over the others?
for example:
- does #2 offer faster read times (for a large collection of rows) over #3 or #4?
- does #3 offer the fastest write time? #2?
- does #3 offer the fastest query time?
- other than specificity, does #4 have any advantage?
also, is option #1 just a (unintended) consequence of the desire to support option #2 & #3?
finally, once billing comes into play, will any of the above scenarios have an impact on account charges (i.e. more partitions = more cost, etc.)?
Thanks
Mike Amundsen [http://amundsen.com/blog/]- 已提議為解答Roger Jennings 2009年4月2日 下午 09:56
- The matrix you have is correct.
#1 is a byproduct of not constraining the values in PartitionKey or RowKey, other than that they need to be strings.
If the key for your logical data model has more than 1 property, you should default to 4.
If the key for your logical data model has only one property, you would default to 2.
We have two columns in the key to separate what defines uniqueness(PartitionKey and RowKey) from what defines scalability(just PartitionKey).
In general, write and query times are less affected by how the table is partitioned. It is affected more by whether you specify the PartitionKey and/or RowKey in the query.
The overall recommendation is "Use option 4 (or 2 if you have only one key property), unless your application has a special need to use only 1 partition". - Niranjan:
thanks for the quick reply. your comments give a new perspective on the thinking behind designing Azure Tables.
"We have two columns in the key to separate what defines uniqueness(PartitionKey and RowKey) from what defines scalability(just PartitionKey)."
Not sure i get this. Partitions are for scaling, then?
"In general, write and query times are less affected by how the table is partitioned. It is affected more by whether you specify the PartitionKey and/or RowKey in the query."
So if you want to improve query times, use only one of the two keys, not both, right?
again, thanks for the info.
Mike Amundsen [http://amundsen.com/blog/]- 已提議為解答Roger Jennings 2009年4月2日 下午 09:56
- Yes, partitions are the units of scaling.
For the best query times, use a filter on the PartitionKey first, and then a filter on the RowKey if possible.
A query that filters on just the RowKey (but not on a partition key) will only be as efficient as a filter on any other property.
Best : Filter on PartitonKey and RowKey
Next : Filter on PartitionKey and some other property
Last : Filter on just RowKey or any other property.
For efficiency, you should specify the filter on the PartitionKey even if it has the same value for all entities.- 已標示為解答Mike Amundsen 2008年12月21日 上午 03:03
- Niranjan:
makes sense now that i see how things are designed.
thank you for the added comments.
Mike Amundsen [http://amundsen.com/blog/]

