We are now considering to move to azure table storage. Our current data is saved to a relational database but the data model is mostly non-relational and can easily be adopted into a nosql model.
The application is an online game with about 2 million registered users and about 10,000 users playing concurrently.
In almost all our sql tables, the unique user name is the key, or part of the key.
What is the best strategy for choosing a partition key in our case? Is is the user, a part of the user name, a hash of the user name, or any other method?
Since you have already identified you primary key as the user name, a simple split at the first 2 characters will result is each partition to be of size 2,000,000 / (26 ^ 2) =~ 3000. Less if the user name supports digits and mixed case. This would be a decent strategy with enough room to grow unless user names are heavily skewed with a majority using the same 2 starting characters.
For example a user name of Lucifure would have the partition key as ‘Lu’ and row key as ‘cifure’.
- 已标记为答案 dkenan 2012年6月21日 16:37
Hi Dani - what Lucifure already said is right - but I would make sure that, like they said, you don't end up skewing dramatically toward one pair of characters. You can also simply use the username as the partition key. Is there a reason not to do that? That actually allows you to do very simple queries for things like "Get all of this user's data" (becomes just "PartitionKey == UserName").