sábado, 4 de agosto de 2012 11:35
I am completely inexperienced with database sharding and partitioning and it's ok for now. However, I try to pick up as many info as possible about these terms.
We are currently working on an application which uses SQL Server (not SQL Azure) as the database system and we consider moving to SQL Azure. We know that some parts are not the same as SQL Server. So, we design the database as it will run on SQL Azure.
Multiple servers is no concern for us ATM. However, it might be in the future so we would like to design the system in a such way that we can easily use SQL Azure Federations to partition the database in a later point. This brings up a few concerns and one of them is the Primary key fields of the tables.
Cihan Biyiklioglu has a nice blog post about this: ID Generation in Federations in Azure SQL Database: Identity, Sequences, Timestamp and GUIDs (Uniqueidentifier)
He recommends using uniqueidentifiers as primary keys. However, it seems to me that using integers are viewed as a better approach around the community. Most of the concerns are with indexes.
In this case, is there anything (restrictions, recommendations, etc) I need to be aware of before considering using SQL Azure federations?
- Editado Tugberk_Ugurlu_ sábado, 4 de agosto de 2012 11:45
Todas as Respostas
sábado, 4 de agosto de 2012 13:13
First, let's leave the SQL Azure part aside. The statement of "using int for PK is widely implemented" or "suggested by the community" is just a generalisation. If you are in a datawarehousing scenario or simply doing replication you will better use GUIDs instead to get globally unique ids.
That's maybe the most important part; "globally unique ID".
Back to our SQL Azure Federation discussion. Currently the only supported partitioning type is uniqueidentifiers. Obviously that does not mean your partitioning field needs to be your PK but still I would suggest to keep your PK as GUIDs if you are planning to go with a massivly federated sql db just to be on the safe side.
If you think you don't need globally unique values you can go back to your integer PKs. That would for sure give relatively better performance.
Hope this helps.
daron yöndem | Sr.Director MEA @ IdentityMine | Microsoft Regional Director, MEA | INETA MEA President | Windows Azure MVP | http://daron.yondem.com
sábado, 4 de agosto de 2012 13:20
No doubt about using integers instead of GUIDs in primary keys. When we talk about GUIDs, It is a whopping 4 times larger than the traditional 4-byte index value; this can have serious performance and storage implications if you're not careful. But, think about the real advantages of using GUIDs instead of integers... especially on SQL Federations scenarios, just like Cihan wrote on his blog post.
I would also encourage you to read this post.
- Unique across every table, every database, every server
- Allows easy merging of records from different databases
- Allows easy distribution of databases across multiple servers
- You can generate IDs anywhere, instead of having to roundtrip to the database
- Most replication scenarios require GUID columns anyway
GUID Cons- It is a whopping 4 times larger than the traditional 4-byte index value; this can have serious performance and storage implications if you're not careful
- The generated GUIDs should be partially sequential for best performance (eg, newsequentialid() on SQL 2005) and to enable use of clustered indexes
And also this one. And think about the trade-offs!
Hope this helps!
Cheers, Carlos Sardo
quarta-feira, 8 de agosto de 2012 11:39
"Currently the only supported partitioning type is uniqueidentifiers."
Actually this is untrue. The federation key type can be several possibilities including INT
- The type name for the federation key type. The type must be of type INT, BIGINT, UNIQUEIDENTIFIER or VARBINARY(n), where n can be a maximum of 900.
quarta-feira, 8 de agosto de 2012 11:51
Please note that newsequentialid() will not work currently on SQL Azure and would have various issues if it were and then used in an Azure build out because right now the newsequentialid is guanteed to be greater than any other previously generated GUID on the given machine. This presents a unique problem for a SQL Azure instance in that you are never gauarenteed to be a particular machine. Therefore, because of background failovers and movement what would be considered the highest generated GUID on one instance may not be the same on another.
Since the database would be designed with the cloud in mind. These are considerations that have to be considered before implementing, otherwise it will require some rewriting of the application at a later stage.
The federation key should be thought of not as the PK of a table but instead the PK of an Atomic Unit.
- Editado Jones0878 quarta-feira, 8 de agosto de 2012 11:58
quinta-feira, 9 de agosto de 2012 10:28I'm using int as Federation Key and uniqueidentifier as PK in the federated tables. Works well. However I've not seriously stress tested it yet.