Encryption
I have a user table with Fields:
UID, name, SSN, phone,address. I need to encrypt all the fields except UID. My company recently moved to sql server 2005 and I have to encrypt old data. We do perform query search based on username and SSN
I have very shallow knowledge of encryption and indexes. I did looked at one of the articles on encrypting data on existing application but could not understand the indexing part(:()
Do i have to define new fields with varbinary as datatype(for ssn, name,phone number) ? I am asking this because in the demo only ssn_index has varbinary and the ssn field is still nvarchar?
Any help will be appreciated.
Thanks,
K
Answers
I am not sure if you have read these articles, but I hope they will be helpful:
· Indexing encrypted data http://blogs.msdn.com/raulga/archive/2006/03/11/549754.aspx
· SQL Server 2005 –Encrypting data on existing applications http://blogs.msdn.com/raulga/pages/589691.aspx
If you have any additional questions or feedback, please let us know, we will be glad to help.
Thanks,
-Raul Garcia
SDE/T
SQL Server Engine
- Thanks for the response. Your demo application was very helpful. Good job!
Since my table already contains a unique identifier UserID, I am assuming I can go with the first option: Create a new identifier value.
http://blogs.msdn.com/raulga/archive/2006/03/11/549754.aspxQuestion: why is the datatype of ssn_cipher nvarchar and not varbinary, as we will be storing encrypted information in that column?
I have created a checklist based on your demo application to encrypt old data in sql server 2005. Please let me know if I need to make any changes.
[Old TABLE] t_user (UserID, name, ssn , address) is not encrypted and has 3000 records.
I need to encrypt name, ssn, and address.- Create master key
- Create certificate
- Create symmetric key
- No need to create MAC keys as I already have a unique identifier
- Create new table to store encrypted data : [new table]
t_userRaw : User Id identity
Name_cipher Varbinary
SSN_cipher Varbinary
Address_cipher Varbinary· Create insert/update triggers on t_userRaw table
· Open symmetric key decrypt by certificate
· Fetch data from t_user to t_userRaw
· Delete old table:t_user
· Create view on t_rawuser to decrypted view of data
· Set user privileges
Thanks a lot for your time.K
You are absolutely correct. The ciphertext column should be a varbinary and not a nvarchar. I think it is a bug on my sample code, but it works because it implicitly casts the varbinary data into nvarchar, but it is not the recommended as the ciphertext is not proper string value. Good catch!
The checklist seems correct, just make sure you have enough space in the new table to store the ciphertext, for example varbinary(100). To estimate how long should the ciphertext column be, my recommendation is to encrypt a string with the maximum number of bytes the original table allowed (i.e. if the original column was nvarchar(30), it would be 30 Unicode characters = 60 bytes) and encrypt it with the same key to calculate the ciphertext length (you can keep an extra 8/16 bytes of space as a precaution).
I also strongly suggest keeping the old table (just remove any permission grated to low privileged users from it) until you can verify that the new table works as expected and that you are not missing any entry.
Thanks a lot,
-Raul Garcia
SDE/T
SQL Server Engine
Thanks a lot. I am really glad that we are able to help the SQL Server community and we really appreciate your comments.
Unfortunately I am not aware of such kind of attacks, but I am not a cryptographer myself. If anyone has information regarding this subject I will greatly appreciate it. On the other hand, and attacker will always try the easiest route first. For example, if you have the same value hashed twice with 2 different algorithms, I would try to find a match using the weakest algorithm and verify the match using the stronger algorithm. In the case of 2 HMACs (assuming same algorithm and different, but same strength keys both unknown to the attacker), the second copy should only help to verify if a given match is indeed a match or a collision.
Being able to specify the IV may lead to using a fixed IV (or a small number of IVs) that may compromised some of the information directly (i.e. via leaking patterns on PT that are longer than 1 block) or via simple observation (i.e. If attacker sees known ciphertexts , she can determine that the messages are related to some particular event) and therefore lead to a false sense of security under many circumstances. I cannot promise anything, but we will discuss this possibility for future releases of the product.
Once more, thanks a lot for your feedback. We really appreciate it.
-Raul Garcia
SDE/T
SQL Server Engine
All Replies
I am not sure if you have read these articles, but I hope they will be helpful:
· Indexing encrypted data http://blogs.msdn.com/raulga/archive/2006/03/11/549754.aspx
· SQL Server 2005 –Encrypting data on existing applications http://blogs.msdn.com/raulga/pages/589691.aspx
If you have any additional questions or feedback, please let us know, we will be glad to help.
Thanks,
-Raul Garcia
SDE/T
SQL Server Engine
- Thanks for the response. Your demo application was very helpful. Good job!
Since my table already contains a unique identifier UserID, I am assuming I can go with the first option: Create a new identifier value.
http://blogs.msdn.com/raulga/archive/2006/03/11/549754.aspxQuestion: why is the datatype of ssn_cipher nvarchar and not varbinary, as we will be storing encrypted information in that column?
I have created a checklist based on your demo application to encrypt old data in sql server 2005. Please let me know if I need to make any changes.
[Old TABLE] t_user (UserID, name, ssn , address) is not encrypted and has 3000 records.
I need to encrypt name, ssn, and address.- Create master key
- Create certificate
- Create symmetric key
- No need to create MAC keys as I already have a unique identifier
- Create new table to store encrypted data : [new table]
t_userRaw : User Id identity
Name_cipher Varbinary
SSN_cipher Varbinary
Address_cipher Varbinary· Create insert/update triggers on t_userRaw table
· Open symmetric key decrypt by certificate
· Fetch data from t_user to t_userRaw
· Delete old table:t_user
· Create view on t_rawuser to decrypted view of data
· Set user privileges
Thanks a lot for your time.K
You are absolutely correct. The ciphertext column should be a varbinary and not a nvarchar. I think it is a bug on my sample code, but it works because it implicitly casts the varbinary data into nvarchar, but it is not the recommended as the ciphertext is not proper string value. Good catch!
The checklist seems correct, just make sure you have enough space in the new table to store the ciphertext, for example varbinary(100). To estimate how long should the ciphertext column be, my recommendation is to encrypt a string with the maximum number of bytes the original table allowed (i.e. if the original column was nvarchar(30), it would be 30 Unicode characters = 60 bytes) and encrypt it with the same key to calculate the ciphertext length (you can keep an extra 8/16 bytes of space as a precaution).
I also strongly suggest keeping the old table (just remove any permission grated to low privileged users from it) until you can verify that the new table works as expected and that you are not missing any entry.
Thanks a lot,
-Raul Garcia
SDE/T
SQL Server Engine
- Thanks for the quick response and thanks a lot for your time.
What do you think the best type of encryption is?
__________________________________________________________________
- Raul,
I can't tell you how helpful I found this post. However, I am curious... I seem to remember that there are some theoretical attacks that can benefit from having the same data hashed or encrypted in different systems. Also, for every column that I wish to encrypt that I still need to match values on (with a hash/MAC), I now have two columns.
Would it be better to allow the database developer to specify the IV during the encryption? That would seem like it would meet both goals (encryption and indexing) without really losing any security (frequency analysis of values) without adding additional overhead in columns, functions, or tables to store the values needed for your MAC solution.
Since that functionality isn't available today, it's likely that I will end up implementing something very similar to your solution (I have many columns per table). However, it would be nice if the Encrypt functions would just let me set the IV instead. I could store IVs in a separate table or something, if it was needed.
Nathan Thanks a lot. I am really glad that we are able to help the SQL Server community and we really appreciate your comments.
Unfortunately I am not aware of such kind of attacks, but I am not a cryptographer myself. If anyone has information regarding this subject I will greatly appreciate it. On the other hand, and attacker will always try the easiest route first. For example, if you have the same value hashed twice with 2 different algorithms, I would try to find a match using the weakest algorithm and verify the match using the stronger algorithm. In the case of 2 HMACs (assuming same algorithm and different, but same strength keys both unknown to the attacker), the second copy should only help to verify if a given match is indeed a match or a collision.
Being able to specify the IV may lead to using a fixed IV (or a small number of IVs) that may compromised some of the information directly (i.e. via leaking patterns on PT that are longer than 1 block) or via simple observation (i.e. If attacker sees known ciphertexts , she can determine that the messages are related to some particular event) and therefore lead to a false sense of security under many circumstances. I cannot promise anything, but we will discuss this possibility for future releases of the product.
Once more, thanks a lot for your feedback. We really appreciate it.
-Raul Garcia
SDE/T
SQL Server Engine


