locked
Search for a word in encrypted text RRS feed

  • Question

  • User1886661451 posted

    I use TripleDes and Cryptography in C# to encrypt my text and then save it in a database. Now I want to be able to search for a single word in that encrypted text in the database. I thought that if I encrypt the word I want to search on, that I can use that encrypted word to search in my database (SQL Server with FREETEXT). But the encrypted string of the word doesn't apear at all in the entire encrypted text.

    How can I achieve what I want namely, a user types in a word he want to look for and my database will return the found record. Encryption must remain in .NET though. I don't know how to create the identical encryption on the database.

    Monday, October 24, 2016 11:45 AM

All replies

  • User-654786183 posted

    It's not going to be easy. The requirement you have is rarely implemented.  When you encrypt a text which has multiple words, the encrypted text is not going to be same when you encrypt every single word in the text and concatenate it.  If you encrypt the words individually and concatenate and save it in the database, then doing a search is possible but that defeats the purpose of encryption and security.

    I used a online tool to encrypt the sample text and then the words individually

    The text I used is "Encrypted Text" 

    Encrypted - EnCt2b2f49be0d85dc75d8d889cfb47b9567926a21365b2f49be0d85dc75d8d889cfbTOxgXwK/DgP
    WQTAYDlgqVUz/66zwe38=

    Text - EnCt24219f3d5016e5f97a300a822ce7680f1cd46941a4219f3d5016e5f97a300a822VM103m/UfAP
    Nk2cYDlhRr6na

    Encrypted Text - EnCt26fd8d42c089621c91586a80479387d6ee56b40c96fd8d42c089621c91586a804c4UVM509fwK
    tpoQYDljU7m5y/U1eCbUIwWyD/w==

    There are some interesting links which talk about similar requirements

    http://stackoverflow.com/questions/1228924/encrypted-fields-full-text-search-best-approach

    http://outsourcedbits.org/2014/08/21/how-to-search-on-encrypted-data-searchable-symmetric-encryption-part-5/

    Monday, October 24, 2016 2:27 PM
  • User-629417786 posted

    Hi Ellen,

    Fortunately (but unfortunate for your specific use case), that's not the way cryptography is designed to work. To maximize privacy during encryption, a string of words are typically encrypted as a whole. In fact, there are special measures (e.g. crypto-random initialization vectors) in place to ensure that even if you encrypt the same string, twice - you should get very different results. Of course, decrypting each result will always result in the original string.

    For example

    input = "Ellen wants to search cipher text"
    // note encrypt results in a raw byte array, represented below as a base64 string
    encrypted the 1st time = "zbMAAAIEGMYcKd8iOW5YaMzMHQIgSHTCm5pNiqaHR6Ofg8Zc3QAAou5V6wVkahQ0h5sC3xXKVPqEyOqNm768WSD0vMaAjlvsN4y0+0JIP866mqOZMNGiXA=="
    encrypted the 2nd time = "zbMAAAIEGO7YPPXb36KwPKYZZgIghYVvvXo40JYWbk++ZLnd1QAARfQ/490WYgRaklpqyiACwswVPIzVY+FwgFhV27YDay1rqJ3XFosu4kfdGQ6n8/Avfw=="

    For what you want to do, you need to break the string into individual words and then encrypt each word in a deterministic manner. However that will open your encrypted data to frequency analysis attacks since there is now a 1:1 correlation between a plaintext word and it's cipher text counterpart. Also, it's not clear why you're using 3DES when AES exists. It's a lot more efficient and more future ready but the caveats previously mention remain.

    Since you seem to be new to the cryptographic field, I would HIGHLY recommend reading the following two articles 

    1. The REAL problem with encryption: you're doing it wrong!
    2. Practical searchable encryption 

    In the interest of full disclosure, I do work there but those articles are educational and very relevant to your question. 

    Thanks
    Sid

    Thursday, November 3, 2016 3:36 AM
  • User-1320437544 posted

    Not realy the way to go. Once encrypted the entire text by TripleDES it is going to be one large string into your database field. The word (or substring) that you are looking for is not going to be recognizable part or substring of the encrypted string. So no search by word would be possible. Cryptography was not made to be used in this way.

    On other side you could go and do your logic over dividing the string to substrings and encrypting each separatly and concatenating them and storing the concatenated strings into database field. Than it would be possible to search by encrypted substring.

    Performance and complicity are another issues that you will have to deal with....

    Another solutions could be to fill all the information to hash table (unencrypted) and do the search on the hash table not the real database (in memory search). Or create temportary in memory sql table during the search which will hold the unencrypted duplicated text field information.

    Those two depend entierly of how much information you are planning to keep into the database.

    Let us know if we could help you.

    Cheers

    Wednesday, August 15, 2018 5:28 AM