none
Control DB encoding RRS feed

  • Question

  • There's a windows application working with different types of DB. Application users are from different cultures and as such they use different encodings. DBA sets the DB character set in such a way that all user's encodings are subset of DB encoding.
    Like:
    1.User1 ...... works with .... EncodingSubset1
    2.User2 ...... works with .... EncodingSubset2
    Where
    EncodingSubset1 & EncodingSubset2 < DatabaseEncoding

    Application logic for handling the encoding is like this:
    WriteToDB function:
    Client (EncodingSubset1) -> write character <Е = 0xA9>  -> DB stores character <® = 0xA9> using DB encoding
    ReadFromDB function:
    DB sends <® = 0xA9> to client -> client reads 0xA9 as Е.
    Previous version of the application (MFC) works like in the above description.
    -----------------------------------------------------------------------------------------------------------------------
    The new version of the application written in C# cannot use the same logic because
    *strings are unicode.
    *when reading or writing to DB, .NET framework converts automatically strings from DB encoding to Unicode and vice versa.
    -----------------------------------------------------------------------------------------------------------------------
    There comes a compatibility issue with the old approach. The next scenario emphasizes it (please take the example as didactic):
    Client used MFC version of the application one year ago. Client wrote <Е = 0xA9> using Cyrillic. This character is saved into DB. DB has a Western European encoding. Cyrillic encoding is a subset of the Latin encoding (every Cyrillic character maps to a Latin one). <E = 0xA9> character maps to <® = 0xA9>.
    Nowadays client uses .NET application. It reads from DB <® = 0xA9> and converts that automatically to Unicode <® = 0xAE>.

    Thus, client writes <E> width MFC application and reads <®> with .NET one.

    Q:
    Is there any way to specify a custom encoding to be used when reading strings from the DB ?


    Monday, February 25, 2008 1:37 PM

Answers

  • RDBMS uses collation with relevant code pages while your application layer uses encoding and no you cannot create one for all because collation implementation varies.  However if you are developing a web application you could put a SQL Server Express user instance in the App_Data folder and save it as Cryllic using VS2005/8 advanced save as option.  You could also use the collation in your SQL string in most new versions of RDBMS.

    Monday, February 25, 2008 2:13 PM