none
Character encoding RRS feed

  • Question

  • I have a spreadsheet that one column contains a string. I read it in using a managed provider then for debugging purposes I write the values out. Looking at the files it seems that there some character encoding that is messed up. There seems to be only one character that is out of wack. It is the apostrophe. So what looks fine in the spread sheet as "women's" appears as "women(some garbage characters)s" when I write it out. I have two concerns. 1) Since I am using the files as a sort of debugging tool I would like the string to appear right and 2) this string value will be used to compare to another string stored in another database so I would like to get a handle on what I need to do to make sure these strings compare OK (without any encoding issues).

    Any comments?

    Thank you.

    Kevin


    Kevin Burton
    Monday, November 29, 2010 11:51 PM

Answers

  • SQL statments supported with Excel have only basic syntax support, no COLLATE
    Val Mazur (MVP) http://www.xporttools.net
    • Marked as answer by KevinBurton Thursday, December 2, 2010 2:50 AM
    Wednesday, December 1, 2010 10:58 AM
    Moderator

All replies

  • Which managed provider are you using and what is underlying data provider?
    Val Mazur (MVP) http://www.xporttools.net
    Tuesday, November 30, 2010 11:43 AM
    Moderator
  • I am not sure if this answers your question but the connection string that I am using is as follows:

       string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=" +
              pathToFile +
              ";";
       connectionString += "Extended Properties=\"Excel 12.0 Xml;HDR=YES;IMEX=1\"";
    
    

    Also does OleDB/Excel support COLLATE? When I add the COLLATE clause to my SQL statement I get an exception.

     


    Kevin Burton

    • Edited by KevinBurton Tuesday, November 30, 2010 5:14 PM Update
    Tuesday, November 30, 2010 2:58 PM
  • SQL statments supported with Excel have only basic syntax support, no COLLATE
    Val Mazur (MVP) http://www.xporttools.net
    • Marked as answer by KevinBurton Thursday, December 2, 2010 2:50 AM
    Wednesday, December 1, 2010 10:58 AM
    Moderator
  • So with the data (text) in Excel as Unicode and the data (also text) in the database as ASCII, how do I resolve this. I have tried reading the data into a temporary table but I cannot seem to force the Unicode characters to ASCII. Any suggestions or examples?

    Kevin


    Kevin Burton
    Wednesday, December 1, 2010 2:42 PM
  • If character has Unicode encoding code, but the code lies within ASCII range then it is converted into ASCI, but if character’s Unicode code is outside of ASCII range and database is not configured for Unicode, all you can do is to convert Unicode characters inside of the code using some sort of replacing function. There is no magic and characters are characters and there is no automatic translation between encoding tables.


    Val Mazur (MVP) http://www.xporttools.net
    Thursday, December 2, 2010 11:54 AM
    Moderator
  • Since I am reading the data from an Excel spreadsheet how do I configure the database for Unicode? In general how do I check and how do I set a database configured for Unicode? Isn't that what the type nvarchar and nchar is for?

    Thank you for the tip.

    Kevin


    Kevin Burton
    Thursday, December 2, 2010 1:29 PM
  • NVarchar is the proper datytype for Unicode. You would need to use N... datatypes to store the data, but first you need to create column(s) in your database table that are also of NChar or NVarChar datatatype
    Val Mazur (MVP) http://www.xporttools.net
    Friday, December 3, 2010 11:50 AM
    Moderator