locked
Japanese (UTF-8) displaying as ????? RRS feed

  • Question

  • User-1875557500 posted
    Hi All

    I have a problem... I have a MySQL db table that contains Japanese text. When I SELECT the contents of the table and display on the screen using vb.net the text is showing as a series of question marks (??????).

    The db table is set to UTF-8, the content comes from a .txt that has been saved as UTF-8. The ResponseEncoding is set to UTF-8 on the page. There is no globalization set in the web.config.

    I have sucessfully worked through the tutorial http://www.asp.net/Tutorials/quickstart.aspx 'setting culture and encoding', without any problems.

    Can anyone help??

    Many thanks.

    For reference the db table source code is:

    SQL STARTS

    CREATE TABLE `newjpcontent` (

    `Id` int(11) NOT NULL auto_increment,

    `content` varchar(17) default NULL,

    PRIMARY KEY (`Id`)

    ) ENGINE=InnoDB DEFAULT CHARSET=utf8;


    SQL ENDS

    The asp.net source code is:

    ASP.NET STARTS

    <%@Page Language="VB" ResponseEncoding="UTF-8"%>

    <%@ Import Namespace="System.Data"%>

    <%@ Import Namespace="System.Data.Odbc"%>

    <script runat=server>

    Protected myConnection As New

    OdbcConnection(ConfigurationSettings.AppSettings("ConnectionString"))

    Sub page_load(ByVal sender As Object, ByVal e As EventArgs)

    myConnection.Open()

    Dim da As New OdbcDataAdapter

    Dim dt As New DataTable

    Dim dr As DataRow

    da.SelectCommand = myConnection.CreateCommand()

    da.SelectCommand.CommandText = "SELECT * FROM newjpcontent;"

    da.Fill(dt)

    da.Dispose()

    myConnection.Close()

    For Each dr In dt.Rows

    litContent.Text &= "<p>Varchar content goes here</p><p>" & dr("content") & "</p>"

    Next

    End Sub

    </script>

    <html>

    <head>

    </head>

    <body>

    <asp:Literal ID="litContent" runat="server"></asp:Literal>

    </body>

    </html>



    ASP.NET ENDS


    Tuesday, June 28, 2005 6:48 AM

All replies

  • User1416329745 posted

    What you are getting is called character conversion, try the links below to get started.  Hope this helps.

    http://www.microsoft.com/downloads/details.aspx?FamilyID=6b6fb09f-f25c-48e9-9e26-b55144600da1&DisplayLang=en

    http://www.aspnetresources.com/blog/unicode_in_vsnet.aspx

    Tuesday, June 28, 2005 10:16 AM
  • User-1875557500 posted

    Hi Caddre

    Thanks for your post, I have worked through the MSDN resources tutorial and have no problems, I can get button controls to display Japanese or English.

    As for the link to aspnetresources.com, I am not using VS (I am using Visual Web Developer), but I have double checked as far as possible that all source files are being saved as UTF8.

    Do you have anymore hints? I found the links you posted very interesting.....

    If it helps, I have posted the page to a live server, http://domain835520.sites.fasthosts.com/default2.aspx obviously you will need to have the Japanese character pack installed to see the Japanese text in the page, but the content that comes from the MySQL db is under the paragraph titled 'Varchar content goes here'. All I see is ????????.

    Thanks

    Thursday, June 30, 2005 4:33 AM
  • User1416329745 posted

    Try these links one have a create table statement that may make a difference.  Hope this helps.

    http://dev.mysql.com/doc/mysql/en/charset-charsets.html

    http://forum.armkb.com/showthread.php?t=17146

    Thursday, June 30, 2005 7:13 AM
  • User-1875557500 posted
    Hi

    Thanks for the response.

    I have worked through the table creation using 'collations'. This has made no difference. The version of MySQL I am using is 4.1.12a

    Just to be sure can you tell me what 'collation' I should be using for UTF8, I have tried 'general' and 'unicode' 

    Does the database engine make any difference with encoding? (I have used both InnoDB and MylSAM).

    Thanks!

    Friday, July 1, 2005 7:15 AM
  • User1416329745 posted
    For Japanese you can choose the language collation instead of UTF8 that is why I gave you the first link, I know in SQL Server 7.0 Unicode require reinstall so I would check MySQL documentation.  If it is text or XML you can save the file as Unicode in notepad before loading it in the database.  The example below is from the MySQL character set chart.  Hope this helps.<?xml:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /><o:p> </o:p>
    | ujis     | EUC-JP Japanese             | ujis_japanese_ci    |<o:p></o:p>
    | sjis     | Shift-JIS Japanese          | sjis_japanese_ci    |<o:p></o:p>
    Friday, July 1, 2005 8:24 AM
  • User-1875557500 posted

    Hi

    Thanks for trying Caddre!

    I followed the MySQL documentation in creating a new database as Shift-JIS, and then converted the source files to Unicode. When I entered the text into the table column it immediately converted to ?????.

    Using UTF8 db tables, and source files saved as UTF8 I can populate the DB correctly, however the problem is then displaying the text in the webpage. I have set globalization to utf8, and can display the same text using static code (stored directly in the webpage).

    I converted some characters to hex (EG &#39080;) these displayed without a problem - is there a method to convert a utf8 string to a hex?

    I feel like I have come to the end of the error solving procees, and I guess there is something incorrect with my setup (though I have used a server hosted by an external hosting co. and a local development server).

    I'd appreciate any further assistance.

    Thanks

    Monday, July 4, 2005 7:13 AM
  • User-1565848253 posted
    I am having the exact same problem.  I've tried creating the database in UTF8, SJIS, CP932, and UJIS character sets.  It just doesn't work.

    I've tried pulling the same stuff out of an Access database through the same .net page, and it works just fine.  This leads me to believe that it's a mySQL issue, and not a .net issue.

    Any thoughts?

    Tuesday, July 26, 2005 5:14 PM
  • User1416329745 posted
    I agree with you MySQL unicode definition is not clear and may not work.  Hope this helps.
    Wednesday, August 3, 2005 7:22 PM
  • User-1794599080 posted

    Try changing the datatype of your Contents column in your table to NVarchar instead of varchar. This is the Unicode formatting. 

    This should work.


    Regards,
    Shilpa

    Thursday, August 11, 2005 11:50 PM
  • User-1875557500 posted
    Thanks for your post Shilpa

    Unfortunately nVarChar is not a valid Data Type for MySQL.

    Regards
    Friday, August 12, 2005 4:24 AM