locked
Character coding in http post operation for swedish characters RRS feed

  • Question

  • Hi,

    I´m building a client application which reads a .txt file on the local computer with a streamreader (encoding set to utf8)

    After this, I use a POST operation (contenttype = application/x-www.form-urlencoded)(this is the only coding that works somewhat)

    When I get the response back, the swedish characters have been replaced.

    These are the code lines:

    Dim strR As New StreamReader("c:\test\myfile.txt", Encoding.UTF8)

    --

    Dim objRequest As HttpWebRequest = WebRequest.Create(url)

    objRequest.ContentType = "application/x-www-form-urlencoded"

    --

    Dim objResponse As HttpWebResponse = objRequest.GetResponse()

    Please tell me how to fix this.

    Thank you

    Sunday, March 24, 2013 9:16 PM

Answers

All replies

  • You need to include the laguage in the hear of the WeRequest.  See webpage below

    http://stackoverflow.com/questions/10883062/how-to-know-users-language

           Dim objRequest As HttpWebRequest = WebRequest.Create(url)
            objRequest.Headers.Add(HttpRequestHeader.AcceptLanguage, "en-sv")
            Dim objResponse As HttpWebResponse = objRequest.GetResponse()


    jdweng

    Monday, March 25, 2013 1:00 AM
  • Thanks for your help.

    I can´t get it to work though. It doesn´t change anything apparently.

    Is there something else I should be doing? This is the first time I´m using POST.

    Thanks again.

    Monday, March 25, 2013 6:57 AM
  • I would start by checking the headers in the webpage by using IE manually.  Go to the webpage and use the menu View - Source.  In the header of the webpage there should be a line that contains the languagee encoding and the contents type.  Set the properties to match what the IE is using.

    Normally I use Wireshark to trace the Network Datagrams and compare the results I get using a IE with the results I get with my code.  Then change my code to match the IE results.  Make sure you delete all the cookies from your webbrowser between changes.  The VS code uses the same cookies as your webbrowser and what happens sometimes is you run the webbrowser and then your code will start working.  The next time you you delete the cookies your code will stop working.  Deleting the cookies will make sure that your code will always work.


    jdweng

    Monday, March 25, 2013 9:44 AM
  • I´m building a client application which reads a .txt file on the local computer with a streamreader (encoding set to utf8)

    Is the original .txt file encoded with UTF-8? If you read it in with your StreamReader and then write it back out to another file using UTF-8 encoding, are all the characters preserved?

    --
    Andrew

    Monday, March 25, 2013 9:53 AM
  • Andrew : The problem is the response of the Webpage.  With a multi language webpage the webpage looks at the header in the POST message to determine what language to use for the response.  I don't think the header in the POST message is correct which is cuasing the wrong language to get returned.

    jdweng

    Monday, March 25, 2013 9:56 AM
  • Check Step by Step where it is not right encoded. 

    Currently you are  trying finding a needle in a hay bin.

    After you know in which statement the error occurs it is easier to find the solution.


    Success
    Cor

    Monday, March 25, 2013 9:58 AM
  • Thank you for your replies.

    I think I don´t have a coding set.

    This is how the page is redered:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head><title> </title></head> <body> <form method="post" action="Reciever.aspx" id="form1"> <div class="aspNetHidden"> <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUJOTc4NDg4ODk5ZGS3QpcWpqtjpswTBa0POHTJnZhZh4k7KhZof+leNd4THQ==" /> </div> <div> </div> </form> </body> </html>

    Andrew Morton: Yes, I´ve checked that. Thanks.


    Monday, March 25, 2013 7:06 PM
  • I don't think somebody can do something with that part. A viewstate is an encrypted part of javascript code and that is all there is on that page.

    Be aware to use Swedish characters both computers have to be set to that. 

    Although on the client Sweden is simply using the Western Europe code system. The same as for instance the USA is using.

    http://en.wikipedia.org/wiki/Windows-1252

     Be aware the page has a small error. 1252 is used for all Western Europe languages, including the most used Western European languages (Spanish, French, Portuguese, English and German) in the world and not some other languages then English .

    Other language pages of Wikipedia show the correct text about that.


    Success
    Cor

    Monday, March 25, 2013 7:27 PM
  • I think I don´t have a coding set

    There are three ways you could do it listed at en.wikipedia.org/wiki/Character_encodings_in_HTML.

    HTH,

    Andrew

    Monday, March 25, 2013 7:32 PM
  • Thank you for your replies.

    The page now renders like this:

    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
     
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head><meta http-equiv="content-type" content="text/html; charset=utf-8" /><title>
     
    </title></head>
     
    <body>
     
        <form method="post" action="Reciever.aspx" id="form1">
    <div class="aspNetHidden">
    <input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTUxOTgwMTYyM2Rk1hLH52UR9BSJnR95pcAPxegwv94C9FLEMZEZh+B3+3s=" />
    </div>
     
     
        <div>
        
        </div>
        </form>
    </body>
    </html>

    And unfortunately there is no difference with the characters.

    I´m developing this with Visual basic.NET 2010 and test it locally with Visual web developer 2010.

    Thanks again.

    Tuesday, March 26, 2013 7:14 AM
  • I forgot to mention:

    I removed the viewstate for the form, and it didn´t help.

    Tuesday, March 26, 2013 7:15 AM
  • Yea but on what server is that page running? Your own desktop?


    Success
    Cor

    Tuesday, March 26, 2013 8:18 AM
  • Yes, localhost so far. I haven´t tried it on a real server yet.

    Wednesday, March 27, 2013 6:45 AM
  • Duh,

    I thought somebody had already given you the link to this class. 

    http://msdn.microsoft.com/en-us/library/system.web.util.httpencoder.aspx

    What it does is coding and encoding  HTTP characters (in fact code strings) from characters used in Swedish, French etc.


    Success
    Cor


    • Edited by Cor Ligthert Wednesday, March 27, 2013 11:17 AM
    • Marked as answer by Black Santa Tuesday, April 2, 2013 5:30 AM
    Wednesday, March 27, 2013 11:17 AM
  • Thanks Cor Ligthert,

    I checked it out and installed it, but I´m not sure how to use it.

    I couldn´t figure it out from the documentation, I´m afraid.

    Do you have an example? Should I use it on the local program as well?

    Thanks

    Thursday, March 28, 2013 7:26 AM
  • Did you look at the methods,

    http://msdn.microsoft.com/en-us/library/system.web.util.httpencoder.htmldecode.aspx

    normally it is just

    HtmlDecode(Text,OutputText)

    Not much sample to make in my idea


    Success
    Cor

    Thursday, March 28, 2013 8:02 AM
  • Thanks again, Cor.

    I tried that and I couldn´t get it to work.

    When I hard-code for instance "ö" in the response, it works, ie:

    response.write("ö") sends "ö" to the local app.

    So I figure it is either in the page.request.form.tostring operation something goes wrong, or in the instansiation of the streamwriter.

    But the I realized something: The letters I send are returned before the html in the responsestream, ie at the top, not inside the head or form tags. Is it supposed to do that?

    server code (the request is "åäö" as well):

    Response.Write("åäö" & Page.Request.Form.ToString)
    returns in local app:
    åäö%ufeff%u00e5%u00e4%u00f6
    <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
    <html xmlns="http://www.w3.org/1999/xhtml">
    <head><meta http-equiv="content-type" content="text/html;&#32;charset=utf-8" />
    <title>
    </title>
    </head>
    <body>
    <form method="post" action="Reciever.aspx" id="form1"><input type="hidden" name="__VIEWSTATE" id="__VIEWSTATE" value="/wEPDwUKMTUxOTgwMTYyM2Rk1hLH52UR9BSJnR95pcAPxegwv94C9FLEMZEZh+B3+3s=" />
    <div>
    </div>
    </form>
    </body>
    </html>



    • Edited by Black Santa Friday, March 29, 2013 10:08 AM
    Friday, March 29, 2013 10:02 AM
  • server code (the request is "åäö" as well):
    Response.Write("åäö" & Page.Request.Form.ToString)

    As you've discovered, Response.Write is not what you need. Instead, put an <asp:Literal> control on the page, e.g. <asp:Literal Id="msg" RunAt="Server" /> and set its text in the code-behind by using msg.Text = "åäö"

    HTH,

    Andrew

    Friday, March 29, 2013 8:19 PM
  • Thanks everyone for your answers.

    It turns out the Htmldecode() method works. For some reason this works:

    New StreamReader(Request.InputStream)

    and this didn´t:

    New StreamReader(page.Request.InputStream)

    It´s now working like it should, thanks again!

    Tuesday, April 2, 2013 5:30 AM