Note: Forums will be making significant UX changes to address key usability improvements surrounding search, discoverability and navigation. To learn more about these changes please visit the announcement which can be found HERE.
How To Make Unicode Readable Of Different Languages?

Answered How To Make Unicode Readable Of Different Languages?

  • Wednesday, September 15, 2010 12:17 AM
     
      Has Code
    string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage)
     {
      UnicodeEncoding tmpEncoding = new UnicodeEncoding();
    
      string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage);
      WebClient tmpClient = new WebClient();
      tmpClient.Encoding = System.Text.Encoding.ASCII;
      
      string result = tmpEncoding.GetString(tmpClient.DownloadData(url));
      return result;
     }
    

    Method Call :
    prvTranslateText2("Hello World","en","ur");
    

    Return Value (
    string result
    
    )
    { ur: ?????????????=?????? }

    Browser Return Value Of URL :
    [[["ہیلو دنیا","Hello World",""]],,"en"]

    Direct URL:
    http://translate.google.com/translate_a/t?client=t&text=Hello%20World&hl=en&sl=en&tl=ur&multires=1&pc=0&sc=1

    Required:
    How Can I get Response String Like The Browser Giving? (I Mean Readable)

    Am working On It :)

All Replies

  • Wednesday, September 15, 2010 12:34 AM
     
     
    Ahh Its Eating Me Out. I dont Wanted To Create a Web Application For such a Job Which Would Automatically Solve This Problem For Me. I Thought It Will Be Easily To Do So. But No Replies Created Multiple Threads On It :(
    Am working On It :)
  • Wednesday, September 15, 2010 9:27 AM
     
     

    Did you try another encoding? ASCII cannot handle any non-latin script.

  • Wednesday, September 15, 2010 1:46 PM
     
     

    Yes I Tried UTF8 Encoding in webClient.Encoding.

    But I Also Used Encoding.UTF8.GetString(DownloadData(url)); this Doesnt Seems to Correct The Problem Also


    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Wednesday, September 15, 2010 3:22 PM
     
     

    I'm not sure what your looking for, but I've the idea that probably HTMLEncoding can help you if not, does not matter.

    http://msdn.microsoft.com/en-us/library/73z22y6h.aspx


    Success
    Cor
  • Wednesday, September 15, 2010 4:12 PM
     
     
    Thanks For Replying Cor Ligthert.

    Some Languages Like Arabic, Chinese, Urdu , etc Are Unicode Languages Text Like.



    In English "Hello" String: "Hello"
    In English "Hello" Bytes (BitConverter.ToString) : 
    Length : 30
    Output : 5B-5B-5B-22-48-65-6C-6C-6F-22-2C-22-48-65-6C-6C-6F-22-2C-22-22-5D-5D-2C-2C-22-65-6E-22-5D


    In Urdu "Hello" Translation String: "ہیلو"
    In Urdu "Hello" Bytes (BitConverter.ToString) : 
    Length : 34
    Output : 5B-5B-5B-22-C0-5C-75-30-36-43-43-E1-E6-22-2C-22-48-65-6C-6C-6F-22-2C-22-22-5D-5D-2C-2C-22-65-6E-22-5D

    I Seems To Use All Languages In Any Text Editor I Found In My Computer But Only Not The string Return From webClient.

    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Wednesday, September 15, 2010 4:13 PM
     
     
    I Tried HTMLDecode Method But It Cannot convert Other Language Text. 
    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Wednesday, September 15, 2010 4:51 PM
     
     Answered Has Code

    Google actually doesn't seem to be using Unicode to return the data. Looking at the ResponseHeaders, it's actually some ISO-8x charset for arabic.

    This code extracts what you want, though it's not very robust atm ;):

        static void Main(string[] args)
        {
          prvTranslateText2(HttpUtility.UrlEncode("I like cheese"), "en", "ar");
    
        }
    
        public static string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage)
        {
       
          string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage);
          WebClient tmpClient = new WebClient();
          tmpClient.Encoding = UTF8Encoding.Default;
          
    
          byte[] resultData = tmpClient.DownloadData(url);
          string charset = Regex.Match(tmpClient.ResponseHeaders["Content-Type"], "(?<=charset=)[\\w-]+").Value;
    
          string result = Encoding.GetEncoding(charset).GetString(resultData);
          return result;
        }
    

     

     

     

  • Wednesday, September 15, 2010 8:18 PM
     
     

    Thanks Jesse Houwing to Take Your time Out And Write Codes For me :). Thanks

     

    I Got The Response Header From FireBug.

     


     

    Date Wed, 15 Sep 2010 20:07:26 GMT

    Expires Wed, 15 Sep 2010 20:07:26 GMT

    Cache-Control private, max-age=3600

    Content-Type text/javascript; charset=UTF-8

    Content-Language ar

    X-Content-Type-Options nosniff

    Content-Encoding gzip

    Server translation

    Content-Length 91

     

    It Seems Its UTF-8. well Doesnt Matter Whatever The charset is But Still Its Returning Some Invalid Unreadable Character Throught Your Code. Did The Codes Worked For You??

     

     

     


    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Wednesday, September 15, 2010 8:24 PM
     
     
    I Think thats Why Google Didnt Put Urdu and Some Languages In API Provided By Them. But All Over Thanks Guys. Thanks Jesse Your Codes Helped me Translate to Arabic Atlast. But Doesnt Seems to Work On Some Other Languages. Well You did your Job and u Deserve Being Answered :)
    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Thursday, September 16, 2010 1:04 PM
     
     
    I haven't looked into other languages that might be failing, I took Arabic as an example, because I can recognize it, though I can only asume Google did the right job in translating it ;). Do you have the parameters that are failing, plus a sample of what the output is you're expecting?
  • Thursday, September 16, 2010 1:45 PM
     
     

    Abdul,

    This is an often stated problem.

    That persons don't sometimes depend on the installed language (in fact code table) on their computer.

    The West European language one is 1252 (includes characters used in languages as English, German, Dutch and French)

    The Arabic one is 1256 

    http://en.wikipedia.org/wiki/Windows-1256

    On every Windows computer where those both need to be displayed they should be installed in the languages.

    (The arabic language is enough to get the table there is no need to install all other languages which use that language)


    Success
    Cor
  • Thursday, September 16, 2010 2:42 PM
     
     Answered

    Two ideas :

    1) why don't you use DownloadString

    2) Maybe add to the header the Accept-Charset with the content ISO-8859-1,utf-8;q=0.7,*;q=0.7 (or something, grab it with firebug) to the request, than probably you get response in the UTF-8 format.

    Second idea should help, why else would firefox get an UTF-8 response from google...

  • Thursday, September 16, 2010 4:14 PM
     
      Has Code
    :)
    This Is The Method Call.
    Method Call :
    prvTranslateText2("Hello World","en","ur");
    

    This is The Value Im Getting
    Return Value (
    string result
    
    )
    { ur: ?????????????=?????? }

    This Is Value I Should get (Browser is Also giving that with the URL)
    Browser Return Value Of URL :
    [[["ہیلو دنیا","Hello World",""]],,"en"]

    (Here is Link Which The Method is Calling.
    Direct URL:
    http://translate.google.com/translate_a/t?client=t&text=Hello%20World&hl=en&sl=en&tl=ur&multires=1&pc=0&sc=1

    Now After with Your Codes. Its translating Small Words and Messing Up Large Sentences. in All Languages :). 

    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Thursday, September 16, 2010 4:20 PM
     
     

    I Think You Are Right Sir, I didn't Notice The Request Headers Of Firefox Let me Change My Method Call And get Back.

     


    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
  • Thursday, September 16, 2010 6:49 PM
     
     Answered Has Code

    Thanks boothwine.

    How Can I Such a Fool :)

    I Thought 

    WebClient.Encoding = Encoding.UTF8;
    

    Will Work Everything Out :)

    But Thanks

    Heres The Code Working Full-Correct

        private static string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage)
        {
          UnicodeEncoding tmpEncoding = new UnicodeEncoding();
    
          string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage);
          WebClient tmpClient = new WebClient();
          tmpClient.Encoding = UTF8Encoding.Default;
    
          tmpClient.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100824 Firefox/3.6.9";
          tmpClient.Headers["Accept-Language"] = "en-us,en;q=0.5";
          tmpClient.Headers["Accept-Charset"] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7";
          
          
    
          byte[] resultData = tmpClient.DownloadData(url);
          string CharSet = Regex.Match(tmpClient.ResponseHeaders["Content-Type"], "(?<=charset=)[\\w-]+").Value;
    
          string result = Encoding.GetEncoding(CharSet).GetString(resultData);
          return result;
        }
    


    Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.
    • Proposed As Answer by boothwine Friday, September 17, 2010 7:30 AM
    • Marked As Answer by Abdul Sami Siddiqui Sunday, September 19, 2010 9:55 AM
    •