How To Make Unicode Readable Of Different Languages?
-
Wednesday, September 15, 2010 12:17 AM
string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage) { UnicodeEncoding tmpEncoding = new UnicodeEncoding(); string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage); WebClient tmpClient = new WebClient(); tmpClient.Encoding = System.Text.Encoding.ASCII; string result = tmpEncoding.GetString(tmpClient.DownloadData(url)); return result; }
Method Call :prvTranslateText2("Hello World","en","ur");
Return Value ()string result{ ur: ?????????????=?????? }
Browser Return Value Of URL :[[["ہیلو دنیا","Hello World",""]],,"en"]
Direct URL:http://translate.google.com/translate_a/t?client=t&text=Hello%20World&hl=en&sl=en&tl=ur&multires=1&pc=0&sc=1
Required:How Can I get Response String Like The Browser Giving? (I Mean Readable)
Am working On It :)
All Replies
-
Wednesday, September 15, 2010 12:34 AMAhh Its Eating Me Out. I dont Wanted To Create a Web Application For such a Job Which Would Automatically Solve This Problem For Me. I Thought It Will Be Easily To Do So. But No Replies Created Multiple Threads On It :(
Am working On It :) -
Wednesday, September 15, 2010 9:27 AM
Did you try another encoding? ASCII cannot handle any non-latin script.
-
Wednesday, September 15, 2010 1:46 PM
Yes I Tried UTF8 Encoding in webClient.Encoding.
But I Also Used Encoding.UTF8.GetString(DownloadData(url)); this Doesnt Seems to Correct The Problem Also
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Wednesday, September 15, 2010 3:22 PM
I'm not sure what your looking for, but I've the idea that probably HTMLEncoding can help you if not, does not matter.
http://msdn.microsoft.com/en-us/library/73z22y6h.aspx
Success
Cor -
Wednesday, September 15, 2010 4:12 PMThanks For Replying Cor Ligthert.
Some Languages Like Arabic, Chinese, Urdu , etc Are Unicode Languages Text Like.
In English "Hello" String: "Hello"In English "Hello" Bytes (BitConverter.ToString) :Length : 30Output : 5B-5B-5B-22-48-65-6C-6C-6F-22-2C-22-48-65-6C-6C-6F-22-2C-22-22-5D-5D-2C-2C-22-65-6E-22-5D
In Urdu "Hello" Translation String: "ہیلو"In Urdu "Hello" Bytes (BitConverter.ToString) :Length : 34Output : 5B-5B-5B-22-C0-5C-75-30-36-43-43-E1-E6-22-2C-22-48-65-6C-6C-6F-22-2C-22-22-5D-5D-2C-2C-22-65-6E-22-5D
I Seems To Use All Languages In Any Text Editor I Found In My Computer But Only Not The string Return From webClient.
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Wednesday, September 15, 2010 4:13 PMI Tried HTMLDecode Method But It Cannot convert Other Language Text.
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Wednesday, September 15, 2010 4:51 PM
Google actually doesn't seem to be using Unicode to return the data. Looking at the ResponseHeaders, it's actually some ISO-8x charset for arabic.
This code extracts what you want, though it's not very robust atm ;):
static void Main(string[] args) { prvTranslateText2(HttpUtility.UrlEncode("I like cheese"), "en", "ar"); } public static string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage) { string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage); WebClient tmpClient = new WebClient(); tmpClient.Encoding = UTF8Encoding.Default; byte[] resultData = tmpClient.DownloadData(url); string charset = Regex.Match(tmpClient.ResponseHeaders["Content-Type"], "(?<=charset=)[\\w-]+").Value; string result = Encoding.GetEncoding(charset).GetString(resultData); return result; }- Marked As Answer by Abdul Sami Siddiqui Wednesday, September 15, 2010 8:24 PM
-
Wednesday, September 15, 2010 8:18 PM
Thanks Jesse Houwing to Take Your time Out And Write Codes For me :). Thanks
I Got The Response Header From FireBug.
Date Wed, 15 Sep 2010 20:07:26 GMT
Expires Wed, 15 Sep 2010 20:07:26 GMT
Cache-Control private, max-age=3600
Content-Type text/javascript; charset=UTF-8
Content-Language ar
X-Content-Type-Options nosniff
Content-Encoding gzip
Server translation
Content-Length 91
It Seems Its UTF-8. well Doesnt Matter Whatever The charset is But Still Its Returning Some Invalid Unreadable Character Throught Your Code. Did The Codes Worked For You??
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Wednesday, September 15, 2010 8:24 PMI Think thats Why Google Didnt Put Urdu and Some Languages In API Provided By Them. But All Over Thanks Guys. Thanks Jesse Your Codes Helped me Translate to Arabic Atlast. But Doesnt Seems to Work On Some Other Languages. Well You did your Job and u Deserve Being Answered :)
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Thursday, September 16, 2010 1:04 PMI haven't looked into other languages that might be failing, I took Arabic as an example, because I can recognize it, though I can only asume Google did the right job in translating it ;). Do you have the parameters that are failing, plus a sample of what the output is you're expecting?
-
Thursday, September 16, 2010 1:45 PM
Abdul,
This is an often stated problem.
That persons don't sometimes depend on the installed language (in fact code table) on their computer.
The West European language one is 1252 (includes characters used in languages as English, German, Dutch and French)
The Arabic one is 1256
http://en.wikipedia.org/wiki/Windows-1256
On every Windows computer where those both need to be displayed they should be installed in the languages.
(The arabic language is enough to get the table there is no need to install all other languages which use that language)
Success
Cor -
Thursday, September 16, 2010 2:42 PM
Two ideas :
1) why don't you use DownloadString
2) Maybe add to the header the Accept-Charset with the content ISO-8859-1,utf-8;q=0.7,*;q=0.7 (or something, grab it with firebug) to the request, than probably you get response in the UTF-8 format.
Second idea should help, why else would firefox get an UTF-8 response from google...
- Marked As Answer by Abdul Sami Siddiqui Thursday, September 16, 2010 6:47 PM
-
Thursday, September 16, 2010 4:14 PM
:)This Is The Method Call.Method Call :prvTranslateText2("Hello World","en","ur");
This is The Value Im GettingReturn Value ()string result{ ur: ?????????????=?????? }
This Is Value I Should get (Browser is Also giving that with the URL)Browser Return Value Of URL :[[["ہیلو دنیا","Hello World",""]],,"en"]
(Here is Link Which The Method is Calling.Direct URL:http://translate.google.com/translate_a/t?client=t&text=Hello%20World&hl=en&sl=en&tl=ur&multires=1&pc=0&sc=1
Now After with Your Codes. Its translating Small Words and Messing Up Large Sentences. in All Languages :).
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Thursday, September 16, 2010 4:20 PM
I Think You Are Right Sir, I didn't Notice The Request Headers Of Firefox Let me Change My Method Call And get Back.
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved. -
Thursday, September 16, 2010 6:49 PM
Thanks boothwine.
How Can I Such a Fool :)
I Thought
WebClient.Encoding = Encoding.UTF8;
Will Work Everything Out :)
But Thanks
Heres The Code Working Full-Correct
private static string prvTranslateText2(string inputStr, string fromLanguage, string toLanguage) { UnicodeEncoding tmpEncoding = new UnicodeEncoding(); string url = string.Format("http://translate.google.com/translate_a/t?client=t&text={0}&hl=en&sl={1}&tl={2}&multires=1&pc=0&sc=1", inputStr, fromLanguage, toLanguage); WebClient tmpClient = new WebClient(); tmpClient.Encoding = UTF8Encoding.Default; tmpClient.Headers["User-Agent"] = "Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US; rv:1.9.2.9) Gecko/20100824 Firefox/3.6.9"; tmpClient.Headers["Accept-Language"] = "en-us,en;q=0.5"; tmpClient.Headers["Accept-Charset"] = "ISO-8859-1,utf-8;q=0.7,*;q=0.7"; byte[] resultData = tmpClient.DownloadData(url); string CharSet = Regex.Match(tmpClient.ResponseHeaders["Content-Type"], "(?<=charset=)[\\w-]+").Value; string result = Encoding.GetEncoding(CharSet).GetString(resultData); return result; }
Mark Post As Answer If It Helped You and Also Take Some Time Out To Mark The Thread Resolved.- Proposed As Answer by boothwine Friday, September 17, 2010 7:30 AM
- Marked As Answer by Abdul Sami Siddiqui Sunday, September 19, 2010 9:55 AM

