locked
Unicode Characters Not similar to Unicode Table RRS feed

  • Question

  • Hi everyone,

    I have a string of characters written in Arabic or even in English. I am using Encoding Class to get the Unicode numbers of this input string like following:

                Encoding _Encoding= System.Text.Encoding.GetEncoding("Unicode");      

                byte[] _UnicodeBytes = _Encoding.GetBytes(inputString);

                foreach (byte _Current in _UnicodeBytes)

                    outputString += _Current;

    The problem now is, when I go back to Unicode code page or table (like this http://unicode-table.com/en/#0041), I found that, none of generated Unicode numbers of my code is similar to that is exist in Unicode Code page or table.

    As example, Letter ‘A’ in Unicode table is equal to (0041) in my code it is (650).

    How I can solve that? And why .Net Unicode converter does that or where are these generated numbers come from?


    Wa'el Mohsen


    • Edited by Wa'el Monday, April 22, 2013 8:26 AM
    Monday, April 22, 2013 8:25 AM

Answers

  • Hi RohitArora,

    I Solved the problem by formatting the input String's characters to Hex format like following

    string.Format("{0:X4}", Convert.ToInt64(inputCharacter));
      

     It is working well. Regarding your solution it will suffer of little Indian concept. I mean output String of your code will be (410) instead of (0041) in case of letter (A).

    Thank you for your great help, I appreciate it.


    Wa'el Mohsen

    • Marked as answer by Wa'el Tuesday, April 23, 2013 6:48 AM
    Tuesday, April 23, 2013 6:48 AM

All replies

  • in you code it is showing up a decimal representation, while on documentation it is usually hexadecimal.

    string hexValue = decValue.ToString("X");

    try using above 


    Mark Answered, if it solves your question and Vote if you found it helpful.
    Rohit Arora

    Monday, April 22, 2013 10:50 AM
  • Hi RohitArora,

    I Solved the problem by formatting the input String's characters to Hex format like following

    string.Format("{0:X4}", Convert.ToInt64(inputCharacter));
      

     It is working well. Regarding your solution it will suffer of little Indian concept. I mean output String of your code will be (410) instead of (0041) in case of letter (A).

    Thank you for your great help, I appreciate it.


    Wa'el Mohsen

    • Marked as answer by Wa'el Tuesday, April 23, 2013 6:48 AM
    Tuesday, April 23, 2013 6:48 AM