none
How to Decode from utf 8 to Clean Text RRS feed

  • Question

  • Goal:
    Decode from utf 8 to clean text

    Problem:
    Based on this code below, it doesn't want to decode from "masaväg" to "masaväg".

    What part am I missing?

    Thank you!

    Info:
    It works to decode from "masaväg" to "masaväg" in this pagehttps://www.browserling.com/tools/utf8-decode


       UTF8Encoding utf8 = new UTF8Encoding();
        String unicodeString = "masaväg";
        // Encode the string.
        Byte[] encodedBytes = utf8.GetBytes(unicodeString);
        // Decode bytes back to string.
        String decodedString = utf8.GetString(encodedBytes);

    Thursday, December 14, 2017 2:13 PM

All replies

  • You are converting from UTF8 to UTF8, nothing would change here. The site you pointed to doesn't clarify what "clear text" is to them. Looking at the script they are running, they seem to just be stripping out characters and adjusting others. What encoding they are using is unclear to me. 

    For your code you need to decide what encoding you want to convert to. There is no such thing as the "clear text" encoding. The Encoding class defines the standard encodings supported by .NET you can convert to.


    Michael Taylor http://www.michaeltaylorp3.net

    Thursday, December 14, 2017 3:13 PM
    Moderator
  • This works on my OS :

    string sString = "masaväg";
    byte[] utf8Bytes = Encoding.UTF8.GetBytes(sString);
    byte[] win1252Bytes = Encoding.Convert(Encoding.UTF8, Encoding.GetEncoding("Windows-1252"), utf8Bytes);
    string sConvertedString =Encoding.UTF8.GetString(win1252Bytes);

    Thursday, December 14, 2017 3:52 PM
  • If you cannot yet fix the problem in other parts to avoid such incorrect strings, then try this temporary workaround too:

    string bad_string = "masaväg";
    string result = Encoding.UTF8.GetString( bad_string.Select( c => (byte)c ).ToArray() );

    Thursday, December 14, 2017 7:16 PM
  • Hello Sakura,

    First you need to know the "masaväg" encoding format, this is the main problem of your issues. The following example is using reflection to iterate all properties of "Encoding".

                String unicodeString = "masaväg";
    
                PropertyInfo[] infos=typeof(Encoding).GetProperties();
                foreach (PropertyInfo ss in typeof(Encoding).GetProperties()) {
    
                    if (ss.PropertyType.FullName == typeof(Encoding).FullName)
                    {
                        var value = ss.GetValue(null, null) as Encoding;
    
                        byte[] bytes12= value.GetBytes(unicodeString);
    
                        Console.WriteLine(Encoding.UTF8.GetString(bytes12));                 
                    }
                    
                }
    
                Console.Read();

    result

    Best regards,

    Neil Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Friday, December 15, 2017 11:08 AM
    Moderator
  • Hello Sakura,

    Is there any update or any other assistance I could provide? You could mark the helpful reply as answer if the issue has been solved. And if you have any concerns, please do not hesitate to let us know.

    Best regards,

    Neil Hu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Sunday, December 24, 2017 8:50 AM
    Moderator