locked
converting HTML <sub></sub> tags to string using unicode subscripts RRS feed

  • Question

  • I'm trying to put subscript into a text box with the following code :

    private void TxtRemoveML_KeyDown(object sender, KeyEventArgs e)
    {
        switch (e.KeyCode)
        {
            case Keys.Enter:
                {
                    txtRemoveML.Text = "\u0082\u0083\u0084\u0082\u0083";
                }
                break;
        }
    }
    

    executing this code results in an empty textbox... 

    shouldn't this just work?

    Christ



    my code is perfect until i don't find a bug

    Sunday, August 25, 2019 10:20 PM

Answers

  • Interesting, because this works for me:

    using System;
    using System.Windows.Forms;
    class Program
    {
        static public void Main(string[] args)
        {
            MessageBox.Show( "A\u2080B\u2081C\u2082D\u2083E");
            Console.WriteLine("Once");
        }
    }
    


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Monday, August 26, 2019 6:53 PM
  • i was using for \u2082 and the edited version above fails to show that.

    Do you want \u0082 or \u2082? I think the following does what you want for me.

    textBox1.Text = "\u2082\u2083\u2084\u2082\u2083";



    Sam Hobbs
    SimpleSamples.Info

    Monday, August 26, 2019 8:23 PM

All replies

  • Hi Christ,

    Thank you for posting here.

    First, I want to mention that your code is unicode string, which could not be shown directly. It is best for you to convert it to Ascii string. Therefore, I suggest that you could try the following code.

      private void TextBox1_KeyDown(object sender, KeyEventArgs e)
            {
                switch (e.KeyCode)
                {
                    case Keys.Enter:
                        {
                            string unicode = "\u0633\u0637\u0648\u0631 \u0639\u0628\u0631 \u0627\u0644\u0623\u064a\u0627\u0645 1";
                            textBox1.Text = Converttostring(unicode);
                        }
                        break;
                }
            }
            private string Converttostring(string text)
            {
    
                // Create two different encodings.
                Encoding ascii = Encoding.ASCII;
                Encoding unicode = Encoding.Unicode;
    
                // Convert the string into a byte array.
                byte[] unicodeBytes = unicode.GetBytes(text);
    
                // Perform the conversion from one encoding to the other.
                byte[] asciiBytes = Encoding.Convert(unicode, ascii, unicodeBytes);
    
                // Convert the new byte[] into a char[] and then into a string.
                char[] asciiChars = new char[ascii.GetCharCount(asciiBytes, 0, asciiBytes.Length)];
                ascii.GetChars(asciiBytes, 0, asciiBytes.Length, asciiChars, 0);
                string asciiString = new string(asciiChars);
                return asciiString;
    
            }

    Second, I want to know what your expected result is. Because I can't get the normal string according to the unicode string.

    Last but not least, could you tell me what is related to your html tags? 

    Best regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, August 26, 2019 1:57 AM
  • thanks for your reply,

    I tried your code and it didn't work.  

    this was the result


    my code is perfect until i don't find a bug

    Monday, August 26, 2019 8:27 AM
  • Hi Christ,

    Thanks for the feedback.

    >>I tried your code and it didn't work.

    It depends on if your computer supports the language.

    I want to know what is your expected result about the following code.

    txtRemoveML.Text = "\u0082\u0083\u0084\u0082\u0083";

    Best Regards,

    Jack


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, August 26, 2019 8:31 AM
  • Jack's advice was not useful.  If you're trying to display Unicode characters, then it is silly to convert to ASCII first.  Those Unicode characters do not have ASCII mappings.

    What are you expecting to see here?  The characters from U+0080 to U+009F are control characters that do not display anything.


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Monday, August 26, 2019 5:18 PM
  • i was using for \u2082 and the edited version above fails to show that.

    I'm running C#2019(preview) on Windows10.

    not sure what I need to do for my computer to support the language.

    I tried your code above and got blank for a textbox.

    thanks for you help, but I wriggled around the subscript/superscript issue (converting contents of <sub> content </sub> HTML tag to subscript on my textbox) using a character Map.

    public static string HTML_Render_Sub_Script_Numerals(string strHTMLSource)
    {
        string strSubScript_Replace = "₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎ₐₑₒₓₔ";
        string strSubscript_Test = "0123456789+-=()aeoxe";
    
        string strStart = "<sub>", strEnd = "</sub>";
        string strSubSource = HTML_GetNext(strHTMLSource, strStart, strEnd);
        while (strSubSource.Length > strStart.Length + strEnd.Length)
        {
            string strSubContent = strSubSource.Replace(strEnd, "").Replace(strStart, "");
    
            string strSubReplace = "";
            foreach (char c in strSubContent)
            {
                int intIndex = strSubscript_Test.IndexOf(c);
                if (intIndex >= 0)
                    strSubReplace += strSubScript_Replace[intIndex];
            }
                    
            if (strHTMLSource.Contains(strSubSource))
                strHTMLSource = strHTMLSource.Replace(strSubSource, strSubReplace);
    
            strSubSource = HTML_GetNext(strHTMLSource, strStart, strEnd);
        }
        return strHTMLSource;
    }

    with a matching superscript function.


    my code is perfect until i don't find a bug


    Monday, August 26, 2019 5:34 PM
  • I was using the 2080s unicode values but they didn't get to this page right.

    thanks for your help


    my code is perfect until i don't find a bug

    Monday, August 26, 2019 5:35 PM
  • Interesting, because this works for me:

    using System;
    using System.Windows.Forms;
    class Program
    {
        static public void Main(string[] args)
        {
            MessageBox.Show( "A\u2080B\u2081C\u2082D\u2083E");
            Console.WriteLine("Once");
        }
    }
    


    Tim Roberts | Driver MVP Emeritus | Providenza &amp; Boekelheide, Inc.

    Monday, August 26, 2019 6:53 PM
  • i was using for \u2082 and the edited version above fails to show that.

    Do you want \u0082 or \u2082? I think the following does what you want for me.

    textBox1.Text = "\u2082\u2083\u2084\u2082\u2083";



    Sam Hobbs
    SimpleSamples.Info

    Monday, August 26, 2019 8:23 PM