none
IndexOf( this is unstable !!!!!) RRS feed

  • Question

  • i've used the String.IndexOf() method in a hundred different projects. 

    its my BFF.

    but right now, its on crack!

    look at my debugging session screen-capture.

    you can see the value for intIndexStart should be 0.  other times (today) when it was skipping the first instance of a single character search i resorted to using IndexOfAny() 

    intCut_Open = strRetVal.IndexOf(strOpen);
    char[] chrClose = { '>', '>' };
    intCut_Close = strRetVal.IndexOfAny (chrClose);

    just to stop from cracking my head against the wall.  but I can't do that with a string???!!!??

    is my computer high? is there anti-virus for this? 

    what is the problem?

    Christ


    my code is perfect until i don't find a bug

    Sunday, September 15, 2019 5:32 PM

Answers

All replies

  • Take a closer look at strText. Enumerate all its characters one by one and examine their binary value. It's possible that it contains a non-printable character, or that it contains a character that when printed looks the same as the characters that you are comparing but in reality has a different code point, so IndexOf would not consider that it is a match.

    Sunday, September 15, 2019 5:55 PM
    Moderator
  • thanks for your suggestion but i've already had a look

    you can see from this screen shot that it skips right over the instance at index = 20 and returns the next one at index = 43...

    Christ


    my code is perfect until i don't find a bug


    Sunday, September 15, 2019 6:03 PM
  • Looks bad. It appears that IndexOf is skipping the first occurrence of ">" and only finding the second one.

    This is completely unexpected. What could be happening? In a different case I would recommend copying the code for IndexOf from the source reference:

    https://referencesource.microsoft.com/#mscorlib/system/string.cs

    then pasting it into your code and stepping through it with the debugger, to try to determine what the problem might be. Unfortunately, in the case of IndexOf it ultimately calls an external method called InternalFindNLSStringE which is the one that does all the work. So you won't be able to debug through it.

    If everything else fails, you could try implementing your own version of IndexOf as an extension method for String (with a different name) and use that one in your code. At least then, if it fails, you will be able to step through it to find the cause of the problem.

    Sunday, September 15, 2019 6:49 PM
    Moderator
  • Did you check what is between “>” and “bi” (strRetVal[21])?

    Sunday, September 15, 2019 7:51 PM
  • ok. that's great.

    i've written this :

    public int CK_IndexOf(string strText, string strSearch) { return CK_IndexOf(strText, strSearch, 0); }
    public int CK_IndexOf(string strText, string strSearch, int intStart) { return CK_IndexOf(strText, strSearch, intStart, false); }
    public int CK_IndexOf(string strText, string strSearch, int intStart, bool bolMatchCase)
    {
        int intRetVal = intStart;
        int intTestChar = 0;
        if  (!bolMatchCase )
        {
            strText = strText.ToLower();
            strSearch = strSearch.ToLower();
        }
    
        while (intRetVal + intTestChar < strText.Length )
        {
            while (intRetVal + intTestChar < strText.Length
                    && intTestChar < strSearch.Length 
                    && strText[intRetVal + intTestChar] == strSearch[intTestChar])
                intTestChar++;
            if (intTestChar == strSearch.Length)
                return intRetVal;
            intTestChar = 0;
            intRetVal++;
        }
    
        return -1;
    }

    it seems fine.  but how do i add it to the string.cs?

    and where is this file on my HD?

    Christ


    my code is perfect until i don't find a bug

    Sunday, September 15, 2019 9:21 PM
  • not until now,

    strRetVal[21] = 712 'ˈ'  (value copied from debugger-watch of strRetVal[21])

    think that's the problem?


    my code is perfect until i don't find a bug

    Sunday, September 15, 2019 9:38 PM
  • Hi,

    // Please, try to add start pos as .... and see the result
    int StartPos = 0;
    
    String.IndexOf(YourString, StartPos);

    Sunday, September 15, 2019 11:22 PM
  • not until now,

    strRetVal[21] = 712 'ˈ'  (value copied from debugger-watch of strRetVal[21])

    think that's the problem?


    my code is perfect until i don't find a bug

    Yeah, I also noticed that char, assumed it's just a dirt on my screen.

    Just copy the string and paste here, I'll try it on my PC.

    


    • Edited by RobbKirk Sunday, September 15, 2019 11:38 PM
    Sunday, September 15, 2019 11:31 PM
  • This worked

    int ind = s.IndexOf(x, StringComparison.Ordinal);


    • Proposed as answer by RobbKirk Sunday, September 15, 2019 11:54 PM
    • Unproposed as answer by RobbKirk Monday, September 16, 2019 1:45 AM
    Sunday, September 15, 2019 11:53 PM
  • tried that earlier, 

    and again just now.

    if (strHTML.Length > 21 && strHTML[21] == (char)712)
        strHTML = strHTML.Replace(strHTML[21], ' ');
    
    intCut_Close = strRetVal.IndexOf(strClose, 0);
                    

    even tried to change the character that follows the one i'm looking for and saw the replace() line above make no difference to the value of strHTML.

    and it still reports the following instance of '>'...


    my code is perfect until i don't find a bug

    Monday, September 16, 2019 12:05 AM
  • I took the time to uninstall C#2019 Preview & reinstall the newer (slicker black screened hopefully not so buggy) C#2019 Community version.

    the IndexOf() issue is still prresent, unfortunately.

    and here is my string : "[em]<span class=\"mw\">ˈbi(l)-​yən(t)th</span>[/em]"

    public static string HTML_RemoveMarkUpLanguage(string strHTML)
    {
        string strOpen = "<";
        string strClose = ">";
        int intCut_Open=-1, intCut_Close=-1;
        string strRetVal = strHTML;
        do
        {
            intCut_Open = strRetVal.IndexOf(strOpen);
            intCut_Close = strRetVal.IndexOf(strClose, 0);
            if (intCut_Close > intCut_Open && intCut_Open >= 0)
            {
                strRetVal = strRetVal.Substring(0, intCut_Open)
                                + strRetVal.Substring(intCut_Close + 1);
            }
            else if (intCut_Close >= 0 && intCut_Open > intCut_Close)
            {
                strRetVal = strRetVal.Substring(intCut_Close + 1);
            }
            else if (intCut_Open >= 0)
                strRetVal = strRetVal.Substring(0, intCut_Open);
    
            intCut_Open = strRetVal.IndexOf(strOpen);
        } while (intCut_Open >= 0);
    
        return strRetVal;
    }
    

    tell me how it turns out.

    Christ


    my code is perfect until i don't find a bug

    Monday, September 16, 2019 12:10 AM
  • This worked

    int ind = s.IndexOf(x, StringComparison.Ordinal);


    try
    • Marked as answer by Christ Kennedy Monday, September 16, 2019 1:12 AM
    Monday, September 16, 2019 12:33 AM
  • woohoo!!!

    public static string HTML_RemoveMarkUpLanguage(string strHTML)
    {
        string strOpen = "<";
        string strClose = ">";
        int intCut_Open=-1, intCut_Close=-1;
        string strRetVal = strHTML;
        do
        {
            intCut_Open = strRetVal.IndexOf(strOpen);
            //intCut_Close = strRetVal.IndexOf(strClose, 0);
            //int ind = s.IndexOf(x, StringComparison.Ordinal);
            intCut_Close = strRetVal.IndexOf(strClose, StringComparison.Ordinal);
            if (intCut_Close > intCut_Open && intCut_Open >= 0)
            {
                strRetVal = strRetVal.Substring(0, intCut_Open)
                                + strRetVal.Substring(intCut_Close + 1);
            }
            else if (intCut_Close >= 0 && intCut_Open > intCut_Close)
            {
                strRetVal = strRetVal.Substring(intCut_Close + 1);
            }
            else if (intCut_Open >= 0)
                strRetVal = strRetVal.Substring(0, intCut_Open);
    
            intCut_Open = strRetVal.IndexOf(strOpen);
        } while (intCut_Open >= 0);
    
        return strRetVal;
    }

    yepper!

    that worked... thanks

    Christ


    my code is perfect until i don't find a bug


    Monday, September 16, 2019 1:05 AM
  • not until now,

    strRetVal[21] = 712 'ˈ'  (value copied from debugger-watch of strRetVal[21])

    think that's the problem?

    712 is “Modifier Letter Vertical Line” (U+02C8) and, according to descriptions, it is a modification of preceding character, resulting in distinct grapheme. Therefore, IndexOf seems to search correctly.


    Monday, September 16, 2019 5:06 AM