locked
equivalent of strlen in C# RRS feed

  • Question

  • I am supplied with this string:  "\002RECORD\003\002LEFT\003\002STANDBY\003\004"
    In my legacy C++ program, it tells me via strlen, that it is 24.  When i do it in C# through the Length property, it tells me that it is 46 (or whatever the actual length is). 

    My problem is that i need my program to determine the length of the string so that it returns the same value that a strlen would.  Is this something that i would have to do manually?
    Monday, April 25, 2011 11:59 AM

Answers

  • Just to clear up some misconceptions here. Christoph already mentioned this in his reply above, but it seems to have been ignored.

    If the VERBATIM string is "\002RECORD\003\002LEFT\003\002STANDBY\003\004" as read from, say, a file then C's strlen() function will indeed return 46.

    However, if the C pre-processor is  being used (because the string is hard-coded into a C/C++ code file) then the C pre-processor will convert the string BEFORE it is passed to the strlen() function.

    So what seems to be required here is to do the same pre-processing to the string that the C/C++ pre-processor does. The C strlen() function does NOT repeat NOT parse the \nnn escape sequences. That is ONLY repeat ONLY done by the C/C++ pre-processor.

    IT IS IMPORTANT TO UNDERSTAND THE DIFFERENCE!

    So, the answer to "what is the equivalent to strlen()" is "string.Length". But that is the wrong question. :)

    The real answer was already given: If you want to use the C# compiler to escape the codes, use \x as a prefix for the codes instead of \0

     


    • Marked as answer by jmgreen7 Monday, April 25, 2011 6:44 PM
    Monday, April 25, 2011 1:05 PM

All replies

  • Why is it returning 24? It should be returning 46. Is it parsing it within the method?

    Adam


    Ctrl+Z
    Monday, April 25, 2011 12:10 PM
  • I dont get your question very well, but this might help you out: http://stackoverflow.com/questions/1083550/how-to-implement-strlen-in-c
    Mitja
    Monday, April 25, 2011 12:11 PM
  • The best i can tell, is that the strlen is not counting the \00 and only counting the 2s, 3s, 4s and the words.  I dont' know why.  I can understand the \0 not being counted as it's a null terminator, but the other 0, i don't know.
    Monday, April 25, 2011 12:16 PM
  • Mitja-

    I read that thread, but the examples don't address the null termination in embedded in the string.  I'm wondering if i need to parse this into multiple strings based on the termination and then get the length that way.

    Monday, April 25, 2011 12:26 PM
  • I would say yes you'll need to parse out the terminators or regex out the wanted text. C# won't recognize those so you'll have to do some work with it somewhere.

    Adam


    Ctrl+Z
    Monday, April 25, 2011 12:30 PM
  • In C++ \XXX represents an ASCII character in Octal notation. C# (or .NET rather) does not suport that, so you may not be able to directly use that same hardcoded string from C# code. You have two workarounds:

    (1) Write a C DLL that returns the count of the passed string, and P/Invoke it. Alternatively, use a C++/CLI wrapper.

    (2) Write these strings to disk, and then read it back from disk using C#. This time it will get the correct length.


    http://blog.voidnish.com
    Monday, April 25, 2011 12:34 PM
  • Wanted to add that your C++ code is correct. The intended length of that string is indeed 24.
    http://blog.voidnish.com
    Monday, April 25, 2011 12:35 PM
  • The default implementation of C/C++ strlen would look for the first appearance
    of \0 (8-bit char) or \00 (16-bit w_char).
    It would return zero for the both char widths.
    A function that returns 24 as sting length for
    \002RECORD\003\002LEFT\003\002STANDBY\003\004
    is a very specialized function. It seems to strip of the \00 first
    having "2RECORD32LEFT32STANDBY34" as remaining string, then checks its length
    which is indeed 24.
    But again, this is not what the C-lib function strlen would do.

    In C#, it'll be:


        public static int GetLengthWithoutBackslashZeroZero(string instr)
        {
           return string.IsNullOrEmpty(instr) ? 0 : instr.Replace(@"\00", null).Length;
        }
    


    Monday, April 25, 2011 12:37 PM
  • This is what i ended up using.  It was modified from the StackOverflow site that Mitja posted:

     

          if (s == null)
          {
            // Handle the error here
          }
    
          int length = 0;
    
          fixed (char* pStr = s)
          {
            for (int i = 0; i < s.Length; i++)
            {
              if (pStr[i] == '\0')
              {
                i ++;
              }
              else
              {
                length++;
              }
            }
          }
    
          return length;

    Monday, April 25, 2011 12:37 PM
  • Hello jmgreen, the issue is the way that you define hexadecimal char codes inside a C# string:

     

    "\x002RECORD\x003\002LEFT\x003\x002STANDBY\x003\x004"
    

     

    Hope this helps,

    Miguel.

    Monday, April 25, 2011 12:45 PM
  • The problem is that the escape characters are not looking the same in C#. strlen would treat "\002" as one character while C# only see's the first digit "\0" as an escape char, thus the following two characters, e.g. "02", "03" or "04" will be counted as normal characters. I don't know how you're planning to use such strings in C#, looks like some old style field delimiter to me (just guessing). If these are the only escape chars you have in your strings then you could simply replace them with whatever character should be there, like:

    		string s = @"\002RECORD\003\002LEFT\003\002STANDBY\003\004";
    		string[] cppesc = { "\\002", "\\003", "\\004" };
    		string[] csesc = { "\u0002", "\u0003", "\u0004" };
    		StringBuilder sb = new StringBuilder(s);
    		for (int i = 0; i < cppesc.Length; i++)
    		{
    			sb.Replace(cppesc[i], csesc[i]);
    		}
    		int len = sb.Length;
    

    /Calle


    - Still confused, but on a higher level -
    Monday, April 25, 2011 12:45 PM
  • Just to clear up some misconceptions here. Christoph already mentioned this in his reply above, but it seems to have been ignored.

    If the VERBATIM string is "\002RECORD\003\002LEFT\003\002STANDBY\003\004" as read from, say, a file then C's strlen() function will indeed return 46.

    However, if the C pre-processor is  being used (because the string is hard-coded into a C/C++ code file) then the C pre-processor will convert the string BEFORE it is passed to the strlen() function.

    So what seems to be required here is to do the same pre-processing to the string that the C/C++ pre-processor does. The C strlen() function does NOT repeat NOT parse the \nnn escape sequences. That is ONLY repeat ONLY done by the C/C++ pre-processor.

    IT IS IMPORTANT TO UNDERSTAND THE DIFFERENCE!

    So, the answer to "what is the equivalent to strlen()" is "string.Length". But that is the wrong question. :)

    The real answer was already given: If you want to use the C# compiler to escape the codes, use \x as a prefix for the codes instead of \0

     


    • Marked as answer by jmgreen7 Monday, April 25, 2011 6:44 PM
    Monday, April 25, 2011 1:05 PM
  • This is what i ended up using.  It was modified from the StackOverflow site that Mitja posted:

     

     

       if (s == null)
       {
        // Handle the error here
       }
    
       int length = 0;
    
       fixed (char* pStr = s)
       {
        for (int i = 0; i < s.Length; i++)
        {
         if (pStr[i] == '\0')
         {
          i ++;
         }
         else
         {
          length++;
         }
        }
       }
    
       return length;

     


    That's odd - I can't see how that would work properly. Consider your string:

    "\002RECORD\003\002LEFT\003\002STANDBY\003\004"

    If that is how you enter the string in the C# program code, it will be converted to this (where <nul> is the ascii nul character):

    <nul>02RECORD<nul>03<nul>02LEFT<nul>03<nul>02STANDBY<nul>03<nul>04

    The code you posted will count the number of non-nul characters, which for the specified string will be 31 - which is surely not the answer!

    Monday, April 25, 2011 1:21 PM
  • Matt,

    The i++ will move the pointer over 1 extra spot, which will allow it to skip that leading zero.

    The string that i presented in my problem statement is provided to me via an external interface.  Based on legacy code, all we did to determine which command was coming in was to look at the length of the string.  I'm kinda stuck with that logic.

    Monday, April 25, 2011 1:52 PM
  • Oh yes, I overlooked the i++ bit.

    I'm still a bit confused as to how you are actually getting the string in the format with embedded nul characters, unless it was produced by a C# program... If that same string (with the leading nul character) was presented to strlen(), it would return 0.

    Monday, April 25, 2011 2:55 PM
  • Your guess is as good as mine.  I facepalm myself everytime i have to deal with this particular interface.  I don't know if this matters or not, but that string is hardcoded in my file (I.e., I had to copy it into my code by hand). 

    I have my copy of the string as it appeared when it originated from the source (external interface).  to figure out which command I'm looking at, i loop through my copy of commands until i find the one that is the right size.  My guess is that the C++ program stripped it all out and that is why i was only seeing the 24 bytes.  I would rather not change the actual command to fit C#, and modify how i determine the length appropriately.
    I know what is going on, and why, but someone else coming in may not.

    Monday, April 25, 2011 3:07 PM
  • Okay,

    I got it all sorted out.  I used the System.Text.RegularExpressions.Regex.Unescape call.  My local copy of the string was prefaced with the @ symbol.  That allowed me to do everything that i needed to do.
    thanks.

    • Edited by jmgreen7 Monday, April 25, 2011 6:43 PM correct
    Monday, April 25, 2011 5:11 PM