none
Very strange. My "SubstringThenTrim" code is much slower than what it is supposed to be RRS feed

  • Question

  • I want to improve the performance of code like the following,

    .Substring.Trim Approach

    var str = "                abc       def         ";
    return str.Substring(5, 10).Trim(); //*

    I want to improve the performance of line * by the following new extension method on System.String.

            public unsafe static string FastSubstringThenTrim(this string str, int startIndex, int length)
            {
                var endIndex = startIndex + length - 1;
    
                while (startIndex <= endIndex)
                {
                    if (!char.IsWhiteSpace(str[startIndex])) break;
                    ++startIndex;
                }
    
                while (endIndex >= startIndex)
                {
                    if (!char.IsWhiteSpace(str[endIndex])) break;
                    --endIndex;
                }
    
                return str.Substring(startIndex, endIndex - startIndex + 1);
            }

    and then the first code piece can be rewritten as

    .FastSubstringThenTrim Approach

    var str = "                abc       def         ";
    return str.FastSubstringThenTrim(5, 10); //**

    The FastSubstringThenTrim method is written in a way exactly the same as .NET's build-in Trim method, and it reduces internal substring operations from twice to once. It should be faster, but actually it is slower by 20%. I don't know why and I have been stumbled on this issue for entire one day.

    The following is the source code of .NET's official String.Trim method.

    [SecuritySafeCritical] private string TrimHelper(int trimType) { int end = this.Length - 1; int start = 0; if (trimType != 1) { start = 0; while (start < this.Length) { if (!char.IsWhiteSpace(this[start]) && !IsBOMWhitespace(this[start])) { break; } start++; } } if (trimType != 0) { end = this.Length - 1; while (end >= start) { if (!char.IsWhiteSpace(this[end]) && !IsBOMWhitespace(this[start])) { break; } end--; } } return this.CreateTrimmedString(start, end); }

    [SecurityCritical]
    private string CreateTrimmedString(int start, int end)
    {
        int length = (end - start) + 1;
        if (length == this.Length)
        {
            return this;
        }
        if (length == 0)
        {
            return Empty;
        }
        return this.InternalSubString(start, length);
    }




    Sunday, May 4, 2014 10:16 AM

Answers

    • Have you tried to run the Release version of your project? What I see here is that in the Debug version your "SubstringThenTrim" method is indeed slower, its "cost" is 0.18 while the cost of the Substring+Trim is 0.13.

    The Release version shows different results, 0.12 for Substring + Trims vs. 0.08 for SubstringThenTrim.

    And some notes about your DiagnosticsEx:

    • Writing 1000 method calls is a waste of time, the whole thing should be a loop and then you only need one tc method, not 4 of them. Maybe you think that using a loop adds overhead but anyway the delegate call has higher overhead than the loop itself.
    • You should use Stopwatch instead of Environment.TickCount. TickCount's resolution is 16ms which is too low for performance measurements.
    • Normally you want to ignore the result of the first timing iteration, that result is likely to include time spent JIT-compiling your code.
    • Rather than summing/averaging the time of each iteration you should keep the individual results and either display them all or use standard deviation to reject values that are too different.
    Monday, May 5, 2014 6:19 AM
    Moderator

All replies

  • I don't see how it reduces internal substring operations.

    Edit: You're right - it should be reducing this from 2 to 1.

    If you're referring to omitting "IsBOMWhitespace" - this is just returns 'false' anyway:

    http://referencesource.microsoft.com/#mscorlib/system/string.cs#bbf058af7f3f71df


    Convert between VB, C#, C++, & Java (http://www.tangiblesoftwaresolutions.com)
    Instant C# - VB to C# Converter
    Instant VB - C# to VB Converter


    Sunday, May 4, 2014 4:12 PM
  • How do you measure the performance? In my tests your FastSubstringThenTrim is 2x faster than the Substring + Trim approach.
    Sunday, May 4, 2014 4:24 PM
    Moderator
  • Thank you for the reply :) I will add the IsBOMWhitespace.

    Monday, May 5, 2014 3:51 AM
  • Hi Mike, 

    I run the two code pieces 1 million times in Release, Any CPU mode. The result is always the .Substring.Trim approach 20% faster than .SubstringThenTrim on my machine. 

    If possible, could you give your source code to measure my method? Thanks a lot :)

    I use the following extension method TimeCost to measure the performance.

           

    public static class DiagnosticsEx

    {

    /// <summary> /// Specifies rough prediction for the timecost of executing a <see cref="System.Action"/> delegate once. /// </summary> public enum TimeCostPrediction { /// <summary> /// The execution time is more than 1 second. /// </summary> Long, /// <summary> /// The execution time is more than 0.1 second. /// </summary> Medium, /// <summary> /// The execution time is less than 0.01 second. /// </summary> Short, /// <summary> /// The execution time is less than 0.001 second. /// </summary> VeryShort } /// <summary> /// Tests the time cost of the specified <see cref="System.Action"/> delegate. /// </summary> /// <param name="method">The action method to test.</param> /// <param name="testCount">Indicates how many times the test will be executed.</param> /// <param name="average">Indicates whether to return the average time cost of each test or the total time cost of all tests.</param> /// <param name="prediction">Roughly predicts how long the <see cref="System.Action"/> delegate will cost.</param> /// <returns>The test result measured in ticks.</returns> public static double TimeCost(this Action method, int testCount, bool average, TimeCostPrediction prediction) { switch (prediction) { case TimeCostPrediction.Long: return _tc(method, testCount, average); default: return _tc20(method, testCount, average); case TimeCostPrediction.Short: return _tc100(method, testCount, average); case TimeCostPrediction.VeryShort: return _tc1000(method, testCount, average); } } private static double _tc1000(Action method, int count, bool average) { int c = count; int rlt = 0; int val; while (c-- > 0) { val = Environment.TickCount; method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); rlt += Environment.TickCount - val; } if (average) return rlt / ((double)count * 1000); else return rlt / 1000f; } private static double _tc100(Action method, int count, bool average) { int c = count; int rlt = 0; int val; while (c-- > 0) { val = Environment.TickCount; method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); rlt += Environment.TickCount - val; } if (average) return rlt / ((double)count * 100); else return rlt / 100f; } private static double _tc20(Action method, int count, bool average) { int c = count; int rlt = 0; int val; while (c-- > 0) { val = Environment.TickCount; method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); method(); rlt += Environment.TickCount - val; } if (average) return rlt / ((double)count * 20); else return rlt / 20f; } private static double _tc(Action method, int count, bool average) { int c = count; int rlt = 0; int val; while (c-- > 0) { val = Environment.TickCount; method(); rlt += Environment.TickCount - val; } if (average) return rlt / ((double)count); else return rlt; }

    }

    Then I use the following code to perform the test.

            static void Main()
            {
                var testStr = "          abc     def       ";
                var act1 = new Action(() => testStr.Substring(5, 10).Trim());
                var act2 = new Action(() => testStr.SubstringThenTrim(5, 10));
                Console.WriteLine(act1.TimeCost(1000, false, DianosticsEx.TimeCostPrediction.VeryShort));
                Console.WriteLine(act2.TimeCost(1000, false, DianosticsEx.TimeCostPrediction.VeryShort));
            }


    Monday, May 5, 2014 4:09 AM
  • Thank you for the reply :) I will add the IsBOMWhitespace.

    No - you don't need it - I was just thinking maybe you thought that your method should be faster because you omitted it. 

    Convert between VB, C#, C++, & Java (http://www.tangiblesoftwaresolutions.com)
    Instant C# - VB to C# Converter
    Instant VB - C# to VB Converter

    Monday, May 5, 2014 4:45 AM
    • Have you tried to run the Release version of your project? What I see here is that in the Debug version your "SubstringThenTrim" method is indeed slower, its "cost" is 0.18 while the cost of the Substring+Trim is 0.13.

    The Release version shows different results, 0.12 for Substring + Trims vs. 0.08 for SubstringThenTrim.

    And some notes about your DiagnosticsEx:

    • Writing 1000 method calls is a waste of time, the whole thing should be a loop and then you only need one tc method, not 4 of them. Maybe you think that using a loop adds overhead but anyway the delegate call has higher overhead than the loop itself.
    • You should use Stopwatch instead of Environment.TickCount. TickCount's resolution is 16ms which is too low for performance measurements.
    • Normally you want to ignore the result of the first timing iteration, that result is likely to include time spent JIT-compiling your code.
    • Rather than summing/averaging the time of each iteration you should keep the individual results and either display them all or use standard deviation to reject values that are too different.
    Monday, May 5, 2014 6:19 AM
    Moderator
  • Why have you made FastSubstringThenTrim method unsafe? That could be the culprit. Ususally unsafe methods always run slower than managed methods.

    I hope that helps.


    Please mark this post as answer if it solved your problem. Happy Programming!

    Monday, May 5, 2014 7:12 AM