none
Fastest way to build large strings. RRS feed

  • Question

  • Hi all,

    I am wanting to optimize my app where it makes strings of integers.

    I made this test example that creates a string of integers from an array of integers. Each integer is in a 10 chr field in the string. For example if the array is:

    index  value

     0   333
     1   22
     3   0
     4   999999


    then the string should be:

    333       22        0         999999

    the fields are 0 to 9 or 10:
    0123456789012345678901234567890123456789

    The array can be from 1 to 999,999 integers.

    The append example function takes 10 secs on my system this morning. That should be easy to improve????

    Here is the tester example. It is designed to add more functions or whatever?

    The test result shown by the example is: Length of Str, time to build, first three fields in string.

    You tell me if it is a valid test. Maybe I made an error? The test can be run multiple times by setting totalTests but for now 1 test seems long enough.

    Edit: Latest Test Results V6.



    'build string test v6
    Public Class Form3
        Private Label1 As New Label With {.Parent = Me, .Location = New Point(50, 20), .AutoSize = True}
        Private Label2 As New Label With {.Parent = Me, .Location = New Point(50, 60), .AutoSize = True}
        Private Label3 As New Label With {.Parent = Me, .Location = New Point(50, 100), .AutoSize = True}
        Private Label4 As New Label With {.Parent = Me, .Location = New Point(50, 140), .AutoSize = True}
        Private Label5 As New Label With {.Parent = Me, .Location = New Point(50, 180), .AutoSize = True}
        Private Label6 As New Label With {.Parent = Me, .Location = New Point(50, 220), .AutoSize = True}
        Private WithEvents button1 As New Button With {.Parent = Me, .Location = New Point(100, 260), .Text = "Run Test"}
        Private sw As New Stopwatch
        Private totalTests As Integer = 1
        Private ObjectArray(50001) As Integer
    
        Private Sub Form3_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            ClientSize = New Size(500, 300)
    
            'make the object array
            For i As Integer = 0 To 50000
                ObjectArray(i) = i
            Next
    
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles button1.Click
            button1.Enabled = False
            button1.Refresh()
    
            Dim str As String = ""
    
            sw.Reset()
            sw.Start()
    
            'test append method
            For i As Integer = 1 To totalTests
                str = BuildAppendString(ObjectArray)
            Next
            EndTest(Label2, "Append: ", str)
    
            'Cor's append method
            For i As Integer = 1 To totalTests
                str = BuildStringCor(ObjectArray)
            Next
            EndTest(Label3, "String Builder Cor: ", str)
    
            'Leshays' append method
            For i As Integer = 1 To totalTests
                str = BuildStringLes(ObjectArray)
            Next
            EndTest(Label4, "String Builder Les: ", str)
    
            sw.Stop()
            Label1.Text = "Test Complete"
            button1.Enabled = True
            button1.Refresh()
        End Sub
    
        Private Sub EndTest(theLabel As Label, results As String, thisStr As String)
    
            sw.Stop()
    
            theLabel.Text = results & "  " & thisStr.Length.ToString & "   " &
                        (CInt(sw.ElapsedMilliseconds / (totalTests)).ToString) & " ms" &
                        "  -" & thisStr.Substring(0, 30) &
                        "..." & thisStr.Substring(thisStr.Length - 30, 30) & "-"
    
            theLabel.Refresh()
    
            'start next test
            sw.Reset()
            sw.Start()
    
        End Sub
    
        Private Function BuildAppendString(thisArray() As Integer) As String
            Dim str As String = ""
            Dim lastCount As Integer = 10000
            Label1.Text = "Append Starting..."
            Label1.Refresh()
    
            For i As Integer = 0 To thisArray.Length - 1
                str += thisArray(i).ToString & Space(10 - Len(i.ToString))
                If i >= lastCount Then
                    Label1.Text = "Append: " & lastCount.ToString
                    Label1.Refresh()
                    lastCount = i + 10000
                End If
            Next
    
            Return str
    
        End Function
    
        Private Function BuildStringCor(thisArray() As Integer) As String
            Dim str As String = ""
            Dim lastCount As Integer = 10000
            Label1.Text = "String Builder Cor Starting..."
            Label1.Refresh()
    
            Dim sb As New System.Text.StringBuilder
    
            For i As Integer = 0 To thisArray.Length - 1
                If i >= lastCount Then
                    Label1.Text = "Cor Sting Builder..." & lastCount.ToString
                    Label1.Refresh()
                    lastCount += 10000
                End If
                sb.Append(thisArray(i).ToString & Space(10 - Len(i.ToString)))
            Next
            str = sb.ToString
    
            Return str
    
        End Function
    
        Private Function BuildStringLes(thisArray() As Integer) As String
            Dim str As String = ""
            Dim lastCount As Integer = 10000
            Label1.Text = "String Builder Les Starting..."
            Label1.Refresh()
    
            Dim sb As New System.Text.StringBuilder
    
            For i As Integer = 0 To thisArray.Length - 1
                If i >= lastCount Then
                    Label1.Text = "Les Sting Builder..." & lastCount.ToString
                    Label1.Refresh()
                    lastCount += 10000
                End If
    
                sb.Append(thisArray(i).ToString.PadRight(10))
    
            Next
            str = sb.ToString
    
            Return str
    
        End Function
    End Class



    Friday, August 18, 2017 2:00 PM

Answers

  • Tommy,

    I just started to add a function to your previouscode, so this I did first. 

    A string is an immutable object, therefore if you add to a string, first the old string is copied as new and then is added to that. 

    You do that 50001 times. This is solved by the stringbuilder. Here the changed code. Try it and be surprised.

       Private Sub Form2_Shown(sender As Object, e As EventArgs) Handles Me.Shown
            button1.Enabled = False
            Refresh()
            StartTest()
            Dim sb As New System.Text.StringBuilder
            Dim nextcount As Integer = 0
            For i As Integer = 0 To 50000
                If i > nextcount Then
                    Label1.Text = "Building Big String..." & nextcount.ToString
                    Label1.Refresh()
                    nextcount += 10000
                End If
                sb.Append(i.ToString & Space(10 - Len(i.ToString)))
            Next
            BigString = sb.ToString
            sw.Stop()
            EndTest(Label1, "Build String: " & BigString.Length.ToString & "  " & sw.ElapsedMilliseconds & " ms")
            button1.Enabled = True
        End Sub


    I used your previous project


    Success
    Cor




    • Edited by Cor Ligthert Friday, August 18, 2017 2:33 PM
    • Marked as answer by tommytwotrain Saturday, August 19, 2017 12:19 AM
    Friday, August 18, 2017 2:30 PM
  • Hi

    A bit late to the party, but in common with the other posts, I use a StringBuilder. Over 1000 tests of 50000 strings the average here was 9ms per test, or, over 10 tests of  1000000 strings the average was 205ms per test.

    ' Form1 with TextBox1 (multiline)
    ' and Button1
    Option Strict On
    Option Explicit On
    Public Class Form1
        Private sw As New Stopwatch
        Private totalTests As Integer = 100
        Private ObjectArray(50000) As Integer
        Private HoldTimes As New List(Of Long)
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            For i As Integer = 0 To ObjectArray.Count - 1
                ObjectArray(i) = i
            Next
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Button1.Enabled = False
            Dim BigString As String
            For i As Integer = 1 To totalTests
                BigString = BuildAppendString(ObjectArray)
                TextBox1.AppendText("Test " & i.ToString & "  Elapsed Time = " & HoldTimes(i - 1).ToString("#") & vbCrLf)
            Next
            TextBox1.AppendText("Test Complete:  Average: " & HoldTimes.Average.ToString("#") & " ms  for  " & (ObjectArray.Count - 1).ToString & " strings" & vbCrLf & vbCrLf)
            Button1.Enabled = True
        End Sub
        Private Function BuildAppendString(thisArray() As Integer) As String
            Dim sb As New Text.StringBuilder
            sw.Restart()
            For i As Integer = 0 To thisArray.Count - 1
                sb.Append(thisArray(i).ToString.PadRight(10))
            Next
            HoldTimes.Add(sw.ElapsedMilliseconds)
            Return sb.ToString
        End Function
    End Class


    Regards Les, Livingston, Scotland




    • Edited by leshay Friday, August 18, 2017 3:34 PM
    • Marked as answer by tommytwotrain Saturday, August 19, 2017 12:19 AM
    Friday, August 18, 2017 3:29 PM

All replies

  • Tommy,

    I just started to add a function to your previouscode, so this I did first. 

    A string is an immutable object, therefore if you add to a string, first the old string is copied as new and then is added to that. 

    You do that 50001 times. This is solved by the stringbuilder. Here the changed code. Try it and be surprised.

       Private Sub Form2_Shown(sender As Object, e As EventArgs) Handles Me.Shown
            button1.Enabled = False
            Refresh()
            StartTest()
            Dim sb As New System.Text.StringBuilder
            Dim nextcount As Integer = 0
            For i As Integer = 0 To 50000
                If i > nextcount Then
                    Label1.Text = "Building Big String..." & nextcount.ToString
                    Label1.Refresh()
                    nextcount += 10000
                End If
                sb.Append(i.ToString & Space(10 - Len(i.ToString)))
            Next
            BigString = sb.ToString
            sw.Stop()
            EndTest(Label1, "Build String: " & BigString.Length.ToString & "  " & sw.ElapsedMilliseconds & " ms")
            button1.Enabled = True
        End Sub


    I used your previous project


    Success
    Cor




    • Edited by Cor Ligthert Friday, August 18, 2017 2:33 PM
    • Marked as answer by tommytwotrain Saturday, August 19, 2017 12:19 AM
    Friday, August 18, 2017 2:30 PM
  • Tommy,

    With large strings you're better off to use a System.Text.StringBuilder than anything I'm aware of. It's made to create a mutable object (a string is immutable), so fast or not, be safe with it. ;-)


    "A problem well stated is a problem half solved.” - Charles F. Kettering


    • Edited by Frank L. Smith Friday, August 18, 2017 2:33 PM ...added link to MSDN documentation
    Friday, August 18, 2017 2:32 PM
  • Cor,

    Yes I seem to recall that in the olden days it was best to dim the entire string the length it will be with spaces? and then just replace over the new integer chrs into the existing field of 10. So the string is only made once at the start and then changed for the 50001 times. I was trying to do that with Insert... no not right, .Replace ... still trying...

    PS I will add yours to the test if I understand it.

    Its nice if you all make a function that I can just add. Or even add a function to the example test. That way I wont mess it up etc.


    PS In fact that is the next challenge... fastest replace... or maybe that will be the answer to this one too?
    Friday, August 18, 2017 2:42 PM
  • PS Cor,

    Wow alright I added your stringBuilder function. 15ms to build the sting vs 9000, nice. See the original test update above.

    Friday, August 18, 2017 3:14 PM
  • Hi

    A bit late to the party, but in common with the other posts, I use a StringBuilder. Over 1000 tests of 50000 strings the average here was 9ms per test, or, over 10 tests of  1000000 strings the average was 205ms per test.

    ' Form1 with TextBox1 (multiline)
    ' and Button1
    Option Strict On
    Option Explicit On
    Public Class Form1
        Private sw As New Stopwatch
        Private totalTests As Integer = 100
        Private ObjectArray(50000) As Integer
        Private HoldTimes As New List(Of Long)
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            For i As Integer = 0 To ObjectArray.Count - 1
                ObjectArray(i) = i
            Next
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Button1.Enabled = False
            Dim BigString As String
            For i As Integer = 1 To totalTests
                BigString = BuildAppendString(ObjectArray)
                TextBox1.AppendText("Test " & i.ToString & "  Elapsed Time = " & HoldTimes(i - 1).ToString("#") & vbCrLf)
            Next
            TextBox1.AppendText("Test Complete:  Average: " & HoldTimes.Average.ToString("#") & " ms  for  " & (ObjectArray.Count - 1).ToString & " strings" & vbCrLf & vbCrLf)
            Button1.Enabled = True
        End Sub
        Private Function BuildAppendString(thisArray() As Integer) As String
            Dim sb As New Text.StringBuilder
            sw.Restart()
            For i As Integer = 0 To thisArray.Count - 1
                sb.Append(thisArray(i).ToString.PadRight(10))
            Next
            HoldTimes.Add(sw.ElapsedMilliseconds)
            Return sb.ToString
        End Function
    End Class


    Regards Les, Livingston, Scotland




    • Edited by leshay Friday, August 18, 2017 3:34 PM
    • Marked as answer by tommytwotrain Saturday, August 19, 2017 12:19 AM
    Friday, August 18, 2017 3:29 PM
  • Thanks Les,

    I added your example above to see if pad right was different.

    Friday, August 18, 2017 3:56 PM
  • Thanks Les,

    I added your example above to see if pad right was different.

    Hi

    Using the 10 test 1000000 string example, it would appear that using PadRight saves approx 30% in my trials.


    Regards Les, Livingston, Scotland


    • Edited by leshay Friday, August 18, 2017 4:08 PM change 100 to 10
    Friday, August 18, 2017 4:06 PM
  • Hi

    Here is a version with a CheckBox to choose to use either PadRight or Spaces(10)

    ' Form1 with TextBox1 (multiline)
    ' Button1 and CheckBox1
    Option Strict On
    Option Explicit On
    Public Class Form1
        Private sw As New Stopwatch
        Private totalTests As Integer = 10
        Private ObjectArray(1000000) As Integer
        Private HoldTimes As New List(Of Long)
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            For i As Integer = 0 To ObjectArray.Count - 1
                ObjectArray(i) = i
            Next
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Button1.Enabled = False
            CheckBox1.Enabled = False
            Dim BigString As String
            HoldTimes.Clear()
    
            For i As Integer = 1 To totalTests
                BigString = BuildAppendString(ObjectArray, CheckBox1.Checked)
                TextBox1.AppendText("Test " & i.ToString & "  Elapsed Time = " & HoldTimes(i - 1).ToString("#") & vbCrLf)
            Next
            Dim pd As String = " (Using Space(10 - Len(i.ToString))"
            If CheckBox1.Checked Then pd = " (Using PadRight)"
            TextBox1.AppendText("Test Complete:  Average: " & HoldTimes.Average.ToString("#") & " ms  for  " & (ObjectArray.Count - 1).ToString & " strings" & vbCrLf & pd & vbCrLf & vbCrLf)
            Button1.Enabled = True
            CheckBox1.Enabled = True
        End Sub
        Private Function BuildAppendString(thisArray() As Integer, testPAD As Boolean) As String
            Dim sb As New Text.StringBuilder
            sw.Restart()
            If testPAD Then
                For i As Integer = 0 To thisArray.Count - 1
                    sb.Append(thisArray(i).ToString.PadRight(10))
                Next
            Else
                For i As Integer = 0 To thisArray.Count - 1
                    sb.Append(thisArray(i).ToString & Space(10 - Len(i.ToString)))
                Next
            End If
            HoldTimes.Add(sw.ElapsedMilliseconds)
            Return sb.ToString
        End Function
    End Class


    Regards Les, Livingston, Scotland

    Friday, August 18, 2017 4:24 PM
  • Thanks Les,

    I added your example above to see if pad right was different.

    Hi

    Using the 10 test 1000000 string example, it would appear that using PadRight saves approx 30% in my trials.


    Regards Les, Livingston, Scotland


    Les,

    Oh yes I see it now. I took out the slow append and did 10 test sample and pad right is 30 percent faster and more as the 10-len way. It does vary from test to test.

    Edit:  V7 with control refresh removed from timing loop.

    This test builds the string 10 times each method and gives an average. The results still vary by several ms each run.

    'build string test v7
    Public Class Form3
        Private Label1 As New Label With {.Parent = Me, .Location = New Point(50, 20), .AutoSize = True}
        Private Label2 As New Label With {.Parent = Me, .Location = New Point(50, 60), .AutoSize = True}
        Private Label3 As New Label With {.Parent = Me, .Location = New Point(50, 100), .AutoSize = True}
        'Private Label4 As New Label With {.Parent = Me, .Location = New Point(50, 140), .AutoSize = True}
        'Private Label5 As New Label With {.Parent = Me, .Location = New Point(50, 180), .AutoSize = True}
        'Private Label6 As New Label With {.Parent = Me, .Location = New Point(50, 220), .AutoSize = True}
        Private WithEvents button1 As New Button With {.Parent = Me, .Location = New Point(100, 260), .Text = "Run Test"}
        Private sw As New Stopwatch
        Private totalTests As Integer = 1
        Private ObjectArray(50001) As Integer
    
        Private Sub Form3_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            ClientSize = New Size(500, 300)
    
            'make the object array
            For i As Integer = 0 To 50000
                ObjectArray(i) = i
            Next
    
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles button1.Click
            button1.Enabled = False
            button1.Refresh()
    
            Dim str As String = ""
            Label1.Text = "Starting Test..."
            Label1.Refresh()
    
            sw.Reset()
            sw.Start()
    
            totalTests = 10
    
            '10-Len stringbuidler method
            For i As Integer = 1 To totalTests
                str = BuildStringSbLen(ObjectArray)
            Next
            EndTest(Label2, "String Builder Len: ", str)
    
            'pad right string builder method
            For i As Integer = 1 To totalTests
                str = BuildStringSbPad(ObjectArray)
            Next
            EndTest(Label3, "String Builder Pad: ", str)
    
            sw.Stop()
            Label1.Text = "Test Complete"
            button1.Enabled = True
            button1.Refresh()
        End Sub
    
        Private Sub EndTest(theLabel As Label, results As String, thisStr As String)
    
            sw.Stop()
    
            theLabel.Text = results & "  " & thisStr.Length.ToString & "   " &
                        (CInt(sw.ElapsedMilliseconds / (totalTests)).ToString) & " ms" &
                        "  -" & thisStr.Substring(0, 30) &
                        "..." & thisStr.Substring(thisStr.Length - 30, 30) & "-"
    
            theLabel.Refresh()
    
            'start next test
            sw.Reset()
            sw.Start()
    
        End Sub
    
        Private Function BuildStringSbLen(thisArray() As Integer) As String
            Dim str As String = ""
            Dim sb As New System.Text.StringBuilder
    
            For i As Integer = 0 To thisArray.Length - 1
                sb.Append(thisArray(i).ToString & Space(10 - Len(i.ToString)))
            Next
            str = sb.ToString
    
            Return str
    
        End Function
    
        Private Function BuildStringSbPad(thisArray() As Integer) As String
            Dim str As String = ""
            Dim sb As New System.Text.StringBuilder
    
            For i As Integer = 0 To thisArray.Length - 1
                sb.Append(thisArray(i).ToString.PadRight(10))
            Next
            str = sb.ToString
    
            Return str
    
        End Function
    End Class


    Friday, August 18, 2017 4:30 PM
  • Hi Tommy, 

    That is not fair, I was just showing the differences from stringbuilder and string using your old code. 

    You as graphic guy should know that this is very inefficient in what I've shown.

           Label1.Text = "Building Big String..." & nextcount.ToString
                    Label1.Refresh()

    But my goal was not to show the best performance, just the difference between appending strings and stringbuilders.


    Success
    Cor

    Friday, August 18, 2017 4:35 PM
  • Hi Tommy, 

    That is not fair, I was just showing the differences from stringbuilder and string using your old code. 

    LOL. Yes that is right Cor.

    You had my old 10-Len dragging along.

    Friday, August 18, 2017 4:40 PM
  • You as graphic guy should know that this is very inefficient in what I've shown.

           Label1.Text = "Building Big String..." & nextcount.ToString
                    Label1.Refresh()

    But my goal was not to show the best performance, just the difference between appending strings and stringbuilders.


    Success
    Cor

    PS. Yes. Good point.

    But its the only way to change the screen update display in real time when it was doing seconds on the slow routine. Its the same in all tests and only adds delay for 1 in 10000 loops so I figure its not bad?

    PS Feel free to update the example that's one reason I post it. We can all use the lastest.

    Plus I miss errors and etc.

    PS again. Now in thinking those should be removed for the tests that are a few milliseconds and would account for a lot of the variation between running many tests.

    Friday, August 18, 2017 4:45 PM

  • But its the only way to change the screen update display in real time when it was doing seconds on the slow routine. Its the same in all tests and only adds delay for 1 in 10000 loops so I figure its not bad?

    Yea but as I can  see quick Len uses in once in a loop so 10 times, so it is 100000 times done in the code I've shown and that is the fare most slowest part of the code I've shown. 

    Then you should not tell that my way is slower 

    :-)

    Not that I care much by the way. 


    Success
    Cor

    Friday, August 18, 2017 4:51 PM

  • But its the only way to change the screen update display in real time when it was doing seconds on the slow routine. Its the same in all tests and only adds delay for 1 in 10000 loops so I figure its not bad?

    Yea but as I can  see quick Len uses in once in a loop so 10 times, so it is 100000 times done in the code I've shown and that is the fare most slowest part of the code I've shown. 

    Then you should not tell that my way is slower 

    :-)

    Not that I care much by the way. 


    Success
    Cor

    Yes you are right. The controls updates etc. should be removed from the loop at that small millisecond interval. It could be making the times vary by what? 50 percent when at 15 ms intervals.

    PS Just info I took the refresh etc out but it still too varied to see any difference. ie there are 3ms differences each test, just from the system?

    PS I posted an updated test v7 a few posts up.

    Friday, August 18, 2017 4:56 PM