locked
Measure the performance of a LINQ sort RRS feed

  • Question

  •  

    In this thread:

     

    http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=2434432&SiteID=1

     

    there is a discussion about the performance of various methods of sorting (order by descending) an array.  I thought that using LINQ might be the fastest method but either it is much slower than using Array.Sort followed by Array.Reverse or I don't understand how to fairly measure LINQ performance for this task.  (I am sure it is the latter!)

     

    So, how can you compare the speed of a LINQ sort vs using the Array methods?

    Tuesday, November 20, 2007 9:05 PM

Answers

  • LINQ's OrderBy operator is about five times slower than Array.Sort with large sequences. Here's how you can compare the two:

        var data = new byte [1000000];
        System.Security.Cryptography.RandomNumberGenerator.Create ().GetBytes (data);

        var sw = Stopwatch.StartNew ();
        data.OrderBy (b => b).ToArray ();   // execute the query into an array
        sw.Stop ();
        Console.WriteLine (sw.Elapsed);

        sw = Stopwatch.StartNew ();
        Array.Sort (data);
        sw.Stop ();
        Console.WriteLine (sw.Elapsed);   // about five times faster

    In the first (LINQ) query, I've called ToArray() after OrderBy for two reasons: (1) to get the result back into an array (so we're on par with Array.Sort) - and (2) to force the query to execute.

    It's easy to attribute the difference in performance to the creation of this additional array. (In fact, it's worse than that: LINQ's OrderBy method has to create two more intermediate arrays into which to sort and store keys). Interestingly, though, the creation of the three extra arrays doesn't account for the increased execution times: creating and copying arrays is actually quite fast compared to performing the sort.  The reason for OrderBy being relatively slow is due to the extra level of indirection that's required to compare two elements. I know - because I reimplemented all the query operators from scratch in LINQBridge! The difficulty in writing OrderBy is that you have to invoke a delegate to obtain each sorting key - and your logic must also allow for a subsequent ThenBy operator. This increases the time for every comparison, which is where the problem lies.

    I guess it might be possible to optimize the special case of an identity lambda with no ThenBy operator.  But then again, the inefficiency might not matter in most real life scenarios.  You can still sort a million elements in under a second!

    Joe

    Wednesday, November 21, 2007 9:22 AM

All replies

  • LINQ's OrderBy operator is about five times slower than Array.Sort with large sequences. Here's how you can compare the two:

        var data = new byte [1000000];
        System.Security.Cryptography.RandomNumberGenerator.Create ().GetBytes (data);

        var sw = Stopwatch.StartNew ();
        data.OrderBy (b => b).ToArray ();   // execute the query into an array
        sw.Stop ();
        Console.WriteLine (sw.Elapsed);

        sw = Stopwatch.StartNew ();
        Array.Sort (data);
        sw.Stop ();
        Console.WriteLine (sw.Elapsed);   // about five times faster

    In the first (LINQ) query, I've called ToArray() after OrderBy for two reasons: (1) to get the result back into an array (so we're on par with Array.Sort) - and (2) to force the query to execute.

    It's easy to attribute the difference in performance to the creation of this additional array. (In fact, it's worse than that: LINQ's OrderBy method has to create two more intermediate arrays into which to sort and store keys). Interestingly, though, the creation of the three extra arrays doesn't account for the increased execution times: creating and copying arrays is actually quite fast compared to performing the sort.  The reason for OrderBy being relatively slow is due to the extra level of indirection that's required to compare two elements. I know - because I reimplemented all the query operators from scratch in LINQBridge! The difficulty in writing OrderBy is that you have to invoke a delegate to obtain each sorting key - and your logic must also allow for a subsequent ThenBy operator. This increases the time for every comparison, which is where the problem lies.

    I guess it might be possible to optimize the special case of an identity lambda with no ThenBy operator.  But then again, the inefficiency might not matter in most real life scenarios.  You can still sort a million elements in under a second!

    Joe

    Wednesday, November 21, 2007 9:22 AM
  • Thank you for the detailed analysis. 

    Wednesday, November 21, 2007 12:30 PM