none
Huge performance decrease (40 times) when using the IList<> interface of an Array (int[]) class RRS feed

  • Question

  • There is a huge performance difference between the first time and the second time in run the TestAddArray method.
    Both the IList<int>.Count and the IList<int>.Item[int index] decrease dramatically after I used the AddValues(IList<int> list) giving it a List<int> as a parameter.
    It seems like the JIT has somehow recompiled the code and made it perform in a horrible way for the int[]. It makes it 40 times slower the second time I run it.  It does not matter if I use .Net 2.0 or 3.5. 

    Can anybody explain this difference and is there a way around it and still use the Array's IList<> interface?

    You can test this with the simple console app test code provided below.



    First run with Array: 29 ms.
    First run with IList: 7 ms.
    Second run with Array: 1160 ms. (40 times slower then first run)

    using System;  
    using System.Collections.Generic;  
    using System.Diagnostics;  
     
    namespace TestAdd4  
      {  
      class Program  
        {  
        public static int AddValues(IList<int> list)  
          {  
          int add = 0;  
          for (int i = 0; i < list.Count; i++)  
            {  
            add += list[i];  
            }  
            
          return add;  
          }  
     
        static void TestAddArray(int[] items)  
          {  
          Stopwatch timer = new Stopwatch();  
          timer.Start();  
          int val = AddValues(items);  
          timer.Stop();  
     
          Console.WriteLine("Array: {0} ms. Value = {1}", timer.ElapsedMilliseconds, val);  
          }  
     
     
        static void TestAddIList(int[] items)  
          {  
          List<int> cl = new List<int>(items);  
          Stopwatch timer = new Stopwatch();  
          timer.Start();  
          int val = AddValues(cl);  
          timer.Stop();  
     
          Console.WriteLine("IList: {0} ms. Value = {1}", timer.ElapsedMilliseconds, val);  
          }  
     
        static void Main(string[] args)  
          {  
          int[] items = new int[500000];  
          for (int i = 0; i < items.Length; i++)  
            items[i] = (int) i - (items.Length / 2);  
     
          TestAddArray(items);  
          TestAddIList(items);  
          TestAddArray(items);  
            
          Console.WriteLine("Ready!");  
          Console.ReadKey();  
          }  
        }  
      }  
     

    Programmer
    • Edited by Dutch Idiot Friday, June 20, 2008 11:07 AM Forgot question
    Friday, June 20, 2008 11:05 AM

Answers

  • The solution of this problem is found by Microsoft. It is a bug in .net

    http://blogs.msdn.com/kirillosenkov/archive/2008/07/13/a-curious-subtlety-about-how-clr-does-interface-dispatch-on-array-types.aspx

    To react to earlier posts: I don't think this is something you shouldn't do. Using the IList<> interface abstracts the use of an array. So if you want to plug in another implementation (instead of an array), the using codebases don't have to change. In our case, we can have implementations like Memory mapped file, database or array. So, it makes good sense to abstract the actual implementation behind an interface. I see a lot of people using int[] etc. in code, but in a some cases this is actually a bad design principle in my opinion.

    Regards,

    Robert Kieboom
    Monday, July 14, 2008 7:10 PM

All replies

  • Really odd, replacing the for loop with a foreach makes it run fast for all three calls, but the for loop runs exactly as you described.

    First I thought that there could be a GC happening at some point there, but that's not the case either. The third call runs consistently slower, and I tried array sizes of 5000, 50000, 500000, and 5000000 (the last had a GC happen after the first call and the third call I just aborted).

    This was an interesting problem.

    Then I thought that perhaps this was a cpu cache problem, but if I add another call to the array method, so that the pattern is array, list, array, array, the two last there both runs slow. If this was a cpu cache problem, the second of those should run fast again.

    Could there be some kind of pagefile problem related to this? I would not think so when I ran the array with 5000 elements, 20kb shouldn't make the program need to page out.

    http://presentationmode.blogspot.com/
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Friday, June 20, 2008 11:52 AM
  • Also, if the pattern is array, array, list, array, array, the results are fast, fast, fast, slow, slow.

    http://presentationmode.blogspot.com/
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Friday, June 20, 2008 11:53 AM
  • Even stranger.

    When I have 50000 elements in the list, and run the following pattern, I get the following result:

    array: 3 ms
    ilist: 0 ms
    array: 201 ms

    if I add more calls to the array method, they all fluctuate around 200ms.

    However, if I call the list first, I get this result:

    ilist: 0 ms
    array: 605 ms
    ilist: 0 ms
    array: 577 ms

    this even ran slower than when I just went straight for the array to begin with.

    This is the strangest behavior I've seen in a long time, I sure hope someone can explain this, looks like someone I would need to know about and watch out for.

    http://presentationmode.blogspot.com/
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Friday, June 20, 2008 11:56 AM
  • An int[] is not readily convertable to an IList<int>.  In TestAddIList(), you help the runtime by explicitly doing the conversion.  It will use the List<T>(IEnumerable<T>) constructor which efficiently creates a copy of the array.  In the case of TestAddArray(), the runtime must convert the array dynamically.  It uses the hidden System.SZArrayHelper class to do this.  It doesn't convert the array wholesale, it acts as a shim that maps the IList<>.Item and IList<>.Count properties to the equivalent array operations.  That's expensive.  And there are optimizations that are no longer possible now.  For one, it must check if the index is out of bounds every time you index the IList<>.

    I have no explanation for why the first call is fast.  Something is getting cached the first time but not the second time.  Implementation of generics in the JIT compiler is a complicated mother.  All of this is fairly academic, converting an array to an IList just isn't a very good idea.

    Hans Passant.
    Friday, June 20, 2008 12:16 PM
    Moderator
  • That SZArrayHelper class was news to me.

    I've learned two things today, I should've saved one of them for tomorrow, now I have to find something to learn tomorrow as well, or it's wasted. Oh well.

    Thanks for the info, though I'd still like to know what specifically made the difference, at least now I know there is some magic here that I didn't know about.

    I tried duplicating the array method and calling first one, then the other, just so that I could get dotTrace to time them separately (it combines method calls from the same method into one entry with multiple calls and sums their time), but it didn't reveal anything different.

    The actual code being executed doesn't seem to change either.

    http://presentationmode.blogspot.com/
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Friday, June 20, 2008 12:39 PM
  • nobugz said:

    An int[] is not readily convertable to an IList<int>.  In TestAddIList(), you help the runtime by explicitly doing the conversion.  It will use the List<T>(IEnumerable<T>) constructor which efficiently creates a copy of the array.  In the case of TestAddArray(), the runtime must convert the array dynamically.  It uses the hidden System.SZArrayHelper class to do this.  It doesn't convert the array wholesale, it acts as a shim that maps the IList<>.Item and IList<>.Count properties to the equivalent array operations.  That's expensive.  And there are optimizations that are no longer possible now.  For one, it must check if the index is out of bounds every time you index the IList<>.

    I have no explanation for why the first call is fast.  Something is getting cached the first time but not the second time.  Implementation of generics in the JIT compiler is a complicated mother.  All of this is fairly academic, converting an array to an IList just isn't a very good idea.


    Hans Passant.



    Everybody uses these kind of constructions everywhere. That is why they exist. And I don't see any reason for a generated wrapper around an array to be 150 times slower then the List<int> class when using indexers. I think there is a little glitch in the JIT compiler and I hope the Microsoft team can look into it. This is something that will be used by thousands of programmers. I understand it won't be as fast as using the array directly, but it should be near the performance of the List<> class.

    Also, I would like to know what causes it so I can look out for it in other parts of my code. Maybe some of my other code will have the same behavior and will suddenly be performing a lot worse then when I tested the performance originally in a seperate piece of code.
    Programmer
    Friday, June 20, 2008 12:57 PM
  • The Rotor source code provides sort of a hint.  Check out clr\src\bcl\system\internal.cs, CommonlyUsedGenericInstantiations_HACK().  Smells like the first time, the runtime can use the Ngen-ed copy of SZArrayHelper<int> but, somehow, not subsequent times.  I didn't check if it JITs a new instance for every call, that would be a runaway memory problem.  Meh, something to do over the weekend.
    Hans Passant.
    • Marked as answer by Bruno Yu Thursday, June 26, 2008 1:44 AM
    • Unmarked as answer by nobugzModerator Monday, July 14, 2008 7:35 PM
    Friday, June 20, 2008 1:03 PM
    Moderator
  • Dutch Idiot said:

    I think there is a little glitch in the JIT compiler and I hope the Microsoft team can look into it.

    Nobody from Microsoft CLR and JIT teams looks at the threads here.  There isn't but a skeleton crew left anyway.  You should post to the Connect web site to get their attention.  Don't expect a change any time soon.

    Hans Passant.
    • Marked as answer by Bruno Yu Thursday, June 26, 2008 1:44 AM
    • Unmarked as answer by nobugzModerator Monday, July 14, 2008 7:35 PM
    Friday, June 20, 2008 1:09 PM
    Moderator
  • nobugz said:

    An int[] is not readily convertable to an IList<int>.  In TestAddIList(), you help the runtime by explicitly doing the conversion.



    Note that your conversion is outside your timing block. How does it compare if you put the conversion inside your timing block?
    Curt - http://www.codeneverwritten.com/
    Friday, June 20, 2008 6:12 PM
  • Curt Nichols said:

    nobugz said:

    An int[] is not readily convertable to an IList<int>.  In TestAddIList(), you help the runtime by explicitly doing the conversion.



    Note that your conversion is outside your timing block. How does it compare if you put the conversion inside your timing block?
    Curt - http://www.codeneverwritten.com/



    But that could only make the performance decrease even more. It definitly won't become any better. So I don't think it really matters. The point is that the .Count and the item[int index] suddenly perform 40 times slower then the first time. That is the main issue. I cannot really trust any of my performance tests at the moment. Cause when running in the full application it could suddenly drop performance.
    Programmer
    Friday, June 20, 2008 9:59 PM
  • nobugz said:

    Dutch Idiot said:

    I think there is a little glitch in the JIT compiler and I hope the Microsoft team can look into it.

    Nobody from Microsoft CLR and JIT teams looks at the threads here.  There isn't but a skeleton crew left anyway.  You should post to the Connect web site to get their attention.  Don't expect a change any time soon.

    Hans Passant.



    Tnx, I'll have a look at it.
    Programmer
    Friday, June 20, 2008 9:59 PM
  • Dutch Idiot said:

    But that could only make the performance decrease even more.

    No it wouldn't, you were complaining about the direct int[] to IList<> conversion.  Curt is right, you should put the explicit conversion inside the timer to compare apples to oranges.  I did, the List<>(IEnumerable<>) constructor is fast, it doesn't change the observation.

    I still think it is something I'll  never do in my own code, I don't see any reason to.  Maybe the real bug is that they allowed it.  It certainly surprised me that the code even compiled.

    Hans Passant.
    • Marked as answer by Bruno Yu Thursday, June 26, 2008 1:43 AM
    • Unmarked as answer by nobugzModerator Monday, July 14, 2008 7:36 PM
    Friday, June 20, 2008 11:36 PM
    Moderator
  • nobugz said:

    Dutch Idiot said:

    But that could only make the performance decrease even more.

    No it wouldn't, you were complaining about the direct int[] to IList<> conversion.  Curt is right, you should put the explicit conversion inside the timer to compare apples to oranges.  I did, the List<>(IEnumerable<>) constructor is fast, it doesn't change the observation.

    I still think it is something I'll  never do in my own code, I don't see any reason to.  Maybe the real bug is that they allowed it.  It certainly surprised me that the code even compiled.

    Hans Passant.



    Well, if anybody uses this code or not is not really important.
    The most important thing in this all is that 2 runs of exactly the same code with the same input give a huge performance difference (40 times).
    When can that happen?
    How can I do performance tests and optimize my code if suddenly the performance can decrease 40 times?
    That, to me, is the most important issue. I know now to avoid using this construction. But what I don't know is if I can expect this sudden performance decrease anywhere else in my code.
    I want to know what is happening here to avoid pitfalls like this.
    Programmer
    Saturday, June 21, 2008 12:00 AM
  •   Alright, now it is just getting rediculous. When I use an array of 99 elements for the List<T> then the array code performs just fine.

    First (fast) run of TestAddArray takes 60 ms. 

    When I have used an Array of < 100 (let's say 99) elements for the List<T> in between TestAddArray is just as fast as the first run.
    When I have used an Array of 100 elements for the List<T> in between TestAddArray becomes 1.2 seconds (20 times slower).
    When I have used an Array of > 100 (let's say 101) elements for the List<T> TestAddArray becomes 2.4 seconds (40 times slower). 

    So, we've got a magic number which is 100 and is the turning point of all performance going down the drain.

    By the way, I've heard from some of my other sources that is does not happen when compiling for the x64 platform.

    Also, this has been on some forum before. 
    http://blogs.msdn.com/ricom/archive/2006/03/12/549987.aspx


    The rest of the code is just the same but I've changed the main.

    static void Main(string[] args)  
          {  
          int[] items = new int[1000000];  
          for (int i = 0; i < items.Length; i++)  
            items[i] = (int) i - (items.Length / 2);  
     
          int[] items2 = new int[100];  
          for (int i = 0; i < items2.Length; i++)  
            items2[i] = (int)i - (items2.Length / 2);  
     
          TestAddArray(items);  
          TestAddIList(items2);  
          TestAddArray(items);  
            
          Console.WriteLine("Ready!");  
          Console.ReadKey();  
          } 

    Programmer
    Tuesday, June 24, 2008 9:29 AM
  • Running List<int> twice with 50 has the same performance decrease as 1 loop of 100 elements. So, it seems to keep some statistics or something and once it has reached the magic number of 100 it suddenly decreases in performance, maybe caused by some recompilation of code.


    Programmer
    • Edited by Dutch Idiot Tuesday, June 24, 2008 9:43 AM addition
    Tuesday, June 24, 2008 9:41 AM
  • The solution of this problem is found by Microsoft. It is a bug in .net

    http://blogs.msdn.com/kirillosenkov/archive/2008/07/13/a-curious-subtlety-about-how-clr-does-interface-dispatch-on-array-types.aspx

    To react to earlier posts: I don't think this is something you shouldn't do. Using the IList<> interface abstracts the use of an array. So if you want to plug in another implementation (instead of an array), the using codebases don't have to change. In our case, we can have implementations like Memory mapped file, database or array. So, it makes good sense to abstract the actual implementation behind an interface. I see a lot of people using int[] etc. in code, but in a some cases this is actually a bad design principle in my opinion.

    Regards,

    Robert Kieboom
    Monday, July 14, 2008 7:10 PM