none
IList<double> parameter much slower than double[] parameter when function does for looping RRS feed

  • Question

  • My code needs to iterate through a large amount of data and calculate some values.  The number of data elements could be billions and the way it's stored is not yet decided, but will most likely have to be read in clusters from a file.  The data will be presented to a function to do some processing.  I would like the function to have an interface (IList<>) parameter, but I am noticing slowdowns when using the interface parameter. 

    I created some code to measure the times needed with different argument types and for or foreach looping.  The test functions just look for the max value, so they do an index lookup.  Here are the times I came up with:


    (Output - release build - start without debug - VS2005 Console Application)

    Start Test.
    =============================
    Running tests with 1000000 points
    For_DoubleArr(double[] a) with double[] = 2.8ms
    For_IListOfDouble(IList<double> a) with double[] = 124.0ms
    For_ListOfDouble(List<double> a) with List<double> = 5.1ms
    For_IListOfDouble(IList<double> a) with List<double> = 28.5ms
    -------------
    Foreach_DoubleArr(double[] a) with double[] = 3.0ms
    Foreach_IListOfDouble(IList<double> a) with double[] = 34.9ms
    Foreach_ListOfDouble(List<double> a) with List<double> = 7.9ms
    Foreach_IListOfDouble(IList<double> a) with List<double> = 45.8ms
    End Test.  Press enter.

     



    source code:

    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Diagnostics;
    using System.Windows;
    
    namespace Benchmark2 {
        class Program {
            static void Main(string[] args) {
                Console.WriteLine("Start Test.");
    
                Test test = new Test();
                test.RunTest();
    
                Console.WriteLine("\nEnd Test.  Press enter.");
                Console.Read();
    
            }
        }
    
        class Test {
            public void RunTest() {
                int count = 1000000;
                double[] doubleArr = new double[count];
                
                for (int i = 0; i < doubleArr.Length; i++)
                    doubleArr[i] = i;
                List<double> doublelist = new List<double>(count);
                doublelist.AddRange(doubleArr);
    
    
    
                Stopwatch stopWatch;
                Output("\n=============================");
                Output("Running tests with {0} points", count);
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                For_DoubleArr(doubleArr);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "For_DoubleArr(double[] a) with double[]"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                For_IListOfDouble(doubleArr);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "For_IListOfDouble(IList<double> a) with double[]"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                For_ListOfDouble(doublelist);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "For_ListOfDouble(List<double> a) with List<double>"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                For_IListOfDouble(doublelist);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "For_IListOfDouble(IList<double> a) with List<double>"));
    
                Output("-------------");
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                Foreach_DoubleArr(doubleArr);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "Foreach_DoubleArr(double[] a) with double[]"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                Foreach_IListOfDouble(doubleArr);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "Foreach_IListOfDouble(IList<double> a) with double[]"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                Foreach_ListOfDouble(doublelist);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "Foreach_ListOfDouble(List<double> a) with List<double>"));
    
                stopWatch = new Stopwatch();
                stopWatch.Start();
                Foreach_IListOfDouble(doublelist);
                stopWatch.Stop();
                Output(FormatTime(stopWatch, "Foreach_IListOfDouble(IList<double> a) with List<double>"));
    
    
    
            }
    
            private string FormatTime(Stopwatch stopWatch, string label) {
                return string.Format("{0} = {1:F1}ms", label, (stopWatch.ElapsedTicks * 1000.0) / Stopwatch.Frequency);
            }
    
            private void Output(string str, params object[] args) {
                Console.Write(string.Format(str + "\n", args));
            }
    
    
            public void For_DoubleArr(double[] a) {
                double max = double.MinValue;
                for (int i = 0; i < a.Length; i++) {
                    if (max < a[i])
                        max = a[i];
                }
            }
    
            public void For_IListOfDouble(IList<double> a) {
                double max = double.MinValue;
                for (int i = 0; i < a.Count; i++) {
                    if (max < a[i])
                        max = a[i];
                }
            }
    
            public void For_ListOfDouble(List<double> a) {
                double max = double.MinValue;
                for (int i = 0; i < a.Count; i++) {
                    if (max < a[i])
                        max = a[i];
                }
            }
    
    
            /**********************************************/
    
            public void Foreach_DoubleArr(double[] a) {
                double max = double.MinValue;
                foreach (double d in a) {
                    if (max < d)
                        max = d;
                }
            }
    
            public void Foreach_IListOfDouble(IList<double> a) {
                double max = double.MinValue;
                foreach (double d in a) {
                    if (max < d)
                        max = d;
                }
            }
    
    
            public void Foreach_ListOfDouble(List<double> a) {
                double max = double.MinValue;
                foreach (double d in a) {
                    if (max < d)
                        max = d;
                }
            }
    
        }
    }

    I found an article that addresses a similar issue but I don't think that I am experiencing what's described there:
    http://blogs.msdn.com/kirillosenkov/archive/2008/07/13/a-curious-subtlety-about-how-clr-does-interface-dispatch-on-array-types.aspx


    I though doing an index access with an interface would just do an index access on the array it interfaces and there should not be a big slowdown.  But as you can see there is a 2 ms to 124ms difference with the case of an array.  I guess I don't understand what's going on with the interface parameters and wish someone could explain or point me to a good reference?
    • Edited by ChromeCompass Friday, June 26, 2009 3:50 PM Messed up less than sign :)
    Friday, June 26, 2009 3:36 PM

Answers

  • No, when IList is involved it is not a direct array access any more.  The runtime needs to figure out what to do, and this is known to have performance problems, as the article you pointed out mentions.  That article also gives some possible workarounds.

    Friday, June 26, 2009 7:31 PM

All replies

  • No, when IList is involved it is not a direct array access any more.  The runtime needs to figure out what to do, and this is known to have performance problems, as the article you pointed out mentions.  That article also gives some possible workarounds.

    Friday, June 26, 2009 7:31 PM
  • Whatever it's doing it seems from my current knowledge standpoint a bit inefficient.  Here is a little test where I define my own list interface and implement a list.  Then I run the same benchmark as above.

        public interface ILongList<T> {
            long Count { get; }
            T this[long index] { get; set; }
    
        }
    

    public class LongList<T> : ILongList<T> { private long _Count; public long Count { get { return _Count; } set { _Count = value; } } private T[] Data; public LongList(long size) { Data = new T[size]; Count = size; } public T this[long index] { get { return Data[index]; } set { Data[index] = value; } } }



    And next... I run the same benchmarks as before:

            public void For_ILongListOfDouble(ILongList<double> a) {
                double max = double.MinValue;
                for (int i = 0; i < a.Count; i++) {
                    if (max < a[i])
                        max = a[i];
                }
            }
    
    
            public void For_LongListOfDouble(LongList<double> a) {
                double max = double.MinValue;
                for (int i = 0; i < a.Count; i++) {
                    if (max < a[i])
                        max = a[i];
                }
            }

    And the result is:

    Start Test.
    
    =============================
    Running tests with 1000000 points
    For_LongListOfDouble(LongList<double> a) with LongList<double> = 6.1ms
    For_ILongListOfDouble(ILongList<double> a) with LongList<double> = 15.2ms
    
    
    End Test.  Press enter.
    
    
    

    As you can see the the test function taking the ILongList is only 3x slower than the test function taking LongList as an argument.  Compared to 44x slower with the IList vs List argument test.

    Monday, June 29, 2009 9:17 PM
  • It might, and I use that term loosely, be where the data resides. I have an article on my blog entitled Tribal Knowledge: Large Object Heap and Double Arrays in .Net. Where I demonstrate that double arrays are placed in the LOH before the 85K, around 20K. There are non-related performance hits to that. Is your test runnning into that hit...maybe its the test and not the cirucumstance?
    William Wegerson (www.OmegaCoder.Com)
    Monday, June 29, 2009 10:11 PM
    Moderator
  • All the data collections of size million indicated from GC.GetGeneration that they are on the Large Object Heap (2).  Changing the data types of the collections from double to float did not alter the for loop performance.
    Tuesday, June 30, 2009 8:07 PM