none
Working with a vertor/array subset. Request for comments RRS feed

  • Question

  • I have done alot of numerical work with C# and repretedly run into a scenario where I need to work with a subset of a vector or array. So the brute force way would be to copy the subset to a separate array. Call a routine to work on that subset and then put the computed values back. Like the following pseudo code:

    B(double[] a)
    {
     . . . . .
    }
    
    A()
    {
    double [] aa;
    . . . . 
    double[] sa = new double[...];
    for(int i = 0; i < ...; i++)
      sa[i] = aa[i + ....];
    
    B(sa)
    
    for(int i = 0; i < ....; i++)
      aa[i + .....] = sa[i];
    

    This not only makes the code hard to read but it is terriblly inefficient. With other languages (C, C++, FORTRAN) it is a simple matter of passing the *address* of the beginining of the subset and the callled routine knows no difference. Two solutions come to mind and I was wondering if there are others:

    1) Pass an offset along with the array to the called routine.

    2) Use "unsafe" code and pass a pointer like C, C++, and FORTRAN.

    I am no sure of the performance impact of 2). And I could see that littering unsafe code throughout a library could be not very beneficial. But if it was localized (think of porting the BLAS library to C#) it might be OK.

    Any other ideas?

     

    Thank you.

     

    Kevin

     

    Wednesday, April 21, 2010 3:41 PM

Answers

  • #2 seems reasonable to me.

    #3: Create a wrapper class that will access only subarray of the array (solution #1 encapsulated in a class). Something like:
    class SubArray { double[] arr; int firstIndex; double Get(int index) { return arr[firstIndex + index]; } ... }

    -Karel

    • Marked as answer by KevinBurton Wednesday, April 21, 2010 4:35 PM
    Wednesday, April 21, 2010 4:33 PM
    Moderator
  • Once you get to pointers, you don't have the array anymore (no dimensions, no methods on array) - just try it and you will see. It's better, easier, faster and more realiable than asking questions. Consider using unsafe C# and/or C++/CLI.

    I don't see any reason why indexing jagged arrays should be faster. Luckily there is a way how to get the answer - measure it. If you want to make design decisions based on perf, create mock scenarios with load similar to the real load (this is very tricky!), then measure it.
    In most cases there is not ultimate answer what is faster (A or B). All depends on the scenario and it is very hard to describe your scenario to others. Therefore measuring it will save you a lot of trouble and uncertainty.

    -Karel

    • Marked as answer by KevinBurton Wednesday, April 21, 2010 5:49 PM
    Wednesday, April 21, 2010 5:39 PM
    Moderator

All replies

  • #2 seems reasonable to me.

    #3: Create a wrapper class that will access only subarray of the array (solution #1 encapsulated in a class). Something like:
    class SubArray { double[] arr; int firstIndex; double Get(int index) { return arr[firstIndex + index]; } ... }

    -Karel

    • Marked as answer by KevinBurton Wednesday, April 21, 2010 4:35 PM
    Wednesday, April 21, 2010 4:33 PM
    Moderator
  • Assuming that #2 is reasonable. What if I want to pass a subset of a jagged array? Say I have an array like:

    1 2 3 4 5
    6 7 8 9 0
    1 2 3 4 5
    6 7 8 9 0
    1 2 3 4 5 

     

    and I want only the lower corner of that array and like it said the array is jagged so it is declared as a[][]. I want something like a[2][2] to be an array (jagged for performance) that would look like an array like

    3 4 5 
    8 9 0
    3 4 5

    to the called routine that is also expecting a jagged array. Is this even possible? I know that using unsafe and some pointer arithmetic I can get the address of the beginning of the sub-array but how do I make it look like a jagged array to the called routine? If the called routine operates on this array will the results get "magically" applied to the original array/matrix?

    Thanks again.

    Wednesday, April 21, 2010 5:06 PM
  • OK. The more I think about it expecting the subarray to also be jagged may be unreasonable. What if this restriction was relaxed and I used "normal" two dimensional arrays ([,]).  Then using unsafe and pointer arithmetic I can get to the begining of the subarray. Can I simply pass this "address" to another method and that method will interpret it as a proper sub array. Will methods in the Array class return the proper dimensions of this subarray (the subarray in the example above is 3x3 and the original array is 5x5). I have heard that indexing jagged arrays in .NET is faster than "normal" arrays. If I convert my arrays to the non-jagged type am I giving up too much as far as performance?

     

    Thanks again.

     

    Kevin

    Wednesday, April 21, 2010 5:25 PM
  • Once you get to pointers, you don't have the array anymore (no dimensions, no methods on array) - just try it and you will see. It's better, easier, faster and more realiable than asking questions. Consider using unsafe C# and/or C++/CLI.

    I don't see any reason why indexing jagged arrays should be faster. Luckily there is a way how to get the answer - measure it. If you want to make design decisions based on perf, create mock scenarios with load similar to the real load (this is very tricky!), then measure it.
    In most cases there is not ultimate answer what is faster (A or B). All depends on the scenario and it is very hard to describe your scenario to others. Therefore measuring it will save you a lot of trouble and uncertainty.

    -Karel

    • Marked as answer by KevinBurton Wednesday, April 21, 2010 5:49 PM
    Wednesday, April 21, 2010 5:39 PM
    Moderator
  • You may want to look into Nito.Linq, a library that I'm working on. http://nitolinq.codeplex.com/

    It has IList-aware operators such as Skip and Slice. It doesn't support multidimensional arrays, though; it works like the "wrapper class" that Karel posted (only with IntelliSense documentation and unit tests). The first problem can be solved just using the Slice operator. The second problem could use the Take, Skip, and Select operators: a.Take(3).Select(x => x.Skip(2));

    The disadvantage to this approach is that the numerical methods would need to be updated to take IList<> types instead of array types.

    If you need ultimate speed, I would expect multidimensional arrays in unsafe code would be the fastest (with methods changed to use pointers instead of arrays). If you prefer jagged arrays or safe code (and IMO more readable code), then Nito.Linq is an option.

            -Steve

    P.S. Nito.Linq is still in a prerelease state; it is mostly stable at this point but there are a few changes that need to go in to support source lists that change size.


    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
      and How to Implement IDisposable and Finalizers: 3 Easy Rules
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    Wednesday, April 21, 2010 5:56 PM
  • Thank you.

    I am a little confused as to the target audience for this library. Is it meant to replace LINQ (many of the operators that I see in the overview are available with LINQ) or is it a stand in for those who don't want to make the jump to .NET 3.5?

    Kevin

     

    Wednesday, April 21, 2010 6:06 PM
  • It augments (and depends on) LINQ, actually. It does duplicate some of the additional LINQ operators in Microsoft's Rx library, so that's why there are "with Rx" and "without Rx" versions. But there is no duplication of core LINQ operators, even when Nito.LINQ operators have the same name.

    For example, LINQ includes the Take and Skip operators, but they always return IEnumerable<T>. Nito.LINQ includes Take and Skip operators for IList<T> that return IList<T>. To use your original example, you could make a copy of the source array by using "a.Skip(n).Take(y).ToArray()", which copies the selected range. Using Nito.LINQ, you could say "a.Skip(n).Take(y)" (or the equivalent "a.Slice(n, y)"), which returns an IList<T> that manipulates its index arguments instead of making a copy.

           -Steve


    Programming blog: http://nitoprograms.blogspot.com/
      Including my TCP/IP .NET Sockets FAQ
      and How to Implement IDisposable and Finalizers: 3 Easy Rules
    Microsoft Certified Professional Developer

    How to get to Heaven according to the Bible
    Wednesday, April 21, 2010 6:42 PM
  • I don't see any reason why indexing jagged arrays should be faster. Luckily there is a way how to get the answer - measure it. If you want to make design decisions based on perf, create mock scenarios with load similar to the real load (this is very tricky!), then measure it.
    In most cases there is not ultimate answer what is faster (A or B). All depends on the scenario and it is very hard to describe your scenario to others. Therefore measuring it will save you a lot of trouble and uncertainty.

    -Karel


    Thank you. The reason I ask is that there seems to be disagreement amoung the published benchmarks as to the performance implications of jagged vs. "normal" arrays (See http://msdn.microsoft.com/en-us/magazine/cc163995.aspx
    http://msdn.microsoft.com/en-us/library/ms182277.aspx
    http://dotnetperls.com/jagged-array). I just wanted some feedback to help me make the decision and to make sure my test covered all the bases.

     

    Wednesday, April 21, 2010 7:09 PM
  • I see - Easiest thing to know for sure is to rerun the benchmarks from this article http://dotnetperls.com/jagged-array.

    -Karel

    Thursday, April 22, 2010 12:53 AM
    Moderator