Inconsistency between BatchBlock and BatchedJoinBlock

Answered Inconsistency between BatchBlock and BatchedJoinBlock

  • 25 июня 2012 г. 15:58
     
     

    I just noticed an inconsistency between BatchBlock and BatchedJoinBlock: BatchBlock places its batches into an array (it's ISourceBlock<T[]>), while BatchedJoinBlock uses IList<T> (it's ISourceBlock<Tuple<IList<T1>, IList<T2>>>). Is there some reason for this? Why don't both use IReadOnlyList<T> (new in .Net 4.5)?

    Also, while I'm at it, why does BatchedJoinBlock produce tuple of lists, instead of list of tuples? Is it for performance reasons (less allocated Tuple objects)?

Все ответы

  • 26 июня 2012 г. 23:26
    Владелец
     
     Отвечено

    re: "Is there some reason for this?"

    It could have been an IList<T> for BatchBlock as well. For the most part, with BatchBlock the size of the output collection is known in advance, i.e. the user-specific batch size (the exception to this is for the case where no more data will arrive at the block or where batching is explicitly triggered).  So the block can hand out an array, allowing the consumer to have full access to the array and not have to go through an interface to access the data.  It's a minor thing, and you could make an argument for IList<T> as well.

    re: "Why don't both use IReadOnlyList<T>"

    Because there's no reason to restrict the consumer of the data to being read-only.  You're handed a collection, and you can do with it what you will.

    re: "Why does BatchedJoinBlock produce tuples of lists instead of lists of tuples"

    The number of elements in each list isn't necessarily equal.  For example, with a batch size of 10, you might have 7 elements of type T1 and 3 element of type T2, so you're handed two lists, one with the 7 T1s and one with the 3 T2.  It's not clear what this would look like if it was instead a list of tuples.  A tuple of lists matches naturally to the output.

  • 26 июня 2012 г. 23:59
     
     

    Yes, in both cases you could make an argument for an array or for IList<T>. But I still don't understand why would one block use array and another one IList<T>. It's an inconsistency that doesn't make sense to me.

    About tuples of lists: I misunderstood how BatchedJoinBlock worked. I thought it behaved exactly as a combination of JoinBlock linked to a BatchBlock, but it doesn't. The MSDN documentation is currently very limited, but I should've looked in your “Introduction to TPL Dataflow”, that explains it well.

  • 27 июня 2012 г. 1:07
    Владелец
     
     

    The difference is that in one case, the size is typically fixed, and in the other case, the size is typically dynamic.