TPL Dataflow and complex object structures


  • I have a question regarding TPL Dataflow and how I can use it to process a complex tree of objects. To illustrate the question I think it's a good idea to use an example, so here it goes.

    I have a schedule to make for a week. I have a couple of employees I can schedule, and when I schedule an employee on a particular day I need to run various checks to make sure I can actually schedule that employee (things such as "Is the employee available on that day at that particular time" and "Isn't the employee going to make to many hours this week if I add schedule him/her for this period of time", etc.). Since I don't know all the checks I need to perform up front and that this all needs to be done fast and asynchronously I thought TPL Dataflow might be a good option to take a look at.

    Now I'm trying to create a proof of concept to see if it can be done using TPL Dataflow but I'm running into a problem where I think I don't understand it well enough yet, so I was hoping somebody can help me with this.

    What I have right now is a class which implements an interface that defines a single property Input of type ITargetBlock<Employee>. Since I want these checks to be pluggable, I imagined creating multiple implementations of the same interface and then linking them together. Inside each of those classes I want to create the necessary DataflowBlocks and link them to the Input property to actually perform the check I want to do. Currently I have just one implementation of the interface which should perform the check to see if the employee is available at the time I have scheduled him/her. However, one employee might have several shifts scheduled, so I need to create a combination of the employee and a shift. Here is a rough sketch of what the class structure looks like:

    class Employee
        IEnumerable<Period> Shifts { get; }
        IEnumerable<Period> Availability { get; }
    class Period
        DateTime Date { get; set; }
        TimeSpan StartTime { get; set; }
        TimeSpan EndTime { get; set; }

    As I said my input is an ITargetBlock<Employee>, which is internally implemented as a BufferBlock<Employee>. Now this is where the problems start. I can of course link a TransformManyBlock<Employee, Period>(employee => employee.Shifts) to it, but how then am I going to join both the employee and one particular shift back into a single Tuple that contains both the employee and the shift. I tried using the JoinBlock<Employee,Period>, but I don't see how I can guarantee that the Employee and the Period that come out of that JoinBlock actually belong together. Am I missing something here, or is what I'm trying to do completely wrong?

    2011년 12월 21일 수요일 오전 11:53


  • Hi Jonathan-

    From your description, it's not clear to me that TPL Dataflow is the right solution here.

    If you were implementing this synchronously and sequentially, what would the high-level psuedocode look like?  It sounds like your implementing an optimization problem, and I'd expect that for such problems you'd be better off parallelizing the solution using Parallel.For or PLINQ, and if you need it to run asynchronously from the caller, wrapping the whole thing in a Task.Run (or Task.Factory.StartNew) call.  That's just a guess, without knowing more about your needs.

    2012년 1월 4일 수요일 오후 8:18