locked
Exception handling in blocks RRS feed

  • Question

  • Hello guys

    While doing some experiments i ran into a snag, all of a sudden my blocks didnt want to accept data anymore. It turned out that an exception was thrown inside one of the blocks, but i could still post to the block, it just didnt do anything. Is there a perticular reason for this behavior?

    i can imagine that there are reasons for post not throwing an InvalidOperationException in these cases, especially when the poster is another linked block, but it makes finding exceptions a little more difficult. Especially since exceptions doesnt propagate 'forwards' [a links to b, b throws an exception, b gets faulted but a doesnt show any sign of trouble]

    Also, there doesnt seem to be a way to recover from an exception. I suppose you could put a continuation on the CompletionTask and link in a new block somehow, but is that really the way to go?

     

    Wednesday, February 9, 2011 12:16 AM

Answers

  • Hi Allan,

     

    Thank you for experimenting with TPL Dataflow! It’s really exciting to see how you discover the intended usage patterns on your own.

     

    Regarding exceptions - yes, the designed approach is to continue off of a block’s completion task. You can still process all exceptions at a single place. One way to do that is for your continuation task to post the exception (along with a reference to the block that encountered it) to an ActionBlock<Tuple<IDataflowBlock, Exception>>.

     

    Regarding replacing a block at runtime – what you really want is to have the “throwing” block fixed. Notice that the block that gets faulted is the “catcher” of the exception and that block may be perfectly solid. The reason why we’ve designed “innocent” blocks to fault themselves is to give you a signal right away that something is wrong in the network and needs to be fixed. If the catcher block didn’t fault itself, you might incur some data loss and you wouldn’t know about it. That’s why we define a strict propagation protocol and we expect blocks to comply with it 100%. Any exception means either the environment is bad (lack of resources) or there is an incorrectly implemented block. Either case would require human investigation. Automatically replacing a block (never the less the one that caught the exception) would not help.

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, February 9, 2011 4:55 PM
  • Hi Allan,

     

    When a callback throws an exception, there is no ambiguity about where the exception is coming from. The case I was referring to is something like this:

        class ThrowerBlock : IPropagatorBlock<T, T>

        {

            DataflowMessage<T> ISourceBlock<T>.ConsumeMessage(DataflowMessage<T> message, ITargetBlock<T> target)

            {

                throw new Exception();

            }

     

            ...

        }

     

    var thrower = new Thrower<int>();

    var catcher = new BufferBlock<int>();

    thrower.LinkTo(victim);

    thrower.Post(1);

    // catcher will catch the exception from thrower.ConsumeMessage() and the only way for it to signal to you that something wrong has happened is to fault itself.

    // Alternatively, we could publically expose a IDataflowBlock.Fault(Exception e) method so that catcher can fault thrower, but this feature could easily be abused.

     

     

    I don’t quite get your argument that the inner blocks of the network are inaccessible. The code that constructs those blocks should be able to hook up continuations right away. Am I missing something?

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 2:30 AM
  • Hi Allan,

     

    [I think] I see your point.

     

    As far as taking a recovery action goes, I wouldn’t count on that, because ultimately there is a faulty code somewhere. Whether it is in a callback or inside a block’s implementation, replacing a block instance while still reusing the faulted code won’t lead to any significant improvement.

     

    Having said that, I agree with you that it is useful to enable blocks to fault one another. In the Encapsulate case in particular, an encapsulated block will be able to fault the last block in the encapsulation chain whose completion task is exposed by the encapsulating block. That way, the outside code can get notified of the failure.

     

    Regarding the thrower-catcher scenario I showed earlier, we’ll probably continue faulting the catcher, because the thrower has already proven that it is faulty and it may not want to fault itself.

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”. 


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 5:54 PM
  • Re: faulting blocks from outside

    What I meant was that we don’t have that feature yet, but your ask is well taken. We’ve been discussing adding it, and your scenario adds more weight in favor of it.

     

    Re: how has a thrower proven it is faulty

    Sorry for overloading the word “faulty”. I should have used “incorrectly written.” Thus a thrower is “incorrectly written”, because it doesn’t comply with the protocol. Our protocol is very strict about what should be returned in each situation. (If you find any ambiguity in the protocol, do let us know.) There is no situation when a block may throw an exception as part of the message propagation [modulo argument validation.]

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 9:24 PM

All replies

  • Ok, the completionTask thing is actually pretty nice, i just have to remember to attach continuation tasks on all the blocks in the flow.  For composite blocks i could do Task.Factory.ContinueWhenAny and plug in all the completiontasks of the sub blocks.

    I noticed a property called LinkedTargets that could be useful for this but it seems to only be available to the debugger, not to actually call and use to walk a dataflow. Perhaps there is another way to do that?

    Still it would be nice to have more options on what should happen if an exception is thrown, like dropping the item or linking to another block. Maybe LinkToOnException that sends you the original object as well as the exception or something.

    Wednesday, February 9, 2011 2:58 PM
  • Hi Allan,

     

    Thank you for experimenting with TPL Dataflow! It’s really exciting to see how you discover the intended usage patterns on your own.

     

    Regarding exceptions - yes, the designed approach is to continue off of a block’s completion task. You can still process all exceptions at a single place. One way to do that is for your continuation task to post the exception (along with a reference to the block that encountered it) to an ActionBlock<Tuple<IDataflowBlock, Exception>>.

     

    Regarding replacing a block at runtime – what you really want is to have the “throwing” block fixed. Notice that the block that gets faulted is the “catcher” of the exception and that block may be perfectly solid. The reason why we’ve designed “innocent” blocks to fault themselves is to give you a signal right away that something is wrong in the network and needs to be fixed. If the catcher block didn’t fault itself, you might incur some data loss and you wouldn’t know about it. That’s why we define a strict propagation protocol and we expect blocks to comply with it 100%. Any exception means either the environment is bad (lack of resources) or there is an incorrectly implemented block. Either case would require human investigation. Automatically replacing a block (never the less the one that caught the exception) would not help.

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Wednesday, February 9, 2011 4:55 PM
  • Alright, so blocks should not throw exceptions, im cool with that :) but that makes it even more important to be able to find blocks that arent playing nice

    I think in alot of scenarios you'll expose only the entry block for a network and perhaps even have dynamic networks that change with load and things like that. In those cases you wont have access to all the blocks and cant hook on continuations.

    If you where able to walk networks to find blocks that would be one way out, but it would be handy if you control the CompletionTask of the "front facing" blocks. That way you could make those blocks become faulted if an exception occurs deeper in the network.

    I do have a cold and fever today so i might be a little thick :) Im not sure i follow you "throwing" as opposed to "catching" block example, consider this code:

       var block1 = new BufferBlock<string>( );
       var block2 = new TransformBlock<string, string>( s => { throw new Exception( ); return s; } );
       var block3 = new ActionBlock<string>( s => { } );
    
       block1.LinkTo( block2 );
       block2.LinkTo( block3 );
    
       block1.CompletionTask.ContinueWith( t => Console.WriteLine( "block1 completed" ) );
       block2.CompletionTask.ContinueWith( t => Console.WriteLine( "block2 completed" ) );
       block3.CompletionTask.ContinueWith( t => Console.WriteLine( "block3 completed" ) );
    
       block1.Post( "foo" );
    
    This code will only print that block2 completed. isnt block2 both the throwing and catching block? or do you mean that the "throwing" block is the block that passed in some invalid data into another block that causes it to throw an exception?
    Wednesday, February 9, 2011 8:20 PM
  • Hi Allan,

     

    When a callback throws an exception, there is no ambiguity about where the exception is coming from. The case I was referring to is something like this:

        class ThrowerBlock : IPropagatorBlock<T, T>

        {

            DataflowMessage<T> ISourceBlock<T>.ConsumeMessage(DataflowMessage<T> message, ITargetBlock<T> target)

            {

                throw new Exception();

            }

     

            ...

        }

     

    var thrower = new Thrower<int>();

    var catcher = new BufferBlock<int>();

    thrower.LinkTo(victim);

    thrower.Post(1);

    // catcher will catch the exception from thrower.ConsumeMessage() and the only way for it to signal to you that something wrong has happened is to fault itself.

    // Alternatively, we could publically expose a IDataflowBlock.Fault(Exception e) method so that catcher can fault thrower, but this feature could easily be abused.

     

     

    I don’t quite get your argument that the inner blocks of the network are inaccessible. The code that constructs those blocks should be able to hook up continuations right away. Am I missing something?

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 2:30 AM
  • ah, now i understand what you mean. but then we are talking about two distinct scenatios. in your example the implementation of the throwerBlock is clearly at fault, but that is not really the  case in my example, since it was I, the user, who passed in a poorly implemented callback.

    To take an exampel that really illustrates both the block visibility problem and the exception handling problem, consider a slightly modified TDFObjectPool example i made in this thread . [this was actually the reason for my original post

     

    public class TDFObjectPool<TInput, TPool, TOutput> {
      JoinBlock<TInput, TPool> _poolQueue = new JoinBlock<TInput, TPool>( );
      Func<TInput, TPool, TOutput> _process = null;
      TransformBlock<Tuple<TInput, TPool>, TOutput> _processor;
      BufferBlock<TOutput> _outputBuffer = new BufferBlock<TOutput>( );
      public IPropagatorBlock<TInput, TOutput> Block { get; private set; }
    
      public TDFObjectPool( Func<TInput, TPool, TOutput> processorFunction ) {
       _process = processorFunction;
    
       _processor = new TransformBlock<Tuple<TInput, TPool>, TOutput>( pair => {
        try { return _process( pair.Item1, pair.Item2 ); }
        finally { _poolQueue.Target2.Post( pair.Item2 ); }
       } );
    
       _poolQueue.LinkTo( _processor );
       _processor.LinkTo( _outputBuffer );
    
       Block = DataflowBlockExtensions.Encapsulate( _poolQueue.Target1, _outputBuffer );
      }
    
      public void AddPoolItem( TPool item ) {
       _poolQueue.Target2.Post( item );
      }
     }
    
     class Program {
      static void Main( string[ ] args ) {
       var pool = new TDFObjectPool<string, WebClient, int>( ( adress, wc ) => wc.DownloadString( adress ).Length );
       var adressBuffer = new BufferBlock<string>( );
       var action = new ActionBlock<int>( l => Console.WriteLine( "Got a response with a length of " + l ) );
    
       adressBuffer.LinkTo( pool.Block );
       pool.Block.LinkTo( action );
       adressBuffer.Post( "urlthatdoesntexsist" );
       pool.AddPoolItem( new WebClient( ) );
       Console.ReadLine( );
      }
     }

     

    Lets say i pass in a url to the pool that doesnt exsist. The JoinBlock at the front of the pool accepts it and everything is fine until the TransformBlock hidden inside the object pool tries to fetch it using a webclient. That transform block is now faulted and stops accepting messages, but i have no real way of telling since the neither the joinblock or outputbuffer is actually faulted. 

    This is a case where it would have been useful to be able to tell the TransformBlock what to do with exceptions. to drop the item, or link it to somwhere else, or to propagate the exception up the link chain to the block that sits in front of it. Some of this could ofcourse be solved by writing a better callback for the TransformBlock, but still

    if LinkTo had options for dealing with exceptions, that would be very useful in these scenarios. you could have the option of faulting the block with the bad callback, the behavior right now, or to fault that block and the linking block as well, the JoinBlock in this case. [and the adressbuffer could choose to be faulted as well by specifying that argument in its LinkTo call]

    I would like to avoid doing Posts inside the callbacks of blocks since it hurts composability as well as future toolability, so if blocks had a source for exceptions that would be very useful. you could have a system where if a callback threw an exception and nobody wanted that message, the block would become faulted, but if someone accepted the message, it would continue working.

     

    Thursday, February 10, 2011 12:00 PM
  • so to recap, i'd like do this :)

     bool FaultIfLinkTargetFaults = true;
     _poolQueue.LinkTo( _processor, FaultIfLinkTargetFaults );
    

     

    and this

    _processor.LinkToOnException( exceptionAndItemTuple => Console.WriteLine( "oops: " + exceptionAndItemTuple.exception.Message ) );
    Thursday, February 10, 2011 12:17 PM
  • Hi Allan,

     

    [I think] I see your point.

     

    As far as taking a recovery action goes, I wouldn’t count on that, because ultimately there is a faulty code somewhere. Whether it is in a callback or inside a block’s implementation, replacing a block instance while still reusing the faulted code won’t lead to any significant improvement.

     

    Having said that, I agree with you that it is useful to enable blocks to fault one another. In the Encapsulate case in particular, an encapsulated block will be able to fault the last block in the encapsulation chain whose completion task is exposed by the encapsulating block. That way, the outside code can get notified of the failure.

     

    Regarding the thrower-catcher scenario I showed earlier, we’ll probably continue faulting the catcher, because the thrower has already proven that it is faulty and it may not want to fault itself.

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”. 


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 5:54 PM
  • "In the Encapsulate case in particular, an encapsulated block will be able to fault the last block in the encapsulation chain whose completion task is exposed by the encapsulating block"

    How do you do that? the only way i can make that happen is if either the target or source blocks that are passed to Encapsulate faults, but how would you cause any blocks inbetween to cause either of these two to fault?

    "[..] because the thrower has already proven that it is faulty and it may not want to fault itself. "

    that sounds very strange to me.. how has the thrower proven its faulty if it doesnt fault it self?

    i can understand that reactivating a faulted block might not bring a significant perf improvments [although i think that would be useful]. But i really hope you add something like the LinkTo option to fault the source block if the target block faults :) that would make alot of scenarios more intuitive i think.

    Thursday, February 10, 2011 8:46 PM
  • Re: faulting blocks from outside

    What I meant was that we don’t have that feature yet, but your ask is well taken. We’ve been discussing adding it, and your scenario adds more weight in favor of it.

     

    Re: how has a thrower proven it is faulty

    Sorry for overloading the word “faulty”. I should have used “incorrectly written.” Thus a thrower is “incorrectly written”, because it doesn’t comply with the protocol. Our protocol is very strict about what should be returned in each situation. (If you find any ambiguity in the protocol, do let us know.) There is no situation when a block may throw an exception as part of the message propagation [modulo argument validation.]

     

     

    Zlatko Michailov

    Software Development Engineer, Parallel Computing Platform

    Microsoft Corp.

     

    If you are satisfied by this post, please mark it as “Answer”.

     


    This posting is provided "AS IS" with no warranties, and confers no rights.
    Thursday, February 10, 2011 9:24 PM