none
A few philosophical questions regarding the TPL

    Question

  • Hello all,

    First post here. I'm working on the OWIN spec (http://owin.github.com) and we would like to support Tasks to allow OWIN applications to asynchronously provide HTTP response data. I originally these questions to the .NET HTTP Abstraction list (http://groups.google.com/group/net-http-abstractions) but didn't get a response.

    Am I correct that the only way to create a truly asynchronous task is using Task.Factory.FromAsync? Why there isn't a Task constructor with the signature of FromAsync? I might guess that there's something special going on under the hood of Task factory besides just creating tasks on a specific scheduler with specific creation, continuation, and cancellation options.

    I see that FromAsync automatically starts the Task, and there doesn't seem to be any way to defer it other than going the Task<Task<T>> route: you start the outer task, which synchronously creates and returns the inner, truly async task. Unless I'm missing something, there must be some good reason why the TPL doesn't allow this, and I'm very interested in what that is.

    Task.Status has a lot of possible values, and it's not exactly clear how to determine if Start can safely be called on the Task without throwing an exception. Any pointers? This is important to the OWIN spec because if the response enumerable is allowed to return Task<T>, there needs to be a clear algorithm for the host to determine whether it should call Start, or if it's safe to access the Result or Exception properties without them blocking, or if it should provide a callback using the ContinueWith mechanism. It's also not clear under what TaskStatus conditions ContinueWith will execute the callback, although the fact that FromAsync doesn't allow you to defer starting the task suggests that ContinueWith will "just work" even for tasks that have already completed at the time ContinueWith is called.

    Is it possible to create a TaskScheduler which doesn't use the ThreadPool? I would like to build a single-threaded cooperative multitasking system using the Task interface.  This brings me to the most important question: why is TaskScheduler an abstract class and not an interface(!)?

    Many thanks,

    Benjamin van der Veen

    • Edited by Benjamin van der Veen Tuesday, December 14, 2010 2:43 AM specify 'single-threaded' cooperative multitasking
    Tuesday, December 14, 2010 2:33 AM

Answers

  • Hi Benjamin-

    re: "Am I correct that the only way to create a truly asynchronous task is using Task.Factory.FromAsync?"

    Not quite.  The TaskCompletionSource<TResult> type is the primary mechanism in TPL for creating tasks to represent arbitrary operations.  You instantiate one of these, and it hands you back a Task.  No threads are used, no delegates are provided, etc.  Instead, you completely control this task, through the completion methods on TCS, e.g. SetResult, SetException, SetCanceled, etc.  In fact, FromAsync is itself built on top of this mechanism, which allows you to represent any arbitrary asynchronous operation.  For example, let's say you wanted to build your own Stream.WriteAsync method; it might look something like this:

    public static Task WriteAsync(this Stream stream, byte [] buffer, int offset, int count)
    {
        var tcs = new TaskCompletionSource<object>();
        stream.BeginWrite(buffer, offset, count, iar =>
        {
            try { tcs.SetResult(stream.EndWrite(iar)); }
            catch(Exception exc) { tcs.SetException(exc); }
        }, null);
        return tcs.Task;
    }

    re: "there must be some good reason why the TPL doesn't allow this, and I'm very interested in what that is"

    In general, the TPL model is based on "hot" tasks, where APIs hand out tasks that are already running.  This helps to make async methods much more like sync methods, where you call them and they start running, and then with async methods you're handed back a token (the task) that represents the completion of the operation at some point in the future.  We added the Start method to Task primarily for cases where internal to a method you, for some reason, need to separate the construction and scheduling of the task, e.g. if you derive from task, if the task body needs access to the initialized reference to the task, etc.  We generally don't encourage usage of Start except in very specific cases, and thus haven't proliferated that model through the APIs.  If you want to hand out a representation for creating a task, you can do so with a Func<Task> or a Func<Task<TResult>>.  This approach also jives nicely with the new async support in C# and VB (see http://msdn.com/vstudio/async), where you can write asynchronous lambdas/anonymous methods, which end up producing Func<Task> or Func<Task<TResult>>.

    re: "Task.Status has a lot of possible values, and it's not exactly clear how to determine if Start can safely be called on the Task without throwing an exception. Any pointers?"

    Start may be called if, and only if, the TaskStatus == Created.  Going back to the "hot" model comment, though, APIs should always hand out tasks that represent already scheduled or running operations.  For tasks backed by a delegate, this means the API has either already called Start or has scheduled something else that will call Start; for tasks representing other async operation, it means the API has already kicked off the work that will complete the task.  The caller of the API should not be responsible for starting the task.  If you want the semantic where the caller gets to decide when to kick off the operation, you should hand out a Func<>.

    re: "It's also not clear under what TaskStatus conditions ContinueWith will execute the callback"

    Continuations will be scheduled/executed once the antecedent task reaches a final state, which is defined as one of the following three statuses: RanToCompletion, Faulted, or Canceled.  These same three states are what cause Task.IsCompleted to return true.  These same three states are also what cause Wait'ing on the task to complete.  In short, all actions based on the task being done occur once Task.IsCompleted is true, which is true if and only if TaskStatus == RanToCompletion, Faulted, or Canceled.

    re: "Is it possible to create a TaskScheduler which doesn't use the ThreadPool?"

    Yes.  There are several samples of this in the ParallelExtensionsExtras project at http://code.msdn.microsoft.com/ParExtSamples.

    re: "why is TaskScheduler an abstract class and not an interface"

    Because it provides some built-in functionality, i.e. not all of its members are abstract.

    I hope that helps.

    Tuesday, December 14, 2010 4:27 PM

All replies

  • Hi Benjamin-

    re: "Am I correct that the only way to create a truly asynchronous task is using Task.Factory.FromAsync?"

    Not quite.  The TaskCompletionSource<TResult> type is the primary mechanism in TPL for creating tasks to represent arbitrary operations.  You instantiate one of these, and it hands you back a Task.  No threads are used, no delegates are provided, etc.  Instead, you completely control this task, through the completion methods on TCS, e.g. SetResult, SetException, SetCanceled, etc.  In fact, FromAsync is itself built on top of this mechanism, which allows you to represent any arbitrary asynchronous operation.  For example, let's say you wanted to build your own Stream.WriteAsync method; it might look something like this:

    public static Task WriteAsync(this Stream stream, byte [] buffer, int offset, int count)
    {
        var tcs = new TaskCompletionSource<object>();
        stream.BeginWrite(buffer, offset, count, iar =>
        {
            try { tcs.SetResult(stream.EndWrite(iar)); }
            catch(Exception exc) { tcs.SetException(exc); }
        }, null);
        return tcs.Task;
    }

    re: "there must be some good reason why the TPL doesn't allow this, and I'm very interested in what that is"

    In general, the TPL model is based on "hot" tasks, where APIs hand out tasks that are already running.  This helps to make async methods much more like sync methods, where you call them and they start running, and then with async methods you're handed back a token (the task) that represents the completion of the operation at some point in the future.  We added the Start method to Task primarily for cases where internal to a method you, for some reason, need to separate the construction and scheduling of the task, e.g. if you derive from task, if the task body needs access to the initialized reference to the task, etc.  We generally don't encourage usage of Start except in very specific cases, and thus haven't proliferated that model through the APIs.  If you want to hand out a representation for creating a task, you can do so with a Func<Task> or a Func<Task<TResult>>.  This approach also jives nicely with the new async support in C# and VB (see http://msdn.com/vstudio/async), where you can write asynchronous lambdas/anonymous methods, which end up producing Func<Task> or Func<Task<TResult>>.

    re: "Task.Status has a lot of possible values, and it's not exactly clear how to determine if Start can safely be called on the Task without throwing an exception. Any pointers?"

    Start may be called if, and only if, the TaskStatus == Created.  Going back to the "hot" model comment, though, APIs should always hand out tasks that represent already scheduled or running operations.  For tasks backed by a delegate, this means the API has either already called Start or has scheduled something else that will call Start; for tasks representing other async operation, it means the API has already kicked off the work that will complete the task.  The caller of the API should not be responsible for starting the task.  If you want the semantic where the caller gets to decide when to kick off the operation, you should hand out a Func<>.

    re: "It's also not clear under what TaskStatus conditions ContinueWith will execute the callback"

    Continuations will be scheduled/executed once the antecedent task reaches a final state, which is defined as one of the following three statuses: RanToCompletion, Faulted, or Canceled.  These same three states are what cause Task.IsCompleted to return true.  These same three states are also what cause Wait'ing on the task to complete.  In short, all actions based on the task being done occur once Task.IsCompleted is true, which is true if and only if TaskStatus == RanToCompletion, Faulted, or Canceled.

    re: "Is it possible to create a TaskScheduler which doesn't use the ThreadPool?"

    Yes.  There are several samples of this in the ParallelExtensionsExtras project at http://code.msdn.microsoft.com/ParExtSamples.

    re: "why is TaskScheduler an abstract class and not an interface"

    Because it provides some built-in functionality, i.e. not all of its members are abstract.

    I hope that helps.

    Tuesday, December 14, 2010 4:27 PM
  • Excellent, detailed answers Stephen, thank you. This gives me a lot to work with.

     

     

    Tuesday, December 14, 2010 10:51 PM