none
Is it OK to use Tasks for Windows Service workers?

    Question

  • We often have scenarios where we need to monitor hundreds of directories for new files an process them. For years we successfully use Windows Services creating one worker thread per directory using Thread.Start, each with a FileSystemWatcher and processing logic. This works fine and is robust.

    I am playing with the following Task-based solution now. Does this make sense?

    Private _watchPaths As New List(Of String) From {"x:\Dir1","y:\Dir2","z:\Dir1", ...}
    Private _workers As New List(Of Task)
    Private _cancellationTokenSource As New CancellationTokenSource()
    Private _cancellationToken As CancellationToken = _cancellationTokenSource.Token
    
    Protected Overrides Sub OnStart(ByVal args() As String)
        For Each path In _watchPaths
            _workers.Add(
                Task.Factory.StartNew(
                    Sub()
                        Dim fileProcessor As New FileProcessor
                        fileProcessor.StartWorking(path, _cancellationToken)
                    End Sub, _cancellationToken))
        Next
    End Sub
    
    Protected Overrides Sub OnStop()
        _cancellationTokenSource.Cancel()
        Task.WaitAll(_workers.ToArray)
    End Sub
    End Class
    
    Class FileProcessor
        Private _newFiles As New BlockingCollection(Of String)
        
        Sub _fileWatcher_Created(sender As Object, e As FileSystemEventArgs)
                                Handles __fileWatcher.Created
            _newFiles.Add(e.FullPath, _cancellationToken)
        End Sub
    
        Async Function ProcessNewFiles() As Task
            Do
                Await ProcessFile(_newFiles.Take(_cancellationToken))
            Loop
        End Function
        '...
    End Class
    I do not start the top-level worker Tasks as LongRunning because even though they are living for aforever (only interrupted by server reboots) they often are idle and only trigger work driven by FileSystemWatcher events.

    Wednesday, September 19, 2012 7:26 PM

Answers

  • This approach seems reasonable - await/async, in general, free up the threads and don't block, so they're a great option for this type of thing.


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    • Marked as answer by pmeinl Monday, September 24, 2012 5:50 PM
    Friday, September 21, 2012 3:51 PM

All replies

  • The two main things I see:

    1) I would suggest doing your "top level" tasks as LongRunning.  Since they are long running, and sitting there executing for the entire time, this would make sense, and not tie up ThreadPool threads for these.

    2) Your ProcessNewFiles function should likely have a way to exit the loop cleanly, ie:

        Async Function ProcessNewFiles() As Task
            Do
                Await ProcessFile(_newFiles.Take(_cancellationToken))
    
               ' Throw when we cancel to exit?
                _cancellationToken.ThrowIfCancellationRequested()
            Loop
        End Function


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    Wednesday, September 19, 2012 9:32 PM
  • @1) That is one of the things I am wondering about. Should one consider a Task LongRunning that exists for a long time but is waiting for file system events most of the time and does its processing in Parallel.For loops on the ThreadPool (not shown in the sample). By not making the top-level Tasks LongRunning I expect all work to run on the ThreadPool and only consume threads when really doing something. Each LongRunning top-level Task in my example would own one CLR = OS Thread forever.

    @2) _newFiles.Take(_cancellationToken) seems to throw an OperationCancelldException automatically.
    Thursday, September 20, 2012 2:27 AM
  • For 1) If you have a thread, and it's blocked waiting on operations, that thread will be tied up the entire process.  By flagging it LongRunning, the default scheduler will use a dedicated thread.  Without that, you'll tie up one of the ThreadPool threads "forever".


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    Thursday, September 20, 2012 3:39 PM
  • The FileProcessor Tasks in my example are not sitting in a tight loop doing compute-bound work, as it is often the case with our Windows Services. Most of the time they are waiting for FileSystemWatcher events or awaiting a Take() on a BlockingCollection. I assume their threads are released to the ThreadPool while they are waiting. Is this assumption correct?
    Thursday, September 20, 2012 4:28 PM
  • The FileProcessor Tasks in my example are not sitting in a tight loop doing compute-bound work, as it is often the case with our Windows Services. Most of the time they are waiting for FileSystemWatcher events or awaiting a Take() on a BlockingCollection. I assume their threads are released to the ThreadPool while they are waiting. Is this assumption correct?

    No.  When the thread is blocked on a Take() call, for example, it's blocked and effectively "used" - it's not available for use elsewhere.  The ThreadPool may introduce new threads since it's seeing blocked ones, but they're not recycled/reused.

    As such, it's more appropriate to use a dedicated thread, ie: use LongRunning, if you know the thread will be "alive" (even blocked) for very long periods of time.


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    Thursday, September 20, 2012 4:35 PM
  • This is interesting!
    Ok, my assumption was partly wrong. Take() does not release its thread to the pool while waiting.  The FileSystemWatchers do not use threads while watching.

    I modified my example by using a Dataflow ActionBlock instead of the BlockingColletion:

    Class FileProcessor
        Private _newFilesActionBlock As New ActionBlock(Of String)(
            Async Function(filePath)
                Await ProcessFile(filePath)
            End Function,
                New ExecutionDataflowBlockOptions With {
                    .CancellationToken = _cancellationToken})
        
        Sub _fileWatcher_Created(sender As Object, e As FileSystemEventArgs)
                                Handles __fileWatcher.Created
            _newFilesActionBlock.Post(e.FullPath)
        End Sub
        '...
    End Class
    

    This solution does not consume threads while idly watching for new files. I spins up threads to process new files and releases them when done. Correct?

    Does this approach make sense?

    Friday, September 21, 2012 5:06 AM
  • This approach seems reasonable - await/async, in general, free up the threads and don't block, so they're a great option for this type of thing.


    Reed Copsey, Jr. - http://reedcopsey.com
    If a post answers your question, please click "Mark As Answer" on that post and "Mark as Helpful".

    • Marked as answer by pmeinl Monday, September 24, 2012 5:50 PM
    Friday, September 21, 2012 3:51 PM