locked
Caching of lazy asynchronous operation RRS feed

  • Question

  • Hi,

    I have a data I need to query over the network in asynchronous way.
    I'd like to cache the result of this operation only if it succeeds.
    This is what I came up with:

        class Cache
        {
            private Task<FatData> _data;

            public Task<FatData> Data
            {
                get
                {
                    if (_data == null)
                    {
                        var tcs = new TaskCompletionSource<FatData>();

                        GetDataAsync()
                        .ContinueWith(d =>
                        {
                            if (_data != null)
                            {
                                tcs.SetResult(_data.Result);
                            }
                            else
                            {
                                try
                                {
                                    tcs.SetResult(d.Result);
                                    _data = tcs.Task;
                                }
                                catch (Exception ex)
                                {
                                    tcs.SetException(ex);
                                }
                            }
                        });
                        return tcs.Task;
                    }
                    return _data;
                }
            }
        }

    Judging by System.Threading.Tasks.Task.ContinueWithCore is seems like it can be done better.

    Should I update _data using Interlocked.CompareExchange instead of _data = tcs.Task?

    Can you see any other improvements can be made to this code?

     

    Thanks

    Friday, September 30, 2011 7:30 AM

Answers

  • Hi Shweduke-

    I don't know all of the constraints you're operating under, but from cases I've seen related to this, if you only want to cache when the operation is successful, then it's made sense to have all operations do their own asynchronous operation until a successful value is cached.  If that works for you, a simple solution is something like:

        public sealed class LazyAsyncValue<T>
        {
            private Func<Task<T>> m_func;
            private Task<T> m_successTask;

            public LazyAsyncValue(Func<Task<T>> func) { m_func = func; }

            public Task<T> GetValueAsync()
            {
                if (m_successTask != null) return m_successTask;
                var t = m_func();
                t.ContinueWith(
                    completed => Interlocked.CompareExchange(ref m_successTask, completed, null),
                    CancellationToken.None,
                    TaskContinuationOptions.OnlyOnRanToCompletion | TaskContinuationOptions.ExecuteSynchronously,
                    TaskScheduler.Default);
                return t;
            }
        }

    Calling GetValueAsync will give you back the cached success task if it already exists.  If it doesn't exist, the operation just uses the supplied func to create a new task which is returned to the caller.  Additionally, when that task completes, if it completes successfully we try to store it as the cached success task.

    Regarding Monitor vs SpinLock, I typically recommend folks always use Monitor unless through performance analysis they determine that SpinLock yields better performance.

    I hope that helps.


    Sunday, December 4, 2011 5:10 PM
    Moderator
  • ps Your #3 has a bug... if the operation gets canceled, you never set _value to null nor complete the TCS, so the returned task will never complete.
    Sunday, December 4, 2011 5:19 PM
    Moderator

All replies

  • Here is my another take on it:

        class Cache
        {
            private readonly object _syncLock = new object();
            private Task<string> _data;

            public Task<string> Data
            {
                get
                {
                    if (_data == null)
                    {
                        lock (_syncLock)
                        {
                            if (_data == null)
                            {
                                var tcs = new TaskCompletionSource<string>();

                                GetDataAsync().ContinueWith(d =>
                                {
                                    try
                                    {
                                        tcs.SetResult(d.Result);
                                    }
                                    catch (Exception ex)
                                    {
                                        tcs.SetException(ex);
                                        lock (_syncLock)
                                        {
                                            _data = null;
                                        }
                                    }
                                });
                                _data = tcs.Task;
                            }
                        }
                    }
                    return _data;
                }
            }
        }

    Any comments will be appreciated.

    Friday, September 30, 2011 10:34 AM
  • I see that you put into cache only the first received value. Does it mean that data does not change?
    Friday, September 30, 2011 7:31 PM
  • In both solutions all callers of Cache.Data will be blocked till the first GetDataAsync() completed with success but on other side in the code sample #1, in case there is no dat in cache, GetDataAsync() will be called per each concurrent caller, the code will receive a number of data instances and throw away all except first.

    So I'd prefer the solution #2 as it wastes less resources, plus I'm also concerning about not using Interlocked.CompareExchange in solution #1

    Friday, September 30, 2011 8:32 PM
  • Yes, it would not be OK to cache it otherwise.
    Monday, October 3, 2011 1:44 AM
  • There is a bug in #2.

    Here is #3, which is more straight forward and much more likely to be correct:

    public sealed class LazyAsyncValue<T>
    {
        private readonly object _syncLock = new object();
        private Func<Task<T>> _getValue;
        private Task<T> _value;

        internal LazyAsyncValue(Func<Task<T>> getValue)
        {
            _getValue = getValue;
        }

        public Task<T> GetValue()
        {
            lock (_syncLock)
            {
                if (_value == null)
                {
                    var tcs = new TaskCompletionSource<T>();

                    _getValue().ContinueWith(d =>
                    {
                        if (d.IsFaulted)
                        {
                            lock (_syncLock)
                            {
                                _value = null;
                            }
                            tcs.SetException(d.Exception.InnerExceptions);
                        }
                        else if (!d.IsCanceled)
                        {
                            tcs.SetResult(d.Result);
                            _getValue = null;
                        }
                    }, TaskContinuationOptions.ExecuteSynchronously);

                    _value = tcs.Task;
                }
                return _value;
            }
        }
    }

    public static class LazyAsyncValue
    {
        public static LazyAsyncValue<T> Create<T>(Func<Task<T>> getValue)
        {
            if (getValue == null) throw new ArgumentNullException("getValue");
            return new LazyAsyncValue<T>(getValue);
        }
    }

    I guess we could use SpinLock instead of Monitor to save a bit of the memory, but it's a too voodoo to try at the moment.

     

    Maybe Stephen could advise on that.

    Monday, October 3, 2011 5:56 AM
  • Hi Shweduke-

    I don't know all of the constraints you're operating under, but from cases I've seen related to this, if you only want to cache when the operation is successful, then it's made sense to have all operations do their own asynchronous operation until a successful value is cached.  If that works for you, a simple solution is something like:

        public sealed class LazyAsyncValue<T>
        {
            private Func<Task<T>> m_func;
            private Task<T> m_successTask;

            public LazyAsyncValue(Func<Task<T>> func) { m_func = func; }

            public Task<T> GetValueAsync()
            {
                if (m_successTask != null) return m_successTask;
                var t = m_func();
                t.ContinueWith(
                    completed => Interlocked.CompareExchange(ref m_successTask, completed, null),
                    CancellationToken.None,
                    TaskContinuationOptions.OnlyOnRanToCompletion | TaskContinuationOptions.ExecuteSynchronously,
                    TaskScheduler.Default);
                return t;
            }
        }

    Calling GetValueAsync will give you back the cached success task if it already exists.  If it doesn't exist, the operation just uses the supplied func to create a new task which is returned to the caller.  Additionally, when that task completes, if it completes successfully we try to store it as the cached success task.

    Regarding Monitor vs SpinLock, I typically recommend folks always use Monitor unless through performance analysis they determine that SpinLock yields better performance.

    I hope that helps.


    Sunday, December 4, 2011 5:10 PM
    Moderator
  • ps Your #3 has a bug... if the operation gets canceled, you never set _value to null nor complete the TCS, so the returned task will never complete.
    Sunday, December 4, 2011 5:19 PM
    Moderator