none
SelectRec: LINQ over recursive data

    General discussion

  • Hi, Rx team!

    Today way to deal with iterating over the recursive data structures is to define recursive C# iterator, but all we know about the problem growing complexity of iterating the recursive iterators. Implementing yield foreach in C# will solve the problem, but it looks like next version of C# will haven't this feature. Today it would be great to have composable LINQ operator to deal with recursive data structures (with a trampoline inside), like this:

    using System;
    using System.Collections.Generic;
    
    public static class RecExtensions
    {
    	#region Public surface
    
    	public static IEnumerable<TSource> SelectRec<TSource>(
    		this IEnumerable<TSource> source,
    		Func<TSource, IEnumerable<TSource>> selector)
    	{
    		return SelectRec(source, _ => true, selector);
    	}
    
    	public static IEnumerable<TSource> SelectRec<TSource>(
    		this IEnumerable<TSource> source,
    		Func<TSource, bool> predicate,
    		Func<TSource, IEnumerable<TSource>> selector)
    	{
    		if (source == null)
    			throw new ArgumentNullException("source");
    		if (predicate == null)
    			throw new ArgumentNullException("predicate");
    		if (selector == null)
    			throw new ArgumentNullException("selector");
    
    		return SelectRecImpl(source, predicate, selector);
    	}
    
    	public static IEnumerable<TSource> SelectRec<TSource>(
    		TSource source,
    		Func<TSource, IEnumerable<TSource>> selector)
    	{
    		return SelectRec(new[] { source }, selector);
    	}
    
    	public static IEnumerable<TSource> SelectRec<TSource>(
    		TSource source,
    		Func<TSource, bool> predicate,
    		Func<TSource, IEnumerable<TSource>> selector)
    	{
    		return SelectRec(new[] { source }, predicate, selector);
    	}
    
    	#endregion
    	#region Implementation
    
    	static IEnumerable<TSource> SelectRecImpl<TSource>(
    		IEnumerable<TSource> source,
    		Func<TSource, bool> predicate,
    		Func<TSource, IEnumerable<TSource>> selector)
    	{
    		LinkedList<TSource> list = null;
    		try
    		{
    			IEnumerator<TSource> e = null;
    			while (true)
    			{
    				// get the new enumerator if needed
    				if (e == null) e = source.GetEnumerator();
    				try
    				{
    					// iterate over the current enumerator
    					while (e.MoveNext())
    					{
    						var o = e.Current;
    						yield return o;
    
    						// if current - is inner enumerable
    						if (predicate(o))
    						{
    							// get the new enumerable
    							source = selector(o);
    							// store current
    							list = new LinkedList<TSource>(e, list);
    							e = null;
    
    							break; // delay enumerator
    						}
    					}
    				}
    				finally // dispose the enumerator if not delayed
    				{
    					if (e != null) e.Dispose();
    				}
    
    				if (e == null) continue; // inner enumerable
    				if (list == null) break; // nothing to enumerate
    				else
    				{
    					e = list.Value;
    					list = list.Next;
    				}
    			}
    		}
    		finally { DisposeRec(list); } // ensure ALL enumerators disposed
    	}
    
    	/// <summary>
    	/// Recursively disposes enumerators from stack
    	/// with the correct exception handling in cases
    	/// when some .Dispose() may throw an exception.
    	/// </summary>
    	static void DisposeRec<T>(LinkedList<T> xs)
    	{
    		if (xs != null)
    		{
    			IDisposable disposable = xs.Value;
    			try { disposable.Dispose(); }
    			finally { DisposeRec(xs.Next); }
    		}
    	}
    
    	sealed class LinkedList<T> // single-linked list
    	{
    		public readonly IEnumerator<T> Value;
    		public readonly LinkedList<T> Next;
    
    		public LinkedList(IEnumerator<T> enumerator, LinkedList<T> next)
    		{
    			this.Value = enumerator;
    			this.Next = next;
    		}
    	}
    
    	#endregion
    }
    
    Usages:

    var dirs = DriveInfo
      .GetDrives()
      .Where(d => d.IsReady)
      .Select(d => d.RootDirectory)
      .SelectRec(dir => {
        try { return dir.EnumerateDirectories(); }
        catch (UnauthorizedAccessException) { }
        return Enumerable.Empty<DirectoryInfo>();
      });
    
    or:

    RecExtensions
    	.SelectRec(
    		treeRoot,
    		node => node is Leaf,
    		node => ((Leaf)node).GetInnerNodes())
    	.Select(x => ...)
    

    What are you thinking about this?

    p.s. Sorry me my English :(

    Friday, November 05, 2010 11:34 AM

All replies

  • Glad to see the community starting to contribute these kind of operators.

    You probably want to take a look at Wes' blog on a simular subject: http://blogs.msdn.com/b/wesdyer/archive/2007/03/23/all-about-iterators.aspx and Erik's paper on the same: http://citeseer.ist.psu.edu/cache/papers/cs2/355/http:zSzzSzwww.cs.kuleuven.ac.bezSz~frankzSzPAPERSzSzFTfJP2005.pdf/iterators-revisited-proof-rules.pdf

     

     

    Friday, November 05, 2010 9:02 PM
  • Hi, Jeffrey!

    Any chance to see this kind of operators as part of Rx?

    Friday, November 05, 2010 10:11 PM
  • I used:

     

     


      public static class RecursiveLINQExtensions
      {
        public static IEnumerable<T> SelectRecursive<T>(this IEnumerable<T> source, Func<IEnumerable<T>, IEnumerable<T>> selector)
        {
          if (source == null)
            throw new ArgumentNullException("source");
          if (selector == null)
            throw new ArgumentNullException("selector");
          Contract.EndContractBlock();

          var hasResults = false;
          do
          {
            foreach (var item in source)
            {
              hasResults = true;
              yield return item;
            }

            source = selector(source);
          }
          while (hasResults);
        }
      }

      public static class LinqExceptionExtensions
      {
        public static TResult Try<T, TResult>(this T obj, Func<T, TResult> selector)
        {
          return obj.Try(selector, default(TResult));
        }

        public static TResult Try<T, TResult>(this T obj, Func<T, TResult> selector, TResult defaultValue)
        {
          var result = default(TResult);
          try
          {
            result = selector(obj);
          }
          catch
          {
            result = defaultValue;
          }
          return result;
        }
      }

     

     

    And:

     

        static void Main(string[] args)
        {
          var dirs = DriveInfo
            .GetDrives()
            .Where(d => d.IsReady)
            .Select(d => (FileSystemInfo)d.RootDirectory)
            .SelectRecursive(parent =>
              from item in parent.OfType<DirectoryInfo>()
              from subitem in item.Try((dir) => dir.GetFileSystemInfos()) ?? Enumerable.Empty<FileSystemInfo>()
              select subitem
              )
              .Take(100);


          foreach (var item in dirs)
          {
            Console.WriteLine(item.FullName);
          }
          return;
        }

     

    I don't see why you had to create a linked list of objects, when a while loop with "yield foreach" and some different requirements produced the same results.

     

    I also implemented mine in the image of the recursive common table expression in SQL. That is, you write some query, then "union all" recursions that increase the depth of that query, by some function of the previous layer of results. You continue until you no longer have any results. Using this style, you can still use a declarative syntax inside the call to SelectRecursive.

     


    All posted code is licensed under the Creative Commons CC-BY license unless stated otherwise.
    Monday, November 08, 2010 1:16 AM
  • Hi, Aaron!

    Linked list is used to reduce enumeration complexity - you always have the current IEnumerator<T> at the list head and shouldn't call MoveNext at every nested IEnumerator<T> to get the next value. Read Erik's paper, it describes the ``yield foreach`` problem :)

    Saturday, February 12, 2011 10:28 AM
  • Hy, RX Team!

    Great release v1.0.2855!

    I've found new Observable.Expand() operator doing the same thing, as I've described in the topic above. Very thank you for including operators like this into Rx!

    It would be great to have interactive and async version of Expand too!

    Saturday, February 12, 2011 10:34 AM
  • Re: ControlFlow, could you link me to Erik's paper on the yield foreach problem? 

     

    Thanks,

    Aaron Friel


    All posted code is licensed under the Creative Commons CC-BY license unless stated otherwise.
    Wednesday, February 23, 2011 3:23 PM
  • Hy, RX Team!

    Great release v1.0.2855!

    I've found new Observable.Expand() operator doing the same thing, as I've described in the topic above. Very thank you for including operators like this into Rx!

    It would be great to have interactive and async version of Expand too!

    Any chance the Expand operator will be in RxJS any time soon?
    Wednesday, March 28, 2012 9:43 AM
  • It will come in our v2.0 Beta of RxJS, soon to be released. It's part of the experimental subset for the time being.

    using (Microsoft.Sql.Cloud.DataProgrammability.Rx) { Signature.Emit("Bart De Smet"); }

    Wednesday, March 28, 2012 9:19 PM