none
Enumerable 'Except' method implementation RRS feed

  • Question

  • Good day.

    In Enumerable.cs i found implementation of extension method 'Except'. She uses 'Set<TElement>' class to optimize search. Why not used 'SortedSet<T>' class instead?


    Monday, July 11, 2016 9:48 AM

Answers

  • I think that since Set is an internal class, it probably was specially optimised for some of operations, while the general SortedSet and HashSet classes work better in other, common circumstances. The internal class does not even implement the usual interfaces, such as ISet and ICollection, therefore it has a limited optimised usage.


    Monday, July 11, 2016 11:12 AM

All replies

  • Erm, because the Except extension method gives you back all elements in one set that do not appear in the second set. As documented. And because it is an extension method of IEnumerable it will work with any collection that implements IEnumerable (which is most if not all of them).

    Whereas, SortedSet is a class for maintaining a set of objects in order. You would use the Except() extension method with a SortedSet if you wanted to of course, because SortedSet implements IEnumerable.

    (SortedSet also has its own ExceptWith method, though that does something slightly different)

    Monday, July 11, 2016 10:17 AM
  • Let me explain my question.

    Here is an implementation of the standard library:

            public static IEnumerable<TSource> Except<TSource>(this IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer)
            {
                if (first == null) throw Error.ArgumentNull("first");
                if (second == null) throw Error.ArgumentNull("second");
                return ExceptIterator<TSource>(first, second, comparer);
            }
     
            static IEnumerable<TSource> ExceptIterator<TSource>(IEnumerable<TSource> first, IEnumerable<TSource> second, IEqualityComparer<TSource> comparer) {
                Set<TSource> set = new Set<TSource>(comparer);
                foreach (TSource element in second) set.Add(element);
                foreach (TSource element in first)
                    if (set.Add(element)) yield return element;
            }

    As you can see, method 'ExceptIterator' uses class Set for optimization.

    SortedSet is a class that implements red-black tree data structure.

    Why use Set, not SortedSet?

    It will search in SortedSet more effective?





    Monday, July 11, 2016 11:03 AM
  • I think that since Set is an internal class, it probably was specially optimised for some of operations, while the general SortedSet and HashSet classes work better in other, common circumstances. The internal class does not even implement the usual interfaces, such as ISet and ICollection, therefore it has a limited optimised usage.


    Monday, July 11, 2016 11:12 AM
  • Ah, I see.

    Interestingly, the Set<> class is an internal class defined in Enumerable.cs. Have a scroll down and look in Enumerable.cs.

    It uses some system of 'buckets' and 'slots' to maintain its internal list. I'm afraid I don't have the mathematical analysis knowledge to be able to compare this with SortedSet, but one would hope that this is designed for fast lookups of existing items!

    Monday, July 11, 2016 11:17 AM
  • It would be interesting to compare the performance of these data structures. You have not seen anything similar?
    Monday, July 11, 2016 12:24 PM