I can not understand the Group Join in Linq RRS feed

  • Question

  • I have already searched on the forums and found a diagram.



    LINQ supports grouping joins which are a superset of inner and left outer join. A grouping join produces a hierarchical result that pairs each element from the outer sequence with its corresponding sequence of elements from the inner sequence. For example (from the Standard Query Operators documentation):

    can be written as a grouping join followed by an iteration of the grouped orders

    The query can be turned into a left outer join by applying the DefaultIfEmpty operator to the grouped orders

    DefaultIfEmpty() turns an empty sequence into a sequence with a single null element. Thus, the code must now be prepared to deal with a null o.
    Hope this helps. For more details, check out the Standard Query Operators documentation (in the Docs directory under your LINQ preview install) section 1.6.2.

    But I still can not understand the group join,
    Code Block
    var query = from customerRow in db.CustomerTable
                join orderRow in db.OrderTable
                    on customerRow.CustomerID equals orderRow.CustomerID into customerOrders
                from orderRow in customerOrders.DefaultIfEmpty()
                select new
                    OrderID = orderRow.OrderID == null ? -1L : orderRow.OrderID



    what is the customerOrders here?
    and I do not understand from orderRow in customerOrders.DefaultIfEmpty() , I think if the orderRow is empty, the DefaultIfEmpty() should return null, so the empty order rows shoule not be included in the result, but it is there.
    So what is the customerOrders indeed? what hierarchical structure it is?
    I have two tables:
    ( CustomerID | CustomerName)
       1                   John
       2                   Jerry
       3                   Joke
       4                   Jenny
    ( OrderID | CustomerID )
       1                    1
       2                    1
       3                    1
       4                    2
    what is the group join result of these tables?
    Friday, November 2, 2007 8:46 AM


  • A group join breaks up the joined table into sequences of objects each matching/corresponding to an object from the other table.  In your example, when you use the join/into you are performing a group join.  All the order objects associated with a given customer are collected into a separate collection that you have named 'customerOrders'.  This is similar to how a GroupBy works, except a GroupBy doesn't join two tables, it turns a single table into a sequence of keys each paired with a collection of matching objects in that group.


    For example, if I were to use a GroupBy with the OrderTable, I could group the orders by customer.


    var q = from o in db.OrderTable

              group o.OrderID by o.CustomerID into g

              select new {Customer = g.Key, Orders = g};


    what I get is this:


      Customer       Orders

      -------------  --------

      1              1




      2              4


    This is not a normal SQL Group By.  A LINQ GroupBy creates a heirarchy.  Every item resulting from the GroupBy has a Key value and a sequence.  The sequence (or group) value is a collection with zero or more items.


    A Group Join is very much like a Join and a Group By smashed together.  A SQL join takes two tables, performs a cartesian product, filters it and produces a resulting table.  A Group Join takes two tables and forms a pairing of left hand side objects (keys) and sequences of right-hand side matching objects (groups).


    var q - from c in db.CustomerTable

             join o in db.OrderTable on c.CustomerID equals o.CustomerID into g

             select new {Customer = c.CustomerName, Order = g.Select(o => o.OrderID) };


    Customer         Orders

    ---------------  ----------

    John             1




    Jerry            4



    Sunday, November 4, 2007 4:27 AM