none
How to use distinct in Group clause RRS feed

  • Question

  • How to use distinct on records of every group in a LINQ query.

     

    I have some duplicate records coming under one group, but I want only ditinct records

     

    For example

     

    from currentTest in tfsDataContext.TestResultAnalysis

    join baseTest in tfsDataContext.TestResultAnalysis on currentTest.TestId equals baseTest.TestId

    where retreivalOptions.BaseRunName.Contains(baseTest.Name) && baseTest.Analyzed == true &&

    retreivalOptions.CurrentRunName.Contains(currentTest.Name) && currentTest.Analyzed == false

    orderby baseTest.TestId ascending

    group baseTest by new {runName = baseTest.Name, time = baseTest.DateFinished} into groupedTests

    orderby groupedTests.Key.time

    select groupedTests;

     

    I get following groups

    G1 - {Test1, Test2,Test3,Test3,Test4}

    G2 - {Test3, Test2, Test2, Test4}

     

    What If I want to have only unique records in every group.

     

    Please suggest me some way to fix this problem.

    Tuesday, March 25, 2008 4:18 PM

Answers

  • from currentTest in tfsDataContext.TestResultAnalysis

    join baseTest in tfsDataContext.TestResultAnalysis on currentTest.TestId equals baseTest.TestId

    where retreivalOptions.BaseRunName.Contains(baseTest.Name) && baseTest.Analyzed == true &&

    retreivalOptions.CurrentRunName.Contains(currentTest.Name) && currentTest.Analyzed == false

    orderby baseTest.TestId ascending

    group baseTest by new {runName = baseTest.Name, time = baseTest.DateFinished} into groupedTests

    orderby groupedTests.Key.time

    select groupedTests.Distinct()

    Wednesday, March 26, 2008 2:25 AM
    Moderator
  • There is a huge difference.  If you put the query in parentheses the Distinct is applied to the whole query.  The query returns a sequence of grouped sequences.  You can't even apply Distinct to that.  If you don't use the parentheses, the Distinct is part of the select expression, it is operating on just the 'groupedTests', allowing you to form the sequences of distinct items in the group.

     

    Friday, March 28, 2008 2:53 AM
    Moderator

All replies

  • from currentTest in tfsDataContext.TestResultAnalysis

    join baseTest in tfsDataContext.TestResultAnalysis on currentTest.TestId equals baseTest.TestId

    where retreivalOptions.BaseRunName.Contains(baseTest.Name) && baseTest.Analyzed == true &&

    retreivalOptions.CurrentRunName.Contains(currentTest.Name) && currentTest.Analyzed == false

    orderby baseTest.TestId ascending

    group baseTest by new {runName = baseTest.Name, time = baseTest.DateFinished} into groupedTests

    orderby groupedTests.Key.time

    select groupedTests.Distinct()

    Wednesday, March 26, 2008 2:25 AM
    Moderator
  • I have a doubt.

     

    Let the above query be represented as QUREY so what is the difference between the following ways of writing the same query :

     

    Case 1 :

    Var results = (QUERY).Distinct()

     

    Case 2 :

    Var results = QUERY.Distinct()

     

    As I tried case 1 earlier and was not able to the get the required output.

     

    Wednesday, March 26, 2008 5:38 AM
  • There is a huge difference.  If you put the query in parentheses the Distinct is applied to the whole query.  The query returns a sequence of grouped sequences.  You can't even apply Distinct to that.  If you don't use the parentheses, the Distinct is part of the select expression, it is operating on just the 'groupedTests', allowing you to form the sequences of distinct items in the group.

     

    Friday, March 28, 2008 2:53 AM
    Moderator