none
Count Distinct with deferred query RRS feed

  • Question

  • Hi,

     

    I am using LINQ to do a count of records across a one to many entity relationship, as follows:

     

     

    bool includeActive = true;
    bool includeInactive = false
    ;
    string username = "frim"
    ;

    var
    count = vs_PollOptions
    .Where(v => (v.vs_Polls.AddedBy == username && v.OptionStatus ==
    1
    ))
    .Select(
        v =>
    new

    {
        PollID = v.vs_Polls.PollID,
        IsActive = v.vs_Polls.IsActive
    }
    );

    if (includeActive && !includeInactive)
    {
        count = count.Where(i => i.IsActive ==
    true
    );
    }
    else if
    (!includeActive && includeInactive)
    {
        count = count.Where(i => i.IsActive ==
    false
    );
    }

    int
    resultCount = count.Count();

     

     

    The resulting SQL is as follows:

     

     

    SELECT COUNT(*) AS [value]
    FROM [vs_PollOptions] AS [t0]
    INNER JOIN [vs_Polls] AS [t1] ON [t1].[PollID] = [t0].[PollID]
    WHERE ([t1].[IsActive] = 1) AND ([t1].[AddedBy] = @p0) AND ([t0].[OptionStatus] = @p1)
    -- @p0: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [frim]
    -- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

     

     

    This is pretty much as I would expect. However, I want to do a Distinct Count on a single column. I can achieve this pretty easily in SQL by just changing 'COUNT (*)', like this:

     

     

    SELECT COUNT(DISTINCT [t1].[PollID]) AS [value]
    FROM [vs_PollOptions] AS [t0]
    INNER JOIN [vs_Polls] AS [t1] ON [t1].[PollID] = [t0].[PollID]
    WHERE ([t1].[IsActive] = 1) AND ([t1].[AddedBy] = @p0) AND ([t0].[OptionStatus] = @p1)
    -- @p0: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [frim]
    -- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

     

     

    But I cannot seem to manage the same with LINQ to SQL. The closest I can come up with is this:

     

     

    bool includeActive = true;
    bool includeInactive = false
    ;
    string username = "frim"
    ;

    var
    count = vs_PollOptions
    .Where(v => (v.vs_Polls.AddedBy == username && v.OptionStatus ==
    1
    ))
    .Select(
        v =>
    new

    {
        PollID = v.vs_Polls.PollID,
        IsActive = v.vs_Polls.IsActive
    }
    ).Distinct();

    if (includeActive && !includeInactive)
    {
        count = count.Where(i => i.IsActive ==
    true
    );
    }
    else if
    (!includeActive && includeInactive)
    {
        count = count.Where(i => i.IsActive ==
    false
    );
    }

    int resultCount = count.Count();

     

     

    All I have done is append 'Distinct()' onto the end of the original query. But this generates the following SQL:

     

     

    SELECT COUNT(*) AS [value]
    FROM (
        SELECT DISTINCT [t1].[PollID], [t1].[IsActive]
        FROM [vs_PollOptions] AS [t0]
        INNER JOIN [vs_Polls] AS [t1] ON [t1].[PollID] = [t0].[PollID]
        WHERE ([t1].[AddedBy] = @p0) AND ([t0].[OptionStatus] = @p1)
        ) AS [t2]
    WHERE [t2].[IsActive] = 1
    -- @p0: Input NVarChar (Size = 4; Prec = 0; Scale = 0) [frim]
    -- @p1: Input Int (Size = 0; Prec = 0; Scale = 0) [1]

     

     

    Which I don't think is as efficient.

     

    Is there any way I can convince LINQ to SQL to give me the simpler 'COUNT (DISTINCT [t1.PollID])' version, while still giving me the flexibility of deferred execution so I can add in my logic to determine whether Active or Inactive records are returned?

     

    Any thoughts very much appreciated.

     

    Thanks...

    Thursday, July 10, 2008 3:36 PM

Answers

  • Well, having a look at the Query Execution Plans for both versions, SQL Server 2005 seems to treat them exactly the same. So perhaps there is no efficiency issue here after all.

     

    Stilll, knowing the answer to this question (if there is one) would give useful insight into how LINQ to SQL handles Distinct().

     

     

    Thursday, July 10, 2008 4:37 PM