none
LINQ against datatable, getting distinct count RRS feed

  • Question

  • I seem to be going around in a circle trying to get the results I'm looking for. I'm returning a datatable from a SQL table that essentially looks like:

    Filename varchar(60)
    UserID varchar(60)

    I want to generate a list of filenames, grouped by the filename, along with the count of the number of users, and the count of distinct users, ex.

    Code Snippet

    Filename    UserID
    ---------------------------------------------
    Filename1, SomePerson
    Filename1, SomePerson
    Filename2, SomeOtherPerson
    Filename3, AnotherPerson



    should generate:

    Code Snippet

    Filename    Count       Unique Count
    --------------------------------------------------
    Filename1,   2,           1
    Filename2,   1,           1
    Filename3,   1,           1



    The SQL code would just be something like:

    Code Snippet

    SELECT Filename,
           COUNT(*) AS Views,
           COUNT(DISTINCT UserID) AS UniqueViews
      FROM MyTable
     GROUP BY Filename



    Finding the count is easy enough with this:

    Code Snippet

    var result = from row in files.DataSet.Tables[0].AsEnumerable()
                 group row by new { File = row.Field<string>("Filename") } into grp                           
                 select new
                 {                            
                      File = grp.Key.File,
                      Views = grp.Count(),                            
                 };


    But I can't seem to get the unique count in there. I've tried a number of variations with subselects in the select, grouping by both Filename and UserID w/subselects, but end up with wrong counts.

    Any ideas?



    Monday, July 7, 2008 2:31 PM

Answers

  • Wow, talk about bad timing with debugging this. I initially ran my stored procedure inside of SQL Mgt. Studio then manually counted the unique views for an item I picked at random (against our production server). Then I constructed the LINQ query and tested it against my expected results; when I compared the counts they were different. I assumed that the issue was with my LINQ query. It didn't occur to me that, yes, it's a production server so the data can change. What I've been thinking was the incorrect count was in fact correct. Doh!

    At any rate, here's what I ended up with:


    Code Snippet

    var result = from row in files.DataSet.Tables[0].AsEnumerable()
                 group row by new
                 {
                     File = row.Field<string>("OriginalFilename")                          
                 } into grp
                 select new
                 {
                     File = grp.Key.File,
                     Views = grp.Count(),
                     UniqueViews = (from p in grp
                                    select p.Field<string>("UserID")).Distinct().Count()                                    
                 };

    Wednesday, July 9, 2008 1:38 PM

All replies

  • I would suggest that you post the question in the DataSet forum.

     

    Thanks,

     

    --Samir

     

     

    Tuesday, July 8, 2008 11:22 PM
  • I did originally. Someone moved it here...
    Wednesday, July 9, 2008 12:42 PM
  • Wow, talk about bad timing with debugging this. I initially ran my stored procedure inside of SQL Mgt. Studio then manually counted the unique views for an item I picked at random (against our production server). Then I constructed the LINQ query and tested it against my expected results; when I compared the counts they were different. I assumed that the issue was with my LINQ query. It didn't occur to me that, yes, it's a production server so the data can change. What I've been thinking was the incorrect count was in fact correct. Doh!

    At any rate, here's what I ended up with:


    Code Snippet

    var result = from row in files.DataSet.Tables[0].AsEnumerable()
                 group row by new
                 {
                     File = row.Field<string>("OriginalFilename")                          
                 } into grp
                 select new
                 {
                     File = grp.Key.File,
                     Views = grp.Count(),
                     UniqueViews = (from p in grp
                                    select p.Field<string>("UserID")).Distinct().Count()                                    
                 };

    Wednesday, July 9, 2008 1:38 PM
  • Paul,

    What if you had to add the following column to your final table?  I am having trouble getting this information using LINQ statements in a similar scenario...

    INITIAL TABLE:

    Filename    UserID
    ---------------------------------------------
    Filename1, SomePerson

    Filename1, AnotherPerson

    Filename1, SomePerson

    Filename2, SomeOtherPerson
    Filename3, AnotherPerson

    FINAL TABLE:

    Filename    Count       Unique Count                UserID
    -------------------------------------------------------------------------
    Filename1,   2,           1                                   SomePerson, AnotherPerson
    Filename2,   1,           1                                   SomeOtherPerson
    Filename3,   1,           1                                   AnotherPerson

    Thanks,

    Daniel




    • Edited by drmercado Friday, September 14, 2012 2:12 AM
    Friday, September 14, 2012 2:10 AM