Using LINQ with duplicate and distinct ID numbers in DataTable RRS feed

  • Question

  • My project is to read an Excel file into memory and process each row.

    I am reading the Excel file into a Data.DataTable and it has over 100

    columns and over 250 rows.  One column serves as the ID column for

    the DataTable and it contains multiple rows for some, but not all, ID

    numbers.  So, ID "1" could have one row, ID "2" could have 5 rows,

    ID "3" could have 20 rows and I need to process each row, as well as

    keep track of the distinct ID numbers I have processed in groups of

    10 distinct ID numbers.  So for the scenario I cited above, I would

    process distinct ID numbers 1 through 10 but possibly process from

    10 to 200 rows for those distinct ID numbers.  I am new to LINQ, but

    I was told I could use it to help me keep track of the distinct ID numbers

    in groups of ten, and also use it to process all the necessary rows (1 to 20)

    for each ID number.  I have to process each row separately, so I cannot

    remove any duplicate rows (somewhere in the 100+ columns there is unique

    data for that row that need to be entered into the mainframe.

    How would I do this?  Thanks.

    • Edited by RandyHJ57 Tuesday, September 5, 2017 3:21 PM
    Tuesday, September 5, 2017 3:18 PM

All replies

  • I need someone who is an expert or really proficient with LINQ to help me to solve this problem with my work project.  I have gotten as far as reading the Excel file into memory in a DataTable and even managed to get an IEnumerable<DataRow> collection from the DataTable with this code:

    Dim ids = From row In sampleTable.AsEnumerable()
    however, I don't know where to go from here.  I need to process every row, but I also need to keep track of distinct ids in  order to subdivide and close units of work.  Can anyone help me?

    Tuesday, September 5, 2017 5:39 PM
  • Hi,

    First, I recommend you start from :   Single-Table Queries (LINQ to DataSet)

    Here have whole explanation about LINQ to DataSet.

    After reader your post, I admit that I'm not completely sure what you are actually trying to achieve.

    The following group query maybe is what you want.

                           var query = from data in dt.AsEnumerable()
                            where data.Field<double>("ID") >= 1 && data.Field<double>("ID") <= 10
                            group data by new
                                ID = data.Field<double>("ID"),
                            } into gro
                            select new { Id = gro.Key, count = gro.Count() };

    If I have any misunderstanding, I suggest you try to rephrase your question and give us more context.

    Best Regards,


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, September 6, 2017 12:15 PM