none
How to get differences from two Datatables RRS feed

  • Question

  • See below two Data Tables:

    Table1:

       File ID, File VersionID

       file1,    1

       file1,    2

       file2,    1

      Table 2:

     File ID, File VersionID

       file1,   1

       file1,   3

       file2,   1

    I want to compare Table1 with Table2, get the results to a new Data Table, which rows could not be found in Table2, see below

    Table 3:

    File ID, File VersionID

     file1,    2

      I try to use Merge two tables (preserver changes), and then use GetChanges method, but this does not work.


    • Edited by ScottQ Wednesday, June 20, 2018 1:00 AM
    Wednesday, June 20, 2018 12:59 AM

Answers

  • If the table schemas are identical then merge would give you the result of merging them together, not the differences. 

    For differences I would lean toward using LINQ.

    //DataTable doesn't have generic collections so unless you are using DataSetExtensions you need to force to IEnumerable<T>
    var sourceRows = source.Rows.OfType<DataRow>();
    var destinationRows = destination.Rows.OfType<DataRow>();
    
    var onlyInSource = sourceRows.Except(destinationRows, DataRowComparer.Default);


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by ScottQ Thursday, June 21, 2018 2:39 AM
    Wednesday, June 20, 2018 1:52 PM
    Moderator
  • Hi ScottQ,

    The maximum number of rows that a DataTable can store is 16,777,216. It is okay for use.

    Please try the code below as well.

    DataTable dt1 = new DataTable();
                dt1.Columns.Add("File ID");
                dt1.Columns.Add("File VersionID");
                dt1.Rows.Add("file1", 1);
                dt1.Rows.Add("file1", 2);
                dt1.Rows.Add("file2", 1);
    
                DataTable dt2 = new DataTable();
                dt2.Columns.Add("File ID");
                dt2.Columns.Add("File VersionID");
                dt2.Rows.Add("file1", 1);
                dt2.Rows.Add("file1", 3);
                dt2.Rows.Add("file2", 1);
    
                DataTable dtmismatch = dt1.AsEnumerable().Except(dt2.AsEnumerable(), DataRowComparer.Default).CopyToDataTable<DataRow>();

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by ScottQ Thursday, June 21, 2018 2:39 AM
    Thursday, June 21, 2018 1:55 AM
    Moderator

All replies

  • If the table schemas are identical then merge would give you the result of merging them together, not the differences. 

    For differences I would lean toward using LINQ.

    //DataTable doesn't have generic collections so unless you are using DataSetExtensions you need to force to IEnumerable<T>
    var sourceRows = source.Rows.OfType<DataRow>();
    var destinationRows = destination.Rows.OfType<DataRow>();
    
    var onlyInSource = sourceRows.Except(destinationRows, DataRowComparer.Default);


    Michael Taylor http://www.michaeltaylorp3.net

    • Marked as answer by ScottQ Thursday, June 21, 2018 2:39 AM
    Wednesday, June 20, 2018 1:52 PM
    Moderator
  • Thank you, Michael Taylor! 

    The example codes you provided work for me.  And then use blow code to convert to DataTable object:

    DataTable dtOnlyInSource = onlyInSource.CopyToDataTable<DataRow>();

    Maybe I need to handle large number of rows. For example there are more than 1,000,000 or 500,000 data rows. 

    And I am also worried about the DataTable object if it is able to contain the large number of data rows.

    Do you have any suggestions about the performance?  I really appreciate for your help.

    Scott 

    Thursday, June 21, 2018 1:21 AM
  • Hi ScottQ,

    The maximum number of rows that a DataTable can store is 16,777,216. It is okay for use.

    Please try the code below as well.

    DataTable dt1 = new DataTable();
                dt1.Columns.Add("File ID");
                dt1.Columns.Add("File VersionID");
                dt1.Rows.Add("file1", 1);
                dt1.Rows.Add("file1", 2);
                dt1.Rows.Add("file2", 1);
    
                DataTable dt2 = new DataTable();
                dt2.Columns.Add("File ID");
                dt2.Columns.Add("File VersionID");
                dt2.Rows.Add("file1", 1);
                dt2.Rows.Add("file1", 3);
                dt2.Rows.Add("file2", 1);
    
                DataTable dtmismatch = dt1.AsEnumerable().Except(dt2.AsEnumerable(), DataRowComparer.Default).CopyToDataTable<DataRow>();

    Best Regards,

    Wendy


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    • Marked as answer by ScottQ Thursday, June 21, 2018 2:39 AM
    Thursday, June 21, 2018 1:55 AM
    Moderator
  • Thank you, Wendy!
    Thursday, June 21, 2018 2:38 AM
  • I would recommend against using a datatable for large sets of data. The overhead (especially if actually coming from a database) is high. A data row keeps track of the original and current values so the more rows you have the larger it gets. There is also the overhead of the schema. If you're trying to work with that much data then I really think doing it directly in a  DB would be a better choice.

    If that isn't an option then consider batching instead. Of course if the only thing your app does is work with 1 monolithic table then it would be OK but if this is part of a larger app then you're potentially going to see heavy memory usage. 


    Michael Taylor http://www.michaeltaylorp3.net

    Thursday, June 21, 2018 3:54 AM
    Moderator
  • Thanks for your reminder, Michael Taylor! 

    I will put the values of the two data table objects into database.

    Friday, June 22, 2018 12:18 AM