none
SQL Fastest Way to find Data Difference Between Two Tables RRS feed

  • Question

  • I am trying to locate fastest way to find difference between two tables.  The first table is populated from text files daily, the second table is maintained in our database.  We need to add data that is new or has changed from the text files. The comparison is done on the primary key.

        create table dbo.CustomerTransaction
        (
            CustomerTransactionId int primary key,
            CustomerName varchar(50),
            ProductName varchar(50),
            QuantityBought int
        )

    So row values Table 1: (1,'Bob','Table',8) is the Same as Table 2: (1,'Bob','Table',8) 

    These are different (1,'Bob','Table',8) ,  (1,'Bob','Chair',8) , different on the primary key.

    I am seeing a lot of articles discussing methods, with Left Join, Except, tablediff, Visual Studio DataComparison, Union group.

    However, no one talks about performance/speed.  Which is the fastest internal algorithm way? I am guessing it is the TableDiff Utility, which is a SQL tool designed specifically for this purpose.

    We do not have access to CDC in legacy text file system, so we are extracting all new data into SQL Server daily, and comparing to previous data. We are finding incremental load values, and placing into Kimball Data Warehouse.

    Neither table is actively being used at the time we are doing the comparison.

    https://dba.stackexchange.com/questions/64878/how-to-find-out-the-content-differences-between-2-sql-tables-and-produce-sync-sq
    https://stackoverflow.com/questions/4602083/sql-compare-data-from-two-tables
    Monday, October 15, 2018 10:55 PM

All replies

  • Hi,

    One of the preferred way is CHECKSUM. You can generate a value with checksum function for all the table columns and store it in the target tables and compare source CHECKSUM values (generate them on the fly at the time of comparison) with target CHECKSUM values, so in this case no need to match all the source & target columns.



    Cheers,

    Tuesday, October 16, 2018 1:41 AM
  • are you actually referring to table rows? find checksums for all rows, thats what we want to compare, and find delta rows between tables

    Thanks,

    Tuesday, October 16, 2018 2:55 PM
  • Yes Absolutely.

    Cheers,

    Monday, October 29, 2018 8:15 AM