locked
how to compare two tables? RRS feed

  • Question

  • I need to compare about 40 tables between two databases.  In one database they will be the same data as the other, except they will only go back a few months in history.  (the fields are all the same and in the same order.)
    Is there a sql command/query that would return only those records that are different, between two tables, and I can use a where clause on the larger table to keep it the same dates?

    thanks


    • Edited by danielsn Thursday, July 14, 2016 11:38 PM
    Thursday, July 14, 2016 11:37 PM

Answers

  • You can use INTERSECT statement

    SELECT * FROM tableA 
    INTERSECT 
    SELECT * FROM tableB 

    -EXCEPT returns any distinct values from the left query that are not also found on the right query.

    -INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

    http://msdn.microsoft.com/en-us/library/ms188055.aspx

    • Marked as answer by danielsn Friday, July 15, 2016 9:07 PM
    Friday, July 15, 2016 12:49 AM
  • Good day danielsn,

    * To clarify: is all rows in the old table, which go back a few months in history, are the same in as these in the new table? Or old rows might be updated as well? If old rows can be update, is only the new tables can be updated or the old as well?

    * direct use of INTERSECT will not help as this return the "rows that are output by both" tables, and you need to find the differences and not the rows that are in both tables.

    If you only insert rows in the new table, and there is no update then (1) you can use EXCEPT. (2) You can use LEFT JOIN with filter all rows that are only in the LEFT table.

    --DROP TABLE IF EXISTS old
    create table old (id int)
    --DROP TABLE IF EXISTS new
    create table new (id int)
    GO
    
    insert old(id) values (1),(2),(4),(6),(8)
    insert new(id) values (1),(2),(4),(6),(8)
    GO
    
    insert new(id) values (21),(22)
    GO
    
    SELECT * FROM new 
    EXCEPT 
    SELECT * FROM old
    GO 

    But if you have updates then EXCEPT will gives only partial information, since its only check one direction. In this case you can use UNION ALL of two EXCEPT queries, but this is will probably gives bad performance.

    * Using JOIN without explicitly say the type of JOIN will uses INNER JOIN, Personally I HIGHLY recommend always to write explicitly "INNER JOIN" instead of "JOIN".

    * INNER JOIN returns the rows that are on both tables, therefore it will not help (directly) to compare, as it will not returns the differences.

    How to compare tables data?

    Option 1: Use tablediff utility

    https://msdn.microsoft.com/en-us/library/ms162843.aspx

    * You can dynamically compare all the tables in specific list or all the tables in the database.
    http://stackoverflow.com/questions/1722846/use-tablediff-to-compare-all-tables

    * In order for tablediff to be included in your installation you must install the SQL Server Replication features along with the Database Engine Services.

    Option 2: Use Visual Studio "Data Compare" tool

    On the SQL menu, point to Data Compare, and then click New Data Comparison. This tool will open a new windows wizard. very simple to use.

    Option 3: Using query

    > 3.1 You can use EXCEPT

    > 3.2 You can use LEFT/RIGHT JOIN

    ** Check this blog: https://www.mssqltips.com/sqlservertip/2779/ways-to-compare-and-find-differences-for-sql-server-tables-and-data/

    ** It will help to post a query that fits your needs if you post your DDL+DML
    ** There are more options as always :-)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]



    • Edited by pituachMVP Friday, July 15, 2016 7:00 AM
    • Marked as answer by danielsn Friday, July 15, 2016 9:07 PM
    Friday, July 15, 2016 6:53 AM
  • Here's something to try

    SELECT *
    FROM dbo.table1 c
    JOIN dbo.table1_archive c2 ON c.pk= c2.pk
    AND CHECKSUM(c.col1, c.col2) <> CHECKSUM(c2.col1, c2.col2)

    • Marked as answer by danielsn Friday, July 15, 2016 9:06 PM
    Thursday, July 14, 2016 11:51 PM
  • Check also this article

     T-SQL: Finding Difference in Columns in the Table


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    • Marked as answer by danielsn Monday, July 18, 2016 5:02 PM
    Saturday, July 16, 2016 12:42 AM

All replies

  • Here's something to try

    SELECT *
    FROM dbo.table1 c
    JOIN dbo.table1_archive c2 ON c.pk= c2.pk
    AND CHECKSUM(c.col1, c.col2) <> CHECKSUM(c2.col1, c2.col2)

    • Marked as answer by danielsn Friday, July 15, 2016 9:06 PM
    Thursday, July 14, 2016 11:51 PM
  • You can use INTERSECT statement

    SELECT * FROM tableA 
    INTERSECT 
    SELECT * FROM tableB 

    -EXCEPT returns any distinct values from the left query that are not also found on the right query.

    -INTERSECT returns any distinct values that are returned by both the query on the left and right sides of the INTERSECT operand.

    http://msdn.microsoft.com/en-us/library/ms188055.aspx

    • Marked as answer by danielsn Friday, July 15, 2016 9:07 PM
    Friday, July 15, 2016 12:49 AM
  • >> I need to compare about 40 tables between two databases. In one database they will be the same data as the other, except they will only go back a few months in history. <<

    I have to ask why this is so. The goal of all databases, not just RDBMS, was to reduce redundancy. One fact, represented one way, in one place, one time. If the second date database is an archive, why would you do be doing comparisons do not you trust your archival software?

    >> (the fields [sic] are all the same and in the same order [sic].) <<

    First of all fields and columns are totally different concepts. You might want to look up what a field actually is in SQL (part of a temporal data value). Secondly, rows in a table have no guaranteed order. This is the whole idea of a set oriented database language!

    >> Is there an SQL command/query that would return only those records [sic] that are different, between two tables, and I can use a where clause on the larger table to keep it the same dates? <<

    The short answer is that you can write some set logic using EXCEPT and INTERSECT. The better answer is that several software vendors have products that will do a lot more work when you are comparing across databases. Google around for Redgate, IDERA, Minion, etc. and see you can find a tool. I am old. I hate reinventing the wheel every few years

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, July 15, 2016 3:55 AM
  • one database is our gold standard and the other is not expected to be a perfect match as this will be first round QA. By having a query which shows the difference between the golden database and the one written to by the apps under development it will hopefully point to where in the newly developed code the issues are that are causing the differences. Each time that it fails I'll make a few adjustments, and start over with a newly created set of tables until It is right.
    Friday, July 15, 2016 5:12 AM
  • Good day danielsn,

    * To clarify: is all rows in the old table, which go back a few months in history, are the same in as these in the new table? Or old rows might be updated as well? If old rows can be update, is only the new tables can be updated or the old as well?

    * direct use of INTERSECT will not help as this return the "rows that are output by both" tables, and you need to find the differences and not the rows that are in both tables.

    If you only insert rows in the new table, and there is no update then (1) you can use EXCEPT. (2) You can use LEFT JOIN with filter all rows that are only in the LEFT table.

    --DROP TABLE IF EXISTS old
    create table old (id int)
    --DROP TABLE IF EXISTS new
    create table new (id int)
    GO
    
    insert old(id) values (1),(2),(4),(6),(8)
    insert new(id) values (1),(2),(4),(6),(8)
    GO
    
    insert new(id) values (21),(22)
    GO
    
    SELECT * FROM new 
    EXCEPT 
    SELECT * FROM old
    GO 

    But if you have updates then EXCEPT will gives only partial information, since its only check one direction. In this case you can use UNION ALL of two EXCEPT queries, but this is will probably gives bad performance.

    * Using JOIN without explicitly say the type of JOIN will uses INNER JOIN, Personally I HIGHLY recommend always to write explicitly "INNER JOIN" instead of "JOIN".

    * INNER JOIN returns the rows that are on both tables, therefore it will not help (directly) to compare, as it will not returns the differences.

    How to compare tables data?

    Option 1: Use tablediff utility

    https://msdn.microsoft.com/en-us/library/ms162843.aspx

    * You can dynamically compare all the tables in specific list or all the tables in the database.
    http://stackoverflow.com/questions/1722846/use-tablediff-to-compare-all-tables

    * In order for tablediff to be included in your installation you must install the SQL Server Replication features along with the Database Engine Services.

    Option 2: Use Visual Studio "Data Compare" tool

    On the SQL menu, point to Data Compare, and then click New Data Comparison. This tool will open a new windows wizard. very simple to use.

    Option 3: Using query

    > 3.1 You can use EXCEPT

    > 3.2 You can use LEFT/RIGHT JOIN

    ** Check this blog: https://www.mssqltips.com/sqlservertip/2779/ways-to-compare-and-find-differences-for-sql-server-tables-and-data/

    ** It will help to post a query that fits your needs if you post your DDL+DML
    ** There are more options as always :-)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]



    • Edited by pituachMVP Friday, July 15, 2016 7:00 AM
    • Marked as answer by danielsn Friday, July 15, 2016 9:07 PM
    Friday, July 15, 2016 6:53 AM
  • HI<gs class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="121a1ead-de33-4c58-a153-880095dfaf2b" id="1b94135e-2bb8-4ed9-8acb-edd21ad3da28">,</gs>

    <gs class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8bfca42b-afc7-4365-8750-865a6dd174d9" id="269e2e29-5fc8-4645-812f-6709597f09a5">using</gs> inner join<gs class="GINGER_SOFTWARE_mark" ginger_software_uiphraseguid="8bfca42b-afc7-4365-8750-865a6dd174d9" id="cd729f9b-c259-4c86-9648-2aad3d11d372">,</gs> we can check it.


    Please click Mark As Answer if my post helped.

    Friday, July 15, 2016 7:58 AM
  • thanks all for the help and much appreciated!
    Friday, July 15, 2016 9:08 PM
  • Check also this article

     T-SQL: Finding Difference in Columns in the Table


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    • Marked as answer by danielsn Monday, July 18, 2016 5:02 PM
    Saturday, July 16, 2016 12:42 AM
  • Thanks Naomi!
    Monday, July 18, 2016 5:03 PM