locked
compare two CSV files RRS feed

  • Question

  • Hi Experts,

    I have been given two CSV files both contains same number of columns and same column-names. (approx each files contains 50 columns). In both the csv files we have one column saying ID through which I need to compare columns between two files.

    Now with respect to this ID column we need to compare value of column1 of 1st CSV file with the value of column1 of 2nd CSV file (either FAIL OR PASS), likewise w.r.t value of ID column we need to compare value of column2 of 1st CSV file with the value of column2 of 2nd CSV file (either FAIL OR PASS)........ in such comparing 50 columns.

    Suppose I say above 2 CSV files makes one 1 batch likewise I have 40 batches, which means I have 80 CSV files with each file contain approx of 50 columns what is the easist way to perform and how?

    I tried loading 2 csv files (1 batch) into 2 DB Tables using import-export wizard and then writing 50 CASE statements between 2 DB Tables to say either PASS or FAIL, but this is just 1 batch likewise I have 40 batches so either is the good solution and how?

    Is their any good way to do using SSIS package? If so can anybody tell me how?

    Thanks

     

    Regards,

    Kumar

    Monday, April 5, 2010 12:10 PM

Answers

  • Staging the CSV files into DB tables sounds the right approach. Assuming the data from two batched CSVs already is in the respective tables, you can use the MERGE statement in SQL 2008 to find out if the row sets beign compared are same or not. In case you are not using SQL 2008, Jamie's blog on this topic will help implementing alternative approaches for this.

    Hope this helps.

     

    Cheers!!

    Muqadder.

    Monday, April 5, 2010 12:20 PM
  • Moving the data to a DB would be a simpler approach and I would be inclined to take this approach.

    After moving the data to the table I would rather create one dynamic stored proc to get the data rather than writing 40 queries having the same logic.


    Sudeep's Domain
    • Marked as answer by gk1393 Tuesday, April 6, 2010 3:08 AM
    • Marked as answer by gk1393 Tuesday, April 6, 2010 3:08 AM
    Monday, April 5, 2010 1:59 PM

All replies

  • Staging the CSV files into DB tables sounds the right approach. Assuming the data from two batched CSVs already is in the respective tables, you can use the MERGE statement in SQL 2008 to find out if the row sets beign compared are same or not. In case you are not using SQL 2008, Jamie's blog on this topic will help implementing alternative approaches for this.

    Hope this helps.

     

    Cheers!!

    Muqadder.

    Monday, April 5, 2010 12:20 PM
  • Moving the data to a DB would be a simpler approach and I would be inclined to take this approach.

    After moving the data to the table I would rather create one dynamic stored proc to get the data rather than writing 40 queries having the same logic.


    Sudeep's Domain
    • Marked as answer by gk1393 Tuesday, April 6, 2010 3:08 AM
    • Marked as answer by gk1393 Tuesday, April 6, 2010 3:08 AM
    Monday, April 5, 2010 1:59 PM
  • If this is one time job, You can use excel. use =+E1=A1 (assuming that you need to compare E1 and A1 cells) . This will simply give you the TRUE and FALSE answeres.

    Then filter for FALSE.

     


    Blog: http://dineshasanka.spaces.live.com
    Monday, April 5, 2010 8:14 PM