none
comparing two excel files and writing the difference into a new worksheet using c#.net RRS feed

  • Question

  • Could anyone help me, i'm doing a program using excel files in c# for the first time. My code starts from fetching two/multiple files from a specific folder. But after that, im already lost as to what will be the next process. Any help will be much appreciated.

    The program should output the differences seen from the two/multiple excel files.

    Thank you in advance. God bless.:)

    • Moved by Leo Liu - MSFT Tuesday, September 13, 2011 6:24 AM Moved for better support. (From:Visual C# General)
    Sunday, September 11, 2011 2:44 PM

Answers

  • Dear -Maez-

    for you work just follow this simple steps.

    1. download component one flexgrid classic control from his site.

    2. install it & add reference to your application.

    3. Add 3 Flexgrid to you form (1 for excel file1, 2nd for excel file2, 3rd for your output file)

    4. you can use flexgrid openexcelfile command to load excel file directly to grid. (.xls or .xlsx both files are supported)

    5. wright a for next loop and read & compare both grids cell value & store it in 3rd grid.

    6. after finish your loop, use flexgrid's saveexcel command to save grid as a excel file. (.xls & .xlsx Both format supported).

    7. Now open your saved excel file in excel & your result is ready as your samples image.


    Sunday, September 11, 2011 6:28 PM

All replies

  • Hello,

    If your file is CSV you can open and read like text file

    string[] file1Rows = File.ReadAllLines("filePath.csv", Encoding.UTF8);
    
    foreach (string row in file1Rows)
    {
        string[] cells = row.Split(',');//we split cells of row
        //you can do comparison here...
    }
    

    if your excel files are xls or xlsx you can use a component to read file contents and compare cell values of each file to another.

    ComponentOne Excel™ for .NET
    Any fool can know. The point is to understand.(Albert Einstein)
    Sunday, September 11, 2011 2:59 PM
  • Hi,

    There is no direct and simple way to do this.

    Do you need to support Office 2003. Then you need to use Excel com components to interact with Excel File. But better way is to buy some third party components so that underlying complexity of COM is isolated from you code.

    Alternatively  you can use Jet engine to query your sheets for 2003, take help from google. But this way creating new sheet is not easy. Atleast I never tried.

    Do you need to support Office 2007. Then you can Open XML SDK 2.0. This will expose your entire excel file data as .Net objects.

    /Srinivas

    Sunday, September 11, 2011 4:25 PM
  • Hi Maez,

    I would use the File Helpers to read the spreadsheets (http://filehelpers.sourceforge.net/)

    But what do you mean by the differences? Are the files going to have exactly the same number of rows? Has each row got an ID? Can the order of the records (rows) can differ between the two files?

    Regards,

    Leo

     

    Sunday, September 11, 2011 4:44 PM
  • The question is not clear enough.

    What do you mean under differences - do you care about colors, formattings, formulas, merging, ...; or you care just the values?

    What about sheets - one is enough, or there might be more than one; do you care about names, colors of the tab background, ordering, ...?

    What Excel versions do you need to support?

    What is the format of the source files - XLS, XLSX, CSV or another?

    • Edited by MCCZ Sunday, September 11, 2011 5:04 PM
    Sunday, September 11, 2011 5:03 PM
  • Dear -Maez-

    for you work just follow this simple steps.

    1. download component one flexgrid classic control from his site.

    2. install it & add reference to your application.

    3. Add 3 Flexgrid to you form (1 for excel file1, 2nd for excel file2, 3rd for your output file)

    4. you can use flexgrid openexcelfile command to load excel file directly to grid. (.xls or .xlsx both files are supported)

    5. wright a for next loop and read & compare both grids cell value & store it in 3rd grid.

    6. after finish your loop, use flexgrid's saveexcel command to save grid as a excel file. (.xls & .xlsx Both format supported).

    7. Now open your saved excel file in excel & your result is ready as your samples image.


    Sunday, September 11, 2011 6:28 PM
  • Hi Maez,
    I have created a class file that will compare two excel file. It can return the changes in datatable or may write to a CSV. Here is the class...
    ---------------------------
     
    using System;
    using System.Data;
    using System.Data.OleDb;
    using System.IO;
    
    
    namespace TestWinForm.Utils
    {
        public class ExcelHandler
        {
            #region Properties
            public string Directory { get; set; }
            public string FirstFile { get; set; }
            public string FirstFileSheetName { get; set; }
            public string SecondFile { get; set; }
            public string SecondFileSheetName { get; set; }
            public DataTable ReturnDataSet { get; set; }
            public bool Excel2007 { get; set; }
            public bool UseHeader { get; set; }
            #endregion
            
            #region Constructor
            public ExcelHandler() { }
    
            public ExcelHandler(string Dir, string File1, string File1SheetName, string File2, string File2SheetName)
            {
                this.Directory = Dir;
                this.FirstFile = File1;
                this.SecondFile = File2;
                this.FirstFileSheetName = File1SheetName;
                this.SecondFileSheetName = File2SheetName;
            }
            #endregion
    
            #region Match Files
            public DataTable CheckExcelFiles()
            {
                DataTable dtRet = new DataTable();
                //Read the first excel
                try
                {
                    //Read the excel
                    DataTable dt1 = GetDataTableFromExcel(this.Directory, this.FirstFile, this.FirstFileSheetName);
                    DataTable dt2 = GetDataTableFromExcel(this.Directory, this.SecondFile, this.SecondFileSheetName);
    
                    //Compare two
                    dtRet = getDifferentRecords(dt1, dt2);
                }
                catch (Exception ex) { }
                return dtRet;
            }
    
            //Overload method to write to csv
            public void CheckExcelFiles(string strFilePath)
            {
                DataTable dtRet = new DataTable();
                //Read the first excel
                try
                {
                    //Read the excel
                    DataTable dt1 = GetDataTableFromExcel(this.Directory, this.FirstFile, this.FirstFileSheetName);
                    DataTable dt2 = GetDataTableFromExcel(this.Directory, this.SecondFile, this.SecondFileSheetName);
    
                    //Compare two
                    dtRet = getDifferentRecords(dt1, dt2);
                    ExportDataTableToExcel(dtRet, strFilePath);
                }
                catch (Exception ex) { }
                
            }
    
            //Get Datatable reading Excel
            private DataTable GetDataTableFromExcel(string strDir, string strFileName, string strSheetName)
            {
                var fileName = string.Format("{0}\\" + strFileName, strDir);
                string connectionString;
                if (Excel2007)
                    //read a 2007 file  
                    connectionString = string.Format("Provider=Microsoft.ACE.OLEDB.12.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=" + (UseHeader == true ? "YES" : "NO") + ";\"", fileName);
                else
                    //read a 97-2003 file  
                    connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0;Data Source={0};Extended Properties=\"Excel 8.0;HDR=" + (UseHeader == true ? "YES" : "NO") + ";\"", fileName);  
    
                
                //var connectionString = string.Format("Provider=Microsoft.Jet.OLEDB.4.0; data source={0}; Extended Properties=Excel 8.0;", fileName);
    
                var adapter = new OleDbDataAdapter("SELECT * FROM [" + strSheetName + "$]", connectionString);
                var ds = new DataSet();
                adapter.Fill(ds, fileName + strSheetName);
                return ds.Tables[fileName + strSheetName];
            }
    
            //Compare datatables
            private DataTable CompareDataTable(DataTable A, DataTable B)
            {
                A.PrimaryKey = new DataColumn[] { A.Columns["PK"] };
                B.PrimaryKey = new DataColumn[] { B.Columns["PK"] };
                A.Merge(B, true); // this will add to A any records that are in B but not A 
                A.AcceptChanges();
                return A.GetChanges(DataRowState.Added); // returns records originally only in B
            }
    
    
            //Provided here http://social.msdn.microsoft.com/Forums/en-US/csharpgeneral/thread/23703a85-20c7-4759-806a-fabf4e9f5be6/
            //Provided by Guo Surfer
            #region Compare two DataTables and return a DataTable with DifferentRecords  
    
            /// <summary>  
            /// Compare two DataTables and return a DataTable with DifferentRecords  
            /// </summary>  
            /// <param name="FirstDataTable">FirstDataTable</param>  
            /// <param name="SecondDataTable">SecondDataTable</param>  
            /// <returns>DifferentRecords</returns>  
            public DataTable getDifferentRecords(DataTable FirstDataTable, DataTable SecondDataTable)
            {
                //Create Empty Table  
                DataTable ResultDataTable = new DataTable("ResultDataTable");
                //use a Dataset to make use of a DataRelation object  
                using (DataSet ds = new DataSet())
                {
                    //Add tables  
                    ds.Tables.AddRange(new DataTable[] { FirstDataTable.Copy(), SecondDataTable.Copy() });
                    //Get Columns for DataRelation  
                    DataColumn[] firstColumns = new DataColumn[ds.Tables[0].Columns.Count];
                    for (int i = 0; i < firstColumns.Length; i++)
                    {
                        firstColumns[i] = ds.Tables[0].Columns[i];
                    }
                    DataColumn[] secondColumns = new DataColumn[ds.Tables[1].Columns.Count];
                    for (int i = 0; i < secondColumns.Length; i++)
                    {
                        secondColumns[i] = ds.Tables[1].Columns[i];
                    }
                    //Create DataRelation  
                    DataRelation r1 = new DataRelation(string.Empty, firstColumns, secondColumns, false);
                    ds.Relations.Add(r1);
                    DataRelation r2 = new DataRelation(string.Empty, secondColumns, firstColumns, false);
                    ds.Relations.Add(r2);
                    //Create columns for return table  
                    for (int i = 0; i < FirstDataTable.Columns.Count; i++)
                    {
                        ResultDataTable.Columns.Add(FirstDataTable.Columns[i].ColumnName, FirstDataTable.Columns[i].DataType);
                    }
                    //If FirstDataTable Row not in SecondDataTable, Add to ResultDataTable.  
                    ResultDataTable.BeginLoadData();
                    foreach (DataRow parentrow in ds.Tables[0].Rows)
                    {
                        DataRow[] childrows = parentrow.GetChildRows(r1);
                        if (childrows == null || childrows.Length == 0)
                            ResultDataTable.LoadDataRow(parentrow.ItemArray, true);
                    }
                    //If SecondDataTable Row not in FirstDataTable, Add to ResultDataTable.  
                    foreach (DataRow parentrow in ds.Tables[1].Rows)
                    {
                        DataRow[] childrows = parentrow.GetChildRows(r2);
                        if (childrows == null || childrows.Length == 0)
                            ResultDataTable.LoadDataRow(parentrow.ItemArray, true);
                    }
                    ResultDataTable.EndLoadData();
                }
                return ResultDataTable;
            }
            
            #endregion  
    
            private void ExportDataTableToExcel(DataTable dt, string strFilePath)
            {
            // Create the CSV file to which grid data will be exported.
            StreamWriter sw = new StreamWriter(strFilePath, false);
            // First we will write the headers.
            //DataTable dt = m_dsProducts.Tables[0];
            int iColCount = dt.Columns.Count;
            for (int i = 0; i < iColCount; i++)
            {
                sw.Write(dt.Columns[i]);
                if (i < iColCount - 1)
                {
                    sw.Write(",");
                }
            }
            sw.Write(sw.NewLine);
            // Now write all the rows.
            foreach (DataRow dr in dt.Rows)
            {
                for (int i = 0; i < iColCount; i++)
                {
                    if (!Convert.IsDBNull(dr[i]))
                    {
                        sw.Write(dr[i].ToString());
                    }
                    if (i < iColCount - 1)
                    {
                        sw.Write(",");
                    }
                }
                sw.Write(sw.NewLine);
            }
            sw.Close();
     
            }
    
        
    
            #endregion
    
        }
    }
    --------------------------
     
    Here how to use....
    ----------------------------
     
    //Declare the class and set the excel file dir/path and sheet name
    Utils.ExcelHandler eh = new Utils.ExcelHandler() { Directory = "c:\\temp", FirstFile = "Book1.xls", FirstFileSheetName = "Sheet1", SecondFile = "Book2.xls", SecondFileSheetName = "Sheet1" };
    //Get the datatable            
    DataTable dt = eh.CheckExcelFiles();
    //Or write to CSV top open with excel
    eh.CheckExcelFiles("c:\\temp\\test123.csv");
    ----------------------------
     
    Hope this will solve your problem.
    Thanks

    Anirban Bhattacharya (Lead Consultant - ESSPL)
    Sunday, September 11, 2011 6:59 PM
  • Ehsan Mohammadi

    how i wish the problem is just that simple..thank you for the link and for giving your answer as well.:)


    • Edited by -zeam- Monday, September 12, 2011 1:20 PM
    Monday, September 12, 2011 12:47 PM
  • SVentrapragada

    im doing it on an office 2003..the problem is that we are not allowed to use any other utilities/component..aside from having the the said excel file version and the .net environment. Thank you for the suggestion.;)

     


    • Edited by -zeam- Monday, September 12, 2011 1:20 PM
    Monday, September 12, 2011 12:49 PM
  •  

    Possible that Pair A in Excel 1 is not same row with Pair A in Excel 2, but Pair B in Excel 1 and 2 is with same row..and only Pairs with differences must be written to the output excel file. Thank you.;)

    • Edited by -zeam- Monday, September 12, 2011 1:21 PM
    Monday, September 12, 2011 12:53 PM
  • Different in terms of values/characters/symbols.

    Im using ms office 2003, its in xls format.

    Possible that Pair A in Excel 1 is not same row with Pair A in Excel 2, but Pair B in Excel 1 and 2 is with same row..and only Pairs with differences must be written to the output excel file. Thank you.;)


    • Edited by -zeam- Monday, September 12, 2011 1:21 PM
    Monday, September 12, 2011 12:56 PM
  •  

    Kamlesh R. Sanchla

     

    Thank you.:)..ill try this..but the problem is that we are not allowed to export other than components present in our existing .net environment.:(

    • Edited by -zeam- Monday, September 12, 2011 1:22 PM
    Monday, September 12, 2011 12:59 PM
  •  

    Anirban Bhattacherya

     

    This code will be of great help..thank you!:)

    1. Is it right that if i have multiple files in Folder A (dt1) to be read, i must only put it in a foreach?

    2. For example I have a_username.xls and b_username.xls in Folder A (dt1) and i have a_username1.xls and b_username1.xls in Folder B (dt2)..is it possible to compare same .xls..with different username?Thank you.


    • Edited by -zeam- Monday, September 12, 2011 1:23 PM
    Monday, September 12, 2011 1:15 PM
  • Thank you.:)..ill try this..but the problem is that we are not allowed to export other than components present in our existing .net environment.:(


    Hello,

    You can use SmartAssembly from red-gate to merge components to your original assembly.


    Any fool can know. The point is to understand.(Albert Einstein)
    Monday, September 12, 2011 1:21 PM
  • Hi Maez,

    Yes you can. In the sceneria, please add one extra property for Second Directory. I used the same directory thats why is something like that. You modify it as per your requirement. I think it will not be complex for you, otherwise just give me a buzz.

    Thanks


    Anirban Bhattacharya (Lead Consultant - ESSPL)
    Monday, September 12, 2011 1:59 PM
  • One more, yes, you can use foreach for multiple checking. But it compares only 2 excel at a time.

    Thanks


    Anirban Bhattacharya (Lead Consultant - ESSPL)
    Monday, September 12, 2011 2:01 PM
  • Any update?

    Thanks


    Anirban Bhattacharya (Lead Consultant - ESSPL)
    Tuesday, September 13, 2011 7:21 AM
  •  

    "Any update?"

     

    im sorry I wasn't able to reply immediately..right now i'm still studying your code..coz i'm still not familiar with the used objects..I have to know how those objects work, i guess i have to understand it myself..:)thank you for the big help..b4 i got home..im thinking of using 4 multidimensional array to store/retrieve the data..hahaha..but after checking your code..i saw object in c# that i can use rather than using multidim array. Ill keep you posted. Thank you very much.;)

    • Edited by -zeam- Tuesday, September 13, 2011 12:24 PM
    Tuesday, September 13, 2011 12:23 PM
  • Hi Anirban, I used the class you have provided as solution. But the problem is I have an excel file containing 80 columns and as a result, what happens is there is an InvalidOperationException that is thrown when I am trying to create a DataRelation.

    So Please help as to how to come over the exception when we have more than 32 columns. Yep the number of columns is restricted to 32. can we do something about it.

    thanks



    • Edited by SandepKU Monday, December 19, 2011 5:50 AM
    Monday, December 19, 2011 5:10 AM
  • Hi Sandep,

    Yes your are right, DataRelation fails working over 32 cols. In that scenerio, we need to use Microsoft Sync Framework. The details of Sync Framework and for file Synchronization you will get the details here... http://msdn.microsoft.com/en-us/sync/bb887623

    Thanks

     


    Anirban Bhattacharya (Lead Consultant - ESSPL)
    Wednesday, December 21, 2011 4:36 AM
  • Hi Anirban, can you pls provide a link to any article where this has been implemented

    thanks

    Thursday, December 22, 2011 5:14 AM
  • Is there any way to add colors for differentiating the difference in excel for the output file

    eg:

    excel 1                         excel 2

    1(red color)                 3 (red color)

    2 (no color)                 2 (no color)

    3 (red color)                4(red color)      

    Thanks

    karthik

    Wednesday, June 26, 2013 11:49 AM
  • Hi, 

    Can you please let me know if you have a solution to compare more than 32 columns?

    Thanks,

    Avanthi

    Monday, July 29, 2013 9:55 AM
  • Very sorry for late reply.

    I am out of touch to programing.

    if you have valid license, then there is no problem to export or redistribute component one flex grid Component.

    Thanks.

    Saturday, June 18, 2016 8:24 AM
  • Hi,

    For Your Problem more than 32 column?

    Please read my above solution.

    Just simple change in loop.

    For Loop Inside For Loop

    1st For Loop For Row & 2nd Inside For Loop For Column.

    And your problem is solved.

    Thanks

    Saturday, June 18, 2016 8:27 AM