locked
How to retrieve cell details from a Excel .csv file and combine it with results from .xlsx file RRS feed

  • Question

  • User-1880744392 posted

    Can someone help me with a more flexible way to solve the puzzle I have solved, they method I have used so far is not very flexible, because if there were more docs it would take a long time, as the code I used is only suitable based on the amount of rows and cols in the docs I have looked at. It has done the trick so far but yeah probably need something more flexible for situations such as docs with 100s of rows.

    Basically I am supposed to scan an xlsx excel document and also a csv excel doc, and merged the information into one document, with specific info.

    Here is what I have got so far

    // First we are going to read all the files and data and assign them into variables
    
                // Opening and reading existing Excel xlsx file
                FileInfo fi = new FileInfo(@"someFilePath.xlsx");
                ExcelPackage excelPackage = new ExcelPackage(fi);
                {
                    //So it doesn't throw an exception if worksheet not found, but return null in case it doesn't find it
                    ExcelWorksheet spreadSheet1 =
                    excelPackage.Workbook.Worksheets.FirstOrDefault(x => x.Name == "spreadsheet1");
    
    // Read CSV
     var rows = File.ReadAllLines(@"filePath.csv");
                    var totalcols = rows[0].Split(',').Length;
    
                    // declare variables outside loop so can use it outside
                    string columns = "", csvCell = "";
                    string[] column;
                    List<string> csvList = new List<string>();
                    // break the row into columns on the commas
                    foreach (var words in rows)
                    {
                        columns = words;
                        // create string array of columns so we can get length
                        column = columns.Split(new[] { ',' }, StringSplitOptions.None);
    
                        // return all column info where there is a match of "Mr" or Ms
                        for (int i = 0; i < column.Length; i++)
                        {
                            if (columns.Split(',')[0].Equals("Mr") || columns.Split(',')[0].Equals("Ms"))
                            {
                                csvCell = columns.Split(',')[i];
                                Console.WriteLine(csvCell);
                                csvList.Add(csvCell);
                            }
                        }
                    }


    //Create a new ExcelPackage and write retrieved data onto it
    ExcelPackage excelPackage2 = new ExcelPackage();

    //Create the WorkSheet
    ExcelWorksheet worksheet = excelPackage2.Workbook.Worksheets.Add("Sheet 1");


    // Write xlsx data //loop all rows starting from second row to skip header, iterate for the length of all the columns in xlsx spreadsheet1, start from as per excel base 1 not 0 for (int i = worksheet.Dimension.Start.Row + 1; i <= spreadSheet1.Dimension.End.Row; i++) { for (int j = worksheet.Dimension.Start.Column; j <= spreadSheet1.Dimension.End.Column; j++) { // copy values from each instance of xlsx to the other worksheet.Cells[i, j].Value = spreadSheet1.Cells[i, j].Value; } } // Write CSV data to the file // j starts at 8 the position where it ends adding data from previous xlsx sheet, and iterates from the point + the number of rows in the csv file for (int j = worksheet.Dimension.End.Row; j < (worksheet.Dimension.End.Row + rows.Count()); j++) { // for each of the rows we hit, iterate through row values from list if (j == spreadSheet1.Dimension.End.Row + 1) // gets row 9 { for (int k = 1; k < totalcols + 1; k++) // start from the end of the current number o { worksheet.Cells[j, k].Value = csvList[k - 1]; } } else if (j == spreadSheet1.Dimension.End.Row + 2) { for (int k = 22; k < 2* totalcols + 2; k++) { worksheet.Cells[j, k - totalcols - 1].Value = csvList[k - 2]; } } else if (j == spreadSheet1.Dimension.End.Row + 3) { for (int k = 43; k < 3* totalcols + 3; k++) { worksheet.Cells[j, k - totalcols - 2].Value = csvList[k - 3]; } } else if (j == spreadSheet1.Dimension.End.Row + 4) // we are adding by 1 each time because we are skipping the cells that has already been added { for (int k = 64; k < 4* totalcols + 4; k++) { worksheet.Cells[j, k - totalcols -3].Value = csvList[k - 4]; } } }

    I think mainly the CSV reading part is the weirdest lol but the documentation online are not so helpful, as I am trying to combine search results from the other file, and merge into search results off the CSV.

    Thursday, March 5, 2020 1:03 PM

Answers

  • User-1330468790 posted

    Hi, TroydonAnabolic,

    I understand that you are finding a neat way which could be achieved by using the existing method from OfficeOpenXml package (which you already used) and a datastructure 'DataTable' in package 'System.Data'.

    There are three ways that can improve the code:

    1. Filter the data by DataTable and DataView
    2. Copy the data from the original excel to a new excel by the code "excelPackage2.Workbook.Worksheets.Add("Sheet 1", spreadSheet1);" 
    3. Load the data from the DataTable by the method "ExcelRange.LoadFromDataTable();"

    More details, you can refer to below code:

    static void Main(string[] args)
            {
                Test();
    
                Console.ReadKey();
    
    
            }
    
            public static DataTable CSVToDT(string[] csvRows)
            {
                DataTable dt = new DataTable();
                string[] headers = csvRows[0].Split(',');
                foreach(string header in headers)
                {
                    dt.Columns.Add(header);
                }
    
                for(int i = 1; i< csvRows.Length; i++)
                {
                    DataRow dr = dt.NewRow();
                    string[] row = csvRows[i].Split(',');
                    for(int j = 0; j < row.Length; j++)
                    {
                        dr[j] = row[j];
                    }
                    dt.Rows.Add(dr);
                }
    
                return dt;
    
            }
    
    
            public static void Test()
            {
                // First we are going to read all the files and data and assign them into variables
    
                // Opening and reading existing Excel xlsx file
                FileInfo fi = new FileInfo(@"FolderPath\tablelist.xlsx");
                ExcelPackage excelPackage = new ExcelPackage(fi);
                {
                    //So it doesn't throw an exception if worksheet not found, but return null in case it doesn't find it
                    ExcelWorksheet spreadSheet1 = excelPackage.Workbook.Worksheets.FirstOrDefault(x => x.Name == "Sheet1");
                   
                    // Read CSV
                    var rows = File.ReadAllLines(@"FolderPath\tableTest.csv");
    
                    //Read CSV as DataTable
                    DataTable dt = CSVToDT(rows);
                    //Filter the data
                    DataView dv = dt.DefaultView;
                    dv.RowFilter = "Gender = 'Mr' or Gender = 'Ms'";
                    dt = dv.ToTable();
    
    
                    FileInfo outputFI = new FileInfo(@"TargetFolderPath\newTable.xlsx");
                    //Create a new ExcelPackage and write retrieved data onto it
                    ExcelPackage excelPackage2 = new ExcelPackage(outputFI);
                  
    
                    //directly copy the original sheet
                    var worksheet = excelPackage2.Workbook.Worksheets.Add("Sheet 1", spreadSheet1);
                   
                    //Load data from DataTable
    //Bear in mind that the excel is 1-base while the datatable is 0-base int startRow = worksheet.Dimension.End.Row + 1; int startColumn = 1; worksheet.Cells[startRow,startColumn].LoadFromDataTable(dt,false); //Save the ExcelPackage excelPackage2.Save(); } }

    Demo:

    original .xlsx file

    .csv file

    New .xlsx file

    Hope this can help you.

    Best regards,

    Sean

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, March 6, 2020 8:49 AM

All replies

  • User-1330468790 posted

    Hi, TroydonAnabolic,

    I understand that you are finding a neat way which could be achieved by using the existing method from OfficeOpenXml package (which you already used) and a datastructure 'DataTable' in package 'System.Data'.

    There are three ways that can improve the code:

    1. Filter the data by DataTable and DataView
    2. Copy the data from the original excel to a new excel by the code "excelPackage2.Workbook.Worksheets.Add("Sheet 1", spreadSheet1);" 
    3. Load the data from the DataTable by the method "ExcelRange.LoadFromDataTable();"

    More details, you can refer to below code:

    static void Main(string[] args)
            {
                Test();
    
                Console.ReadKey();
    
    
            }
    
            public static DataTable CSVToDT(string[] csvRows)
            {
                DataTable dt = new DataTable();
                string[] headers = csvRows[0].Split(',');
                foreach(string header in headers)
                {
                    dt.Columns.Add(header);
                }
    
                for(int i = 1; i< csvRows.Length; i++)
                {
                    DataRow dr = dt.NewRow();
                    string[] row = csvRows[i].Split(',');
                    for(int j = 0; j < row.Length; j++)
                    {
                        dr[j] = row[j];
                    }
                    dt.Rows.Add(dr);
                }
    
                return dt;
    
            }
    
    
            public static void Test()
            {
                // First we are going to read all the files and data and assign them into variables
    
                // Opening and reading existing Excel xlsx file
                FileInfo fi = new FileInfo(@"FolderPath\tablelist.xlsx");
                ExcelPackage excelPackage = new ExcelPackage(fi);
                {
                    //So it doesn't throw an exception if worksheet not found, but return null in case it doesn't find it
                    ExcelWorksheet spreadSheet1 = excelPackage.Workbook.Worksheets.FirstOrDefault(x => x.Name == "Sheet1");
                   
                    // Read CSV
                    var rows = File.ReadAllLines(@"FolderPath\tableTest.csv");
    
                    //Read CSV as DataTable
                    DataTable dt = CSVToDT(rows);
                    //Filter the data
                    DataView dv = dt.DefaultView;
                    dv.RowFilter = "Gender = 'Mr' or Gender = 'Ms'";
                    dt = dv.ToTable();
    
    
                    FileInfo outputFI = new FileInfo(@"TargetFolderPath\newTable.xlsx");
                    //Create a new ExcelPackage and write retrieved data onto it
                    ExcelPackage excelPackage2 = new ExcelPackage(outputFI);
                  
    
                    //directly copy the original sheet
                    var worksheet = excelPackage2.Workbook.Worksheets.Add("Sheet 1", spreadSheet1);
                   
                    //Load data from DataTable
    //Bear in mind that the excel is 1-base while the datatable is 0-base int startRow = worksheet.Dimension.End.Row + 1; int startColumn = 1; worksheet.Cells[startRow,startColumn].LoadFromDataTable(dt,false); //Save the ExcelPackage excelPackage2.Save(); } }

    Demo:

    original .xlsx file

    .csv file

    New .xlsx file

    Hope this can help you.

    Best regards,

    Sean

    • Marked as answer by Anonymous Thursday, October 7, 2021 12:00 AM
    Friday, March 6, 2020 8:49 AM
  • User-1880744392 posted

    Okay, that looks way better than the manual way i was trying to use, the inbuilt DLL of EPPPlus seems to simplify it so much.. too bad there is hardly any documentation on it!

    I did manage to come so far in my code to simplifying it but got stuck with increasing the iteration parameter each time i reached the end of the row

                    // Write CSV data to the file
                    // j starts at 8 the position where it ends adding data from previous xlsx sheet, and iterates from the point + the number of rows in the csv file  
                    int count = 1;
                    int count2 = 0;
                    for (int j = spreadSheet1.Dimension.End.Row + 1; j < 13; j++)
                    {
                        // start from the first column in the left and this is starting at 1, each time count2 increases we increase value of starting point k each time we reach end of column + 1
                        // doing this we can start assigning the k values into the next row of the next j iteration (later try setting count = 0 to start of with, to replace int k = count2*totalcols, use only one incrase on count for start  =1;
                        for (int k = count2 * totalcols + count; k <= count2 * totalcols + totalcols; k++)
                        {
                            worksheet.Cells[j + count2, k - count2 * totalcols - count2].Value = csvList[k - count]; // we iterated one times too many to avoid 0 base of excel so when adding value we increase by 1 iteration
                                                                                                                     // when k reaches the end of the total columns and we increase counters to allow to start assigning values on the next row
                            if (k == (count2 * totalcols) + (count * totalcols + count)) // first iteration: 0*20 + 20 + 1
                            {
                                ++count; // we increase the incremental value whenever we reach the total col length
                                ++count2; // then increase the value we multiply the start index for loop starting point
                            }
                        }
    

    Anyhow, will give the code you have showed ago and see if it resolves my Issue!

    Friday, March 6, 2020 8:55 AM
  • User-1330468790 posted

    Hi, TroydonAnabolic,

     

    I would be glad if it could help you.


    If you still meet problems, you can post it here and people here will be happy to solve the problem.


    If you find that the answer does solve your issues, I suggest you could mark the answer which is right and helpful.

    This will help other people who faces the same issue to find the right answer faster.

     

    Best regards,

    Sean

    Thursday, March 12, 2020 8:08 AM