none
read an excel file in C# RRS feed

  • Question

  • Hello all,

    I need to create something quickly in C# that basically takes everything from an excel file, and put them into a 2D string array. Now I have no experience in programming in C# and has minimal database knowledge, and I've been reading up on the OLEDB and ODBC stuff which I think that's what I want, but I'm having some problems finding some basic instructions online. Can anyone give me a good pointer?

    thanks!
    Tuesday, April 29, 2008 6:00 PM

Answers

  • 1. Here is article that explains about 255 characters limitation. It would require registry settings from you

     

    http://support.microsoft.com/kb/189897/en-us

     

    2. Your second question is not clear. What is "define->name"

     

    3. Since Excel is not really a database with fixed columns structure and each cell has its own data type, Jet tries to guess types of column based on initial scanning of first N rows. Based on that scan it will select some type. Since most of rows contain mix of numbers and string, Jet will use string as a type for the columns in many cases. You cannot force Jet to use specific data type. In some cases you even may get NULL values instead of actual one for the rows that do not match type of the columns. For example, if Jet (based on initial scan) defines column as numeric and later gets some string values, it will return NULL. To resolve this issue you can use Visual Studio Tools for Office. They are COM-based, but provide full access to the data without any loss. I also developed 100% .NET managed Excel reader that resolves limitations with data types and returns each cell value in its native type. If you are interested you could try it downloading it from my web site.

     

    Wednesday, April 30, 2008 2:27 AM
    Moderator
  • Yes, you can query data from the whole spreadsheet using next kind of query

     

    SELECT * FROM [SheetName$]

     

    As you see you need to append $ to the actual spreadsheet name

     

    Wednesday, April 30, 2008 11:42 PM
    Moderator

All replies

  • Thanks, that was actually the reference that I was using however... it helped me on some initialization stuff pretty well.
    Let me post some of my code actually... I did a define->name to the items that I wanted to be included, and the name is "Test".

    Anyways, here's what I have:

                OleDbConnection connection = new OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" +
                                                              filename +
                                                              "Extended Properties=Excel 8.0;");

                OleDbDataReader reader;
                connection.Open();
                OleDbCommand query = new OleDbCommand("SELECT * From Test", connection);
                reader = query.ExecuteReader();

                while (reader.Read())
                {
                    System.Console.WriteLine(reader["ID"] + " "+ reader["Name"] + " " + reader["Description"]);
                    Console.WriteLine();
                }

    Here's my questions:
    1. Although this code works in printing out most of the data in the cells to the console, I have multiple cells that contains 488 characters (include spaces).  The printout that I obtained chops off everything after about 250 characters. Using DataTable shows the same result

    2. Is there any way I can use the names of the sheets instead of doing the define->name? The excel file that I have has two pages in total. I'm currently doing a select all and defining a name for everything that is selected in that page.

    3. Everything seems to be treated as strings as far as what I see. However, the IDs are actually numbers. I previously had used Console.WriteLine(reader.GetInt32(0) + ", " + reader.GetString(1)); as an attempt to get the ID as number, but I get the error that says the cast is not valid. I actually need the IDs to be in numbers (the biggest number for the ID field is around 5000 so int32 should be fine) Any good tips on how to get them in numbers?

    Thanks!
    Tuesday, April 29, 2008 8:23 PM
  • 1. Here is article that explains about 255 characters limitation. It would require registry settings from you

     

    http://support.microsoft.com/kb/189897/en-us

     

    2. Your second question is not clear. What is "define->name"

     

    3. Since Excel is not really a database with fixed columns structure and each cell has its own data type, Jet tries to guess types of column based on initial scanning of first N rows. Based on that scan it will select some type. Since most of rows contain mix of numbers and string, Jet will use string as a type for the columns in many cases. You cannot force Jet to use specific data type. In some cases you even may get NULL values instead of actual one for the rows that do not match type of the columns. For example, if Jet (based on initial scan) defines column as numeric and later gets some string values, it will return NULL. To resolve this issue you can use Visual Studio Tools for Office. They are COM-based, but provide full access to the data without any loss. I also developed 100% .NET managed Excel reader that resolves limitations with data types and returns each cell value in its native type. If you are interested you could try it downloading it from my web site.

     

    Wednesday, April 30, 2008 2:27 AM
    Moderator
  • Thanks for the answer provided!

    1. Unfortunately, the registry settings change is not a solution for me because the Excel file is not completely populated, and changing registry settings could only guess up to 16 rows according to that document, which the row in question is actually in the 300s row.

    I could get over with this problem by inserting random character junk in the first row of an excel document that exceeds say, 1000 characters, then ignore the first row. (or second row of the sheet)


    2. Sorry about not being clear for my question. Here is the part of the code I was having questions about:

    OleDbCommand query = new OleDbCommand("SELECT * From Test", connection);
    OleDbDataAdapter oleAdapter = new OleDbDataAdapter();
    oleAdapter.SelectCommand = query;

    Notice the SELECT statement has "Test" as the "table" to choose from. In order for excel to recognize this, I highlighted all cells within the sheet1, and on the top menu, I hit Insert->Name->Define to Define the name "Test" for all the columns in order to use that Query statement.

    My question is that, is there any way I can make the query to refer to everything in sheet1, instead of using the Name->Define method?


    3. I have previously looked into Visual Studio Tools for Office before, however, VSTO is not included in the free express version of Visual Studio, therefore I dropped the idea.



    Thanks a lot for everything! =)




    Wednesday, April 30, 2008 5:23 PM
  • Yes, you can query data from the whole spreadsheet using next kind of query

     

    SELECT * FROM [SheetName$]

     

    As you see you need to append $ to the actual spreadsheet name

     

    Wednesday, April 30, 2008 11:42 PM
    Moderator
  • Thanks a lot!! that worked! =)
    Friday, May 2, 2008 4:43 PM