none
How To Load Data To Datatable Faster Using Interop.Excel

    Question

  • I use Interop excel to load data from excel sheet to DataTable (NO OLEDB !!). However it's slow process iterating each excel row. Is there any other faster approach using array or whatever else to make it faster? keep in mind i would like for some of columns to say that i want to either take value as .Value or .Text as in the example.

    Dim dt As New DataTable()
    
            Dim rows As Integer = _xlWorkSheet.UsedRange.Rows.Count
                Dim cols As Integer = _xlWorkSheet.UsedRange.Columns.Count
    
            Dim noofrow As Integer = 1
    
                'If 1st Row Contains unique Headers for datatable include this part else remove it
                'Start
                For z As Integer = 1 To cols
                Dim colname As String = _xlWorkSheet.Cells(1, z).Text
                dt.Columns.Add(colname)
                    noofrow = 2
                Next
                'END
    
                For gr As Integer = noofrow To rows
                    Dim drr As DataRow = dt.NewRow()
                For columnn As Integer = 1 To cols
    
                    If columnn = 11 Then
                        Dim dsajdnjsd As Excel.Range = _xlWorkSheet.Cells(gr, columnn)
                        drr(columnn - 1) = dsajdnjsd.Text    '<--- here as Text
                    Else
                        drr(columnn - 1) = _xlWorkSheet.Cells(gr, columnn).value  '<--- here is Value
                    End If
    
    
                Next
                dt.Rows.Add(drr)
                Next

    • Edited by JimmyJimm Monday, April 03, 2017 8:16 AM
    Saturday, April 01, 2017 3:59 PM

All replies

  • Hello,

    With no disrespect, Excel automation is over-kill. Instead consider using OleDb. In the following code sample I have a xlsx file with two sheets, People1, first row is data, People2, first row is data. Depending on the first row you need to setup the connection string accordingly which is handled om Connections class along with handling if you want to use a xls or xlsx file.

    People1 sheet

    People2 sheet

    In the code below the following screenshot I show three ways to load data. The screenshot is from People1 sheet where I create my own column names which are used in the DataTable.

    Connection class

    Imports System.Data.OleDb
    Public Class Connections
        Public Sub New()
        End Sub
        ''' <summary>
        ''' Create a connection where first row contains column names
        ''' </summary>
        ''' <param name="FileName"></param>
        ''' <param name="IMEX"></param>
        ''' <returns></returns>
        ''' <remarks></remarks>
        <DebuggerStepThrough()>
        Public Function HeaderConnectionString(ByVal FileName As String, Optional ByVal IMEX As Integer = 1) As String
            Dim Builder As New OleDbConnectionStringBuilder
            If IO.Path.GetExtension(FileName).ToUpper = ".XLS" Then
                Builder.Provider = "Microsoft.Jet.OLEDB.4.0"
                Builder.Add("Extended Properties", String.Format("Excel 8.0;IMEX={0};HDR=Yes;", IMEX))
            Else
                Builder.Provider = "Microsoft.ACE.OLEDB.12.0"
                Builder.Add("Extended Properties", String.Format("Excel 12.0;IMEX={0};HDR=Yes;", IMEX))
            End If
    
            Builder.DataSource = FileName
    
            Return Builder.ToString
    
        End Function
        ''' <summary>
        ''' Create a connection where first row contains data
        ''' </summary>
        ''' <param name="FileName"></param>
        ''' <param name="IMEX"></param>
        ''' <returns></returns>
        ''' <remarks></remarks>
        <DebuggerStepThrough()>
        Public Function NoHeaderConnectionString(ByVal FileName As String, Optional ByVal IMEX As Integer = 1) As String
            Dim Builder As New OleDbConnectionStringBuilder
            If IO.Path.GetExtension(FileName).ToUpper = ".XLS" Then
                Builder.Provider = "Microsoft.Jet.OLEDB.4.0"
                Builder.Add("Extended Properties", String.Format("Excel 8.0;IMEX={0};HDR=No;", IMEX))
            Else
                Builder.Provider = "Microsoft.ACE.OLEDB.12.0"
                Builder.Add("Extended Properties", String.Format("Excel 12.0;IMEX={0};HDR=No;", IMEX))
            End If
    
            Builder.DataSource = FileName
    
            Return Builder.ToString
    
        End Function
    End Class

    Form code. The Excel file in this case is in the same folder as the executable.

    Imports System.Data.OleDb
    Public Class Form1
        Private FileName As String = IO.Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "PeopleData.xlsx")
        Private Connection As Connections = New Connections
        Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
            ' normally you would load from here but I have three version to demo
        End Sub
    
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Dim SelectStatement As String = "SELECT * FROM [People1$]"
            Using cn As New OleDbConnection With {.ConnectionString = Connection.NoHeaderConnectionString(FileName)}
                Using cmd As New OleDbCommand With {.Connection = cn, .CommandText = SelectStatement}
                    cn.Open()
                    Dim dt As New DataTable
                    Try
                        dt.Load(cmd.ExecuteReader)
                    Catch ex As Exception
                        MessageBox.Show(ex.Message)
                    End Try
                    DataGridView1.DataSource = dt
                End Using
            End Using
        End Sub
    
        Private Sub Button2_Click(sender As Object, e As EventArgs) Handles Button2.Click
            Dim SelectStatement As String = "SELECT * FROM [People2$]"
            Using cn As New OleDbConnection With {.ConnectionString = Connection.HeaderConnectionString(FileName)}
                Using cmd As New OleDbCommand With {.Connection = cn, .CommandText = SelectStatement}
                    cn.Open()
                    Dim dt As New DataTable
                    Try
                        dt.Load(cmd.ExecuteReader)
                    Catch ex As Exception
                        MessageBox.Show(ex.Message)
                    End Try
                    DataGridView1.DataSource = dt
                End Using
            End Using
        End Sub
    
        Private Sub Button3_Click(sender As Object, e As EventArgs) Handles Button3.Click
            Dim SelectStatement As String =
                <SQL>
                    SELECT 
                        F1 As [Given Name], 
                        F2 As Middle, 
                        F3 As [Sur Name], 
                        F4 As City, 
                        F5 As State, 
                        F6 As Zip, 
                        F7 As Email 
                    FROM [People1$]
                </SQL>.Value
    
            Using cn As New OleDbConnection With {.ConnectionString = Connection.NoHeaderConnectionString(FileName)}
                Using cmd As New OleDbCommand With {.Connection = cn, .CommandText = SelectStatement}
                    cn.Open()
                    Dim dt As New DataTable
                    Try
                        dt.Load(cmd.ExecuteReader)
                    Catch ex As Exception
                        MessageBox.Show(ex.Message)
                    End Try
                    DataGridView1.DataSource = dt
                End Using
            End Using
        End Sub
    End Class
    Two overall notes, I highly suggest never using SELECT * but instead always spell out field names and second, in Button3, I used XML Literals to create a nicely formatted SELECT statement which works using Framework 3.5 and higher.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites


    Saturday, April 01, 2017 8:17 PM
    Moderator
  • If you have access to your SQL Server you can import data in BULK from Excel using Sql Server Integration Services. That is fastest method - no matter what kind of programmatic approach you take. You can try this with the import/export wizard from SQL Server Management Studio and save your package. You can write code to manipulate the package to import new data when needed or simply change the Excel file and make sure you use the same location and filename as the one used when created the package and execute the package. You can schedule SQL Server job to do this on schedule or even get a bit more sophisticated with writing SQL Server stored procedure to handle the some data cleaning/pruning before and after import.

    I guess what I am trying to say that there is plenty of choice if you do this on SQL Server side, however that requires a bit more then novice knowledge of SQL server. 

    Saturday, April 01, 2017 11:13 PM
  • sorry i didn't mentioned i dont want to use odbc but just interop. excel. So have to read range of specific sheet and as an array "in one shoot" populate datatable or datasource.
    Monday, April 03, 2017 7:32 AM
  • Hi JimmyJimm,
    Please refer to the code below.

    Private Sub SaveFileToDatabase()
            Dim strConnection As [String] = "Data Source=(LocalDB)\MSSQLLocalDB;AttachDbFilename=D:\C# and VB Support\VB(Demo)\VB(Demo)\Database1.mdf;Integrated Security=True"
    
            Dim excelConnString As [String] = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source=D:\Date.xlsx;Extended Properties='Excel 12.0 Xml; HDR = YES';"
            'Create Connection to Excel work book 
            Using excelConnection As New OleDbConnection(excelConnString)
                'Create OleDbCommand to fetch data from Excel 
                Using cmd As New OleDbCommand("Select [ID],[Name],[Sex],[Number] from [Sheet1$]", excelConnection)
                    excelConnection.Open()
                    Using dReader As OleDbDataReader = cmd.ExecuteReader()
                        Using sqlBulk As New SqlBulkCopy(strConnection)
                            'Give your Destination table name 
                            sqlBulk.DestinationTableName = "test"
                            sqlBulk.WriteToServer(dReader)
                            MessageBox.Show("OK!")
    
                        End Using
                    End Using
                End Using
            End Using
        End Sub
    
        Private Sub button1_Click(sender As Object, e As EventArgs)
            SaveFileToDatabase()
        End Sub
    

    Best Regards,

    Cherry Bu


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, April 03, 2017 8:10 AM
    Moderator
  • Can you provide without using oledb ? Just regulat interop excel.
    Monday, April 03, 2017 8:14 AM
  • Hi can you post without oledb? I am just using interop.excel.
    Monday, April 03, 2017 8:15 AM
  • I am not using oledb just regulat interop excel.
    Monday, April 03, 2017 8:15 AM
  • Can you provide without using oledb ? Just regulat interop excel.

    Why, do you also try to sleep with your eyes open and things like that?

    No problem, but don't ask help with idiotic behaviour. 


    Success
    Cor

    Monday, April 03, 2017 10:33 AM
  • Excel automation is always slower then OleDb. OleDb can read a specific range rather than an entire sheet.


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    Monday, April 03, 2017 11:43 AM
    Moderator
  • It's not idiotic behaviour there is difrfence between using OLEDB approach and Excel Interop Object. See here e.g: http://stackoverflow.com/questions/3608248/which-one-is-best-oledb-or-excel-object-or-database

    I rpefer using Excel Object Interop even if it slower because it gives you full access to Excel objects hierarchy see in this link.

    • Edited by JimmyJimm Monday, April 03, 2017 12:06 PM
    Monday, April 03, 2017 11:57 AM
  • There advantages using excel automation because sometime there is no excel version issues when using automation and in case of oledb there are. Especially when you developed on one system and then migrating to diffrent one therefore i prefer excel automation. There was very nice and quick way to do it using arrays but i dont remember and can't found it. Besides Excel Interop Object has complete access to Excel's object model, you can extract almost every information (for example: formatting information, colors, frames etc) that is stored in your Excel file and your sheets can be as complex structured as you want.
    Monday, April 03, 2017 12:04 PM
  • Your question was. How To Load Data To Datatable faster Using Interop.Excel. What is written in that link is not about a datatable. It is about using an interop worksheet. Be aware that if is written that in Liechtenstein plays better Baseball than in the USA, than it is not direct the truth. You need it for a datatable, a part of the same namespace as the OleDB approach, than asking not to use that is in my perception ..................................

    Success
    Cor


    Monday, April 03, 2017 12:28 PM
  • COM Interop is slow so I would dump the contents of an Excel Range to a two dimensional array and then load your DataTable from the array. The below example demonstrates how to copy the data to an array:

        Public Sub ExcelRangeToArray()
    
            Dim excelApplication As New Excel.Application
            Dim excelWorkbook As Excel.Workbook
            Dim excelWorksheet As Excel.Worksheet
            Dim excelRange As Excel.Range
            Dim excelWorkbooks As Excel.Workbooks = excelApplication.Workbooks
    
            excelWorkbooks = excelApplication.Workbooks
            excelWorkbook = excelWorkbooks.Open("C:\Users\...\Documents\My Database\Excel\Book2007.xlsx")
            excelWorksheet = excelWorkbook.Sheets("Sheet3")
            excelRange = excelWorksheet.UsedRange
    
            Dim excelRows As Object(,)
            excelRows = excelRange.Value
    
            For excelRow As Integer = excelRows.GetLowerBound(0) To excelRows.GetUpperBound(0)
                For excelColumn As Integer = excelRows.GetLowerBound(1) To excelRows.GetUpperBound(1)
                    'Populate DataTable here
                    'Dim excelValue As Object = excelRows(excelRow, excelColumn)
                    'Console.WriteLine(String.Concat("At row ", excelRow, " column ", excelColumn, " the value is ", excelValue))
                Next excelColumn
            Next excelRow
    
            excelRange = Nothing
            excelWorksheet = Nothing
            excelWorkbook.Close()
            excelWorkbook = Nothing
            excelWorkbooks = Nothing
            excelApplication.Quit()
            excelApplication = Nothing
    
        End Sub


    Paul ~~~~ Microsoft MVP (Visual Basic)

    Monday, April 03, 2017 2:25 PM
  • use xml to import excel large data Because it is characterized by high speed in dealing with data
    Monday, April 03, 2017 7:46 PM
  • Hi Paul,

    you answer is vey similar to what i found so far:

    Dim data As Object(,) = DirectCast(_xlWorkSheet.UsedRange.Value2, Object(,))

    If you look i typed to get .Value2 for all data. What i am missing, do you know how to say for instance for two of the columns i want to use .Text instead of .Value2? At the moment all are .Value2.

    Monday, April 03, 2017 8:16 PM
  • any example? Note that for all columns i want to say either .Value/.Value2/.Text has to be used.
    • Edited by JimmyJimm Monday, April 03, 2017 8:19 PM
    Monday, April 03, 2017 8:19 PM
  • Here is a complete working code sample in VS2015 that reads a specific sheet with headers into a DataTable. Granted I don't uses the names in the first row as DataColumn ColumnName (but it would be easy for you to insert that logic) nor do I do updates but I give you the row index.

    Class for Excel

    Imports Excel = Microsoft.Office.Interop.Excel
    Public Class Operations
        ''' <summary>
        ''' 
        ''' </summary>
        ''' <param name="FileName">Excel file to open</param>
        ''' <param name="SheetName">SheetName to read</param>
        ''' <param name="StartCell">Start cell</param>
        ''' <param name="EndCell">End Cell</param>
        ''' <param name="Used">By default, use StartCell and EndCell, if set to true read all used cells</param>
        ''' <returns></returns>
        Public Function OpenExcelAndIterate(
            ByVal FileName As String,
            ByVal SheetName As String,
            ByVal StartCell As String,
            ByVal EndCell As String, Optional Used As Boolean = False) As DataTable
    
            Dim dt As New DataTable
    
            Dim Proceed As Boolean = False
    
            Dim xlApp As Excel.Application = Nothing
            Dim xlWorkBooks As Excel.Workbooks = Nothing
            Dim xlWorkBook As Excel.Workbook = Nothing
            Dim xlWorkSheet As Excel.Worksheet = Nothing
            Dim xlWorkSheets As Excel.Sheets = Nothing
            Dim xlCells As Excel.Range = Nothing
    
            xlApp = New Excel.Application
            xlApp.DisplayAlerts = False
            xlWorkBooks = xlApp.Workbooks
            xlWorkBook = xlWorkBooks.Open(FileName)
    
            xlApp.Visible = False
    
            xlWorkSheets = xlWorkBook.Sheets
    
    
            '
            ' For/Next finds our sheet
            '
            For x As Integer = 1 To xlWorkSheets.Count
                xlWorkSheet = CType(xlWorkSheets(x), Excel.Worksheet)
    
                If xlWorkSheet.Name = SheetName Then
                    Proceed = True
                    Exit For
                End If
    
                Runtime.InteropServices.Marshal.FinalReleaseComObject(xlWorkSheet)
                xlWorkSheet = Nothing
    
            Next
    
            If Proceed Then
    
                dt.Columns.AddRange(New DataColumn() _
                    {
                        New DataColumn With {.ColumnName = "Identifier", .DataType = GetType(Int32), .AutoIncrement = True, .AutoIncrementSeed = 1},
                        New DataColumn With {.ColumnName = "CompanyName", .DataType = GetType(String)},
                        New DataColumn With {.ColumnName = "ContactName", .DataType = GetType(String)},
                        New DataColumn With {.ColumnName = "CompanyTitle", .DataType = GetType(String)}
                    }
                )
    
    
                Dim xlUsedRange As Excel.Range = Nothing
                If Used Then
                    xlUsedRange = xlWorkSheet.UsedRange
                Else
                    xlUsedRange = xlWorkSheet.Range(StartCell, EndCell)
                End If
    
                Try
    
                    Dim ExcelArray(,) As Object = CType(xlUsedRange.Value(Excel.XlRangeValueDataType.xlRangeValueDefault), Object(,))
    
                    If ExcelArray IsNot Nothing Then
                        ' Get bounds of the array.
                        Dim bound0 As Integer = ExcelArray.GetUpperBound(0)
                        Dim bound1 As Integer = ExcelArray.GetUpperBound(1)
    
                        For j As Integer = 1 To bound0
                            If Used AndAlso j = 1 Then
                                Continue For
                            End If
                            If (ExcelArray(j, 1) IsNot Nothing) Then
                                Try
                                    dt.Rows.Add(New Object() {Nothing, ExcelArray(j, 1), ExcelArray(j, 2), ExcelArray(j, 3)})
                                Catch ex As Exception
                                    dt.Rows.Add(New Object() {Nothing, ex.Message})
                                End Try
                            Else
                                dt.Rows.Add(New Object() {Nothing})
                            End If
                        Next
                    End If
                Finally
                    ReleaseComObject(xlUsedRange)
                End Try
    
            Else
                MessageBox.Show(SheetName & " not found.")
            End If
    
            xlWorkBook.Close()
            xlApp.UserControl = True
            xlApp.Quit()
    
            ReleaseComObject(xlCells)
            ReleaseComObject(xlWorkSheets)
            ReleaseComObject(xlWorkSheet)
            ReleaseComObject(xlWorkBook)
            ReleaseComObject(xlWorkBooks)
            ReleaseComObject(xlApp)
    
    
            Return dt
    
        End Function
        Private Sub ReleaseComObject(ByVal sender As Object)
            Try
                If sender IsNot Nothing Then
                    Runtime.InteropServices.Marshal.ReleaseComObject(sender)
                    sender = Nothing
                End If
            Catch ex As Exception
                sender = Nothing
            End Try
        End Sub
    End Class
    

    Form code

    Public Class Form1
        Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
            Dim ops As New Operations
            Dim dt As DataTable = ops.OpenExcelAndIterate(
                IO.Path.Combine(AppDomain.CurrentDomain.BaseDirectory, "Customers.xlsx"),
                "Customers", "A2", "C90", True)
            DataGridView1.DataSource = dt
        End Sub
    End Class
    


    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    Monday, April 03, 2017 10:49 PM
    Moderator
  • Hi Paul,

    you answer is vey similar to what i found so far:

    Dim data As Object(,) = DirectCast(_xlWorkSheet.UsedRange.Value2, Object(,))

    If you look i typed to get .Value2 for all data. What i am missing, do you know how to say for instance for two of the columns i want to use .Text instead of .Value2? At the moment all are .Value2.


    Why do you want to use Text? Text contains formatted values which you don't need. Also, there is no reason to use Value2. Use Value as in my example.

    Paul ~~~~ Microsoft MVP (Visual Basic)

    Tuesday, April 04, 2017 2:19 AM
  • Hi there is specific reason why. I got one column which format of it is set to [h]:mm:ss so means hour can exceed normal clock hour e.g 657:10:20. In this case when you get it by .Value you get double value which i tried to convert back again to 657:10:20, however it is very hard even using NumberFormat. I already raised couple topics about that and so far no answer. Therefore i tested only for that column to get it by .Text where i get exactly what i want (only for that column). Therefore i ask how to tell only for that column to get .Text
    Tuesday, April 04, 2017 7:47 AM
  • . I already raised couple topics about that and so far no answer. 

    Of course you get no answers. 

    You also don't get them when you ask: "How to drive with my head in a bucket".

    What you can do once is try yourself if what you write is the truth. 

    Currently you act like a messiah which has developed his own believe based on what he has seen and judged about that. 

    For instance 657:10:20 is not a normal clock hour. The maximum of a clock is 23:59:59 999

    But that is not even a topic for this thread, that was speed


    Success
    Cor



    Tuesday, April 04, 2017 9:40 AM
  • If you are getting a Double for the date then look at using DateTime.FromOADate.

    Please remember to mark the replies as answers if they help and unmark them if they provide no help, this will help others who are looking for solutions to the same or similar problem. Contact via my Twitter (Karen Payne) or Facebook (Karen Payne) via my MSDN profile but will not answer coding question on either.
    VB Forums - moderator
    profile for Karen Payne on Stack Exchange, a network of free, community-driven Q&A sites

    Tuesday, April 04, 2017 10:21 AM
    Moderator
  •  I have given a solution to this in post 10# of the link below.  8)

    How to load data to datatable faster using Interop.Excel


    If you say it can`t be done then i`ll try it

    Tuesday, April 04, 2017 6:03 PM