none
SQLBulkCopy and SQLServer Network IO RRS feed

  • Question

  •  

    Greetings.

     

    We recently updated our system to use SqlBukCopy (C#, .Net 2.0) for importing data directly to tables in our database. Has anyone else encountered high network IO and CPU utilization on SQL Server with this data import method?

     

    I plan on setting the BatchSize property to a fixed value next.

     

    Thanks.

    Dave

     

    Monday, April 21, 2008 8:55 PM

All replies

  • Can you be more specific about how high is the network IO and CPU utilization? Since bulkcopy is designed to upload large amount data more quickly, it's possible you see higher network and CPU usage.

    Thanks.

    Monday, April 21, 2008 9:09 PM
  • We're using Veritas I3 to collect this transaction data. For a 33 second single-table import, the breakdown looks like this.

     

    Sql Server Time: 3 Seconds

    Network IO Time: 30 Seconds

     

    Along with this we see the CPU take a significant jump (>50%). Under production load testing, we are using a lot of CPU time for multiple requests. We consistently see the CPU steadily above 80-90%.

     

    Thanks.

    Dave

     

    Monday, April 21, 2008 9:20 PM
  • While it depends on factors like your data size and how you are using SqlBulkcopy, it is not unusual that you increase the CPU on the server (many insertions happening), specially if you are doing concurrent insertions of tables, that would create locking on the server.

     

    Network also is frequently a bottleneck if your data size is enough.

     

    So the answer to your first question is: Yes, it is common to see increase in the CPU and network... but see below.

    On your second question: Setting the batch size to a fixed size will allow you to see if your buffer is too large, consuming too much memory and causing degradation if your client has RAM constraints; or if it is too small and you are not exploiting the network capacity (which might be the problem, if it is too small your communication gets too chatty). But it is fixed by default, so instead probably you mean increasing/decreasing to fine tune.

     

    Still don't have elements to understand if it is good or bad performance, because we don't know how much data you are transferring. So please clarify:

     

    • What method did you use before?
    • Why you find performance unreasonable?
    • How much data are you transferring, e.g. number of rows, average data size per row?
    • Which kind of client / server / network do you have?
    • Why you are doing multiple requests? In general is not expected to have concurrent bulkcopy, not impossible but not the common patterns

    Some other things you may take a look at:

    • Do you have indexes on the destination tables? If so, are they created before or after the load. I have found significant difference to use unindexed tables for transfer, and then uploading the data to the final production table, using the more efficient order (and after removing the network factor)
    • Do you have additional validations, e.g. constraints, unique indexes, data conversion, that might make your process slower? Again, sometimes having a destination table with all strings, accepting nulls, no constraints makes process faster and then you do a server-side validation / conversion / transfer, for example using a stored proc.
    Tuesday, April 22, 2008 1:11 AM
  • Here is some extra detail about the destination table (I've disguised the names for confidentiality purposes). Please excuse the use of sql_variant (not my first choice).

     

    CREATE TABLE [dbo].[XXXX](
     [c1] [int] NOT NULL,
     [c2] [int] NOT NULL,
     [c3] [bit] NULL,
     [c4] [bit] NULL,
     [c5] [sql_variant] NULL,
     [c6] [bit] NULL,
     [c7] [bit] NULL,
     [c8] [bit] NULL,
     [c9] [varchar](40) NULL,
     [c10] [timestamp] NULL,
     CONSTRAINT [XXXX_PK] PRIMARY KEY CLUSTERED
    (
     [c1] ASC,
     [c2] ASC
    )WITH (IGNORE_DUP_KEY = OFF) ON [PRIMARY]
    ) ON [PRIMARY]


    ALTER TABLE [dbo].[XXXX]  WITH NOCHECK ADD  CONSTRAINT [FKTable_XXXX_FK1] FOREIGN KEY([c2])
    REFERENCES [dbo].[FKTable] ([c1])
    GO
    ALTER TABLE [dbo].[XXXX] CHECK CONSTRAINT [FKTable_c2_FK1]
    GO
    ALTER TABLE [dbo].[XXXX]  WITH NOCHECK ADD  CONSTRAINT [FKTable2_XXXX_FK1] FOREIGN KEY([c2])
    REFERENCES [dbo].[FKTable2] ([c1])
    ON UPDATE CASCADE
    ON DELETE CASCADE
    GO
    ALTER TABLE [dbo].[XXXX] CHECK CONSTRAINT [FKTable2_XXXX_FK1]

     

  • What method did you use before? We used components that used ADO (COM) bulk insert and a product called LLBLGen's (ORM for .Net) bulk insert capability.
  • Why you find performance unreasonable? I don't necessarily find it unreasonable. For the reasons you state in your response (indexes, constraints, etc), I can totally understand why it would perform the way it does.
  • How much data are you transferring, e.g. number of rows, average data size per row? We're averaging around 1000 rows per import for this table. (we do many other tables as well this way)
  • Which kind of client / server / network do you have? Windows 2003 64bit and SQL 2005 64bit (8-way server with a high performance SAN). The network has plenty of bandwidth.
  • Why you are doing multiple requests? In general is not expected to have concurrent bulkcopy, not impossible but not the common patterns. I think this comment is the key issue. We can run many of these (10+) simultaneously at any time during the typical day.

     

    I really think that it's not so much an issue with SQLBulkCopy as it is an overall pattern approach. In past systems I have worked on, I always staged imported data into temporary tables before putting them into their final destination tables. Basically, SQLBulkCopy (and all data import tasks) should follow standard database ETL and general data import practices.

     

    We are importing data directly into operational tables that are being read by clients concurrent with the import. This is where I think I should focus.

     

    I welcome further comments.

     

    Thanks for discussing this!

     

Tuesday, April 22, 2008 5:40 PM