locked
BULK INSERT - Will not accept datetime value regardless of format. RRS feed

  • Question

  • I have 6 million rows worth of data I want to insert into my SQL Server database. I can do it the slow way with 6 million INSERT statements (by my calculation it would take 18 hours to run) or I can try BULK INSERT.

    BULK INSERT has issues with not being able to escape characters, but the data in this case is very simple and so shouldn't run into this problem.

    However SQL Server doesn't seem to like any form of date/time data to be inserted into a field.

    Here is the table (psuedo-SQL)

    CREATE TABLE Tasks (
      TaskId bigint NOT NULL IDENTITY(1,1) PRIMARY KEY,
      TriggerId bigint NOT NULL FOREIGN KEY,
      Created datetime NOT NULL,
      Modified datetime NOT NULL,
      ScheduledFor datetime NULL,
      LastRan datetime NULL,
      -- and about 10 more fields after this
    )
    
    

    Here is my BULK INSERT statement:

    SET DATEFORMAT dmy
    BULK INSERT Calls
    FROM 'C:\TasksBulk.dat'
    WITH (
      -- CHECK_CONSTRAINTS is not necessary as the only constraints are always enforced regardless of this option (UNIQUE, PRIMARY KEY, and NOT NULL)
      CODEPAGE = 'RAW',
      DATAFILETYPE = 'native',
    
      KEEPIDENTITY,
      MAXERRORS = 1,
      ORDER ( CallId ASC ),
    
      FIELDTERMINATOR = '\t',
      ROWTERMINATOR  = '\0'
    )
    

    And here is the first row of data in TasksBulk.dat:

    1000\t1092\t01/01/2010 04:00:17\t01/01/2010 04:00:17\t\t01/01/2010 04:00:14\0

    However when I run the BULK INSERT statement, I get this error:

    Msg 4864, Level 16, State 1, Line 2 Bulk load data conversion error (type mismatch or invalid character for the specified codepage) for row 1, column 3 (Created).

    I have tried using different row and field terminators and every different date/time format (including "01/01/2010", "2010-01-01", both with and without the "04:00:17" time component). I don't know what I'm doing wrong here.

    Sunday, August 21, 2011 2:00 AM

Answers

  • You want DATAFILETYPE = 'char', not 'native'.  Native is if the data is in SQL Server internal format and requires no conversion.  Usually, this is only true if you have used bcp to output the file and specified native in the bcp command.

    It's not the cause of your problem here, but the always safe way for datetime formats with character data are yyyymmdd if you don't specify time and either

    yyyy-mm-ddThh:mm:ss or yyyy-mm-ssThh:mm:ss.ttt if you do specify a time.

    Tom


    • Proposed as answer by Naomi N Sunday, August 21, 2011 2:53 AM
    • Marked as answer by DavidAmazing Sunday, August 21, 2011 3:06 AM
    Sunday, August 21, 2011 2:40 AM

All replies

  • You want DATAFILETYPE = 'char', not 'native'.  Native is if the data is in SQL Server internal format and requires no conversion.  Usually, this is only true if you have used bcp to output the file and specified native in the bcp command.

    It's not the cause of your problem here, but the always safe way for datetime formats with character data are yyyymmdd if you don't specify time and either

    yyyy-mm-ddThh:mm:ss or yyyy-mm-ssThh:mm:ss.ttt if you do specify a time.

    Tom


    • Proposed as answer by Naomi N Sunday, August 21, 2011 2:53 AM
    • Marked as answer by DavidAmazing Sunday, August 21, 2011 3:06 AM
    Sunday, August 21, 2011 2:40 AM
  • Yes, thank you. Changing the DATAFILETYPE to 'char' fixed the date formatting problem.

    However I have a column of type varbinary(16), and it rejects my "0xaabbccdd" data format. Is there a way to specify binary values in a BULK INSERT data file or should I insert it into a staging table and then do a conversion that way?

    Sunday, August 21, 2011 3:06 AM
  • Yes, thank you. Changing the DATAFILETYPE to 'char' fixed the date formatting problem.

    However I have a column of type varbinary(16), and it rejects my "0xaabbccdd" data format. Is there a way to specify binary values in a BULK INSERT data file or should I insert it into a staging table and then do a conversion that way?


    You probably can't do it without a format file.  Once you specify the file you are importing from is a character file, you can't use that file to load a column from that file unless that column is a type that can be implicitly converted to from varchar.  So you can, for example, insert into integer columns, or datetime.  But not varbinary because SQL won't implicitly convert character to varbinary.  A format file could perhaps do this, I'm not sure, I rarely use them.  Perhaps someone else can help you.

    I would load the file into a staging table.  I almost always do that anyway.  It allows you to do checks on the data in the staging table before it is loaded into your real tables. 

    Tom

    Sunday, August 21, 2011 6:18 AM
  • Thankfully it is not as bad as Tom says; you can load binary values directly with BULK INSERT. What it is putting you off is the 0x prefix. Thankfully, you can work around this with a format file. Assuming that you have 7 columns in your table, and the varbinary colunm is the fifth column, this is how your format file should look like:

    9.0
    7
    1 SQLCHAR 0 0 "\t"    1 col1 Latin1_General_CI_AS
    2 SQLCHAR 0 0 "\t"    2 col2 Latin1_General_CI_AS
    3 SQLCHAR 0 0 "\t"    3 col3 Latin1_General_CI_AS
    4 SQLCHAR 0 0 "\t"    4 col4 Latin1_General_CI_AS
    5 SQLCHAR 0 0 "\t0x"  5 col5 Latin1_General_CI_AS
    6 SQLCHAR 0 0 "\t"    6 col6 Latin1_General_CI_AS
    7 SQLCHAR 0 0 "\r\n"  7 col7 Latin1_General_CI_AS

    That is, you specify 0x as being part of the delimiter.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, August 21, 2011 10:33 AM
  • Thankfully it is not as bad as Tom says; you can load binary values directly with BULK INSERT. What it is putting you off is the 0x prefix. Thankfully, you can work around this with a format file. Assuming that you have 7 columns in your table, and the varbinary colunm is the fifth column, this is how your format file should look like:

    9.0
    7
    1 SQLCHAR 0 0 "\t"  1 col1 Latin1_General_CI_AS
    2 SQLCHAR 0 0 "\t"  2 col2 Latin1_General_CI_AS
    3 SQLCHAR 0 0 "\t"  3 col3 Latin1_General_CI_AS
    4 SQLCHAR 0 0 "\t"  4 col4 Latin1_General_CI_AS
    5 SQLCHAR 0 0 "\t0x" 5 col5 Latin1_General_CI_AS
    6 SQLCHAR 0 0 "\t"  6 col6 Latin1_General_CI_AS
    7 SQLCHAR 0 0 "\r\n" 7 col7 Latin1_General_CI_AS

    That is, you specify 0x as being part of the delimiter.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se


    Thanks, Erland

    One minor correction, if the fifth column is varbinary, then the delimiter for col4 should be the \t0x and the delimiter for col5 should be just \t.  Since the 0x is in the front of the fifth column, you need to treat it as part of the forth column's delimiter.

    Tom

    Sunday, August 21, 2011 8:20 PM
  • Thanks Tom, you are correct. This is the correct format file:

    9.0
    7
    1 SQLCHAR 0 0 "\t"    1 col1 Latin1_General_CI_AS
    2 SQLCHAR 0 0 "\t"    2 col2 Latin1_General_CI_AS
    3 SQLCHAR 0 0 "\t"    3 col3 Latin1_General_CI_AS
    4 SQLCHAR 0 0 "\t0x"  4 col4 Latin1_General_CI_AS
    5 SQLCHAR 0 0 "\t"    5 col5 Latin1_General_CI_AS
    6 SQLCHAR 0 0 "\t"    6 col6 Latin1_General_CI_AS
    7 SQLCHAR 0 0 "\r\n"  7 col7 Latin1_General_CI_AS

    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Sunday, August 21, 2011 10:00 PM