none
Bulk inserting a csv in SQL using a formatfile to remove double quotes RRS feed

  • Question

  • I have a CSV file in the following format:

    data, data, "timestamp", data, data, data, data, data

    I need to remove the double quotes from around the timestamp data, then insert it into the table as a DATETIME data type.

    After researching formatfiles, I have come up with this:

    10.0
    8
    1   SQLCHAR 0   12  ","     1   Data        SQL_Latin1_General_CP1_CI_AS
    2   SQLCHAR 0   12  ","     2   Data        SQL_Latin1_General_CP1_CI_AS
    3   SQLCHAR 0   26  "",""   3   Timestamp   SQL_Latin1_General_CP1_CI_AS
    4   SQLCHAR 0   41  ","     4   Data        SQL_Latin1_General_CP1_CI_AS
    5   SQLCHAR 0   41  ","     5   Data        SQL_Latin1_General_CP1_CI_AS
    6   SQLCHAR 0   41  ","     6   Data        SQL_Latin1_General_CP1_CI_AS
    7   SQLCHAR 0   5   ","     7   Data        SQL_Latin1_General_CP1_CI_AS
    8   SQLCHAR 0   12  "0x0a"  6   Data        SQL_Latin1_General_CP1_CI_AS

    where the 3rd row, Timestamp, is the item with the double quotes around it.

    Attempting to use this file in a Bulk Insert results in the error message "Msg 4823, Level 16, State 1, Line 2 Cannot bulk load. Invalid column number in the format file."

    I've tried ' "," ', and "\",\"" instead of "","" with no success. In most threads I've also seen examples of using a dummy column to strip out the first set of quotes, but that is when every entity in the csv file is surrounded with quotes.

    Is there a way I can alter my formatfile to do what I need?

    Tuesday, November 27, 2012 9:50 PM

All replies

  • How about doing it the simple way: SSIS Import/Export Wizard:

    http://www.sqlusa.com/bestpractices/ssis-wizard/

    I just don't understand why so many software engineers struggle & suffer & waste time with bcp/bulk insert?


    Kalman Toth SQL 2008 GRAND SLAM
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012


    Tuesday, November 27, 2012 10:06 PM
    Moderator
  • Thanks for the advice Kalman. I should have mentioned that I intend to create the table using a stored procedure, so using the SSIS Wizard is not an option unfortunately.
    Tuesday, November 27, 2012 10:22 PM
  • I see... It looks like bcp is here to stay for another 100 years.... most people don't even know what it means: bulk copy program...

    Why don't you use bcp to make a format file for you?  Article:

    http://msdn.microsoft.com/en-us/library/ms191516.aspx


    Kalman Toth SQL 2008 GRAND SLAM
    New Book: Beginner Database Design & SQL Programming Using Microsoft SQL Server 2012



    Tuesday, November 27, 2012 10:36 PM
    Moderator
  • Have a look at this:

    http://blogs.msdn.com/b/sqlserverfaq/archive/2010/02/04/how-to-remove-unwanted-quotation-marks-while-importing-a-data-file.aspx

    Chuck Pedretti | Magenic – North Region | magenic.com

    I did read that article, but the "\",\"" did not work. I get "Cannot bulk load. Invalid destination table column number for source column 3 in the format file."
    Tuesday, November 27, 2012 10:37 PM
  • From the sample you have posted, you should have this format file:

    10.0
    8
    1   SQLCHAR 0   12  ","     1   Data        SQL_Latin1_General_CP1_CI_AS
    2   SQLCHAR 0   12  ","     2   Data        SQL_Latin1_General_CP1_CI_AS
    3   SQLCHAR 0   26  ",\""   3   Timestamp   SQL_Latin1_General_CP1_CI_AS
    4   SQLCHAR 0   41  "\","   4   Data        SQL_Latin1_General_CP1_CI_AS
    5   SQLCHAR 0   41  ","     5   Data        SQL_Latin1_General_CP1_CI_AS
    6   SQLCHAR 0   41  ","     6   Data        SQL_Latin1_General_CP1_CI_AS
    7   SQLCHAR 0   5   ","     7   Data        SQL_Latin1_General_CP1_CI_AS
    8   SQLCHAR 0   12  "\r\n"  6   Data        SQL_Latin1_General_CP1_CI_AS

    That is, the delimiter for the second field is ," and the timestamp field is terminated by ",

    I also changed the delimiter for the last line. Assuming that this is a Windows file, you should have \r\n. if it is indeed a Unix file, you should keep \n as you have now.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Tuesday, November 27, 2012 10:40 PM
  • IF OBJECT_ID('#tmp') IS NOT NULL DROP TABLE #tmp
    CREATE TABLE #tmp ( a INT, b INT, c DATETIME, d INT, e INT, f INT, g INT, h INT )
    GO
    
    BULK INSERT #tmp
    FROM 'c:\temp\temp.csv'
    WITH ( FORMATFILE = 'c:\temp\format.txt' )
    GO
    
    SELECT * FROM #tmp

    I got this to work with the following sample data:

    1,2,"1 Jan 2012",4,5,6,7,8
    9,10,"1 Apr 2012",12,13,14,15,16
    17,18,"28 Feb 2012",19,20,21,22,23

    This XML format file:

    <?xml version="1.0"?>
    <BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
    	<RECORD>
    		<FIELD ID='1' xsi:type='CharTerm' TERMINATOR=',' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='2' xsi:type='CharTerm' TERMINATOR=',"' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='3' xsi:type='CharTerm' TERMINATOR='",' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='4' xsi:type='CharTerm' TERMINATOR=',' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='5' xsi:type='CharTerm' TERMINATOR=',' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='6' xsi:type='CharTerm' TERMINATOR=',' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='7' xsi:type='CharTerm' TERMINATOR=',' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    		<FIELD ID='8' xsi:type='CharTerm' TERMINATOR='\r\n' MAX_LENGTH='50' COLLATION='Latin1_General_CI_AS'/>
    	</RECORD>
    	<ROW>
    		<COLUMN SOURCE='1' NAME='a' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='2' NAME='b' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='3' NAME='c' xsi:type='SQLDATETIME'/>
    		<COLUMN SOURCE='4' NAME='d' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='5' NAME='e' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='6' NAME='f' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='7' NAME='g' xsi:type='SQLVARYCHAR'/>
    		<COLUMN SOURCE='8' NAME='h' xsi:type='SQLVARYCHAR'/>
    	</ROW>
    </BCPFORMAT>

    And this script:

    Tuesday, November 27, 2012 10:43 PM
  • I did read that article, but the "\",\"" did not work. I get "Cannot bulk load. Invalid destination table column number for source column 3 in the format file."

    So how does the CREATE TABLE statement for your table look like? Does your table have eight columns?


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Wednesday, November 28, 2012 8:41 AM
  • Hi,

    Please take a look on the url hope it will work fine:

    http://www.codeproject.com/Articles/439843/Handling-BULK-Data-insert-from-CSV-to-SQL-Server


    Ahsan Kabir Please remember to click Mark as Answer and Vote as Helpful on posts that help you. This can be beneficial to other community members reading the thread. http://www.aktechforum.blogspot.com/

    Wednesday, November 28, 2012 9:00 AM
  • I did read that article, but the "\",\"" did not work. I get "Cannot bulk load. Invalid destination table column number for source column 3 in the format file."

    So how does the CREATE TABLE statement for your table look like? Does your table have eight columns?


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    CREATE TABLE [dbo].[TABLE] 
    (ID INT,
     TagId INT NOT NULL,
     DayTime DATETIME NOT NULL,
     X DECIMAL(18,5) NOT NULL,
     Y DECIMAL(18,5) NOT NULL,
     Z DECIMAL(18,5) NOT NULL,
     Status TINYINT NOT NULL,
     AssetID INT DEFAULT NULL,
     )

    I tried your suggestion Erland:

    3   SQLCHAR 0   26  ",\""   3   Timestamp   SQL_Latin1_General_CP1_CI_AS
    4   SQLCHAR 0   41  "\","   4   Data        SQL_Latin1_General_CP1_CI_AS
    But I get "Cannot bulk load because the file "C:\...\FormatFile.txt" could not be read. Operating system error code (null)."

    Wednesday, November 28, 2012 2:09 PM
  • So there are 8 eight columns in the table good.

    But I get "Cannot bulk load because the file "C:\...\FormatFile.txt" could not be read. Operating system error code (null)."

    Make sure that there is a line break after the last line. Also make sure the spaces between the fields are real spaces and not no-break spaces. A copy-and-paste from a forum post can induce that. And BCP is a very sensitive fellow.

    You can replace hard spaces with a find-replace dialog in your favourite editor. To enter a no-break space in the Find box, press ALT and keep it pressed and type 0160 on the numeric keypad.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se
    Wednesday, November 28, 2012 2:48 PM