none
SSIS Wizard cannot import text columns longer then 255 using Excel source

    General discussion

  • (Applies to SQLServer 2005 SP1)

    We have found that using the SSIS "Import and Export Wizard" using the "Microsoft Excel" data source that there appears to be a maximum column length of 255 characters for any row.

    Even when defining the destination table columns as nvarchar(4000), the wizard fails with the errors shown below.

    We have found no workaround except manually changing the imput data. There doesn't appear to be any "Advanced" options for the Excel importer as there are for the flat-text importer. So, no question here, just posting the bug so that *next* time someone searches the web for an answer, this post comes up


    Messages
    Error 0xc020901c: Data Flow Task: There was an error with output column "English String" (18) on output "Excel Source Output" (9). The column status returned was: "Text was truncated or one or more characters had no match in the target code page.".
    (SQL Server Import and Export Wizard)

    Error 0xc020902a: Data Flow Task: The "output column "English String" (18)" failed because truncation occurred, and the truncation row disposition on "output column "English String" (18)" specifies failure on truncation. A truncation error occurred on the specified object of the specified component.
    (SQL Server Import and Export Wizard)

    Error 0xc0047038: Data Flow Task: The PrimeOutput method on component "Source - Sheet1$" (1) returned error code 0xC020902A. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
    (SQL Server Import and Export Wizard)

    Error 0xc0047021: Data Flow Task: Thread "SourceThread0" has exited with error code 0xC0047038.
    (SQL Server Import and Export Wizard)

    Error 0xc0047039: Data Flow Task: Thread "WorkThread0" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
    (SQL Server Import and Export Wizard)

    Error 0xc0047021: Data Flow Task: Thread "WorkThread0" has exited with error code 0xC0047039.
    (SQL Server Import and Export Wizard)




    edit: After searching further this is documented under "Excel Source" in BOL which provides a registry-based workaround.  I guess the issue is that the wizard considers truncation to be  a 'fail' case and there's no easy way to override this behaviour, specify the column types nor determine which line is in error)

    Truncated text. When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples. If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key.
    )
    Wednesday, November 29, 2006 2:05 AM

All replies

  • Something I flashed on looking for this answer said something similar to set the {} key to 0 to make it scan all rows.  May have applied either to BULK IMPORT command lines or registry entries.  In any case, I set the above TypeGuessRows entry to 0 (I'm not as cautious with the registry as I should be-but it was on the PC with Management Studio not the SQL SERVER itself).  After closing completely out of Excel (probably N/A), SQL Server 2005 and reopening SQL, I then had to recreate my SSIS package.  My import of a field with 300+ characters worked.  More testing to see if my limits apply to everything.  I have files with several thousand lines and fields containing 30K characters yet to test. .dt

     

    Thursday, June 05, 2008 9:30 PM
  • Hi,

    i had some problem, i looked and i found the solution :

    http://msdn.microsoft.com/fr-fr/library/ms141683.aspx

    (u can use google translator for translate French to english)
    Wednesday, October 01, 2008 2:10 PM
  • Or just translate the "fr-fr" to "en-us" in the URL: http://msdn.microsoft.com/en-us/library/ms141683.aspx

     

    The relevant section:

     

    Truncated text. When the driver determines that an Excel column contains text data, the driver selects the data type (string or memo) based on the longest value that it samples. If the driver does not discover any values longer than 255 characters in the rows that it samples, it treats the column as a 255-character string column instead of a memo column. Therefore, values longer than 255 characters may be truncated. To import data from a memo column without truncation, you must make sure that the memo column in at least one of the sampled rows contains a value longer than 255 characters, or you must increase the number of rows sampled by the driver to include such a row. You can increase the number of rows sampled by increasing the value of TypeGuessRows under the HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel registry key. For more information, see PRB: Transfer of Data from Jet 4.0 OLEDB Source Fails w/ Error.

    Interesting... you learn something new everyday!

     

    Thanks, nizaration!

    Wednesday, October 01, 2008 3:32 PM
  • Hi a.dt,
    Does changing the registry for TypeGuessRows to 0 works for you for field with more than 300+ characters? Or do you know if it works for around 2000 char?


    Thanks for your help,
    Clarice
    Friday, May 01, 2009 8:50 AM
  • I've encountered this error, but am a bit confused by the implementation of the solution. If I have an excel file which I know will be changing and those updates will be loaded on a set schedule - how am I to go about augmenting the registry?

    So in the original load I know that there is a comment field with approx 265 characters in it. This of courses causes the above error. This record is 2061 in the excel file. Am I to increase my TypeGuessRows to 2100?

    Or am I suppose to just move that record to #1? If this is true then the problem takes on a new form, as I do not control the master file. Therefore I can move this record for the initial load, but the master file will then get overwritten on subsequent loads.

    What I'm working on is a very time limited piece of work, so I am not treating this as an SCD, I'm completely truncating the dest table and re-writing all data over top of it.

    I could manually change the record order before each load, but I'd rather not.

    any other thoughts?

    Thanks!

    Friday, July 31, 2009 6:11 PM
  • Read up on Douglas Laudenschlager's blog: http://dougbert.com/blogs/tags/Excel/default.aspx  Great articles there.

    TypeGuessRows only takes numbers from 0 to 16, IIRC.  Zero is the number you want, which basically tells the Excel driver to check every row, not the "first" bunch.
    Todd McDermid's Blog
    Friday, July 31, 2009 6:28 PM
  • Besides TypeGuessRows registry, FYI there are other tweaks (connection string extended properties) which the canned Excel Source doesn't expose in the Import Export wizard.

    As BOL points out… http://support.microsoft.com/kb/194124 This article tells us we can add these tweaking switches to the properties, but I found its only possible from the Jet Extended Properties, which the Excel Source in SSIS doesn't expose.

    You can get to Excel from Import/Export Wizard using the Jet provider like this:

    1. Open the Import Export Wizard (Right click > All Tasks > import Data)
    2. Pick the source is “Microsoft Access”
    3. Point to the .xls file in the File Name box (browse)
    3A. During the browse, Change the File name filter to [All files (*.*)] to find your .xls file.. Click “Open” to select the .xls.
    4. Click the Advanced button.
    4A. Click the ALL tab of the Data Link Properties
    4B. Double Click the “Extended Properties” item and type in the string “Excel 8.0;HDR=NO;IMEX=1;” without quotes
    As per http://www.connectionstrings.com/excel
    "HDR=Yes;" indicates that the first row contains columnnames, not data. "HDR=No;" indicates the first row is data.
    "IMEX=1;" tells the driver to always read "intermixed" (numbers, dates, strings etc) data columns as text.
    To query the worksheet in Excel, the query will be something like SELECT * FROM [sheet1$]

    5. Next
    6. Pick the SQL Server (SQL Native Client) as the destination
    7. Next.
    8. Click the Edit Mappings button and you will see they are all nvarchar(255), but I think the size can be adjusted.
    9. Next - save the package if you want, or just Execute it.

    Thx, Jason
    Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance
    Saturday, August 01, 2009 4:17 AM
  • Jason.  will this work on SQL 2008 on a 64-bit Vista Home Premium system (no Jet 4.0) ?  I am having a VERY hard time getting more than 255 characters to import from any text fields in Excel.  Whatever method I use, the 'source column' is ALWAYS VarChar (255), and only 255 characters get imported.  I have changed my TypeGuessRows to 0, and i tried the 'pretend it's Access' method you described above.  All to no avail.
    Friday, August 07, 2009 9:56 PM
  • For myself, if I moved a record which contained more than 255 chars to row 1, and then redid the data flow (so as to ensure the meta data was all up to speed). When SSIS created the target table it create that field as type NTEXT and it worked perfectly. I have not played around with the registry solution as of yet. I am also waiting on installing Office 2007 to see if the problems still persist.

    Friday, August 07, 2009 10:08 PM
  • This change worked for me.  I did not have to "re-create my SSIS package", just made the change, rebooted, and then it worked.  Sweet!

    Monday, April 19, 2010 6:02 PM
  • I know this is a bit old, but maybe it can help someone.  On a 64-bit Windows Server 2008 R2 machine, I found the TypeGuessRows value in HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\4.0.  The value also exists in HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Jet\3.5 for me, so I changed both of them to 0.  For my application, it seemed to help with my initial problem, but there was data further down in the 40,000 lines I was importing that it did not properly detect the length.  I ended up padding the first dummy record to get SQL 2008 to pick up the length I wanted it to.

    Thanks.

    John

    Tuesday, July 26, 2011 3:34 PM
  • Michael Gilligan,

     

    Were you able to resolve the issue because even my situation is similar to yours.

    I am using sql server 2008 and excel 2010 and it does not change the 255 datalength. This is sooooooo irritating. I dont know why microsoft sells unfinished work or atleast provide us with a workaround.

     

    Regards,

    Simon

    Thursday, August 11, 2011 12:20 AM
  • The steps you have to take are pretty simple.

    Ensure that one of your rows with text longer than 255 characters is within the first eight rows, OR

    Change the TypeGuessRows registry key to zero (so it does a full scan of all rows).

    The Excel provider will then interpret the column as a MEMO, and deliver it to SSIS as DT_TEXT.


    Todd McDermid's Blog Talk to me now on
    Thursday, August 11, 2011 4:48 PM
  • You may also need to change the key here: HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Office\12.0\Access Connectivity Engine\Engines\Excel\TypeGuessRows which is where my machine (Windows 7 64-bit) was looking for it.
    Thursday, September 08, 2011 5:01 PM
  • This has not worked for me.  I'm importing into SQL 2005 a spreadsheet that now has a Comment column of 355 characters in length.  The servers have 'typeGuessRows' set to 0 under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel.  My SSIS are built in BIDS 2005, so I assume I don't need to change any other leaves in the registry other than this one.  I'm running them as a scheduled task in the DB as a command prompt so that I can run it with the 32-bit version of DTSEXEC.

    The Excel Source in my SSIS package has the Comment field as NTEXT, which I get why that happens.  I have a data conversion to convert that DT_NTEXT to DT_TEXT, then another to convert from DT_TEXT to DT_STR(1000).  I've also tried going into the advanced editor for the Excel Source and changing the field it outputs to DT_NSTR(1000), then converting DT_NSTR to DT_STR(1000).  I've tried setting the column in the destination server to Varchar(4000) and to Varchar(max), I've tried adding IMEX=1 to my connection string.  Nothing seems to work other than 'ignore truncation errors'.  

    I get various errors depending on what permutation of settings I've chosen: "Failed to retrieve long data", "cannot convert between unicode and non-unicode string data types", "Multiple-step OLE DB operation generated errors. Check each OLE DB status value, if available. No work was done."  The one thing that is true is that no work is getting done.  I don't have any control over how the spreadsheets are organized, they are sent to us from an external organization and we have little control over them.

    Can anyone please point out to me what I have been missing on this?


    • Edited by JBrune Tuesday, September 13, 2011 9:47 PM
    Tuesday, September 13, 2011 9:44 PM
  • Start by posting a new thread.  In that thread, give us some sample data - say the first ten rows of your spreadsheet.  Then tell us - for each scenario you tried:

    • What the scenario was.  ("I set typeguessrows to zero, the Excel source showed the column as NTEXT, ...)
    • What a data viewer attached to the Excel Source output looked like.
    • What errors or warnings (if any) you see executing the package.

    If you've tried several things, post up the same info for each scenario.


    Todd McDermid's Blog Talk to me now on
    Wednesday, September 14, 2011 3:25 PM
  • This worked perfectly for me.  Thank you kind sir.
    Friday, December 16, 2011 11:11 PM
  • The above discussions are helpful to conceptually understand the Excel Truncation issue. 

    SSIS: How to resolve Excel Import 255 character Truncation issue?

    http://www.singhvikash.in/2012/05/ssis-how-to-resolve-excel-import-255.html


    Vikash Kumar Singh || www.singhvikash.in

    Saturday, May 26, 2012 1:27 AM
  • I am having the same problem but with a flat file import of .csv file it will only sample first 10000 rows

    How do I increase the default?

    Wednesday, August 22, 2012 4:10 AM
  • Are you using the Flat File Source, or Excel source?

    Have you tried flat file source, comma delimited?

    Thanks, Jason


    Didn't get enough help here? Submit a case with the Microsoft Customer Support team for deeper investigation - http://support.microsoft.com/select/default.aspx?target=assistance

    Wednesday, August 22, 2012 4:19 AM
  • Brilliant! It is found in the same location on 64 bit Windows 7 as well!!
    Tuesday, January 28, 2014 3:46 PM