none
SSIS excel data source numeric values returned as null RRS feed

  • Question

  • Hello everyone, long time no see!, I have a problem and was hoping someone can help me:
    I'm using SSIS 2005 Enterprise edition,  I'm creating a package that reads an excel (xls) file using the "excel source" component, and it dumps the data into an OLEDB destination (a sql server).
    When I drag the excel source component and create the excel connection to my file the component automatically reads the columns and their datatypes.

    The problem is that I have a column which has numeric data and the package uploads as NULL every number that starts with a zero. (note: in excel this column is formatted as "text", despite it has only numbers, because it's the only way excel maintains the left sided zeros).
    So I checked the data types by right clicking the excel source component -> show advanced editor and my surprise is that this column's data type is detected as double-precision float, and it doesn't let me change it.

    I tried the answer posted here: http://devselekta.blogspot.com/2007/09/ssis-excel-data-source-values-returned.html but it only works when the first row of data has a number beginning with zero on this column.
    Someone knows how to get the data imported correctly? Thanks in advance.
    Odin_Dark
    Friday, May 8, 2009 10:30 PM

Answers

  • You might want to check out this page in Books Online: http://msdn.microsoft.com/en-us/library/ms141683.aspx

    You've nailed the problem - the Excel driver is scanning the first few rows to determine the type of the column. If the data has no leading zeroes, it interprets it as numeric, and then NULLs out values that have leading zeroes because these appear as text to the driver. Excel defaults to only using the first 8 rows of data to do this guessing. You can increase the number of rows it uses by modifying the TypeGuessRows key in the registry under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel.

    Hope this helps.


    John Welch | www.mariner-usa.com | www.agilebi.com | ssisUnit.codeplex.com
    • Marked as answer by Odin_dark Monday, May 11, 2009 10:46 PM
    Monday, May 11, 2009 9:54 PM
    Moderator

All replies

  • Hi,

    Use DataConversion tool. The Data Conversion transformation converts the data in an input column to a different data type and then copies it to a new output column.

    Change datatype of that filed to "DTS_Str" using DataConversion. 
    So now your order is change in DataFlowTask-

    1. ExcelSource
    2. DataConversion
    3. OLEDB Destination 

    Goto this url for more details: http://msdn.microsoft.com/en-us/library/ms141706.aspx
    Kapil Khalas
    Saturday, May 9, 2009 7:13 AM
  • Hi Kapil, first of all thanks for your response.
    Unfortunately I believe that the problem is on the Excel Source component, because when I add it and create the connection there is a "preview" button, and there all numeric values that start with zero are previewed as null, which means that the problem is that the excel source component reads the values as null from the very beggining, so when the data arrives to the DataConversion this data is already null.
    I tried your suggestion and it didn't work, It's still giving null values.

    I appreciate your response, thanks in advance!



    Odin_Dark
    Monday, May 11, 2009 3:10 PM
  • I was able to view data from excel using Excel Source as expected. The excel columns were type text with values having leading zeroes. The only thing I can thing of is that you have the incorrect version of Excel selected in the connection manager.
    -Will
    Monday, May 11, 2009 3:48 PM
  • Thanks wmlpez,
    I'm connecting to an excel 2003 file (.xls) using the following connection string:
    Provider=Microsoft.Jet.OLEDB.4.0;Data Source=C:\MyExcel.xls;Extended Properties="Excel 8.0;HDR=Yes;IMEX=1";

    The curious thing is that, if the first rows of data begin with zero, the values are uploaded correctly, but if the first rows of data does NOT have zeroes, then the data with zeroes at beginning are uplaoded as null, example:

    .....this is uploaded correclty (no nulls because first rows have zeroes at beginning)
    Row # | Numeric Data
    ------------------------------
    1        | 01548
    2        | 00157
    3        | 78996
    4        | 73956
    5        | 01547


    ....this is uploaded INCORRECTLY (null values are read where zeroes are at beginning)
    Row # | Numeric Data
    ------------------------------
    1        | 81548
    2        | 89157
    3        | 78996
    4        | 48956
    5        | 71547
    6        | 91547
    7        | 11548
    8        | 91547
    9        | 41547
    10      | 71543
    11      | 01547
    12      | 00595


    It looks like SSIS is only reading data correctly when first N rows of data have zero at beginning....
    Do you know what could be possibly happenning? If you can have the right behavior, can you send me an example of how you  do it? (or a zipped project?)
    Thanks in advance!






    Odin_Dark
    • Proposed as answer by Camilosser Tuesday, March 10, 2015 7:19 PM
    Monday, May 11, 2009 8:47 PM
  • You might want to check out this page in Books Online: http://msdn.microsoft.com/en-us/library/ms141683.aspx

    You've nailed the problem - the Excel driver is scanning the first few rows to determine the type of the column. If the data has no leading zeroes, it interprets it as numeric, and then NULLs out values that have leading zeroes because these appear as text to the driver. Excel defaults to only using the first 8 rows of data to do this guessing. You can increase the number of rows it uses by modifying the TypeGuessRows key in the registry under HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Jet\4.0\Engines\Excel.

    Hope this helps.


    John Welch | www.mariner-usa.com | www.agilebi.com | ssisUnit.codeplex.com
    • Marked as answer by Odin_dark Monday, May 11, 2009 10:46 PM
    Monday, May 11, 2009 9:54 PM
    Moderator
  • Hello jwelch,
    Thanks a lot for your response, in fact that's the problem, and now I understand how IMEX & guessrows works, since I don't really like to configure the registry on the server because maintenance will be a little complicated (and guessing 65000 rows to assure finding the data type will be very slow), I did the following:

    1-I set the IMEX=1 property on my connection string on the connection manager
    2-On the connection manager, I unchecked the "first row has column names" option (since my titles will always be text, the driver will recognize text
       always and treat all data as such)
    3-I remove manually (sql query) the first row of data since it has only the column names.
    4-I do manually any data type conversion I need.

    By the way, I don't really understand why microsoft always try to guess data types, instead of letting us choose which is the data type of the column on the excel source component and respect that.....   I really really don't see why guessing is the only way...

    If I could put a suggestion to microsoft I would suggest that.

    Thanks a lot for your amazing support guys!



    Odin_Dark
    Monday, May 11, 2009 10:46 PM
  • My understanding of the issue is that Excel really doesn't have typed data, not in the same way as a database does. The JET driver that is used to query Excel does it's own guessing about the types, as described above. The SSIS team's not really involved in that - a different group manages the JET provider. And since they are not really planning any updates to it, I doubt we'll see anything different in the near term.
    John Welch | www.mariner-usa.com | www.agilebi.com | ssisUnit.codeplex.com
    Tuesday, May 12, 2009 2:27 PM
    Moderator
  • Coonection manager does not let me change the properties but i changed expresion like this

    "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" + @[User::FileName] + ";Extended Properties=\"EXCEL 8.0;HDR=YES;IMEX=1\";"

    Column which has datalike this 23456A it work fine but column which has 123456 it show like 2.01375e+006 but actual number is like this 2012449.

    Please help me to solve this issue ASAP

    • Proposed as answer by Radhai Thursday, January 3, 2013 9:09 AM
    • Unproposed as answer by Radhai Thursday, January 3, 2013 9:10 AM
    Wednesday, October 20, 2010 6:10 PM
  • please check the following link.

    http://support.microsoft.com/kb/194124/en-us

     

         Set Db = OpenDatabase("C:\Temp\Book1.xls", _
                False, True, "Excel 8.0; HDR=NO; IMEX=1;")
    


    Regards, Ahmad Elayyan
    • Proposed as answer by Tech_consultant Thursday, October 18, 2012 10:50 AM
    Tuesday, September 27, 2011 10:16 PM
  • Coonection manager does not let me change the properties but i changed expresion like this

    "Provider=Microsoft.Jet.OLEDB.4.0; Data Source=" + @[User::FileName] + ";Extended Properties=\"EXCEL 8.0;HDR=YES;IMEX=1\";"

    Column which has datalike this 23456A it work fine but column which has 123456 it show like 2.01375e+006 but actual number is like this 2012449.

    Please help me to solve this issue ASAP

    This connection string value helped me solve my issue..Thanks !

    --------------------------- Radhai Krish | Golden Age is no more far | --------------------------

    Thursday, January 3, 2013 9:10 AM
  • thanks a lot for this...it worked for me

    Tuesday, June 4, 2013 5:02 PM
  • try this connection string in the excel connection manager properties

    Provider= Microsoft.Jet.OLEDB.4.0;Data Source=C:\path\myfyle.xls; Extended Properties="Excel 8.0;IMEX=1;HDR=YES;TYPEGUESSROWS=0;IMPORTMIXEDTYPES=TEXT\"

    Friday, June 28, 2013 3:37 PM
  • That doesn't work, it simply reverses the issue.  So if I have a column with "A" in the first 10 rows and numbers in the rest it either puts NULLs in for the first 10 rows (if the TypeGuessRows is > 10) or NULLs for the numbers (if the TypeGuessRows is <= 10).
    Monday, November 11, 2013 3:46 PM
  • After many, many unsuccessful attempted solutions involving the advanced editor and the TypeGuessRows registry key this worked for me.  Make sure the IMEX=1 is inside the same quotation marks as the "EXCEL 8.0;HDR=YES" as these are all the extended properties.
    • Proposed as answer by Hizzo K Friday, November 29, 2013 10:27 AM
    Monday, November 11, 2013 3:53 PM
  • I just solved this problem by changing (in the connection manager) the Excel version from 97-2003 to 2007.

    Thursday, November 12, 2015 3:23 PM
  • try this connection string in the excel connection manager properties

    Provider= Microsoft.Jet.OLEDB.4.0;Data Source=C:\path\myfyle.xls; Extended Properties="Excel 8.0;IMEX=1;HDR=YES;TYPEGUESSROWS=0;IMPORTMIXEDTYPES=TEXT\"

    Thanks fernava79!  This worked great for me!

    Monday, September 11, 2017 7:24 PM
  • After many hours of testing and looking for a resolution, the above solution worked for me. I alter the ConnectionString for my Excel Source in SSIS using VS 2017.

    The solution worked and when previewing the data I can now see the data; prior to the change was being displayed as NULL values. In Excel a warning was being displayed on the cell;

    Will continue with testing and will need to see if this change will work the OLE DB Source connection.

    Cheers!

    Monday, June 25, 2018 8:42 PM