locked
SSIS FlatFile Source Datetime Column Format RRS feed

  • Question

  • Hi All,

    Our project has a datetime source column from a Flatfile.

    When I was reviewing the file, I found two values

    1. 20/06/2013
    2. 06/20/2013

    To my surprise SSIS extracted both of them as 20/06/2013(DD/MM/YYYY). I'm afraid if this poor data quality issue will go unnoticed.(Though I'm happy that SSIS is intelligent enough!)

    Does anyone know how to handle this kind of scenario? Can we specify the input date format?


    Rajkumar

    Tuesday, August 6, 2013 8:04 AM

Answers

  • Hi guys,

    Agreeing on a date format will not solve this problem.  I've ran into similar issues.  This usually happens when the file is not system generated, but users create the files based on templates.  Even though the specification might specify dd/mm/yyyy as the data format, there is nothing that stops a user from entering a date in a different format.

    The issue is that you want SSIS to fail when it encounters different date formats in a file, but it doesn't.

    This can easily be resolved by applying a standard approach to importing text files into your staging database.  In your flat file connection manager, set the data type for all the fields to a large enough string data type, e.g. DT_STR 255.  Following the Flat File Source Adapter will be a Script Component that validates the data types.  After validation, the data will be inserted into the staging database with an OLE DB destination adapter. The data flow is shown below.

    data flow

    The code in the script component is shown below.

    Script Component Code

    The script component uses the TryParseExact method to parse the string date to a datetime data type. The exact format can be specified (dd-MM-yyyy in this example). If it date string is not in that format, it raises an error, which in turn causes the data flow task to fail.

    It would be nice if SSIS could allow you to explicitly specify the input format in the flat file connection manager, but in the meantime,  I use this framework. I follow similar logic with each field in my file (integers, decimals, dates, etc). 

    I hope this was helpful. Let me know if I can email you my package.

    Regards,


    Marius  (Microsoft BI Solutions Architect)
    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    • Proposed as answer by Mike Yin Sunday, August 11, 2013 2:18 PM
    • Marked as answer by Mike Yin Wednesday, August 14, 2013 4:04 PM
    Thursday, August 8, 2013 9:29 AM
  • Hi Marius,

    I agree with you. Agreeing upon the data means there is no need for data quality tools. I had used a similar solution. You can find it here.

    http://social.technet.microsoft.com/wiki/contents/articles/18943.ssis-flat-file-source-datetime-column-format-issue-solution.aspx


    Rajkumar

    • Proposed as answer by Mike Yin Sunday, August 11, 2013 2:18 PM
    • Marked as answer by Mike Yin Wednesday, August 14, 2013 4:04 PM
    Friday, August 9, 2013 1:54 PM

All replies

  • SSIS simply sampled several top records and made an assumption. Which is bad, especially with dates.

    The date simply must be expressed as a datetime datatype on the source side or converted using the data conversion tranform. The conversion or metadata will be set based on the machine locale.

    You then must get an error on trying to convert 20/06/2013 if you are under a North American locale.


    Arthur My Blog

    Tuesday, August 6, 2013 2:01 PM
  • Thanks Arthur. But that is the issue. I do not get any error, though I use a Data Conversion Transformation.

    Rajkumar

    Wednesday, August 7, 2013 4:54 AM
  • HI,

    You can set the data type of column in Flatfile connrction manager properties.So that it will take prorper datatype is used for given columns.

    Sonal Totla

    Wednesday, August 7, 2013 12:05 PM
  • Rajkumar,

    I agree, not very nice that the source adapter does not fail.  As far as I can figure out, SSIS converts the dates to the format spesified by your regional setting.  If this fails, it implicitly tries to fix it by swapping day and month.

    The only way to catch this would be to use DQS and build a rule to check for the date format.  Alternatively, you could also do this with standard SSIS tasks.

    Kind regards
    Marius

    Wednesday, August 7, 2013 1:57 PM
  • Attempting processing data that is not proper is a danger. E.g. how would you know whether 8/7/2013 was 7 th of Aug or 8th of July?

    I suggest you do not bend the input but rather ask for the data be sanitized.


    Arthur My Blog

    Wednesday, August 7, 2013 2:56 PM
  • Arthur, is right. You should ask for date format that is agreed upon one style format, MM/DD/YYYY or DD/MM/YYYY. This way it ensures  that input data is clean and correct. And in case SSIS converted implicitly 20/06/2013 to 06/20/2013, it is not correct...


    Thanks, hsbal

    Wednesday, August 7, 2013 3:51 PM
  • Hi guys,

    Agreeing on a date format will not solve this problem.  I've ran into similar issues.  This usually happens when the file is not system generated, but users create the files based on templates.  Even though the specification might specify dd/mm/yyyy as the data format, there is nothing that stops a user from entering a date in a different format.

    The issue is that you want SSIS to fail when it encounters different date formats in a file, but it doesn't.

    This can easily be resolved by applying a standard approach to importing text files into your staging database.  In your flat file connection manager, set the data type for all the fields to a large enough string data type, e.g. DT_STR 255.  Following the Flat File Source Adapter will be a Script Component that validates the data types.  After validation, the data will be inserted into the staging database with an OLE DB destination adapter. The data flow is shown below.

    data flow

    The code in the script component is shown below.

    Script Component Code

    The script component uses the TryParseExact method to parse the string date to a datetime data type. The exact format can be specified (dd-MM-yyyy in this example). If it date string is not in that format, it raises an error, which in turn causes the data flow task to fail.

    It would be nice if SSIS could allow you to explicitly specify the input format in the flat file connection manager, but in the meantime,  I use this framework. I follow similar logic with each field in my file (integers, decimals, dates, etc). 

    I hope this was helpful. Let me know if I can email you my package.

    Regards,


    Marius  (Microsoft BI Solutions Architect)
    Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.

    • Proposed as answer by Mike Yin Sunday, August 11, 2013 2:18 PM
    • Marked as answer by Mike Yin Wednesday, August 14, 2013 4:04 PM
    Thursday, August 8, 2013 9:29 AM
  • Hi Marius,

    I agree with you. Agreeing upon the data means there is no need for data quality tools. I had used a similar solution. You can find it here.

    http://social.technet.microsoft.com/wiki/contents/articles/18943.ssis-flat-file-source-datetime-column-format-issue-solution.aspx


    Rajkumar

    • Proposed as answer by Mike Yin Sunday, August 11, 2013 2:18 PM
    • Marked as answer by Mike Yin Wednesday, August 14, 2013 4:04 PM
    Friday, August 9, 2013 1:54 PM