Answered by:
SSIS FlatFile Source Datetime Column Format

Question
-
Hi All,
Our project has a datetime source column from a Flatfile.
When I was reviewing the file, I found two values
- 20/06/2013
- 06/20/2013
To my surprise SSIS extracted both of them as 20/06/2013(DD/MM/YYYY). I'm afraid if this poor data quality issue will go unnoticed.(Though I'm happy that SSIS is intelligent enough!)
Does anyone know how to handle this kind of scenario? Can we specify the input date format?
Rajkumar
Tuesday, August 6, 2013 8:04 AM
Answers
-
Hi guys,
Agreeing on a date format will not solve this problem. I've ran into similar issues. This usually happens when the file is not system generated, but users create the files based on templates. Even though the specification might specify dd/mm/yyyy as the data format, there is nothing that stops a user from entering a date in a different format.
The issue is that you want SSIS to fail when it encounters different date formats in a file, but it doesn't.
This can easily be resolved by applying a standard approach to importing text files into your staging database. In your flat file connection manager, set the data type for all the fields to a large enough string data type, e.g. DT_STR 255. Following the Flat File Source Adapter will be a Script Component that validates the data types. After validation, the data will be inserted into the staging database with an OLE DB destination adapter. The data flow is shown below.
The code in the script component is shown below.
The script component uses the TryParseExact method to parse the string date to a datetime data type. The exact format can be specified (dd-MM-yyyy in this example). If it date string is not in that format, it raises an error, which in turn causes the data flow task to fail.
It would be nice if SSIS could allow you to explicitly specify the input format in the flat file connection manager, but in the meantime, I use this framework. I follow similar logic with each field in my file (integers, decimals, dates, etc).
I hope this was helpful. Let me know if I can email you my package.
Regards,
Marius (Microsoft BI Solutions Architect)
Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.Thursday, August 8, 2013 9:29 AM -
Hi Marius,
I agree with you. Agreeing upon the data means there is no need for data quality tools. I had used a similar solution. You can find it here.
Rajkumar
Friday, August 9, 2013 1:54 PM
All replies
-
SSIS simply sampled several top records and made an assumption. Which is bad, especially with dates.
The date simply must be expressed as a datetime datatype on the source side or converted using the data conversion tranform. The conversion or metadata will be set based on the machine locale.
You then must get an error on trying to convert 20/06/2013 if you are under a North American locale.
Arthur My Blog
Tuesday, August 6, 2013 2:01 PM -
Thanks Arthur. But that is the issue. I do not get any error, though I use a Data Conversion Transformation.
Rajkumar
Wednesday, August 7, 2013 4:54 AM -
HI,
You can set the data type of column in Flatfile connrction manager properties.So that it will take prorper datatype is used for given columns.
Sonal Totla
Wednesday, August 7, 2013 12:05 PM -
Rajkumar,
I agree, not very nice that the source adapter does not fail. As far as I can figure out, SSIS converts the dates to the format spesified by your regional setting. If this fails, it implicitly tries to fix it by swapping day and month.
The only way to catch this would be to use DQS and build a rule to check for the date format. Alternatively, you could also do this with standard SSIS tasks.
Kind regards
Marius- Edited by Marius van Schalkwyk Wednesday, August 7, 2013 1:58 PM
Wednesday, August 7, 2013 1:57 PM -
Attempting processing data that is not proper is a danger. E.g. how would you know whether 8/7/2013 was 7 th of Aug or 8th of July?
I suggest you do not bend the input but rather ask for the data be sanitized.
Arthur My Blog
- Proposed as answer by Tim MitchellMVP Wednesday, August 7, 2013 4:43 PM
Wednesday, August 7, 2013 2:56 PM -
Arthur, is right. You should ask for date format that is agreed upon one style format, MM/DD/YYYY or DD/MM/YYYY. This way it ensures that input data is clean and correct. And in case SSIS converted implicitly 20/06/2013 to 06/20/2013, it is not correct...
Thanks, hsbal
Wednesday, August 7, 2013 3:51 PM -
Hi guys,
Agreeing on a date format will not solve this problem. I've ran into similar issues. This usually happens when the file is not system generated, but users create the files based on templates. Even though the specification might specify dd/mm/yyyy as the data format, there is nothing that stops a user from entering a date in a different format.
The issue is that you want SSIS to fail when it encounters different date formats in a file, but it doesn't.
This can easily be resolved by applying a standard approach to importing text files into your staging database. In your flat file connection manager, set the data type for all the fields to a large enough string data type, e.g. DT_STR 255. Following the Flat File Source Adapter will be a Script Component that validates the data types. After validation, the data will be inserted into the staging database with an OLE DB destination adapter. The data flow is shown below.
The code in the script component is shown below.
The script component uses the TryParseExact method to parse the string date to a datetime data type. The exact format can be specified (dd-MM-yyyy in this example). If it date string is not in that format, it raises an error, which in turn causes the data flow task to fail.
It would be nice if SSIS could allow you to explicitly specify the input format in the flat file connection manager, but in the meantime, I use this framework. I follow similar logic with each field in my file (integers, decimals, dates, etc).
I hope this was helpful. Let me know if I can email you my package.
Regards,
Marius (Microsoft BI Solutions Architect)
Please use Marked as Answer if my post solved your problem and use Vote As Helpful if a post was useful.Thursday, August 8, 2013 9:29 AM -
Hi Marius,
I agree with you. Agreeing upon the data means there is no need for data quality tools. I had used a similar solution. You can find it here.
Rajkumar
Friday, August 9, 2013 1:54 PM