none
I am trying to import a Tab-delimited Text file with multiple tabs in a single column using SSIS package to sql server database. Basically the data is bleeding over to the next columns. Transformations to clean the format and load to SQL Table? RRS feed

  • Question

  • So for example i have columns:

    ID Name Address City State Zip-code Email Phone Number

    1 abc zzzzz vgse MN 85241 wsget@abc.com 5214632854

    2 sge zzzz vgse MN 85241 sge@sge.com 4569872314

    So, this is an example of the bad data rows that have multiple tabs in a single column. In Row 1 you have 3 tabs between the zip-code and Email and in row 2 you have 2 tabs in between ID and Name, 3 tabs in between zip-code and email. As a result, data is bleeding over to the column next to it and causing data_type errors. There should be a way to clean the format of the tab-delimited text file. thank you.

    Friday, September 16, 2016 5:56 PM

Answers

  • In addition to David's answer,

    Another option to use T-SQL instead of SSIS project (can be part of JOB if you need schedule). This will give you more flexibility to parse the text. BUT! Best option is probably using simple external app like C#, PowerShell, VBS and so on (as David mentioned).

    Choose your approach and we will be able to help you implement it :-)

    Q to think about: How can you know that 2 tabs are actually extra tab on the same column's value or that these are two different separators and the middle column is null? In first glance seems like your issue is far from been fit for any simple automatic solution. You should remember to cover all options in your script.

    * IN any case it will help if you can upload a sample file and post the table DDL (query to create the table)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]



    • Edited by pituachMVP Saturday, September 17, 2016 9:32 AM
    • Proposed as answer by Seif Wang Thursday, September 29, 2016 7:08 AM
    • Marked as answer by Eric__ZhangModerator Thursday, September 29, 2016 7:50 AM
    Saturday, September 17, 2016 9:30 AM
  • You have an corrupt data file.  You'll have to fix it by hand, or write a program (SSIS Script or Powershell) to fix it.  SSIS can't magically do it for you.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Friday, September 16, 2016 6:30 PM
  • Hi super_vdawg,

    According to my personal experience, fix the delimiter in Source Flat File may be a workaround.

    And as far as I know, there is a feature named “Treat consecutive delimiters as one” in Excel Text Import Wizard tools could be used to fix  the extra delimiters in Flat file. For more details, please try to reference the following steps.

    1. Create a new excel sheet and open the Excel Text Import Wizard tools(DATA>From Text). 
    2. Enable “Treat consecutive delimiters as one” in the second step. 


    3. Export the normal format data into  a new text file, and call this new text file in SSIS package.

    Regards,

    Seif


    Regards, Seif

    • Proposed as answer by Seif Wang Monday, September 26, 2016 11:54 AM
    • Edited by Seif Wang Monday, September 26, 2016 11:54 AM
    • Marked as answer by Eric__ZhangModerator Thursday, September 29, 2016 7:50 AM
    Tuesday, September 20, 2016 10:25 AM

All replies

  • You have an corrupt data file.  You'll have to fix it by hand, or write a program (SSIS Script or Powershell) to fix it.  SSIS can't magically do it for you.

    David


    David http://blogs.msdn.com/b/dbrowne/

    Friday, September 16, 2016 6:30 PM
  • When you say SSIS Script -  you mean using Script task?
    Friday, September 16, 2016 6:34 PM
  • Could be a script task to rewrite the file, or you could use a Script Transform to actually parse the rows at the beginning of a Data Flow.

    David


    David http://blogs.msdn.com/b/dbrowne/

    • Proposed as answer by Seif Wang Thursday, September 29, 2016 7:08 AM
    Friday, September 16, 2016 6:35 PM
  • In addition to David's answer,

    Another option to use T-SQL instead of SSIS project (can be part of JOB if you need schedule). This will give you more flexibility to parse the text. BUT! Best option is probably using simple external app like C#, PowerShell, VBS and so on (as David mentioned).

    Choose your approach and we will be able to help you implement it :-)

    Q to think about: How can you know that 2 tabs are actually extra tab on the same column's value or that these are two different separators and the middle column is null? In first glance seems like your issue is far from been fit for any simple automatic solution. You should remember to cover all options in your script.

    * IN any case it will help if you can upload a sample file and post the table DDL (query to create the table)


    signature   Ronen Ariely
     [Personal Site]    [Blog]    [Facebook]    [Linkedin]



    • Edited by pituachMVP Saturday, September 17, 2016 9:32 AM
    • Proposed as answer by Seif Wang Thursday, September 29, 2016 7:08 AM
    • Marked as answer by Eric__ZhangModerator Thursday, September 29, 2016 7:50 AM
    Saturday, September 17, 2016 9:30 AM
  • Hi super_vdawg,

    According to my personal experience, fix the delimiter in Source Flat File may be a workaround.

    And as far as I know, there is a feature named “Treat consecutive delimiters as one” in Excel Text Import Wizard tools could be used to fix  the extra delimiters in Flat file. For more details, please try to reference the following steps.

    1. Create a new excel sheet and open the Excel Text Import Wizard tools(DATA>From Text). 
    2. Enable “Treat consecutive delimiters as one” in the second step. 


    3. Export the normal format data into  a new text file, and call this new text file in SSIS package.

    Regards,

    Seif


    Regards, Seif

    • Proposed as answer by Seif Wang Monday, September 26, 2016 11:54 AM
    • Edited by Seif Wang Monday, September 26, 2016 11:54 AM
    • Marked as answer by Eric__ZhangModerator Thursday, September 29, 2016 7:50 AM
    Tuesday, September 20, 2016 10:25 AM