Ask a questionAsk a question
 

Proposed AnswerModifying Rows From a Flat File Source

  • Wednesday, November 04, 2009 10:26 PMknovak Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I'm importing data from a flat file via the Flat File Source and need to make some modifications to this data before writing into SQL Server via the OLE DB Destination task.

    The format of the data from the flat file is as follows...

    Column 0  Column 1
    Asia         
                    cn
                    hk
                    jp
    Europe
                    at
                    be
                    ch


    I would like to massage the data to this...

    Column 0  Column 1
    Asia          cn
    Asia          hk
    Asia          jp
    Europe      at
    Europe      be
    Europe      ch

    However I'm not sure about how to go about this.  I would think a Script task, but again I'm not sure if this is the proper approach.  If indeed I do need to use a Script task, how do I get the dataset out of the Flat File Source?

    Any ideas on how to massage this data would be greatly appreciated.

    Thanks,

    Kyle

All Replies

  • Wednesday, November 04, 2009 10:29 PMPhil BrammerMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Proposed Answer
    A Script Component (not a Task) in the data flow can easily do this. 

    Basically, you will want to read in Column0 and store it in a variable, presenting it back to the data flow where Column1 is not null or empty.  When the variable <> Column0, then you know you have a new data set, so repeat the process.

    I can write the code for you, but would encourage you to at least make an effort first. 
    Phil Brammer | http://www.ssistalk.com | Twitter: http://twitter.com/PhilBrammer
    • Proposed As Answer byKunal Joshi Wednesday, November 04, 2009 11:20 PM
    •  
  • Wednesday, November 04, 2009 11:20 PMKunal Joshi Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    HI novak
    Phil is correct U can use Script Component [Source] to achieve this sort of transformation
    add rows to ur  outputBuffer with Column0 variable value and not null column 1,2,3....
    switch when the variable column0 value changes ...
    I guess given by ur flat file sample u'd require to save Column0 value in variable and add rows to output buffer varColumn0,Column1......CoulmnN till u find not null or empty  Coulmn0 value in ur flat file rows.
     variable <> Column0 i guess might not work properly the second row itself as it would treat as new data set in ff when it'll compare Asia Vs emptystr/ null

    any ways it should be piece doing by script component if u like and comfortable in coding..
    well for now Script Component should solve ur problem
    Please check for script source Component help its quiet detailed document : http://msdn.microsoft.com/en-us/library/ms136060.aspx


    _______________________________________________________________________________
    Please mark post/s helpful / answered if they answer ur query -- kunal
  • Friday, November 06, 2009 6:49 PMknovak Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Okay so I've used the Script Component as a Transformation to get almost there.  Right now my output data looks like...


    <blank> <blank>
    Asia       cn
    Asia       hk
    <blank> <blank>
    Europe   at
    Europe   bt

    etc..

    I'm trying to remove the blank rows via the PipelineBuffer.RemoveRow method.  However, it doesn't seem as the PrimeOutput method is firing when I override it like this...

        Dim normalBuffer As PipelineBuffer
        Dim region As String = String.Empty
    
        Public Overrides Sub PrimeOutput(ByVal Outputs As Integer, ByVal OutputIDs() As Integer, ByVal Buffers() As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)
    
            'MessageBox.Show("OutputIDs length is " + OutputIDs.Length.ToString)
    
            'If (Buffers(0) Is Nothing) Then
            '    MessageBox.Show("Buffers is NULL")
            'End If
    
            normalBuffer = Buffers(0)
            MyBase.PrimeOutput(Outputs, OutputIDs, Buffers)
        End Sub
    
    

    When I call the normalBuffer.RemoveRow in my ProcessInputRow method, I receive an "Variable not set to an instance of an object".  I've followed this thread http://social.msdn.microsoft.com/Forums/en-US/sqlintegrationservices/thread/6d44468c-8db4-4b29-a92c-dd4cd96dfd25/
    but it's not working for me.

    Do what do I need to do to remove a row?

    Thanks,

    Kyle



  • Friday, November 06, 2009 7:04 PMAdam TappisModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    You could easily add another output to the script task. You then get an additional method generated for you that you can call directly on the rowe.g. row.DirectRowTo<output name>. Simply ignore this output.

    Have a look at this blog post by Jamie Thomson http://consultingblogs.emc.com/jamiethomson/archive/2005/09/05/2113.aspx
    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
  • Saturday, November 07, 2009 5:29 AMKunal Joshi Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Okay so I've used the Script Component as a Transformation to get almost there.  Right now my output data looks like...


    <blank> <blank>
    Asia       cn
    Asia       hk
    <blank> <blank>
    Europe   at
    Europe   bt

    etc..
    Thanks,

    Kyle



    HI could u share ur script component code ...
    the solution suggest above shouldnt give the blank rows in first place it self ...
    i think there is some small logical calculation error in ur code ...
    u neednt direct these blank rows .... in the output at 1st place itself ...
    why worry about removing them later ....
    _______________________________________________________________________________
    Please mark post/s helpful / answered if they answer ur query -- kunal