none
SSIS Script Component - Iterating through input columns

    Question

  •  

    I'm trying write a reusable script component that takes data from rows that were rejected from a SQL Destination operation and put them into a common SQL error table.

     

    This script would basically function to take the input columns selected in the script, and build a delimited string, (similar to the 'Flat File Source Error Output' that is contains redirected rows from reading a flat file) and insert this string into a SQL table called 'SourceData' to store errors.

     

    I'm trying to script the component to iterate through all input columns (as selected in the input columns screen) and build a simple string.

     

     

    Code Block

    Public Overrides Sub Input0_ProcessInputRow(ByVal Row As Input0Buffer)

    'Use the incoming error number as a parameter to GetErrorDescription

    Row.ErrorDescription = ComponentMetaData.GetErrorDescription(Row.ErrorCode)

    Try

    Row.ErrorColumnName = ComponentMetaData.InputCollection(0).InputColumnCollection(Row.ErrorColumn).Name

    Catch ex As Exception

    Row.ErrorColumnName = String.Concat("Column Name retrieval failure. Details", ex.Message)

    End Try

    '

    'Build input data

    '

    Dim inData As String

    For Each inputCol As IDTSInputColumn90 In ComponentMetaData.InputCollection(0).InputColumnCollection

    inData = String.Concat(inData, "~", inputCol.Name) 'I don't want the name, but the value.

    Next

    Row.SourceData = inData

    '

    End Sub

     

     

    I've only got as far as iterating the names of columns in the input buffer, but how do i get the values?

     

    The result i'm trying to achieve is :

    Selected columns in 'Input Column' screen : Name, Address, Phone

    OutPut column 'SourceData' value : Harry~Melbourne~None

     

    I don't want to write the code as:

    Code Block

    inData = Row.Name

    indata = String.Concat(inData,"~",Row.Address)

    indata = string.concat(inData,"~",Row.Phone)

     

    as this make my code not very reusable. I've got some tables which are 100+ columns long and I don't wish modify the code too much Smile

     

    I have also tried overriding the ProcessInput() function of the script component to iterate through the buffer columns :

    Code Block

    Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

    MyBase.ProcessInput(InputID, Buffer)

     

    Dim iCnt As Integer = 0

    Dim inData As String

    If Buffer.ColumnCount > 0 Then

    For iCnt = 0 To Buffer.ColumnCount - 1

    inData = String.Concat(Buffer.Item(iCnt)) 'Error thrown here: PipelineBuffer has encountered an invalid row index value.

    Next

    End If

    End Sub

     

     

    but i get an error when i run it.

     

    Please help.

     

    Monday, November 26, 2007 4:24 AM

Answers

  • You should not need to use both ProcessInput and ProcessInputRow. The latter is called once per row, so doing the column names there is not a good idea.

     

    Here is some very rough code that shows getting all values -

     

    Public Class ScriptMain

    Inherits UserComponent

    Private columns As Integer()

    Public Overrides Sub PreExecute()

    Dim input As IDTSInput90 = ComponentMetaData.InputCollection(0)

    ReDim columns(input.InputColumnCollection.Count)

    columns = Me.GetColumnIndexes(input.ID)

    System.Windows.Forms.MessageBox.Show(columns.Length.ToString())

    End Sub

    Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

     

    While Buffer.NextRow()

    Dim values As New System.Text.StringBuilder

    For Each index As Integer In columns

    Dim value As Object = Buffer(index)

    'If value Is Not Nothing Then

    values.Append(value)

    'End If

    values.Append(",")

    Next

    '' TODO: Write line to destination here

    System.Windows.Forms.MessageBox.Show(values.ToString())

    End While

    End Sub

    End Class

     

     

    I think this would be much better served as a custom component, This sound like a pattern you would wish to follow in all your packages, so make it easier to reuse, and maintain by doing in a custom component.

    Monday, November 26, 2007 6:25 AM
  • This may have some additional ideas for you, though I tend to agree with Darren that in the long run, you are better off with a custom component.

     

    http://agilebi.com/cs/blogs/jwelch/archive/2007/10/21/address-columns-generically-in-a-script-task.aspx

     

    Tuesday, November 27, 2007 3:27 AM

All replies

  • You should not need to use both ProcessInput and ProcessInputRow. The latter is called once per row, so doing the column names there is not a good idea.

     

    Here is some very rough code that shows getting all values -

     

    Public Class ScriptMain

    Inherits UserComponent

    Private columns As Integer()

    Public Overrides Sub PreExecute()

    Dim input As IDTSInput90 = ComponentMetaData.InputCollection(0)

    ReDim columns(input.InputColumnCollection.Count)

    columns = Me.GetColumnIndexes(input.ID)

    System.Windows.Forms.MessageBox.Show(columns.Length.ToString())

    End Sub

    Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

     

    While Buffer.NextRow()

    Dim values As New System.Text.StringBuilder

    For Each index As Integer In columns

    Dim value As Object = Buffer(index)

    'If value Is Not Nothing Then

    values.Append(value)

    'End If

    values.Append(",")

    Next

    '' TODO: Write line to destination here

    System.Windows.Forms.MessageBox.Show(values.ToString())

    End While

    End Sub

    End Class

     

     

    I think this would be much better served as a custom component, This sound like a pattern you would wish to follow in all your packages, so make it easier to reuse, and maintain by doing in a custom component.

    Monday, November 26, 2007 6:25 AM
  • Darren, thanks. That was very helpful, but i still have a problem.

    My intention is to include the string of data into a new output column on the row being processed so i can use downstream.

     

    How do i do this?

     

    Code Block

    Public Overrides Sub ProcessInput(ByVal InputID As Integer, ByVal Buffer As Microsoft.SqlServer.Dts.Pipeline.PipelineBuffer)

    While Buffer.NextRow

    For Each colIndex As Integer In columns

    Try

    Dim value As String = CStr(Buffer(colIndex))

    inData = String.Concat(inData, "~", value)

    MsgBox(inData.ToString)

    Catch ex As Exception

    MsgBox(ex.Message)

    End Try

    Next

    End While

    MyBase.ProcessInput(InputID, Buffer)

    End Sub

     

     

    Monday, November 26, 2007 7:19 AM
  •  DarrenSQLIS wrote:

    You should not need to use both ProcessInput and ProcessInputRow.

     

    Darren,

    I need to use the ProcessInputRow as well, as I'm using it get the error descriptions. It appears that if i call the mybase.ProcessInput() in my overriding sub, it fails, telling the pipeline buffer has already been accessed.

     

    Of course, i could make that another script component, but its starting to look really odd that my error handling flow is getting longer than my actual data import flow Smile

    Monday, November 26, 2007 7:42 AM
  • If you want the dynamic nature of ProcessInput, then you cannot have the static spoon feeding of ProcessInputRow, they have the same purpose, so use one or the other. Unfortunately I cannot find a decent way to identify the error columns in ProcessInput. The values are there and available to you, but you need to know which is which.

     

    You could do it all in ProcessInputRow, but it would mean lots of hand coding, referencing each column by name. If you really want a dynamic solution then use a custom component. The Script component dumbs things down to make it faster and easier to write code, but it takes away flexibility. It is a trade off, and the flexible stuff you want to do would be so much easier in a custom component.

    Monday, November 26, 2007 9:36 AM
  • Darren,

    I agree with you on using ProcessInput() then.

     

    If I'm using ProcessInput to process row by row:

    1. Do I have to manually generate corresponding output rows when using ProcessInput()? If yes, how do I do this? A rough code sample would be extremely appreciated.

    2. How do I assign a value to the output column that I want to into my output?

     

    I'm totally lost as to building custom components, but i'll start taking fresh look into it. But due to my development timeline, I'm pressed for a solution using existing components until i can propose a more robust one.

     

    Thanks for your time. You've always been a super source of info.

    Tuesday, November 27, 2007 2:39 AM
  • This may have some additional ideas for you, though I tend to agree with Darren that in the long run, you are better off with a custom component.

     

    http://agilebi.com/cs/blogs/jwelch/archive/2007/10/21/address-columns-generically-in-a-script-task.aspx

     

    Tuesday, November 27, 2007 3:27 AM
  •  

    Thank you. That was what I was looking for, having access to the PipelineBuffer inside the ProcessInputRow().

     

    I've got it working now. Thanks to the both of you.

     

    Just a side question:

    If I wanted get the columnName and value, what approach should i take?

     

    I can now get the column values from the buffer. But I don't think the Buffer.Item(index) matches the componentmetadata.inputcolumn(0).inputcolumncollection(index), does it?

     

    Is there some way I can match the buffer item i'm processing to the column metadata name?

    Tuesday, November 27, 2007 5:57 AM
  • I don't believe buffer index and input column index are the same thing, unless the script component has simplified this, in full custom components it is certainly not. A buffer can span multiple transforms, and columns are hidden, from upstream components but still exist if they are created in downstream components. That says to me there is a risk that the two can be different.

     

    This is a key reason, why again, I think a custom component is the way to do this. I should caveat that, I know exactly how to do it in a custom component, but I cannot think of flexible way to do it in a script component, but there may still be a way.

     

    John's code cool, but I'd worry about performance. Development time vs re-use vs performance ... ?

    Tuesday, November 27, 2007 11:14 AM
  • The reflection code is definately slow. I don't recommend it in high volume scenarios. The index based approach isn't bad, but I haven't really tested that approach with a large number of rows. I still agree with Darren that a custom component is the best approach.

    Wednesday, November 28, 2007 4:19 AM
  • Hi Enzoe...

     

    I know this is late but I need to do almost the same thing in my project. I want the entire erroring out row as XML in a table. Could you please tell me if you were able to do so...I have been trying out all the above examples but could not get exactly what I wanted i.e xml string with column name and value.

     

    Will be grateful if you could help me with this.

    Thanks in advance.

     

     

     Enzoe wrote:

     

    Thank you. That was what I was looking for, having access to the PipelineBuffer inside the ProcessInputRow().

     

    I've got it working now. Thanks to the both of you.

     

    Just a side question:

    If I wanted get the columnName and value, what approach should i take?

     

    I can now get the column values from the buffer. But I don't think the Buffer.Item(index) matches the componentmetadata.inputcolumn(0).inputcolumncollection(index), does it?

     

    Is there some way I can match the buffer item i'm processing to the column metadata name?

    Monday, February 04, 2008 7:20 AM
  • Thank you so much John. I needed this only Smile

     

    Monday, February 04, 2008 6:15 PM
  • Hi

    How did u get the exact column name on which the error had occured .

     

    I am also  trying to build some customized code but i m unable to get the exact column name in which error occured 

     

    Row(Errorcolumn) does not give the exact  details to get the column name

     

    Plz reply

    How to achieve this ??

     

     I m newbie so i couldn't follow much in the doc's plz help

    Thanks in advance

    anlis

     

     

     

    Thursday, July 03, 2008 12:37 PM
  • My approach to getting columns names may not work in many scenarios.

    I log my errors to a SQL table destination, called ImportError

     

    I also have another table in the database that has a list of packages, task names, column names and column ids, i call this table ColumnMetadata. This table is populated by a separate program that programmatically opens the packages and retrieves component's column names and Ids and bulk inserts it into the ColumnMetadata table. I usually do this as a post deployment step. This table serves as a lookup table.

     

    On my ImportError table, i have a column that stores the ErrorColumn, ErrorCode and ErrorDescription. The ErrorDescription is populated from the SSIS Script Component using the me.componentMetadata.GetErrorDescription function.

     

    I also have another column called ErrorColumnName, which is a computed column, which looks up the ErrorCode in the ComponentMetadata table's ColumnId column and returns the ColumnName. This means that I don't 'need' to retrieve the error column name in my SSIS package. I get the column name when i query my ImportError table.

     

    If you had to get the column name in the SSIS package, then just use a lookup component to get the column name from the ColumnMetadata table. (note: remember to handle the lookup component error if a lookup failed).

     

    Cheers.

     

    Monday, July 21, 2008 1:14 AM
  • thks for ur valuable time & information

    but i needed a scenario wherein the custom component has to refer to metadata which is not known.

    It should work for any table as its  oledb source .

     

    any ideas ...

    bye

    anlis

     

     

     

     

    Monday, July 21, 2008 1:23 PM