none
Storing an array in an SSIS variable

    Question

  • I am not much of a VB.NET programmer, so forgive me if the solution to this problem is obvious. What I am trying to do is store an array from a Script Task in an SSIS variable and then access it from later script tasks. In brief, this is what I have tried so far:

    Init (Script Task)
    This task initializes an array:

    Dim processes(0) As System.Diagnostics.Process
    Dts.Variables("Processes").Value = processes

    LOOP - Start (Script Task)
    This task is contained in another loop and starts a number of processes, while storing the process objects in the "growing" array:

    Dim processes() As System.Diagnostics.Process
    processes = DirectCast(Dts.Variables("Processes").Value, System.Diagnostics.Process())
    ReDim Preserve processes(processes.Length + 1)
    processes(processes.Length - 1) = System.Diagnostics.Process.Start("...")

    Wait (Script Task)
    This task waits, after the loop, for the processes to finish:

    Dim processes() As System.Diagnostics.Process
    processes = DirectCast(Dts.Variables("Processes").Value, System.Diagnostics.Process())
    Dim proc As System.Diagnostics.Process
    For Each proc In processes
      proc.WaitForExit()
    Next

    The following things doesn't seem to be working. The ReDim of the array doesn't seem to have any effect, which I gathered by doing a MsgBox with the length after the redim and seeing that it doesn't change while the loop iterates. I am guessing that this is also causing the problem in the last task, where I get an error complaining about the object state of proc in the For Each section.

    Any help is appreciated!

    Regards,
    Lars Rönnbäck

    Thursday, December 15, 2005 6:43 PM

Answers

  • Thanks for your help, and I believe you got to the core of the problem, but I ran into more trouble with the array. For some reason it seems that the array gets converted to an Object[] array, which refuses to be cast back to a Process[] array with a DirectCast. I am sure this is because I don't know how to handle arrays properly in VB.NET, so I resorted to using an ArrayList instead, which conveniently takes care of the array handling.

    Everything is now working as intended, so I thought I could post what I have done in case anyone else needs to do anything similar. What I have done is a master package that will execute other packages in parallell in a controlled manner. Performance tests showed us that running four packages sequentially took about an hour, while running them in parallell cut the time in half (on a quad Xeon). This is the package structure.

    For Each Loop - Loops over groups of jobs that can run in parallel

    Script Task - Initialization of an Object variable
    Imports System
    Imports System.Data
    Imports System.Math
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
     Public Sub Main()
            Dim processes As New System.Collections.ArrayList()
            Dts.Variables("Processes").Value = processes
      Dts.TaskResult = Dts.Results.Success
     End Sub
    End Class

    For Each Loop - Loops over the jobs in the current parallel execution group 

    Script Task - Executing the job
    Imports System
    Imports System.Data
    Imports System.Math
    Imports System.Diagnostics
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
     Public Sub Main()
            Dim processes As System.Collections.ArrayList
            Dim dtexec As Process = New Process()
            processes = DirectCast(Dts.Variables("Processes").Value, System.Collections.ArrayList)
            processes.Add(dtexec)
            dtexec.StartInfo.FileName = "dtexec.exe"
            dtexec.StartInfo.Arguments = "/Reporting E /File """ + Dts.Variables("CurrentJobScript").Value.ToString + """ " + Dts.Variables("Parameters").Value.ToString
            dtexec.Start()
            Dts.Variables("Processes").Value = processes
            Dts.TaskResult = Dts.Results.Success
        End Sub
    End Class

    Loop Ends

    Script Task - Wait for jobs in this group to finish
    Imports System
    Imports System.Data
    Imports System.Math
    Imports System.Diagnostics
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
        Public Sub Main()
            Dim processes As System.Collections.ArrayList
            processes = DirectCast(Dts.Variables("Processes").Value, System.Collections.ArrayList)
            Dim proc As Process
            Dim code As Integer
            For Each proc In processes
                proc.WaitForExit()
                code += proc.ExitCode
                proc.Close()
            Next
            processes.Clear()
            If code > 0 Then
                Dts.TaskResult = Dts.Results.Failure
            Else
                Dts.TaskResult = Dts.Results.Success
            End If
        End Sub
    End Class

    Loop End - Process next group of jobs that can run in parallel

    Friday, December 16, 2005 12:41 PM

All replies

  • I'm not much of VB.NET programmer either, but looking at MSDN, "ReDim releases the existing array and creates a new array with the same rank."

    So you get local variable processes that contains new array, but the Variable still contains old array. After ReDim'ing the array you need to assign new array to variable value:

    ...
    processes(processes.Length - 1) = System.Diagnostics.Process.Start("...")
    Dts.Variables("Processes").Value = processes

    Thursday, December 15, 2005 8:57 PM
  • Thanks for your help, and I believe you got to the core of the problem, but I ran into more trouble with the array. For some reason it seems that the array gets converted to an Object[] array, which refuses to be cast back to a Process[] array with a DirectCast. I am sure this is because I don't know how to handle arrays properly in VB.NET, so I resorted to using an ArrayList instead, which conveniently takes care of the array handling.

    Everything is now working as intended, so I thought I could post what I have done in case anyone else needs to do anything similar. What I have done is a master package that will execute other packages in parallell in a controlled manner. Performance tests showed us that running four packages sequentially took about an hour, while running them in parallell cut the time in half (on a quad Xeon). This is the package structure.

    For Each Loop - Loops over groups of jobs that can run in parallel

    Script Task - Initialization of an Object variable
    Imports System
    Imports System.Data
    Imports System.Math
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
     Public Sub Main()
            Dim processes As New System.Collections.ArrayList()
            Dts.Variables("Processes").Value = processes
      Dts.TaskResult = Dts.Results.Success
     End Sub
    End Class

    For Each Loop - Loops over the jobs in the current parallel execution group 

    Script Task - Executing the job
    Imports System
    Imports System.Data
    Imports System.Math
    Imports System.Diagnostics
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
     Public Sub Main()
            Dim processes As System.Collections.ArrayList
            Dim dtexec As Process = New Process()
            processes = DirectCast(Dts.Variables("Processes").Value, System.Collections.ArrayList)
            processes.Add(dtexec)
            dtexec.StartInfo.FileName = "dtexec.exe"
            dtexec.StartInfo.Arguments = "/Reporting E /File """ + Dts.Variables("CurrentJobScript").Value.ToString + """ " + Dts.Variables("Parameters").Value.ToString
            dtexec.Start()
            Dts.Variables("Processes").Value = processes
            Dts.TaskResult = Dts.Results.Success
        End Sub
    End Class

    Loop Ends

    Script Task - Wait for jobs in this group to finish
    Imports System
    Imports System.Data
    Imports System.Math
    Imports System.Diagnostics
    Imports Microsoft.SqlServer.Dts.Runtime
    Public Class ScriptMain
        Public Sub Main()
            Dim processes As System.Collections.ArrayList
            processes = DirectCast(Dts.Variables("Processes").Value, System.Collections.ArrayList)
            Dim proc As Process
            Dim code As Integer
            For Each proc In processes
                proc.WaitForExit()
                code += proc.ExitCode
                proc.Close()
            Next
            processes.Clear()
            If code > 0 Then
                Dts.TaskResult = Dts.Results.Failure
            Else
                Dts.TaskResult = Dts.Results.Success
            End If
        End Sub
    End Class

    Loop End - Process next group of jobs that can run in parallel

    Friday, December 16, 2005 12:41 PM
  • > it seems that the array gets converted to an Object[] array, which refuses to be cast back to a Process[] array

    Unfortunately, this seems like an artifact of COM interop - the underlying Variable object in unmanaged, and Process[] gets converted to SAFEARRAY when assigned to Variable value, then converted to object[] when retrieved back.

    I think your solution of using ArrayList is the best choice here - since only a reference to the ArrayList is stored in the variable, you don't have to worry about this convertion, and since it is dynamically sized (instead of creating completely new object as was the case with plain array) there is no need to assign it back to variable - in this new version you can probably remove assignment Dts.Variables("Processes").Value  except during the initialization.

    Saturday, December 17, 2005 3:17 AM