locked
Importing a File with Dynamic Columns RRS feed

  • Question

  • I am new to SSIS and C#. In SQL Server 2008 I am importing data from a .csv file. Now I have the columns dynamic. They can be around 22 columns(some times more or less). I created a staging table with 25 columns and import data into it. In essence each flat file that I import has different number of columns. They are all properly formatted only. My task is to import all the rows from a .csv flat file including the headers. I want to put this in a job so I can import multiple files into the table daily.

    So inside a for each loop I have a data flow task within which I have a script component. I came up(research online) with the C# code below but I get error:

    Index was outside the bounds of the array.

    I tried to find the cause using MessageBox and I found it is reading the first line and the index is going outside the bounds of the array after the first line.

    1.) I need your help with fixing the code

    2.) My File1Conn is the flat file connection instead I want to read it directly from a variable User::FileName that my foreach loop keeps updating. Please help with modifying the code.

    Thanks in advance.

    This is my flat file:

    https://drive.google.com/file/d/0B418ObdiVnEIRnlsZFdwYTRfTFU/view?usp=sharing

    using System;
    using System.Data;
    using Microsoft.SqlServer.Dts.Pipeline.Wrapper;
    using Microsoft.SqlServer.Dts.Runtime.Wrapper;
    using System.Windows.Forms;
    using System.IO;
    
    [Microsoft.SqlServer.Dts.Pipeline.SSISScriptComponentEntryPointAttribute]
    public class ScriptMain : UserComponent
        
    {
        private StreamReader SR;
        private string File1;
    
        public override void AcquireConnections(object Transaction)
        {
            // Get the connection for File1
            IDTSConnectionManager100 CM = this.Connections.File1Conn;
            File1 = (string)CM.AcquireConnection(null);
        }
    
        public override void PreExecute()
        {
            base.PreExecute();
            SR = new StreamReader(File1);
        }
    
        public override void PostExecute()
        {
            base.PostExecute();
            SR.Close();
        }
    
        public override void CreateNewOutputRows()
        {
            // Declare variables
            string nextLine;
            string[] columns;
            char[] delimiters;
            int Col4Count;
            String[] Col4Value = new string[50];
    
            // Set the delimiter
            delimiters = ";".ToCharArray();
    
            // Read the first line (header)
            nextLine = SR.ReadLine();
    
            // Split the line into columns
            columns = nextLine.Split(delimiters);
    
            // Find out how many Col3 there are in the file
            Col4Count = columns.Length - 3;
            //MessageBox.Show(Col4Count.ToString());
    
            // Read the second line and loop until the end of the file
            nextLine = SR.ReadLine();
            
            while (nextLine != null)
            {
               
                // Split the line into columns
                columns = nextLine.Split(delimiters);
                {
                    // Add a row
                    File1OutputBuffer.AddRow();
    
    
                    // Set the values of the Script Component output according to the file content
                    File1OutputBuffer.SampleID = columns[0];
                    File1OutputBuffer.RepNumber = columns[1];
                    File1OutputBuffer.Product = columns[2];
                    File1OutputBuffer.Col1 = columns[3];
                    File1OutputBuffer.Col2 = columns[4];
                    File1OutputBuffer.Col3 = columns[5];
                    File1OutputBuffer.Col4 = columns[6];
                    File1OutputBuffer.Col5 = columns[7];
                    File1OutputBuffer.Col6 = columns[8];
                    File1OutputBuffer.Col7 = columns[9];
                    File1OutputBuffer.Col8 = columns[10];
                    File1OutputBuffer.Col9 = columns[11];
                    File1OutputBuffer.Col10 = columns[12];
                    File1OutputBuffer.Col11 = columns[13];
                    File1OutputBuffer.Col12 = columns[14];
                    File1OutputBuffer.Col13 = columns[15];
                    File1OutputBuffer.Col14 = columns[16];
                    File1OutputBuffer.Col15 = columns[17];
                    File1OutputBuffer.Col16 = columns[18];
    
    
                }
    
                // Read the next line
                nextLine = SR.ReadLine();
                
            }
        }
    
    }

    Script Transformation Output Columns. All columns have String[DT_STR] Lenght 50.


    svk


    • Edited by czarvk Friday, July 17, 2015 3:55 AM
    Thursday, July 16, 2015 11:14 PM

Answers

All replies

  • Use split task and separate the first row which is your header information.

    Figure out the way to use these header text to create column names in your staging table.

    Thursday, July 16, 2015 11:34 PM
  • Thanks for the response but that is not what I need. I have a staging table already and I just want to insert rows into it along with headers. Although my columns are dynamic since my staging table has 25 columns that is good enough to hold all the columns. Since I have the headers I will deal with separating them at a later stage.

    svk

    Friday, July 17, 2015 1:07 AM
  • If you have fewer than 19 columns, the "columns[18]" line will produce that error. You should wrap these in a check to make sure that the columns array has the number of elements you want. You could also wrap this in a try/catch block and ignore the error of this type. 
    Friday, July 17, 2015 11:32 AM
  • In essence each flat file that I import has different number of columns. They are all properly formatted only

    Why you loading all files data into single staging table? You should have different table for each file as they would be correctly formatted. Later you don't have to worry/write extra complicated logic to separate out rows per file. 


    Thanks, hsbal


    PS. Your google drive sample file has 16 columns (;) separated. these should be loaded into staging table with 16 columns
    • Edited by Harry Bal Friday, July 17, 2015 7:58 PM Checked the flat file
    • Proposed as answer by Katherine Xiong Tuesday, July 21, 2015 6:34 AM
    • Marked as answer by Katherine Xiong Tuesday, July 28, 2015 2:11 AM
    Friday, July 17, 2015 7:53 PM
  • I would recommend you check the commercial COZYROC Data Flow Task Plus. It is an extension of the standard Data Flow Task and supports dynamic columns metadata at runtime. You can process files with variable number of columns without a need to implement any code.

    SSIS Tasks Components Scripts Services | http://www.cozyroc.com/

    Saturday, July 18, 2015 6:55 AM
  • Saturday, July 18, 2015 3:59 PM