locked
Loop through excel file sheets in ssis RRS feed

  • Question

  • HI..

    I have one excel file which is having dimensions meta data info in each sheet (One dimension info in each sheet) with same format. Sheet name having the dimension name.

    Now, i want to access these sheets and load into my target table(look up on sheet name and target table name).

    So, i need to perform 2 tasks now.

    1) Loop thru all sheets in excel and

    2) Save the sheet name in variable each time, using this variable to look up on target table name .

    Can anyone help me to achieve this?

    • Changed type SSISJoostMVP Friday, February 13, 2015 7:25 AM More appropriate type
    Thursday, February 12, 2015 10:22 AM

Answers

  • Thank You Vaibhav, it solved the issue and i am able to create the Excel Conn now.

    I am bit new in accessing excel files in ssis.

    After creating Excel connection it is not displaying under Excel Connection Manager in my excel source connection part.

    Please create New Excel Connection which points to your Excel file SourceToTarget and then you will see this Connection in drop down in Excel source.


    -Vaibhav Chaudhari

    • Proposed as answer by Katherine Xiong Friday, February 13, 2015 8:30 AM
    • Marked as answer by Naveen Mbv Monday, February 16, 2015 8:39 AM
    Thursday, February 12, 2015 5:52 PM
  • To overcome this you need to check the variable value inside the foreachloop container.  

    in my way .. 

    So you can take a variable @validsheet and a script task .. in script task you can check the value of @sheetname 

    if it contain "Print_area"  or "print_Titles" then set @validsheet =0 else @validsheet =1 

    After that in precedence constraint you can check the value @validsheet  to proceed.. 

     

    Thanks,

    Saurabh 


    http://www.linkedin.com/in/sbhadauria http://www.experts-exchange.com/M_6313078.html

    • Marked as answer by Naveen Mbv Monday, February 16, 2015 7:10 PM
    Monday, February 16, 2015 11:16 AM

All replies

  • Go through below links ... there is a very good way to do this.. 

    http://beyondrelational.com/modules/24/syndicated/398/Posts/18163/ssis-how-to-loop-through-multiple-excel-sheets-and-load-them-into-a-sql-table.aspx

    http://www.codeproject.com/Tips/395541/How-to-load-data-from-multiple-Excel-sheets-to-any

    Thanks,

    Saurabh 


    http://www.linkedin.com/in/sbhadauria http://www.experts-exchange.com/M_6313078.html

    Thursday, February 12, 2015 10:48 AM
  • You can refer the below step by step procedure. Just change would be that, inside OLE DB destination you need provide Data Access Mode: Table Name or View Name Variable
                 Variable Name: User::SheetName

    SSIS: How to loop through Multiple Excel sheets and load them into a SQL Table?


    -Vaibhav Chaudhari

    Thursday, February 12, 2015 10:50 AM
  • You just need to use for each loop with ADO schema rowset enumerator to iterate through sheets within an Excel file. Then rest is straightforward

    http://sqlage.blogspot.in/2013/12/ssis-read-multiple-sheets-from-excel.html


    Please Mark This As Answer if it solved your issue
    Please Vote This As Helpful if it helps to solve your issue
    Visakh
    ----------------------------
    My Wiki User Page
    My MSDN Page
    My Personal Blog
    My Facebook Page

    Thursday, February 12, 2015 12:38 PM
  • Thank You Visakh and all for your links and tips...

    I was trying to follow the link http://beyondrelational.com/modules/24/syndicated/398/Posts/18163/ssis-how-to-loop-through-multiple-excel-sheets-and-load-them-into-a-sql-table.aspx and implementing same in my scenario. But i am facing issue in excel source.

    While creating the Foreach ADO.NET Schema Rowset Enumerator connection , i am facing the below issue.


    • Edited by Naveen Mbv Thursday, February 12, 2015 3:00 PM
    Thursday, February 12, 2015 2:59 PM
  • It throws error because Jet 4.0 OLE DB provider only supports .xls files and you have given .xlsx file.

    To use .xlsx file, you need to select MS Office 12.0 Access DB related driver and provide Extended property as Excel 12.0

    Refer: https://dataintegrity.wordpress.com/2009/10/16/xlsx/

    Another workaround (If you can use .xls)

    1. Please open SourceToTarge.xlsx file in Excel and save it back with .xls (97-2003) format
    2. Come back to connection manager and provide this newly saved.xls file.
    2. Click on All tab and provide Extended Properties: Excel 8.0 then hit Test Connection button.


    -Vaibhav Chaudhari


    Thursday, February 12, 2015 3:19 PM
  • Thank You Vaibhav, it solved the issue and i am able to create the Excel Conn now.

    I am bit new in accessing excel files in ssis.

    After creating Excel connection it is not displaying under Excel Connection Manager in my excel source connection part.

    Thursday, February 12, 2015 3:51 PM
  • Hi Visakh,

    I am trying to follow your link http://sqlage.blogspot.in/2013/12/ssis-read-multiple-sheets-from-excel.html

    I forgot to tell you about source query in my scenario

    I need to select only few cells from each sheet for execution , below is my source query for each sheet in excel

    "SELECT * FROM [DimAlder$A14:R14] UNION ALL SELECT * FROM [DimAlder$A16:R1000]"

    But as showing in STEP 3, if i take ADO.NET Source i am not able to select cells like above query as this source component is allowing only for 'Table or View' and 'SqlCommand'

    But i want 'sqlcommandfromvariable' also as i configured the above query in variable expression(for each excel sheet/tablename it has to generate same query )

    Thursday, February 12, 2015 4:37 PM
  • Thank You Vaibhav, it solved the issue and i am able to create the Excel Conn now.

    I am bit new in accessing excel files in ssis.

    After creating Excel connection it is not displaying under Excel Connection Manager in my excel source connection part.

    Please create New Excel Connection which points to your Excel file SourceToTarget and then you will see this Connection in drop down in Excel source.


    -Vaibhav Chaudhari

    • Proposed as answer by Katherine Xiong Friday, February 13, 2015 8:30 AM
    • Marked as answer by Naveen Mbv Monday, February 16, 2015 8:39 AM
    Thursday, February 12, 2015 5:52 PM
  • Thanks Vaibhav, i am able to create the connection now.

    But, when i am accesing the file in excel source, i was getting all worksheets and named spaces. I want to load only worksheets, when i googled it they were suggesting to create script task to select only worksheets.

    In web, i find the below code to read tables from excel.

    publicclass ScriptMain { publicvoid Main() { string excelFile; string connectionString; OleDbConnection excelConnection; DataTable tablesInFile; int tableCount = 0; string currentTable; int tableIndex = 0; string[] excelTables = newstring[5]; excelFile = Dts.Variables["ExcelFile"].Value.ToString(); connectionString = "Provider=Microsoft.Jet.OLEDB.4.0;" + "Data Source=" + excelFile + ";Extended Properties=Excel 8.0"; excelConnection = new OleDbConnection(connectionString); excelConnection.Open(); tablesInFile = excelConnection.GetSchema("Tables"); tableCount = tablesInFile.Rows.Count; foreach (DataRow tableInFile in tablesInFile.Rows) { currentTable = tableInFile["TABLE_NAME"].ToString(); excelTables[tableIndex] = currentTable; tableIndex += 1; } Dts.Variables["ExcelTables"].Value = excelTables; Dts.TaskResult = (int)ScriptResults.Success; } }

    But,this is to read both worksheets and named spaces. where i need to modify this code to extract only worksheets.

    Monday, February 16, 2015 8:43 AM
  • Thanks Vaibhav, i am able to create the connection now.

    But, when i am accesing the file in excel source, i was getting all worksheets and named spaces. I want to load only worksheets, when i googled it they were suggesting to create script task to select only worksheets.

    <<Code>>    

    But,this is to read both worksheets and named spaces. where i need to modify this code to extract only worksheets.

    Hi Naveen,

    Your aim was loop through Excel and load sheets data to SQL. For that I suggested below step by step.

    Please refer: http://beyondrelational.com/modules/24/syndicated/398/Posts/18163/ssis-how-to-loop-through-multiple-excel-sheets-and-load-them-into-a-sql-table.aspx

    So at the Excel source, you created a new Excel source connection manager and used and now you see all worksheets there. You don't need to set any particular worksheet there. Your foreach loop cotainer will return the Sheet Name in variable User::SheetName. You have to select it there like below:

    Please correct if my assumption is wrong. (FYI, you don't need script task here)


    -Vaibhav Chaudhari

    Monday, February 16, 2015 9:49 AM
  • Correct, i need to select worksheet name which is coming from ForEach. But here my concern is ForEach should return only worksheets not any namedrange.

    First, here i have two requirements.

    1. Read multiple worksheets(only few cells) from the excel sheet and

    2. Select only worksheets and exclude named ranges

    So, for req 1, i have configured my excel source editor to 'sql cmd from variable' as below

    "SELECT * FROM ["+ @[User::SheetName]  +"A16:R1000] WHERE F1 is NOT NULL"

    the above one shld generate the same query for all sheets exist in excel , but only for worksheets

    For req 2,As per below link , excel sheet will give worksheets and named ranges(i dont require names range)

    https://msdn.microsoft.com/en-us/library/ms345182.aspx

    'The list of tables in an Excel workbook includes both worksheets (which have the $ suffix) and named ranges. If you have to filter the list for only worksheets or only named ranges, you may have to write custom code in a Script task for this purpose.'

    This is the reason i am looking for script to select only worksheets from excel sheet.


    • Edited by Naveen Mbv Monday, February 16, 2015 10:20 AM
    Monday, February 16, 2015 10:14 AM

  • In the excel source editor , my excel file is showing sheets as below.( Both sheet names and named range)

    For ex, in the above pic i have 3 names with 'DimAlder$' (DimAlder$,DimAlder$Print_Area,DimAlder$Print_Titles).

    Here DimAlder$ is the sheet name and remaing are namedranges. So, i need to have these names only from For each loop.For this i have a variable '@SheetName' to store sheetName. But in this i am getting all the 3 names

    Hence my DFT is failing for named spaces as these are not sheetnames in excel file.


    Monday, February 16, 2015 10:30 AM
  • To overcome this you need to check the variable value inside the foreachloop container.  

    in my way .. 

    So you can take a variable @validsheet and a script task .. in script task you can check the value of @sheetname 

    if it contain "Print_area"  or "print_Titles" then set @validsheet =0 else @validsheet =1 

    After that in precedence constraint you can check the value @validsheet  to proceed.. 

     

    Thanks,

    Saurabh 


    http://www.linkedin.com/in/sbhadauria http://www.experts-exchange.com/M_6313078.html

    • Marked as answer by Naveen Mbv Monday, February 16, 2015 7:10 PM
    Monday, February 16, 2015 11:16 AM

  • In the excel source editor , my excel file is showing sheets as below.( Both sheet names and named range)

    For ex, in the above pic i have 3 names with 'DimAlder$' (DimAlder$,DimAlder$Print_Area,DimAlder$Print_Titles).

    Here DimAlder$ is the sheet name and remaing are namedranges. So, i need to have these names only from For each loop.For this i have a variable '@SheetName' to store sheetName. But in this i am getting all the 3 names

    Hence my DFT is failing for named spaces as these are not sheetnames in excel file.


    I got the story.

    Currently the script will return both Named ranges and sheets because both are treated as Tables. I'm not sure if we have any special property to identify named range or sheet. However I posted the question and got a compext script in below. See if you can use it.

    https://social.msdn.microsoft.com/Forums/vstudio/en-US/45b03364-456e-4949-bc2c-11cf45928922/get-excel-sheet-names-only-skip-named-ranges?forum=csharpgeneral

    Also, you can try what Saurabh suggested ie to check if User::SheeName is valid or not.


    -Vaibhav Chaudhari


    Monday, February 16, 2015 12:25 PM
  • Thanks Saurabh, it solves the issue.

    Instead of filtering on worksheets, i did join with the sheetname and filename in DB table, which has loaded in previous run.Thus i am executing the logic for only worksheets.

    Monday, February 16, 2015 7:13 PM