locked
Importing Excel Spreadsheets from different sources RRS feed

  • Question

  • I'm battling with importing sales data from different ERP systems into a SQL table.  I have multiple spreadsheets, which have the same sheet name, column names and formats.  I use a for each file loop container to loop through the different files.  I use variables passed to a script to pull the relevant source information from the file name. I then use the variables to create derived columns in the flow to indicate the datadate and the company division. 

     

    My problem is even though all the data and metadata looks the same, there are a couple of columns between the two spread sheets that alway complain about the metadata not matching or a coumn needing to be added to the external metadata column collection and the same coumn name needs to be removed from the external metadata column collection.

     

    These spreadsheets are coming from different countries but I have gone in and adjusted any fromats that are different from the excel spreadsheet used as the template.  The same two columns out of 12 complain and I'm not sure what is causing the error.  In the error is

     

    Warning: 0x800470C8 at Read excel File, Excel Source [274]: The external metadata column collection is out of synchronization with the data source columns. The column "COGS" needs to be added to the external metadata column collection.

    The "external metadata column "COGS " (2512)" needs to be removed from the external metadata column collection.

     

    Eventually, I'll have 3 spreadsheets to load and need to find out why they are not viewed as the same data and metadata when they apear that way on the surface.  I've searched numerous blogs out there (Even Jamie's) and can't find anything.

     

    Thanks,

     

    pkdenver

     

    Tuesday, November 20, 2007 5:57 PM

Answers

  • I created a workaround to process the separate files for the different spreadsheets.  There is definitely some quirky behavior regarding reading the metadata from the different spreadsheets.  It was always just one column that couldn't be read or was detected as out of sync. 

    I used an expression, which evaluated a user variable in the Precedence constraint to determine if the file should be processed by the following dataflow task.  I used a script task and user variables to store values from the filename to determine how the file should be processed.  I placed three branches from the script task with the 3 expressions leading to the appropriate dataflow.  Luckily, we don't have many different files to process.

     

     

    Monday, November 26, 2007 3:38 PM

All replies

  • It's likely that Excel is reporting different metadata, even though you have tried to make it the same. This is why most of us advise avoiding the use of Excel as a data source. The driver determines what the metadata is at runtime - so it isn't guaranteed to produce consistent results.

     

    I realize that doesn't really help answer your question. The only other suggestion I would have is that you look into exporting the Excel files into a different format, like a flat file. That way you control the metadata definitions.

     

    Friday, November 23, 2007 11:46 PM
  • I created a workaround to process the separate files for the different spreadsheets.  There is definitely some quirky behavior regarding reading the metadata from the different spreadsheets.  It was always just one column that couldn't be read or was detected as out of sync. 

    I used an expression, which evaluated a user variable in the Precedence constraint to determine if the file should be processed by the following dataflow task.  I used a script task and user variables to store values from the filename to determine how the file should be processed.  I placed three branches from the script task with the 3 expressions leading to the appropriate dataflow.  Luckily, we don't have many different files to process.

     

     

    Monday, November 26, 2007 3:38 PM