concatenating similar files from different folders
-
Thursday, July 26, 2012 8:52 AM
Hi,
I have some folders namely FolderA, Folder B... and each folder contains some files like file1, file2..etc. File1 can be present in both folder A and B and so on. I want to create an SSIS package which merges all the similar files. Like File1 from folder A and file1 from folder B has to be merged. How can i proceed with this? All suggestions are most welcome.
Also, which one will be efficient? Using the SSIS or using windows scripting?
Thanks in advance,
Nikesh
All Replies
-
Thursday, July 26, 2012 9:01 AMModeratorwhat do you mean by merging files? you mean appending rows ? or merging columns?
-
Thursday, July 26, 2012 9:21 AM
Hi Reza,
I need to append the similar files, not merging columns. File1 in folder A and file1 in folder B will be having same metadata. So we an append.
Thanks,
Nikesh
- Edited by Nikesh Kootan Thursday, July 26, 2012 9:22 AM typo
-
Thursday, July 26, 2012 9:30 AM
Hi Nitesh
Lets say File1
Name,id
anil,1
sunil,2
Say File 2
Name,id
ram,3
shyam,4
and your megrged file l
Name,id
anil,1
sunil,2
ram,3
shaym,4
like above means
use union all component in DataFlowTask
it will do.. but two source componets should refers recpective folder files..
- Edited by Kingxxx1 Thursday, July 26, 2012 9:31 AM
-
Thursday, July 26, 2012 9:50 AM
Hi Kingxxx1,
Thanks for your reply.
But the problems is that its has to be dynammic.
Below are the challenges I am facing.
1->How to read all the files from different directories? Suppose my parent directory is C:\data and it has different folders like folderA,folderB and so on. All the files are distributed among these folders.
2->When a particular file , say wc12345.dat is found in folderA, I need to check for the occurence of the same file wc12345.dat in all other fiolders and append to that file.
Hope you understood the problem.
Thanks,
Nikesh
-
Thursday, July 26, 2012 10:12 AMModerator
yes, as Kingxxx1 said, you can use UNION ALL Transformation,
but detailed help needs more information from you.
are those files have same structure?
I mean all File1s have same structure? how about File2s ? or all files have same structure
UNION all seems to be good if all files have same structure, but if they differs we need to think about other methods, could you explain more about your case/
-
Thursday, July 26, 2012 10:16 AM
Hi Nikesh,
In this case you have two options,
1. Go For an FileWatcherTask(Third part component).
2. Retrive the full name of the file in to variable .. In script get the file name of the file and store it temporarily and search the particular file in other loacation..
if yes get the location of the second file store it in a vaiable and pass the variable to the DFT source..
check the below link for an custom componet or you can use konesans file watcher also..
http://www.sqlis.com/post/file-watcher-task.aspx
But if are resticted using custom component means go for WMI event task and do some work arround with script..
- Marked As Answer by Nikesh Kootan Thursday, August 02, 2012 7:24 AM
-
Thursday, July 26, 2012 11:51 AM
Yes, the files with same names will be identical.So we can use the union all operator on those files.File1 and file2 will be structurally different.
Thanks,
Nikesh
-
Thursday, July 26, 2012 12:03 PMModerator
if you have different structures for each couple of files, then this means that: if you want to use data flow task, you will need one specific for each couple of files. because Data flow task doesn't support dynamic metadata.
So I think in your case writing .NET scripts to do that is better method.
for reading data from file you can use System.IO.StreamReader
and for writing data into file you can use System.IO.StreamWriter
or if you are not good in programming with .net, you can use some codes that can be find to append two files together, like this:
- Marked As Answer by Nikesh Kootan Thursday, August 02, 2012 7:24 AM
-
Thursday, July 26, 2012 12:16 PM
Hi Reza.. Take a look at this..
1->How to read all the files from different directories? Suppose my parent directory is C:\data and it has different folders like folderA,folderB and so on. All the files are distributed among these folders.
2->When a particular file , say wc12345.dat is found in folderA, I need to check for the occurence of the same file wc12345.dat in all other fiolders and append to that file.
what can be the better solution for this..
-
Friday, July 27, 2012 2:24 PM
Hi Nikesh,
Here is a rough sketch what you can do:
1 - Loop through all the files in your system using "For each Loop container" set it to iterate through the files in a folder (or use similar scheme to loop through multiple folders).
2 - Add a Data Flow task within "For each Loop Container" and set the source as your flat files and destination as Flat File Destination from the tool box.
3 - In the data task Use "Lookup" and "Conditional split" to ignore the duplicate rows.
Hope this helps
- Proposed As Answer by Eileen ZhaoMicrosoft Contingent Staff, Moderator Monday, July 30, 2012 8:10 AM
- Marked As Answer by Eileen ZhaoMicrosoft Contingent Staff, Moderator Wednesday, August 01, 2012 3:37 AM
-
Thursday, August 02, 2012 7:23 AM
Hi,
I could implement this and the steps are given below:
1->For each loop container for file, with an option to traverse subdirectories.
2-> In the container, add one script task to set the dynamic connection strings.
3->File system task to create dynamic directories based on the file names.
4->Data flow task source and destination as flat files, all the similar files will get appended to the destination file.
Here we had 3 kind of files with different metadata, they are identifiable with their names itself. So we can have different dataflow task for all these 3 types of files. One drawback is that whenever a new file is added, say a new file with different metadata, then we need to alter the package to accomodate this file.
Thanks for all your suggestions,
Nikesh

