SSIS Dataflow to Flat File SLOOOOWWWW!!
-
lundi 20 août 2012 21:43This is a very simple operation, I created a VIEW in a SQL 2005 DB, which has the columns I need to export to a flat file. I set up a data flow task, only one in the package, that copies the rows to the file. There are roughly a half million rows in the table.. It taks 5+ HOURS!! I'm not even sure where to start looking here! Any help would be appreciated!
Toutes les réponses
-
lundi 20 août 2012 22:57Modérateur
Need specifics.
Do you copy to a file across WAN (network) where a db is at a remote location?
Did you use OLEDB with "Fast load" option turned on?
The above and many more could hinder you load speed.
Arthur My Blog

-
mardi 21 août 2012 08:17As Arthur stated, there are many reasons for the slowness. It can be hardware/ network/ incorrect design. Check this which talks about an incorrect design when loading data to flat files.
http://btsbee.wordpress.com/
-
mardi 21 août 2012 12:40
This is a very simple operation, I created a VIEW in a SQL 2005 DB, which has the columns I need to export to a flat file. I set up a data flow task, only one in the package, that copies the rows to the file. There are roughly a half million rows in the table.. It taks 5+ HOURS!! I'm not even sure where to start looking here! Any help would be appreciated!
Views are extremely slow when huge amount of data is involved.
Run a test, Instead of using the View use SQL statement from within the view and run the job if there is a huge time difference then you should try using Stored Procedure instead.
(Using views withing views is even worse and personally I would never directly use a view from withing SSIS).
Edit: Look for Indexed Views if the views is your way to go.
Please mark the post as answered if it answers your question
- Modifié DotNetMonster mardi 21 août 2012 13:47
-
mardi 21 août 2012 14:45Arthur, Yes I am developing this against a remote server, Where would i find the "Fast Load" option? For a reference point, if i execute the query and send the results to a file directly from Management Studio, it takes about 3 minutes.. Also not blazing, but I can live with that..
-
mardi 21 août 2012 15:03Modérateur
Arthur, Yes I am developing this against a remote server, Where would i find the "Fast Load" option? For a reference point, if i execute the query and send the results to a file directly from Management Studio, it takes about 3 minutes.. Also not blazing, but I can live with that..
So, to find out if there is any latency use the PING command like
PING MyRemoteHost -n 1000
if you see reply timeouts and/or late replies then this explains the slowness.
The "Fast Load" is here:

Arthur My Blog

-
mardi 21 août 2012 15:14
Okay, I can't see the picture.. but I would think that Fast load applies to the destination if it is a SQL server table, I am writing to a flat file. There is no latency when pinging the server, as I said, i can retrieve the entire resultset to a file in mgt studio in under 3 minutes.. so How do I account for the 5 hours it needs to write that data to a file?Arthur, Yes I am developing this against a remote server, Where would i find the "Fast Load" option? For a reference point, if i execute the query and send the results to a file directly from Management Studio, it takes about 3 minutes.. Also not blazing, but I can live with that..
So, to find out if there is any latency use the PING command like
PING MyRemoteHost -n 1000
if you see reply timeouts and/or late replies then this explains the slowness.
The "Fast Load" is here:

Arthur My Blog

-
mardi 21 août 2012 15:15
As Arthur stated, there are many reasons for the slowness. It can be hardware/ network/ incorrect design. Check this which talks about an incorrect design when loading data to flat files.
I did find this post last night, but it is not using a for loop, it is a straight Data Flow task.. (at least i think it is..)
http://btsbee.wordpress.com/
-
mardi 21 août 2012 15:16
While I am new to SSIS, we have had no performance issues with Views in our DTS datapumps.. and running the query from the view directly to a file within management studio takes less than 3 minutes.. so something is happening in the SSIS engine.. As an experiment, I created a quick data pump in one of our old DTS packages that copies the view to a table in another database.. That takes about 3 minutes.. Is writing to the file REALLY that slow?This is a very simple operation, I created a VIEW in a SQL 2005 DB, which has the columns I need to export to a flat file. I set up a data flow task, only one in the package, that copies the rows to the file. There are roughly a half million rows in the table.. It taks 5+ HOURS!! I'm not even sure where to start looking here! Any help would be appreciated!
Views are extremely slow when huge amount of data is involved.
Run a test, Instead of using the View use SQL statement from within the view and run the job if there is a huge time difference then you should try using Stored Procedure instead.
(Using views withing views is even worse and personally I would never directly use a view from withing SSIS).
Edit: Look for Indexed Views if the views is your way to go.
Please mark the post as answered if it answers your question
-
mardi 21 août 2012 15:32Modérateur
OK, at least we made some progress with figuring out where the bottleneck is. It is writing to a file on the remote machine, not local.
I thus see no grounds to blame the SSIS engine (the local execution is 3 min, right?).
So one thing you can do to speed up the output to a file: use the FastParse Option in Flat File Source.
Other than that you need an IO pressure test in that (target) environment where the slowness is, I simply think you are facing a SAN misconfiguration or something, after all the SSIS engine does not use any kind of magic than the .Net standard System.IO write to file methods.
Arthur My Blog

-
mardi 21 août 2012 15:40
Well, it's pulling from a remote server, writing to a local file, I am simply running the package in debug mode on my workstation.. (the 3 minute thing was just pushing query results to a file in management studio). It is not an IO issue.. I went back to an old DTS package, and recreated the same dataflow that I am trying to accomplish in SSIS, and it took 3 minutes.. so yeah, I think i CAN blame the SSIS engine.. I don't see the FastParse option, but it is not a flat file source, it is a flat file destination..OK, at least we made some progress with figuring out where the bottleneck is. It is writing to a file on the remote machine, not local.
I thus see no grounds to blame the SSIS engine (the local execution is 3 min, right?).
So one thing you can do to speed up the output to a file: use the FastParse Option in Flat File Source.
Other than that you need an IO pressure test in that (target) environment where the slowness is, I simply think you are facing a SAN misconfiguration or something, after all the SSIS engine does not use any kind of magic than the .Net standard System.IO write to file methods.
Arthur My Blog

-
mardi 21 août 2012 16:05Modérateur
-
mardi 21 août 2012 16:07Modérateur
Regarding the time to pull data, let's recap:
You run your SSIS package locally and get the file filled in 3 min completely, right?
DTS does the same, right?
Not so (takes ~ 5hrs) when you pull the data from a remote machine to the server (not your machine), right?
Arthur My Blog

-
mardi 21 août 2012 16:11
Regarding the time to pull data, let's recap:
You run your SSIS package locally and get the file filled in 3 min completely, right?
DTS does the same, right?
Not so (takes ~ 5hrs) when you pull the data from a remote machine to the server (not your machine), right?
Arthur My Blog

No.. Sorry if this hasn't been clear.
I run the SSIS package locally, it takes 5 hours. (Pulling from remote server, writing to local file)
In SQL Management Studio, I run the same query, it returns all rows in less than 3 minutes, whether i direct the output to a GRID/Text/or File.
I Created a Datapump in a DTS package that I run locally, pulling from remote server, writing to local file, it takes 3 minutes.
-
mardi 21 août 2012 16:22ModérateurPlease share here how you configured the source (in every detail please)
Arthur My Blog

-
mardi 21 août 2012 16:34
Please share here how you configured the source (in every detail please)
do you just want the XML from the package? Otherwise i'm not sure what you are asking for.. I pretty much created the package, created the dataflow task, dropped the oledb connection from the toolbar, pointed to the shared data source, picked a view from the data source views.. didn't change any of the default properties.. then dropped a flat file destination, connected, mapped, then ran.. setup took a minute and nothing other than default settings were selected..
Arthur My Blog

-
mardi 21 août 2012 17:56Modérateur
XML is no use
oledb connection I need to see as a screen shot
advanced edit
destination - need to see as above
One interesting item, can you revert to not to using the SHARED DATA SOURCE? Try a new one please and see how would it work!
Arthur My Blog

-
jeudi 6 septembre 2012 20:16
Did you ever get an answer on this? I'm trying to do something similar. Using SSIS because it's easier to edit and move things around, but the DTS package that Export Wizard creates works fine and then I have troubles in SSIS. (Admittedly, I know little to nothing about SSIS... I'm reading up on that now, but came across this thread.)
tia.
-
dimanche 9 septembre 2012 02:15Modérateur
Futrell,
Do not be intimidated by reading this thread.
Just go ahead with the package creation or editing the one produced by the Data Export Import Wizard.
Arthur My Blog


