none
Source Data Flow Task very slow using same connection as DTS RRS feed

  • Question

  • My SSIS package transfers data from AS400 to SQL Server 2005 and takes 3.75 minutes to transfer 115,000 records.  There are no transformations, indexes, contraints, data conversions – just a straight dump. 


    A DTS package running in SQL Server 2000 using the same ODBC data source takes about 23 secs for the same 115,000 records.


    There is 145 bytes per record.  Nineteen fields, all but two are nvarchar, two are numeric.  No dates or timestamp.

    SQL Server 2005 (Standard Edition) is running as a named instance alongside SQL Server 2000 on the same physical server that has 4 processors and 3GB memory.  It makes no detectable difference when the SSIS package runs with SQL Server 2000 (and Agent) service is stopped.

     

    Using RowCount (as per Donald Farmer’s OVAL presentation ) as the destination task, it’s apparent that nearly all of the bottle neck is in the source task.  This is confirmed by SQL Server Profiler which shows that almost all of the processing time is attributed to the following BULK INSERT statement:

     

    BULK INSERT [dbo].[MyTableName]

    FROM 'Global\DTSQLIMPORT              00000000000016200000000000d8b0e8'

    WITH (DATAFILETYPE = 'DTS_Buffers', CODEPAGE = 'RAW', TABLOCK)

     

    When running from BIDS, the record-count ticks over at a (constant) slow rate until it reaches the total (115,000) then the package instantly completes successfully.  So it seems the bottleneck is in filling the buffer (Global\DTSQLIMPORT ).

     

    Perfmon counters show:

        - Processor: %processor time,  is reasonably constant at about 26%

        - System: Processor Queue Length, spikes from 0 to 1 (occassionally 2) approx each second.

        - Memory: Pages/sec, constantly zero

        - Memory: Available Mbytes, constantly about 340

        - PhysicalDisk: Avg.Disk Queue Length, regular spikes from 0 - 4

        - PhysicalDisk: %Idle Time, regular spikes from 85% - 100%          

        - Network Interface: BytesTotal / sec, regular spikes between 52000 - 74000

        - SQL Server BufferManager, Buffer Cache Hit Ratio, constant at about 95.5%

        - SSIS Pipeline: Rows read,  is zero – is this the key to my problem?

        - SSIS Pipeline: Rows written,  seems to be cumulative of all packages.  It increments by 115,000 with each run of the package. I can’t see the value of it.

        - SSIS Pipeline: Buffers spooled, constantly zero

     

    Are these results OK?  They seem alright to me. Are there other counters I should be recording?

     

    Package Details.

    Source Connection

    ConnectionManagerType= ADO.NET:System.Data.Odbc.OdbcConnection, System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken = …

    ConnectionString: uid=MyUserName;Dsn=MyDSN

    DelayValidation=False

    Qualifier= System.Data.Odbc.OdbcConnection, System.Data, Version=2.0.0.0, Culture=neutral, PublicKeyToken=… (etc)

    RetainSameConnection=False (True causes an error)

     

    Destination Connection

    ConnectionManagerType=OLEDB

    ConnectionString= Data Source=MyServer\MyNamedInstance;Initial Catalog=MyDatabase;Provider=SQLNCLI.1;Integrated Security=SSPI;Auto Translate=False 

    RetainSameConnection=True (Changed from False)

     

    Data Flow Control

    DefaultBufferMaxRows=1000 (Changed from 10000, and other values on the experience of Andy, who contributed to this excellent blog by Jamie Thomson:
    DefaultBufferSize=10485760 (and various values)

    ForceExecutionValue = False

     

    Data Flow Task: Datareader Source:

    ValidateExternalMetaData=False

    CommandTimeout=30

    SqlCommand=SELECT * FROM MySourceTable (including DISTINCT slows it down by about 5%)

     

    Data Flow Task: SQL Server Destination:

    (I have tried OLEDB Destination with practically the same results)

    ValidateExternalMetaData=False

    AlwaysUseDefaultCodePage = False

    BulkInsertCheckConstraints = False (There are no constraints)

    BulkInsertFireTriggers = False (There are no triggeres)

    BulkInsertFirstRow=1 (Changed from the default -1, made no difference)

    BulkInsertKeepIndentity=False (There are no identity cols)

    BulkInsertKeepNulls=False

    BulkInsertLastRow=0 (Changed from -1 after reading this article )

    BulkInsertMaxErrors=-1 (No errors occur so I figure this is not relevant)

    BulkInsertOrder is blank (There are no clustered indexes on the source or destination tables.)

    BulkInsertTableName=[dbo].[MyTableName]

    BulkInsertTabLock=True

    DefaulCodePage=1252

    MaxInsertCommitSize=0 (Have tried various settings here)

    Timeout=30

     

    Initially I suspected that I might need more memory, but my current thinking is that, although performance would improve with more memory, there is something else wrong with my set up. 

    Any ideas would be greatly appreciated.

    Thursday, April 16, 2009 5:08 AM

Answers

  • Hi guys,

    this slowdown is very likely caused by the overhead ADO .NET bridge to ODBC adds to the system. Unforunatelly, SSIS does not have a native support for ODBC providers in data flows.

    Have you guys tried to use OLE DB providers For DB2? I wonder how would those rate in your environments.

    Thanks,
    Bob
    This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, May 4, 2009 4:46 PM
  • I recently started using the OLE DB provider to extract data from DB2(AIX) and have had no issues. Response times seems reasonable.
    Rafael Salas | Don’t forget to mark the post(s) that answered your question http://rafael-salas.blogspot.com/
    Monday, May 4, 2009 5:12 PM
    Moderator

All replies

  • I have just done some simple maths which points to the network as the problem.

    I'm transferring 115,000 records at 145 bytes per record = 16,675,000 bytes
    From Perfmon, Bytes/sec averages about 60,000 bytes per sec so the expected transfer time is about 278 ses (4.6 mins) which is approx what I'm getting.

    However, I still don't understand why the DTS package takes about 23 secs using the same network.

    Does SSIS interact with the network differently to DTS?

     

    Thursday, April 16, 2009 10:01 PM
  • Many thanks for doing all this work and posting such clear documentation (as well as on SQLServerCentral).
    I have had worse results importing a few million rows from 35 DB2 tables (iSeries/AS400).
    My manually-created SSIS package took 5 hours whereas the DTS package took only 31-35 minutes for the exact same work.

    In desperation, I have now installed DTS in Legacy mode on my new SQL2005 server and after about an hour work copied the DTS package from the old SQL2005 server, created identical ODBC System DSNs and ran it. Still takes 30 minutes, on the exact same machine, same network, etc, as the SSIS that takes 5 hours.

    My conclusion is that - unless something dramatic is revealed - there is no way I can tweak SSIS so much as to achieve a 1000% improvement on how it works now - i.e. to bring it down from over 300 minutes to 30.
    I can fiddle with 35 buffer sizes * 35 max records /min-max while adding up the bytes in the table designer to work out the min/max... but I guess it'd not improve 1000%.

    And yes, I researched as much as you - I use SELECT column names in the OLEDB Sources, tried concurrent task series (getting connection conflicts in the process as it POOLS by default...).

    I am still interested so if anyone hears anything, do post!


    Forever Rookie
    Thursday, April 30, 2009 7:26 AM
  • It's nice to get this feedback; to know that others have struck the same brick walls as me. 

    Since my last post I have decided to stick with SQL Server 2000 and the 'old' DTS packages so your post helps me feel better about that decision.

    It's a shame about the performance because  SSIS is so versatile and I would liked to have used it.  And like you, I'm still interested to hear if there is a solution.

    Cheers
    Alister
    Sunday, May 3, 2009 10:34 PM
  • Hi Alister,
    I have one question, how many times you run your package??

    Thanks-
    Let us TRY this
    Monday, May 4, 2009 6:25 AM
  • Hi guys,

    this slowdown is very likely caused by the overhead ADO .NET bridge to ODBC adds to the system. Unforunatelly, SSIS does not have a native support for ODBC providers in data flows.

    Have you guys tried to use OLE DB providers For DB2? I wonder how would those rate in your environments.

    Thanks,
    Bob
    This posting is provided "AS IS" with no warranties, and confers no rights.
    Monday, May 4, 2009 4:46 PM
  • I recently started using the OLE DB provider to extract data from DB2(AIX) and have had no issues. Response times seems reasonable.
    Rafael Salas | Don’t forget to mark the post(s) that answered your question http://rafael-salas.blogspot.com/
    Monday, May 4, 2009 5:12 PM
    Moderator
  • ETL vs ELTL

    In the course of tweaking different properties and recording various counters I would say I have run this package about 200 times.  I'm not sure how this can help you though.

    Cheers
    Alister

    Tuesday, May 5, 2009 10:42 AM
  • AlisterN,

    Please do a search for the "Microsoft OLE DB Provider for DB2" and "SQL Server 2008 Feature Pack". Ignore the fact that the description says it's for 2008, it will run fine with SQL Server 2005. Then test with this provider. When configuring the provider, go into the "advanced" tab, and set Rowset Cache Size to 30. This allows us to prefetch data from the DB2 source before SSIS asks for it, and helps speed things up, depending on the network, sql overhead, etc, etc.

    If you have any problems with it, open a case with our product support group. If they can't help you, I'm sure I will be contacted :)

    Charles
    Saturday, May 16, 2009 6:19 PM
  • [..] the "Microsoft OLE DB Provider for DB2" and "SQL Server 2008 Feature Pack". Ignore the fact that the description says it's for 2008, it will run fine with SQL Server 2005. Then test with this provider.
    Charles
    I would love to retest my slow SSIS package with the MS connectivity drivers (providers).

    Unfortunately, this release of DB2OLEDB also requires MSSQL higher than the Standard Edition we are licensed for (SQL2005) ...

    This is a big M$ mistake IMHO as the poor cousins who cannot afford the high-end licenses will continue RIGHTLY bagging the poor MS connectivity and SSIS - which cannot be used for anything serious as far as I can say as it runs more than 1000 times slower than DTS when importing data from DB2.
    I cannot confirm that the slowdown is due to poor IBM OLEDB provider drivers as I cannot run anything else.
    Forever Rookie
    Sunday, June 28, 2009 12:06 AM
  • Great solution. I use J.D.Edwards and Database of J.D.E have some rows that SQL Server read how Binary rows. When this heappen, I need use ADO.NET connection, because is the only one that read all data no corrupt. Anyway. I need solution that read the binary rows and be fast in the sometime. Anybody knows some solution for this??

     

    Thank you..

     

    Carlos - DBA - Brazil

    Thursday, October 14, 2010 7:07 PM