none
Efficient Data Processing RRS feed

  • Question

  • ok I have searched the MSDN forums for like 2 days and have not found anything relating, If i over looked just point me in the right direction thank you.

     

    The problem is I have to connect to an SQL Server and retrieve about 200,000+ records, and update another database on a different server.  I was able to complete the task using DataSets/DataRows/Datatables but it takes like 30 min that is too slow.  I also tried to retrieve the data and put it in to hashtables that to was slow.  Actually it was slower that the previous attempt.  Then I tried just raw querying to the server and it worked great, although I do not feel comfortable querying the server so many times.  If any one knows a way that I can efficiently process mass amounts of data without throwing tens of thousands of queries to an SQL server please let me know. Thank you.

    Tuesday, June 5, 2007 2:15 PM

All replies

  • If you're servers are "linked", you can run an individual query that manipuliates data from both machines.  This usually gives good performance, but may require some coding effort.

     

    -Ryan / Kardax

    Tuesday, June 5, 2007 3:23 PM
  • You could try using an SqlDataReader rather than a DataSet.

     

    If the problem can be solved using a SqlDataReader then they can be much faster than using a DataSet, particularly when there are higher row counts.

     

    That said, it is hard to tell if your problem can be effectively solved that way.  Can you describe in more detail what it is your batch job is doing conceptually?

     

    Are you migrating data from one DB to a second?

    Are you reading 200k records to then write out a summary in a "data warehouse" like fasion?

    Once the 200k records are read, do you then need to or try to sort/filter them on the client?

     

    There could be ways to more effectivle solve your problem using the features of SQL rather than in your client app.

     

    It is hard to tell without knowing more.

    Tuesday, June 5, 2007 6:24 PM
  • You could probably use BCP to accomplish this quickly.  It is made for quickly doing bulk operations for SQL Server.  This functionality is available as an api in OLEDB and ODBC as well as in the form of a standalone executable called "BCP.exe".

     

    http://msdn2.microsoft.com/en-us/library/aa174646(SQL.80).aspx

     

    Hope this helps,

     

    John (MSFT)

    Tuesday, June 5, 2007 11:30 PM
  • SQL Server has possibility to migrate data between different data sources using DMO/SMO (depending on a version of SQL Server you have). I believe it is the fastest way to transfer data and it also allows you to manipulate with that data (select only specific rows, columns etc)

    Wednesday, June 6, 2007 10:53 AM
    Moderator