none
dataset.fill performance RRS feed

  • Question

  • c#.net visual studio 2005 right now 2tier but moving to 3 tier

    i have a table that we have to retrieve all the information from.
    do this with odbc dataadapter.fill(dataset);

    the problem is that there are around 300,000 records in the table and it is taking upwards of 30 minutes to process(2 tier)-- this is just too long when compared to the method that currently exists in powerbuilder (taking about 5 minutes).

    what would be the best method to process this many records and keep performance up?
    Tuesday, June 19, 2007 7:10 PM

Answers

  • I mean, get rid of the DataSet entirely.  The DataTable object has a substantial amount of stuff that you probably don't need if all you're doing is reading data into memory and processing it. 

     

    I don't know what your use case is, but let's say your table has a million names in it, and you want to build a class that can load all the data and then let your program get to a name (given its ID, an int) as quickly as possible.  This does that:

     

    Code Snippet

    public class MyRow

    {

       public int ID;

       public string Name;

     

       public static Dictionary<int, MyRow> GetData(SqlDataReader reader)

       {

          Dictionary<int, MyRow> d = new Dictionary<int, MyRow>();

          while (reader.Read())

          {

             MyRow r = new MyRow();

             r.ID = reader.GetInt32(0);

             r.Name = reader.GetString(1);

             d.Add(r.ID, r);

          }

          return d;

       }

    }

     

    You lose a lot of the DataTable's functionality by doing this - no more creating DataViews that can sort and filter, or keeping track of the edit/commit status of your rows.  In fact, all that this class can do is let you do this:

     

    Code Snippet
       string name = d.Contains(id) ? d[id].Name : "";

     

    in close to O(1) time.  But it'll get the data out of the database about as fast as it's possible to do so.  On my laptop, the above gets a million rows out of a SQL Express database in about 3-4 seconds.

    Saturday, June 23, 2007 8:37 AM
  • thanks again,

    still really slow. this time it takes about 6 minutes though so big improvement but a ridiculous time for 300,000 records especially when compared to your time.
    i think that this might have something to do with the fact that i am using an odbc adapter.

    thanks again.
    Monday, June 25, 2007 2:41 PM

All replies

  • As a quick fix, you might try turning off EnforceConstraints in your DataSet before loading it.

    Depending on what you do with that data once you've loaded it, you may be much better off building a lighter-weight class than a DataTable and using a DataReader to populate it.
    Wednesday, June 20, 2007 1:54 AM
  • thanks for the reply,

    i tried turning off enforceconstraints.-- no difference.

    not really sure where to start on building a lighter dataset though.-- any resources that you know of for researching?
    Wednesday, June 20, 2007 1:25 PM
  • I mean, get rid of the DataSet entirely.  The DataTable object has a substantial amount of stuff that you probably don't need if all you're doing is reading data into memory and processing it. 

     

    I don't know what your use case is, but let's say your table has a million names in it, and you want to build a class that can load all the data and then let your program get to a name (given its ID, an int) as quickly as possible.  This does that:

     

    Code Snippet

    public class MyRow

    {

       public int ID;

       public string Name;

     

       public static Dictionary<int, MyRow> GetData(SqlDataReader reader)

       {

          Dictionary<int, MyRow> d = new Dictionary<int, MyRow>();

          while (reader.Read())

          {

             MyRow r = new MyRow();

             r.ID = reader.GetInt32(0);

             r.Name = reader.GetString(1);

             d.Add(r.ID, r);

          }

          return d;

       }

    }

     

    You lose a lot of the DataTable's functionality by doing this - no more creating DataViews that can sort and filter, or keeping track of the edit/commit status of your rows.  In fact, all that this class can do is let you do this:

     

    Code Snippet
       string name = d.Contains(id) ? d[id].Name : "";

     

    in close to O(1) time.  But it'll get the data out of the database about as fast as it's possible to do so.  On my laptop, the above gets a million rows out of a SQL Express database in about 3-4 seconds.

    Saturday, June 23, 2007 8:37 AM
  • thanks again,

    still really slow. this time it takes about 6 minutes though so big improvement but a ridiculous time for 300,000 records especially when compared to your time.
    i think that this might have something to do with the fact that i am using an odbc adapter.

    thanks again.
    Monday, June 25, 2007 2:41 PM
  • One reason that my test ran so fast is that my query was only returning an int and an nvarchar(10), so the total amount of data that had to get pumped through the DataReader was small.  Make sure you're not selecting any columns you don't need.
    Monday, June 25, 2007 9:47 PM
  • i try to never use the select * as it slows performance, so i am only selecting the columns i need. tested a different connection type. used the sybase asaClient and the return time was cut to 3-5 seconds in a 2 tier application.
    Tuesday, June 26, 2007 12:21 PM
  • As Jon Lovitz says in A League of Their Own:  Well then, that's better, isn't it.
    Tuesday, June 26, 2007 5:58 PM
  • Thursday, July 5, 2007 10:44 PM
  • Is it possible to create dynamic property for the class?
    I got a query with lots of columns. sometimes i do not know what the result will be.
    Having a general class will be helpful.

    Any suggestion?
    Thursday, August 2, 2007 1:20 AM
  • Dynamic property for which class?  The table?  If you're using a strongly-typed dataset, then no.  If you're using the DataTable class, you can (and will have to, I'm pretty sure) generate its collection of DataColumns dynamically.
    Tuesday, August 7, 2007 5:39 AM