none
DataReader vs DataSet for DropDownList RRS feed

  • Question

  • This is more of a best practices question than an actual code one.  Im doing a side project just for fun, but I want to do things the right way.  Its a windows form application that pulls college football stats from a database.  On my main form there is a drop down list with all the teams pulled from the database.  At first I was using a datareader right on the code behind for the form itself.  But then I thought I should have a data access layer to do all my database transactions and pass datasets back to the form.  Now Ive got myself a little confused on whats best.  On the one hand, I figured having a datalayer and seperating my analysis code from my data access code was a good idea.  On the other hand, passing the dataset just for my dropdown list has slowed my form load time, and it seems for populating a drop down list (ie read only, forward movement of one record at a time) datareader is the obvious choice.  Since a datareader cant be passed between layers and thus has to remain in the form code, what philosophy is best? 

    Heres my current coder if that makes any difference:
    team.cs:

     

    private void frmMain_Load(object sender, EventArgs e)
    {
       txtMessages.Text =
    "opening message to come soon...";
       txtMessages.BackColor = System.Drawing.
    Color.LightSteelBlue;
       mnuMain.BackColor = System.Drawing.
    Color.LightSteelBlue;

       sql =
    "SELECT teamName FROM tblTeams order by teamName ASC";
       ds = dal.executeDataSet(sql);

     

       for (int i = 0; i < ds.Tables[0].Rows.Count; i++)
       {
          school = (
    string)ds.Tables[0].Rows[i].ItemArray[0];
          cmbSchool.Items.Add(school);
       }
    }

    datalayer:

     

    public DataSet executeDataSet(string sql)
    {
       ds =
    new DataSet();
       adapter =
    new OleDbDataAdapter();
       try
       {
          connection.Open();
          command = connection.CreateCommand();
          command.CommandText = sql;
          adapter.SelectCommand = command;
          adapter.Fill(ds);
       }
       catch & finally etc



    Thanks a bunch for any advice

    Tuesday, April 28, 2009 3:59 AM

Answers

  • When you call dataAdapter.Fill(dataSet), the framework internally uses DataReader anyway, it all goes down to a call to System.Data.Common.DbDataAdapter.FillInternal :

    private int FillInternal(DataSet dataset, DataTable[] datatables, ...)
    {
        ...
        IDbConnection connection = GetConnection3(this, command, "Fill");
        ...       
        try {
            ...
            using (IDataReader reader = command.ExecuteReader(behavior) ) {
                ...
            }
        }
        ...
    }

    So when you use DataSet, you use DataReader + a lot of other things that slow you down.

    That said, DataReader keeps the existing connection and is designed to quickly retrieve the data (moreover, it's forward only - you cannot move back). And DataSet is designed to get the data, do something with the data - without keeping the open connection to the database - and update the database.

    So DataReader looks like a way to go in your just-fill-the-combobox scenario.

    Here's an expample (getting the ordinals allows you to be less tightly coupled with column ordering in your query):

                int idxTeamId = reader.GetOrdinal("TeamId");
                int idxTeamName = reader.GetOrdinal("TeamName");

                while (reader.Read()) {
                     long teamId = reader.GetInt64(idxTeamId);
                     string teamName = reader.GetInt64(idxTeamName);
                     //I guess something like cmbSchool .Items.Add(new Item(teamName, teamId)); here
                }


    • Marked as answer by liurong luo Monday, May 4, 2009 6:34 AM
    Tuesday, April 28, 2009 7:28 AM
  • Thanks Peter A for this wonderful article

    DataReader vs. DataSet Performance Characteristics

    DataReaders are faster than DataSets populated via DataAdapters, right?      Well, yes --- and, no.

    The DataSet provides useful functionality for (among others):

    • Master-Detail Relationships with integrity checking
    • Client-side computed columns, and aggregations that keep up to date when the underlying data changes
    • Client-side filtering and sorting that doesn’t require round-trips to the Database and stays up to date when the underlying data changes
    • A broad choice of data binding options, including full two-way data binding with Windows Forms, binding to the DataGrid, and other .NET controls
    • Integration with the DataAdapter for automatic database updates
    • Ability to directly read and write XML representations of relational data.
    At some point, however (if it hasn’t already), the controversy will arise that DataReaders are faster that DataAdapters/DataSets and that you should wean yourself away from all those features and use the lean, mean DataReader instead. After all, the DataAdapter.Fill() method actually uses the DataReader internally so the theory would be that it can only hope to approach the performance of a DataReader alone.

    However, there are other issues involved here that may not immediately come into focus on “first pass”. In reality, on multi-user systems, a DataReader is holding the connection and any locks open throughout the duration of your looping logic (e.g., while myReader.Read() { – loop --} ). Conversely, DataAdapters/DataSets release their connections and locks right after the fill statement completes. The result of using DataReaders can therefore lead to increased database contention, slower performance, and potential poor scalability, depending on concurrent traffic under load. Of course, the only way to get a realistic handle on all this is to test one's code in a simulated production environment.

    Improving DataAdapter Performance

    You can improve the performance of the DataAdapter, however, - approaching the speed of a DataReader. This is done by temporarily turning off some of the advanced default functionality of the DataSet during processing. The biggest performance hits during DataSet retrieval are the integrity constraints and indexing that the DataSet uses to maintain its data internally.

    The code listing below fills a dataset that has two DataTables, Customers and Employees from the Northwind Database. This is done via a DataAdapter.

     

    DataSet ds= new DataSet();
    string strConn="Server=(local);dataBase=Northwind;user id=sa;password=;";
    SqlConnection cn = new SqlConnection(strConn);
    string strSQL="select * from Customers;select * from employees";
    cn.Open();
    SqlDataAdapter da = new SqlDataAdapter( strSQL,cn);
    ds.Tables.Add("Customers");
    ds.Tables.Add("Employees");
    ds.EnforceConstraints =false;
    ds.Tables["Customers"].BeginLoadData();
    da.Fill(ds.Tables["Customers"]);
    ds.Tables["Customers"].EndLoadData();
    ds.Tables["Employees"].BeginLoadData();
    da.Fill(ds.Tables["Employees"]);
    ds.Tables["Employees"].EndLoadData();
    dataGrid1.DataSource=ds.Tables["Customers"];
    dataGrid2.DataSource=ds.Tables["Employees"];
    cn.Close();

     

    Note that the first key line of code sets the EnforceConstraints property of the dataset to false. This turns off constraint checking and can offer faster overall processing. This setting can be turned back on, after the data is retrieved, inside of a try/catch loop that specifically attempts to catch a ConstraintException, and so we can still check for constraint violations.

    By the way, establishing a primary key as part of setting up your DataSet schema can add a significant performance overhead.

    We can also temporarily turn off internal Dataset indexing and notifications. Before and after each fill, execute the BeginLoadData and EndLoadData methods (shown above) respectively. These methods are members of the DataTable class and so you need to call them on the particular DataTable that you are filling. BeginLoadData turns off notifications, index maintenance, and constraints while loading data. This ain't "rocket science", it's right there in the MS documentation for the DataTable class.

    The performance improvements that you will achieve with these techniques will vary depending upon the size of the data being retrieved and the number of constraints within your dataset. On the simplest multi-table dataset however, I saw an average performance improvement of some 31 percent and I suspect with complex master-detail relational datasets the performance improvement could be significantly higher.

    Another test I did (just for comparison purposes) was to put together a utility class with 2 methods, one to convert a DataReader to a DataTable, and a second to convert a DataReader (one with more than one resultset, of course) to a DataSet. The idea was to see if bringing back a DataReader for speed, and then converting it at the client side into a required DataTable or DataSet for binding, say, to a DataGrid would be worth it. Sadly, the performance was actually slower than the regular DataAdapter.Fill() methods. However, the class still has a number of uses and I reproduce the code below and also include it with the test – harness download at the bottom of this article.

     

    using System;
    using System.Data;
    using System.Data.SqlClient;
    using System.Reflection;
    
    namespace PAB.Data.Utils
    {
    public class DataReaderConverters
    {
    /// <summary>
    /// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataSet:
    /// Converts one or more resultsets returned in a SqlDataReader to a DataSet
    /// </summary>
    /// <param name="reader">SqlDataReader</param>
    /// <returns>System.Data.DataSet</returns>
    public static DataSet ConvertDataReaderToDataSet(SqlDataReader reader)
    {
    DataSet dataSet = new DataSet();
    do
    {
    // Create new data table
    DataTable schemaTable = reader.GetSchemaTable();
    DataTable dataTable = new DataTable();
    
    if ( schemaTable != null )
    {
    for ( int i = 0; i < schemaTable.Rows.Count; i++ )
    {
    DataRow dataRow = schemaTable.Rows[ i ];
    // Create a column name that is unique in the data table
    string columnName = ( string )dataRow[ "ColumnName" ];
    // Add the column definition to the data table
    DataColumn column = new DataColumn( columnName, ( Type )dataRow[ "DataType" ] );
    dataTable.Columns.Add( column );
    }
    
    dataSet.Tables.Add( dataTable );
    
    while ( reader.Read() )
    {
    DataRow dataRow = dataTable.NewRow();
    
    for ( int i = 0; i < reader.FieldCount; i++ )
    dataRow[ i ] = reader.GetValue( i );
    
    dataTable.Rows.Add( dataRow );
    }
    }
    else
    {
    // No records returned
    
    DataColumn column = new DataColumn("RowsAffected");
    dataTable.Columns.Add(column);
    dataSet.Tables.Add( dataTable );
    DataRow dataRow = dataTable.NewRow();
    dataRow[0] = reader.RecordsAffected;
    dataTable.Rows.Add( dataRow );
    }
    }
    while ( reader.NextResult() );
    return dataSet;
    }
    
    /// <summary>
    /// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataTable
    /// converts SqlDataReader to a DataTable
    /// </summary>
    /// <param name="reader">SqlDataReader</param>
    /// <returns>System.Data.DataTable</returns>
    public static DataTable ConvertDataReaderToDataTable(SqlDataReader reader)
    {
    System.Data.DataTable table = reader.GetSchemaTable();
    System.Data.DataTable dt = new System.Data.DataTable();
    System.Data.DataColumn dc;
    System.Data.DataRow row;
    System.Collections.ArrayList al = new System.Collections.ArrayList();
    
    for (int i = 0; i < table.Rows.Count; i ++)
    {
    dc = new System.Data.DataColumn();
    if (! dt.Columns.Contains(table.Rows[i]["ColumnName"].ToString()))
    {
    dc.ColumnName = table.Rows[i]["ColumnName"].ToString();
    dc.Unique = Convert.ToBoolean(table.Rows[i]["IsUnique"]);
    dc.AllowDBNull = Convert.ToBoolean(table.Rows[i]["AllowDBNull"]);
    dc.ReadOnly = Convert.ToBoolean(table.Rows[i]["IsReadOnly"]);
    al.Add(dc.ColumnName);
    dt.Columns.Add(dc);
    }
    }
    while (reader.Read())
    {
    row = dt.NewRow();
    for ( int i = 0; i < al.Count; i++)
    {
    row[((System.String) al[i])] = reader[(System.String) al[i]];
    }
    dt.Rows.Add(row);
    }
    return dt;
    }
    } // end class
    } // end namespace

     

    Original article

     

    http://www.eggheadcafe.com/articles/20030205.asp





    Thanks, A.m.a.L | [Remember to click "mark as answered" when you get a correct reply to your question]
    • Marked as answer by liurong luo Monday, May 4, 2009 6:34 AM
    Tuesday, April 28, 2009 7:46 AM

All replies

  • When you call dataAdapter.Fill(dataSet), the framework internally uses DataReader anyway, it all goes down to a call to System.Data.Common.DbDataAdapter.FillInternal :

    private int FillInternal(DataSet dataset, DataTable[] datatables, ...)
    {
        ...
        IDbConnection connection = GetConnection3(this, command, "Fill");
        ...       
        try {
            ...
            using (IDataReader reader = command.ExecuteReader(behavior) ) {
                ...
            }
        }
        ...
    }

    So when you use DataSet, you use DataReader + a lot of other things that slow you down.

    That said, DataReader keeps the existing connection and is designed to quickly retrieve the data (moreover, it's forward only - you cannot move back). And DataSet is designed to get the data, do something with the data - without keeping the open connection to the database - and update the database.

    So DataReader looks like a way to go in your just-fill-the-combobox scenario.

    Here's an expample (getting the ordinals allows you to be less tightly coupled with column ordering in your query):

                int idxTeamId = reader.GetOrdinal("TeamId");
                int idxTeamName = reader.GetOrdinal("TeamName");

                while (reader.Read()) {
                     long teamId = reader.GetInt64(idxTeamId);
                     string teamName = reader.GetInt64(idxTeamName);
                     //I guess something like cmbSchool .Items.Add(new Item(teamName, teamId)); here
                }


    • Marked as answer by liurong luo Monday, May 4, 2009 6:34 AM
    Tuesday, April 28, 2009 7:28 AM
  • Thanks Peter A for this wonderful article

    DataReader vs. DataSet Performance Characteristics

    DataReaders are faster than DataSets populated via DataAdapters, right?      Well, yes --- and, no.

    The DataSet provides useful functionality for (among others):

    • Master-Detail Relationships with integrity checking
    • Client-side computed columns, and aggregations that keep up to date when the underlying data changes
    • Client-side filtering and sorting that doesn’t require round-trips to the Database and stays up to date when the underlying data changes
    • A broad choice of data binding options, including full two-way data binding with Windows Forms, binding to the DataGrid, and other .NET controls
    • Integration with the DataAdapter for automatic database updates
    • Ability to directly read and write XML representations of relational data.
    At some point, however (if it hasn’t already), the controversy will arise that DataReaders are faster that DataAdapters/DataSets and that you should wean yourself away from all those features and use the lean, mean DataReader instead. After all, the DataAdapter.Fill() method actually uses the DataReader internally so the theory would be that it can only hope to approach the performance of a DataReader alone.

    However, there are other issues involved here that may not immediately come into focus on “first pass”. In reality, on multi-user systems, a DataReader is holding the connection and any locks open throughout the duration of your looping logic (e.g., while myReader.Read() { – loop --} ). Conversely, DataAdapters/DataSets release their connections and locks right after the fill statement completes. The result of using DataReaders can therefore lead to increased database contention, slower performance, and potential poor scalability, depending on concurrent traffic under load. Of course, the only way to get a realistic handle on all this is to test one's code in a simulated production environment.

    Improving DataAdapter Performance

    You can improve the performance of the DataAdapter, however, - approaching the speed of a DataReader. This is done by temporarily turning off some of the advanced default functionality of the DataSet during processing. The biggest performance hits during DataSet retrieval are the integrity constraints and indexing that the DataSet uses to maintain its data internally.

    The code listing below fills a dataset that has two DataTables, Customers and Employees from the Northwind Database. This is done via a DataAdapter.

     

    DataSet ds= new DataSet();
    string strConn="Server=(local);dataBase=Northwind;user id=sa;password=;";
    SqlConnection cn = new SqlConnection(strConn);
    string strSQL="select * from Customers;select * from employees";
    cn.Open();
    SqlDataAdapter da = new SqlDataAdapter( strSQL,cn);
    ds.Tables.Add("Customers");
    ds.Tables.Add("Employees");
    ds.EnforceConstraints =false;
    ds.Tables["Customers"].BeginLoadData();
    da.Fill(ds.Tables["Customers"]);
    ds.Tables["Customers"].EndLoadData();
    ds.Tables["Employees"].BeginLoadData();
    da.Fill(ds.Tables["Employees"]);
    ds.Tables["Employees"].EndLoadData();
    dataGrid1.DataSource=ds.Tables["Customers"];
    dataGrid2.DataSource=ds.Tables["Employees"];
    cn.Close();

     

    Note that the first key line of code sets the EnforceConstraints property of the dataset to false. This turns off constraint checking and can offer faster overall processing. This setting can be turned back on, after the data is retrieved, inside of a try/catch loop that specifically attempts to catch a ConstraintException, and so we can still check for constraint violations.

    By the way, establishing a primary key as part of setting up your DataSet schema can add a significant performance overhead.

    We can also temporarily turn off internal Dataset indexing and notifications. Before and after each fill, execute the BeginLoadData and EndLoadData methods (shown above) respectively. These methods are members of the DataTable class and so you need to call them on the particular DataTable that you are filling. BeginLoadData turns off notifications, index maintenance, and constraints while loading data. This ain't "rocket science", it's right there in the MS documentation for the DataTable class.

    The performance improvements that you will achieve with these techniques will vary depending upon the size of the data being retrieved and the number of constraints within your dataset. On the simplest multi-table dataset however, I saw an average performance improvement of some 31 percent and I suspect with complex master-detail relational datasets the performance improvement could be significantly higher.

    Another test I did (just for comparison purposes) was to put together a utility class with 2 methods, one to convert a DataReader to a DataTable, and a second to convert a DataReader (one with more than one resultset, of course) to a DataSet. The idea was to see if bringing back a DataReader for speed, and then converting it at the client side into a required DataTable or DataSet for binding, say, to a DataGrid would be worth it. Sadly, the performance was actually slower than the regular DataAdapter.Fill() methods. However, the class still has a number of uses and I reproduce the code below and also include it with the test – harness download at the bottom of this article.

     

    using System;
    using System.Data;
    using System.Data.SqlClient;
    using System.Reflection;
    
    namespace PAB.Data.Utils
    {
    public class DataReaderConverters
    {
    /// <summary>
    /// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataSet:
    /// Converts one or more resultsets returned in a SqlDataReader to a DataSet
    /// </summary>
    /// <param name="reader">SqlDataReader</param>
    /// <returns>System.Data.DataSet</returns>
    public static DataSet ConvertDataReaderToDataSet(SqlDataReader reader)
    {
    DataSet dataSet = new DataSet();
    do
    {
    // Create new data table
    DataTable schemaTable = reader.GetSchemaTable();
    DataTable dataTable = new DataTable();
    
    if ( schemaTable != null )
    {
    for ( int i = 0; i < schemaTable.Rows.Count; i++ )
    {
    DataRow dataRow = schemaTable.Rows[ i ];
    // Create a column name that is unique in the data table
    string columnName = ( string )dataRow[ "ColumnName" ];
    // Add the column definition to the data table
    DataColumn column = new DataColumn( columnName, ( Type )dataRow[ "DataType" ] );
    dataTable.Columns.Add( column );
    }
    
    dataSet.Tables.Add( dataTable );
    
    while ( reader.Read() )
    {
    DataRow dataRow = dataTable.NewRow();
    
    for ( int i = 0; i < reader.FieldCount; i++ )
    dataRow[ i ] = reader.GetValue( i );
    
    dataTable.Rows.Add( dataRow );
    }
    }
    else
    {
    // No records returned
    
    DataColumn column = new DataColumn("RowsAffected");
    dataTable.Columns.Add(column);
    dataSet.Tables.Add( dataTable );
    DataRow dataRow = dataTable.NewRow();
    dataRow[0] = reader.RecordsAffected;
    dataTable.Rows.Add( dataRow );
    }
    }
    while ( reader.NextResult() );
    return dataSet;
    }
    
    /// <summary>
    /// [static] PAB.Data.Utils.DataReaderConverters.ConvertDataReaderToDataTable
    /// converts SqlDataReader to a DataTable
    /// </summary>
    /// <param name="reader">SqlDataReader</param>
    /// <returns>System.Data.DataTable</returns>
    public static DataTable ConvertDataReaderToDataTable(SqlDataReader reader)
    {
    System.Data.DataTable table = reader.GetSchemaTable();
    System.Data.DataTable dt = new System.Data.DataTable();
    System.Data.DataColumn dc;
    System.Data.DataRow row;
    System.Collections.ArrayList al = new System.Collections.ArrayList();
    
    for (int i = 0; i < table.Rows.Count; i ++)
    {
    dc = new System.Data.DataColumn();
    if (! dt.Columns.Contains(table.Rows[i]["ColumnName"].ToString()))
    {
    dc.ColumnName = table.Rows[i]["ColumnName"].ToString();
    dc.Unique = Convert.ToBoolean(table.Rows[i]["IsUnique"]);
    dc.AllowDBNull = Convert.ToBoolean(table.Rows[i]["AllowDBNull"]);
    dc.ReadOnly = Convert.ToBoolean(table.Rows[i]["IsReadOnly"]);
    al.Add(dc.ColumnName);
    dt.Columns.Add(dc);
    }
    }
    while (reader.Read())
    {
    row = dt.NewRow();
    for ( int i = 0; i < al.Count; i++)
    {
    row[((System.String) al[i])] = reader[(System.String) al[i]];
    }
    dt.Rows.Add(row);
    }
    return dt;
    }
    } // end class
    } // end namespace

     

    Original article

     

    http://www.eggheadcafe.com/articles/20030205.asp





    Thanks, A.m.a.L | [Remember to click "mark as answered" when you get a correct reply to your question]
    • Marked as answer by liurong luo Monday, May 4, 2009 6:34 AM
    Tuesday, April 28, 2009 7:46 AM