Asked by:
Get DataTable async huge performance issue

Question
-
User-157143658 posted
Hi,
I need to implement a performin way to return a datatable asycronously from my webapi 2 project.
I have my old synchronous method version like this one that run on a 300.000 rows table in around 800ms:
public IHttpActionResult TestData()
{
string sql = "select * from test_async";
using (SqlConnection conn = new SqlConnection("<CONNECTION STRING>"))
{
conn.Open();
DataTable table = new DataTable();
SqlCommand cmd = new SqlCommand(sql, conn);
using (SqlDataAdapter Adapter = new SqlDataAdapter(cmd))
{
Adapter.Fill(table);
}
}
return Ok();
}this method use Adapter.Fill() that does not exist in Async Version.
I Found a alternative way to get data using SqlDataReader class that support Async methods implementation but it take about 3000ms to complete the same task (4 times slower!!!):
public async Task<IHttpActionResult> TestDataAsync() { string sql = "select * from test_async"; using (SqlConnection conn = new SqlConnection("<CONNECTION STRING>")) { await conn.OpenAsync(); DataTable table = new DataTable(); SqlCommand cmd = new SqlCommand(sql, conn); using (var reader = await cmd.ExecuteReaderAsync().ConfigureAwait(false)) { DataTable schema = reader.GetSchemaTable(); if (schema != null) { foreach (DataRow drow in schema.Rows) { string columnName = System.Convert.ToString(drow["ColumnName"]); DataColumn column = new DataColumn(columnName, (Type)(drow["DataType"])); table.Columns.Add(column); } } while (await reader.ReadAsync().ConfigureAwait(false)) { var dataRow = table.NewRow(); for (int i = 0; i < table.Columns.Count; i++) { dataRow[i] = reader[i]; } table.Rows.Add(dataRow); } } } return Ok(); }
In this test i use test_async table created using this TSQL script:
create table test_async( id int primary key, name text ) DECLARE @i int = 0 while @i < 300000 begin insert into test_async(id,name) select @i, 'Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco ' + cast(@i as VARCHAR) set @i +=1 end
Anyone can tell me how can i improve the performance of the async ???
I do not want to consider hints like "use Task.Run" as a solution because i dont want to lose Async benefits.
Thanks!!!!!
Wednesday, February 6, 2019 2:36 PM
All replies
-
User753101303 posted
Hi,
You could perhaps try https://referencesource.microsoft.com/ to see what Fill does behing the scene and build an async version.
AFAIKK async is NOT about pure speed but rather scalability. The idea is to not wait on I/O so that the processor can do something else. I'm not an expert and never measured that but IMHO it is perfectly possible that async code is a bit slower (I agree 4x seems a lot) under a light load but increase the ability of the server to perform well under load.
Also in a web api it's common to use paging. Returning 300 000 rows seems something that should really not happen often if at all. This is really needed for your scenario ?
Wednesday, February 6, 2019 2:52 PM -
User-157143658 posted
HI,
in my webapi project i never return 300.000 rows, i've created this example with this amout of rows only to show off the performance issue.
i can accept a bit slower performance result as a cost to scalability but 4 times is really too much.
I'm sure there's a better way to do it so I want to wait for an expert's answer.
Thanks
Wednesday, February 6, 2019 2:57 PM -
User1120430333 posted
Yes using a datatable is a performance issue no doubt.
https://dzone.com/articles/reasons-move-datatables
https://www.codingblocks.net/programming/boxing-and-unboxing-7-deadly-sins/
http://lauteikkehn.blogspot.com/2012/03/datatable-vs-list.html
You should learn how to use a datareader that the fastest way of reading data with a forward reading cursor.
https://docs.microsoft.com/en-us/dotnet/framework/data/adonet/retrieving-data-using-a-datareader
You can see in this example where the datareader is using column names on the database table.
https://www.akadia.com/services/dotnet_data_reader.html
You can use a custom object like a DTO that can be populated in the reader loop, the DTO loaded into a collection and the collection of DTO(s) used.
https://en.wikipedia.org/wiki/Data_transfer_object
create new List <DTOT> for dtos
execute T-SQL.
reader loop
create new DTO
populate DTO from reader
add DTO to stos collection
Wednesday, February 6, 2019 4:46 PM -
User-893317190 posted
Hi bathoo22,
Since you are dealing with so much data , it is recommend that you could use pagination to reduce the size of data.
Divide the data into several part with page number and every time request only one page.
It is easy to page using entity framework,
https://stackoverflow.com/questions/33316277/pagination-in-mvc5
But if you want to use ado.net, you could try sql for paging.
https://stackoverflow.com/questions/109232/what-is-the-best-way-to-paginate-results-in-sql-server
Best regards,
Ackerly Xu
Thursday, February 7, 2019 7:28 AM -
User-474980206 posted
you may want to try .BeginLoadData(), .LoadDataRow(), .EndLoadData()
or you could switch to dapper which has much better performance than DataTables.
Tuesday, February 12, 2019 12:01 AM