none
Dataset's in large scale multi threaded services and applications RRS feed

  • Question

  • In the "Thread Safety" section of MSDN's documentation for the dataset object, it states that "This type is safe for multithreaded read operations. You must synchronize any write operations.". http://msdn2.microsoft.com/en-us/library/system.data.dataset.aspx

     

    We have also discovered that writes require an exclusive lock (cannot read any data while any other data is being written).

    I am wondering if there are any thread safe implementations of a dataset-like object that will allow simultaneous  read/write operations on multiple threads.

    We have a large scale multithreaded service that uses a centralized dataset to keep several hundred instances of the client application synced, with up to the second information. Single threading the write operations on this dataset has become unacceptably slow.

    Thursday, December 20, 2007 1:00 AM

Answers

  • Constraints including DRI(declarative referential integrity) constraints are in the logical design while indexes are in the physical both are used together but are really separate. What I am saying if an application have all these problems in the relational model of the application then it is a design problem. DataSet, DataReaders and threads can only help good application design, so the OP needs to post at the architecture forums both here and Asp.net. The reason a good relational designer could determine the upper and lower bound Cardinality of all the tables.

    Tuesday, December 25, 2007 4:18 PM

All replies

  • Try to use relational database backend as session data store.

    Thursday, December 20, 2007 8:42 AM
  • I don't know why you are using Thread Locks but good practice uses low level Transaction locks in write operation using DataSet while read operations are separate in a DataReader that is usually Cached.  This makes your application scalable but it require good knowledge of the Asp.net architecture.  Hope this helps.

     

    Monday, December 24, 2007 3:00 PM
  • When you encounter a limitation like this, it's generally a good idea to try and understand why the limitation is there, because that will guide you towards possible solutions.

     

    A simple example:  uniqueness constraints require synchronized write operations because there needs to be a single point in time, between the beginning of the write operation and its conclusion, where the DataTable can evaluate the proposed operation to see if any change that it will be making will cause a constraint to be violated. 

     

    Consider the other things that the DataTable is doing during a write operation - allocating memory from the heap, cascading updates, updating the internal hashtables that it uses for indexing - and it's clear that there are a lot of places where unsynchronized write operations could cause race conditions and create data consistency problems of a sort that the DataSet is designed to prevent.

     

    In fact, if you drill down into the DataTable in the debugger, you'll see that it's maintaining pages of DataRows.  If it's maintaining pages, then it's also doing page splits when there's not enough space in a page to hold a new row.  While it's doing a page split, the actual in-memory location of one or more DataRows will change.  I don't know for certain, but it makes a lot of sense that this operation would require an exclusive lock:  otherwise, the location of a DataRow could be changed by the writing thread while the reading thread was accessing it, and that would be a bad thing.

     

    So the question then becomes:  do you need all of this?  Your multithreaded application that's doing all of these write operations:  do these write operations inherently need to check constraints and allocate memory?  Maybe the answer's no.  Maybe your application is updating items, not inserting or deleting them, and it's updating columns that don't have constraints attached to them.

     

    If that (or something like it) is the case, the answer may be to build a different data structure for your application to interoperate with, one that is not going to have any logicial problems with simultaneous multithreaded write operations.

     

    Another possibility is to create a DataSet on each thread for caching writes, and have a separate operation on its own thread that periodically moves the cached writes into the centralized DataSet.  This will allow the write operations to terminate quickly (so that the writing threads aren't blocking each other) and minimize the amount of time that the centralized DataSet is locked (as you'll only have one thread, not hundreds, that's trying to write to it).

    Monday, December 24, 2007 10:30 PM
  • Constraints including DRI(declarative referential integrity) constraints are in the logical design while indexes are in the physical both are used together but are really separate. What I am saying if an application have all these problems in the relational model of the application then it is a design problem. DataSet, DataReaders and threads can only help good application design, so the OP needs to post at the architecture forums both here and Asp.net. The reason a good relational designer could determine the upper and lower bound Cardinality of all the tables.

    Tuesday, December 25, 2007 4:18 PM