Designing a data base application using Windows forms and System.Data.DataSet RRS feed

  • Question

  • Hello everyone,

    I need help! I've been working on an application for a while using .Net and ADO.NET 2.0. There are 2 people on the project and when we started (almost a year ago now). Both of us hadn't worked in C# prior to starting the project and I, personally, had never programmed in C#. 

    Originally, we had business objects that wrapped up DataRows. Because we didn't want any restrictions on when you could load objects and save them, we had one DataTable per DataRow. We ignored the relations in the DataSet and through painful implementation tasks, we implemented our own way to navigate relationships between data. We quickly (not quickly enough mind you) ran into issues with memory and have since (mostly) abandoned this design.

    About the application: 
    The application is meant for auctioning stock. It handles the input of the stock (inventory), the sale of stock, and the generation of cheques and invoices. We have 56 tables in our database. A large portion of the tables are related to cheques invoices and charges. The rest of the tables are allocated to the Stock (stuff to be sold) and groupings of stock.  For example, a Stock can be in multiple types of groups, but only one of each type. These relationships are mapped via foriegn keys.
    Now for my questions:

    Some questions:
     How many data sets should we have? I don't really want an exact number here (unless the answer is    1). We've toyed with the idea of having one data set per Form, but this seems to pose a problem when it comes to implementing business rules (not to mention queries). It seems like a real shame that the dataset type is baked into all of the table adapters. The one dataset per form idea seems like it would take care of a couple issues. It would allow us to use foriegn key relationships but not necessarily be forced to load data that the particular form we're on doesn't care about.

    If we use only one dataset, we either have to make foreign key relationships just 'Relations' in the dataset or we risk being forced to bring in a bunch of data that we don't really need in order to satisfy those foriegn key relationships in the dataset. It also seems desirable that at when using a dataset on a form we would want some foriegn key relations to be 'ON' and some to be 'OFF'. And we would want this configurable on a per-form basis. There doesn't seem to be an easy way to do that with a dataset.

    If i create two DataRows, d1 and d2. d1 has a foriegn key to d2. Both have RowState as 'Added'. If I save d2, its PK will get updated and d1 will no longer have a valid FK. How can I get around this?

    I know this is a lot. Hopefully it is at least a little clear. If anyone knows of any web resources or books that might help me, that would be great.

    Thanks a lot,


    Thursday, February 5, 2009 8:23 PM

All replies

  • Jordan, I certainly would not use one dataset per form...  I would turn around the application and build your application data model that is optimized for retrieval and search operation instead of building your application data model around a RDBMS that is normalized and optimized for storage.  My view, a database normalization was developed during a period when storage devices were expensive and slow application performance is acceptable.  Applications were built around a database.  But we do not have to follow this model anymore.  Take a look at Google's BigTable concept.  Modern transactional applications should use database only as a persistance layer and manage application data model around the application design...

    Now to answer you question, I would recommend reading Model Driven Architecture [http://www.omg.org/mda/] and www.codeplex.com/apparch before reengineer your application.  I have also published an application architecture using .NET 3.5 at www.gajakannan.com/netarch.aspx, but most of its elements could be easily built using .NET 2.0 features...  Hope this helps.

    { Gaja; }
    Pl mark if answer solves your problem | Visit http://gajakannan.com/netarch.aspx for .net ref Arch
    Friday, February 6, 2009 4:35 AM
  • Hello Gaja,

    Thanks for your post. 

    Based on what i've read, It looks like we were closer the first time when we were wrapping DataRows in our own Business objects. We were just running into problems (memory wise) because DataRows when mapped one-to-one with datatables were taking up a lot of memory.

    So, using the terms provided in the AppArch link, we had Custom objects that (more or less) corresponded to database rows (though this was not always the case. For example some objects were built out of two datarows in two separate tables). The relationships between objects were maintained by our C# code.

    For example:
    We have a Transactions (which could be an invoice or a cheque) and we have Charges (debits or credits) that appear on transactions. In the database this represented by a Transaction table and, possibly several Charge tables. The charge tables had a forign key to the transaction table. In code, a Transaction object had a list of charges. Does this approach make sense?

    Where we ran into problems:

    Because some of the relationships between data spanned many tables, we had trouble determining when to stop loading data. Example: We have stock in our system. Stock can be in several types of groups. SaleGroups, ReleaseGroups, Lots. The meaning is not important here really. Just know that these are types of groups that stock can be in. If we load a single stock, it's likely that we will want to know what groups it is in. So we load the groups. Uh oh, maybe we want to know all the stock in each of these groups too. Since it is the case that if stock A is in the same SaleGroup as stock B, they are not neccessarily in the same Release. Given that, We could run into a situation where we load far more than we actually want to (and indeed see a performance hit as a result). So, my question is, do I rely on heuristics to decide what to load and when, or do I give my upper layers control over what is loaded? 
    What is generally done in this case?

    We also ran into problems because we were being really dumb about how we loaded objects and relied on really generic ways to load data. For example, if we were loading a bunch of transactions (based on some criteria), we would first load the transaction records, loop through them and load the charges that referenced those transactions. I think we really should have loaded the charges by the same criteria as we used to load transactions (this would cut down on the amount of calls to the database).

    I don't know what kind of answers I want here. Maybe if someone could just comment on the approaches I am taking and offer suggestions on some of the problems I have outlined.


    Friday, February 6, 2009 3:12 PM
  • I would load objects I know would be required frequently and defer the rest.  This is where knowledge of the domain is helpful. 

    Less frequently used or more expensive resources can be lazy loaded by default.  You could also supply a way to override that behaviour so you can optimize for your requirements in future.

    The following link describes the lazy load design pattern;


    Pl mark as answer or helpful if you found this useful
    Monday, February 9, 2009 2:08 PM
  • I agree with GMoore on this and would recommend using a ORM like NHibernate or Entity Framework for your DAL...

    An example of Lazy load using entity framework for your case could be something like this...

    // Explicitly include the SalesGroups from the database in single call.
    this.MyDataGrid2.DataSource = entityFrameworkContext.Stock.Include("SalesGroups").First().SalesGroups;

    // Explicitly load the SalesGroups after retrieving the Stock.

    var Stock = entityFrameworkContext.Stock.First();


    this.MyDataGrid2.DataSource = customer.SalesGroups;

    { Gaja; }
    Pl mark if answer solves your problem | Visit http://gajakannan.com/netarch.aspx for .net ref Arch
    Tuesday, February 10, 2009 3:22 AM
  • Hi,

    Thanks for your responses. Lazy load is something that we've tried before with limited success. We found that in order to use this, we needed to have accessors of the form, IsBlahLoaded. We found that this cluttered up the code. We needed these accessors because it was crutial to avoid database calls at certain time and we wanted to be able to control when we where going to the database.

    I was wondering what you guys think of this approach. We have a layer which maps our database tables into a set of business objects implemented in c# code with classes. Each business object has a data object that it operates on. The data object is basically the primitive values that were retrieved from the database and there is not necessarily a one to one mapping between a data object and a row in the database. 

    In order to head off our loading problem, we thought it best to have multiple business objects which could operate on a data object of the same type. So working with the examples I was talking about earlier, we could have a StockData object for which TransactionStock and GroupStock cold operate on. These stock objects will implement a common interface (possibly). So if we loaded StockData and chose to create a TransactionStock out of it, we would also load the transactions that a stock is on as well as all of the charges related to that stock. We wouldn't need to load groups or any other of the objects that stock may be related to. From TransactionStock, the user would be able to navigate the transaction related relationships. Does this sound like an ok approach? Hopefully this is clear. I'd appreciate any comments. 

    Thanks a lot.

    Thursday, February 12, 2009 3:50 PM