locked
Activity costing database and bussiness logic design. RRS feed

  • Question

  • Hi,

    I was wondering peoples opinion on the best way to tackel my problem.

    I have a database table activity and table resource

    e.g.

    Activity ID    Activity Name    cost/hour

        1          an activity               10

        2          a n other activity     N/A

     

    Resource ID    Resource Name    cost/hour

        1                    an activity           10

     

     


    Cost of an activity can be calculated in two ways

    either

    User selects an activity and enters I need x times activity y with productivity of z activities per hour giving cost

    x/z * (cost) y

    10/5 = 2 hours -> 2 * 10 = £20

    or

    User selects an activity and says resource x will do activity y for z hours

    (cost) x * z

    10 * 10 = £100

    So I have my data and I ahve my bussiness logic

    Question 1 is it a good idea to mix activities that have a fixed cost and are calculated by productivity in the same table as activities that have no price and are perfromed by resoures with a cost per hour?

    Next question should I store the calculates values in SQL or calculate them on the fly evey time a quote is opened in the system?

    If I should store them in the DB how should I structure my tables?

    If not arguments why I should calculate on the fly.

    Cheers D

    Thursday, January 28, 2010 7:54 PM

Answers

  • Hi Dave,

    Tables should usually be a logical grouping of attributes that relate to the same entity. Fixed cost activities and variable cost activities will probably share a lot of attributes in your data model, so the logical thing to do is to have one table.

    Now, if the cost/hour is the only attribute that is unique to variable cost activities, I would store it in a nullable column in the table. If it's NULL, the activity is fixed cost; otherwise it's variable cost and the value in the column is the hourly price. On the other hand, if you have a whole lot of attributes that are relevant for varying cost activities only, you should consider using a second table for just these attributes, This "Variable cost activities" table when then be related (o or 1) to 1 to the main "Activities" table. And you could of course use a similar "Fixed cost activities" table if you have several attributes that are only relevant when the cost is fixed. This structure is usually refered to as a subtype/supertype structure.

    For the second question, my answer is that, as a rule of thumb, calculated values should not be stored in the database. The main problem is that you are essentially storing duplicate data, and you also have the risk that you at one point get conflicting data in your database (if due to some bug, the calculated column is not changed when it should be). Another problem is that you are wasting disk space by storing redundant data, and extra disk space means that more I/O (already the bottleneck of most database applications) is required to do the same amount of work.

    The main exceptions to this rule are:
    a) when the data can't be caluclated everytime - a "last modified date" is a great example of this; it is a calculated column but you have to store it since you can only calculate it when data is modified and you can't repeat the calculation later.
    b) when the calculation is too time consuming - suppose, for example, that my bank still has the complete transaction history of my checking account since I opened it, twenty or so years ago, then they could calculate the current balance from that history. But the processing speed of ATM transactions would probably suffer immmensely, so they probably store my current balance in their database even though it is a calculated column.

    If you really do need to store calculated data, then SQL Server offers calculated columns and indexed views as two possible ways to do that without having to code everything yourself. But as I said - the best option is usually not to store the calculation results.
    -- Hugo Kornelis, SQL Server MVP
    • Marked as answer by DaveC1982 Friday, January 29, 2010 10:11 AM
    Thursday, January 28, 2010 11:34 PM
  • Hi Dave,

    Neither a, nor b. ;)

    I would remove the calculations from the C# program and implement them in SQL Server instead, embedded in a view. In most cases, this kind of calculation is simple enough that SQL Server can handle without needing the power of C#. If you have some weird formula's that are very computationally intensive, then my advice would change to creating the calculation as a seperate C# component, including that in your database using SQLCLR, and then creating the view in SQL Server. In both cases, you get to keep one single source for the data, without risks of conflicting answers to the same questions.

    As to the history of changes, there are several methods for maintaining a change history that are easily found with your favorite search engine. My favorite method is to have a change log table for each table, with the same structure plus an additional timestamp column to identify when the change took place. However, this method works for keeping a log of changes to the base data used in the calculation. If there is also a possibility that the calculation logic itself changes, then you'll have to look to different methods - and in some cases, recording and storing the outcome of the old formula before implementing the new one is the best option.
    -- Hugo Kornelis, SQL Server MVP
    • Marked as answer by DaveC1982 Monday, February 8, 2010 2:14 PM
    Friday, January 29, 2010 4:16 PM

All replies

  • Hi Dave,

    Tables should usually be a logical grouping of attributes that relate to the same entity. Fixed cost activities and variable cost activities will probably share a lot of attributes in your data model, so the logical thing to do is to have one table.

    Now, if the cost/hour is the only attribute that is unique to variable cost activities, I would store it in a nullable column in the table. If it's NULL, the activity is fixed cost; otherwise it's variable cost and the value in the column is the hourly price. On the other hand, if you have a whole lot of attributes that are relevant for varying cost activities only, you should consider using a second table for just these attributes, This "Variable cost activities" table when then be related (o or 1) to 1 to the main "Activities" table. And you could of course use a similar "Fixed cost activities" table if you have several attributes that are only relevant when the cost is fixed. This structure is usually refered to as a subtype/supertype structure.

    For the second question, my answer is that, as a rule of thumb, calculated values should not be stored in the database. The main problem is that you are essentially storing duplicate data, and you also have the risk that you at one point get conflicting data in your database (if due to some bug, the calculated column is not changed when it should be). Another problem is that you are wasting disk space by storing redundant data, and extra disk space means that more I/O (already the bottleneck of most database applications) is required to do the same amount of work.

    The main exceptions to this rule are:
    a) when the data can't be caluclated everytime - a "last modified date" is a great example of this; it is a calculated column but you have to store it since you can only calculate it when data is modified and you can't repeat the calculation later.
    b) when the calculation is too time consuming - suppose, for example, that my bank still has the complete transaction history of my checking account since I opened it, twenty or so years ago, then they could calculate the current balance from that history. But the processing speed of ATM transactions would probably suffer immmensely, so they probably store my current balance in their database even though it is a calculated column.

    If you really do need to store calculated data, then SQL Server offers calculated columns and indexed views as two possible ways to do that without having to code everything yourself. But as I said - the best option is usually not to store the calculation results.
    -- Hugo Kornelis, SQL Server MVP
    • Marked as answer by DaveC1982 Friday, January 29, 2010 10:11 AM
    Thursday, January 28, 2010 11:34 PM
  • Hi,

    Very clear answer. I have a couple of aditional requirements. I need to store all changes to an activity costing. Now I can store these snapshots as references to the values.

    All my calclulations would then be perfromed by my c# bussiness logic.

    What is best practice in reporting on my data?

    Should I a) make my c# store denormalised data at timely intervals?
                 b) duplicate my bussiness logic into SQL so that I can run reports against the database?

    Cheers D
    Friday, January 29, 2010 10:11 AM
  • Hi Dave,

    Neither a, nor b. ;)

    I would remove the calculations from the C# program and implement them in SQL Server instead, embedded in a view. In most cases, this kind of calculation is simple enough that SQL Server can handle without needing the power of C#. If you have some weird formula's that are very computationally intensive, then my advice would change to creating the calculation as a seperate C# component, including that in your database using SQLCLR, and then creating the view in SQL Server. In both cases, you get to keep one single source for the data, without risks of conflicting answers to the same questions.

    As to the history of changes, there are several methods for maintaining a change history that are easily found with your favorite search engine. My favorite method is to have a change log table for each table, with the same structure plus an additional timestamp column to identify when the change took place. However, this method works for keeping a log of changes to the base data used in the calculation. If there is also a possibility that the calculation logic itself changes, then you'll have to look to different methods - and in some cases, recording and storing the outcome of the old formula before implementing the new one is the best option.
    -- Hugo Kornelis, SQL Server MVP
    • Marked as answer by DaveC1982 Monday, February 8, 2010 2:14 PM
    Friday, January 29, 2010 4:16 PM