Asked by:
Design of Fact Table
Question

Hello,
I am relatively new to BI and am wondering the pros and cons of a particular design of a fact table and associated dimensions. In the first case below the fact table would have one row with fields T, N, M for which the codes would come from a dimension (the Tis etc just demonstrate the idea  would normally be an integer key in place).
In the second case I have two dimensions  one for the Criteria (T,N, or M) and another table for the actual Criteria values. This approach would require numerous rows for each ID  again the fact table would just be populated with keys from the two dimensions rather than the actual values seen below.
Is either of these just plain wrong or are both valid but one is better than the other?
Thanks kindly for any consideration.
 Moved by ArthurZ Friday, March 21, 2014 3:02 PM Does not appear SSIS related
All replies

Hi,
Well I think this is more on OLTP kind of problem then OLAP one.
Few things one needs to considered.
1. Will there be any other/Additional value needs to be added in future? (Except T,M,N)
2. What will be value if ID 1234 is not related to N or M?
3. What is the relation between (T and Tis) or (N and N3)? Can it linked to One to many?
4. If it's Dimension model, you can combine small dimensions into one dimension.
Hope this will help to get your understanding clear..!!!
Nimesh

WHat I would consider is if the Criteria values are fixed and has no possibility to add /modify in future I would go for first option. Also it should be consistent across ID values too.
If criteria can vary betweenm ID values and also number can vary over time then I would go for second approach as that would be more scalable.
Please Mark This As Answer if it helps to solve the issue Visakh  http://visakhm.blogspot.com/ https://www.facebook.com/VmBlogs

Hello Nimesh,
Thank you for the thoughts. Please see my replied below.
1. Will there be any other/Additional value needs to be added in future? (Except T,M,N)
There will initially be a set of values including T,M,N (more may be added in the future depending on what new things international cancer research comes up with but for now these are pretty static)
2. What will be value if ID 1234 is not related to N or M?
There a whole bunch of other measurements that are taking in addition to N, M,and N. Typically, one sees a fact table as narrow but long. In this case since there are many different dimensions for staging a cancer the fact table could be 10 columns wide AND long.
3. What is the relation between (T and Tis) or (N and N3)? Can it linked to One to many?
Tis only one possible value for the measurement T. There could be T1, Tx, T2 etc etc. These are all just types of T measurements. Same for N and M. One can say that if a patient has T of Tis, N of N3 and M of M0 then their cancer stage is III.
4. If it's Dimension model, you can combine small dimensions into one dimension.
Do you have a specific example of this in mind or I can look at on a website?

Hello Visakh,
I noted to Nimesh that the set of measurements (N, T, M, and some others) is pretty static over time. It's only when international cancer research bodies determine a new measurement and institutions adopt it will it possibly be added to our measurements. That is not very frequent.
I also see one row in a fact table as ONE transaction. If I have the table like in the first example anyone could understand one row as the one staging of one cancer patient. With multiple rows it no longer follows the model of one row for one transaction. I think I will go with the first example unless I find a disadvantage at sometime in the future.
Thank you for your kind thoughts.