SQL Server Developer Center > SQL Server Forums > SQL Server Analysis Services > Modeling: "Time dependant dimension hierarchies"
Ask a questionAsk a question
 

AnswerModeling: "Time dependant dimension hierarchies"

  • Tuesday, November 03, 2009 2:12 PMEduard R Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hello all,

    Let's say I have an Orders fact table like:
    Time         ProductId    Value
    20090101  Prd1            10
    20090101  Prd2            5
    20090201  Prd1            3
    20090301  Prd1            6
    20090401  Prd2            4

    The ProductCategory table looks like:
    PrdId     Category    StartDate   EndDate
    Prd1       A               20090101  20090228
    Prd1       B               20090301  99991231
    Prd2       B               20090101  99991231


    What I want is a cube showing the sum of Value that could be drilldown by Category->PrdId and Year->Month hierarchies

    For example at "Year, Category" and "Month, Category" the data from the cube should be:

    Year        Category     Value
    2009        A                13 (10+3)
    2009        B                15 (5+6+4)

    Month      Category     Value
    January    A                10
    January    B                5 
    Febrary    A                3
    Febrary    B                0 
    March      A                0 
    March      B                10 (6+4)

    How can I do that with SSAS 2008? I don't know how to "connect" the time dimension with the product dimension. How can I tell to SSAS that the drill up from ProductIds to Categories is time-dependant and it depends on the StartDate and EndDate columns of the ProductCategory table?

    I hope the example is clear enough...

    Thanks in advance,
    Eduard.

Answers

  • Tuesday, November 03, 2009 2:52 PMJustin A Rush Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    You will want a surrogate key on the productCategory table and you will want that key to be an extra column on your fact table.  I prefer to do this at the physical level and have my ETL process populate the surrogate keys on the fact tables, but you can do this at the DSV level if you want.

    Replace the ProductCategory table with a named query that looks something like:
    select cast(startdate as varchar(20)) + cast(enddate as varchar(20)) + prdid + category as ProductKey, Category from ProductCategory

    Replace your fact table with a named query that joins to the productcategory table to create the same key:
    select o.time,o.value, cast(startdate as varchar(20)) + cast(enddate as varchar(20)) + prdid + category as ProductKey
    from orders o inner join ProductCategory pc on o.productid = pc.prdid and o.time between pc.startdate and pc.enddate


    Obviously replace with the correct syntax...
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:06 PM
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:20 PM
    • Unmarked As Answer byEduard R Tuesday, November 03, 2009 3:20 PM
    •  
  • Tuesday, November 03, 2009 2:56 PMMarkWojo Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Eduard,
    SSAS really likes data models that follow the Kimball methodology.  In which case, you'll want to use a surrogate key to identify your product rows uniquely.  It would look like this:

    Productkey   PrdId     Category    StartDate   EndDate
    1                 Prd1       A               20090101  20090228
    2                 Prd1       B               20090301  99991231
    3                 Prd2       B               20090101  99991231

    Your fact table will look something like this:
    Time         Productkey    Value
    20090101  1                  10
    20090101  2                  5
    20090201  3                  3
    20090301  1                  6
    20090401  2                  4

    In this way, the multiple prd1 rows will link up to the correct fact row for the appropriate product category.
    Once this is done, you should model these relationships in the DSV in your project.  When you develop the cube, you will bring in PrdID and Category as attributes in the product dimension.  Additionally, you can create User hierarchies to define the drilldown path that you want Category -->Prdid, etc.  You do not need to, but it can be helpful to your users to make it simpler to choose attributes in a sensible drilldown.

    The above is assuming you have a time dimension table to link up to as well. 
    hope that helps,
    mark
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:06 PM
    •  

All Replies

  • Tuesday, November 03, 2009 2:52 PMJustin A Rush Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    You will want a surrogate key on the productCategory table and you will want that key to be an extra column on your fact table.  I prefer to do this at the physical level and have my ETL process populate the surrogate keys on the fact tables, but you can do this at the DSV level if you want.

    Replace the ProductCategory table with a named query that looks something like:
    select cast(startdate as varchar(20)) + cast(enddate as varchar(20)) + prdid + category as ProductKey, Category from ProductCategory

    Replace your fact table with a named query that joins to the productcategory table to create the same key:
    select o.time,o.value, cast(startdate as varchar(20)) + cast(enddate as varchar(20)) + prdid + category as ProductKey
    from orders o inner join ProductCategory pc on o.productid = pc.prdid and o.time between pc.startdate and pc.enddate


    Obviously replace with the correct syntax...
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:06 PM
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:20 PM
    • Unmarked As Answer byEduard R Tuesday, November 03, 2009 3:20 PM
    •  
  • Tuesday, November 03, 2009 2:56 PMMarkWojo Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Answer
    Eduard,
    SSAS really likes data models that follow the Kimball methodology.  In which case, you'll want to use a surrogate key to identify your product rows uniquely.  It would look like this:

    Productkey   PrdId     Category    StartDate   EndDate
    1                 Prd1       A               20090101  20090228
    2                 Prd1       B               20090301  99991231
    3                 Prd2       B               20090101  99991231

    Your fact table will look something like this:
    Time         Productkey    Value
    20090101  1                  10
    20090101  2                  5
    20090201  3                  3
    20090301  1                  6
    20090401  2                  4

    In this way, the multiple prd1 rows will link up to the correct fact row for the appropriate product category.
    Once this is done, you should model these relationships in the DSV in your project.  When you develop the cube, you will bring in PrdID and Category as attributes in the product dimension.  Additionally, you can create User hierarchies to define the drilldown path that you want Category -->Prdid, etc.  You do not need to, but it can be helpful to your users to make it simpler to choose attributes in a sensible drilldown.

    The above is assuming you have a time dimension table to link up to as well. 
    hope that helps,
    mark
    • Marked As Answer byEduard R Tuesday, November 03, 2009 3:06 PM
    •  
  • Tuesday, November 03, 2009 3:20 PMEduard R Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Thanks for your responses!

    In case we added more levels in the product hierarchy (i mean something like product->category->supercategory->supersupercategory->..., all with time dependencies) what should be your aproach?
    Create a big dimension table with one subrogated key for each diferent "expanded" relationship that could exist in a given time?

    In case the product hierarchy was modeled as a time dependant parent-child realtionship, the approach would be to create first a flat table?  With this approach the number of levels should be set in advance... Is there any other aproach for avoid number of levels constraints?

    Thanks,

  • Tuesday, November 03, 2009 4:30 PMJustin A Rush Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hm, it depends on exactly what you are talking about... would each level of the hierarchy be dependent on a different start and end date?  If so, then you might be better off creating seperate dimensions for them.  But if they are all based on the same start and end date, then just go with the original approach of populating a single surrogate key in the fact table and then expose product, category, supercategory, supersupercategory as attributes of the dimension and set them up in a hierarchy.

    I am not really sure what your second question is asking... could you expand?
  • Tuesday, November 03, 2009 4:53 PMEduard R Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hm, it depends on exactly what you are talking about... would each level of the hierarchy be dependent on a different start and end date?  If so, then you might be better off creating seperate dimensions for them.  But if they are all based on the same start and end date, then just go with the original approach of populating a single surrogate key in the fact table and then expose product, category, supercategory, supersupercategory as attributes of the dimension and set them up in a hierarchy.
    Ok. To expose product, category and supercategory as diferent dimensions it means the fact table should contain the surrogated key for each
    dimension, isn't it? Is it posible using this approach to provide to the user the hierarchy product->category->supercategory when browsing the data? I'm thinking the user is using excel to browse an SSAS cube...

    I am not really sure what your second question is asking... could you expand?

    I would try to set an example. Let be a product hierarchy table like

    PrdKey   ParentPrdKey  StartDate   EndDate
    Prd1        PrdA            20090101  20090131
    Prd2        PrdA            20090101  99991231
    Prd1        PrdB            20090201  99991231
    PrdA        PrdI             20090101  99991231
    PrdI         Prdalfa         20090101  99991231

    Some metaexplanation here: Within this example I use 4 levels: the lowest one is Prd1,2,3,4 etc... The second one are PrdA,B,C,D.... The third one is PrdI,II,III,IV, and the last one is Prdalfa, beta, gamma,...

    As before we have a Sales fact table at Prd1,2,3,4,... level.

    The product hierarchy is Prdalfa->PrdI->PrdA->Prd1.

    Instead of having a fix number of levels (4 in this example) I would like to have as many levels as the table contains, without a priori constraints...

    Thanks

     

  • Tuesday, November 03, 2009 8:49 PMMarkWojo Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Eduard,

    I would recommend adding a column to your product dimension for every attribute that you wish to analyze by.  To have 10 columns in a product table (Category, Brand, Product line, Super-category, sub-category, etc.) is not uncommon.  The advantage to this design is that all of the product related fields are held together.  Alternatively, you could have a mini-dimension for each attribute, but this could be confusing to navigate in the front end tools and there is really only a benefit if the mini-dimension is shared by other fact tables.

    Mark

  • Wednesday, November 04, 2009 2:08 PMEduard R Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    The only problem I see at first glance, is that if we add too many attributes in the dimension table instead of a slow changing dimension we would have a "fast" changing dimension... I guess it is not a good idea to have one natural key with several thousand diferents surrogates keys.

    For instance imagine I have 10000 products. Products belong to 100 subcategories (100 products each). Subcategories belong to 20 categories (5 each category) and each category belong to 3 super-categories (10 on A,6 on B,4 on C).
    Let's say one category change from belonging to super-category B to belonging to super-category C.
    This small change in one single record in the original OLTP system would result in creating about 500 new surrogated keys (1category=5 subcategories=500 products).

    It could result in a big and "always growing" dimension table.






  • Wednesday, November 04, 2009 9:03 PMMarkWojo Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Eduard,
    You are correct that this could create large dimensional growth.  I think it is more of common practice to use Type 1 changes in a product table and type 2 in a customer table.  This would depend on your specific business of course.  In my business, we sell the same items we have been selling for the past 20 years with additions and deletions each year.  Product categories do change, but it is always a reclassification rather than the attributes of the product changing causing a change in its category. 
    In the example that you give above where 500 products change at once, this sounds more like the category or classification is being refined rather than the products actually changing. 

    You will know best if it is worth tracking changes on the product table.  On the customer side, everyone wants to know who the sales rep was or if the customer changed location or store size or something like that. 
    Does that make sense?
    Mark