locked
Issue with dimension and partition processing RRS feed

  • Question

  • Let me start from the Cube structure:

    1. DIMENSION:
      • D1 with attributes:

    D1_id (primary key)

    D2_id (foreign key)

    • D2 with attributes:

    D2_id (primary key)

    1. MEASURE GROUP connection made to :
      • D1, relationship  type :regular  (D1_id)
      • D2 relationship  type: referenced  (D2_id)

    Measure group include 12 partitions (monthly ones)

    CUBE PROCESSING METHOD:

    1) ProcessUpdate (dimension)

    2) ProcessFull (last partition)

    3) ProcessIndexe (all partitions except the last one)

    Dimension D1 has been updated (D2_id was modified) and after processing we can see:

    • Correct D2_id  if we use attribute from D1 dimension
    • Correct D2_id for last partition ale old one D2_id for historical partitions if we use attribute from D2 dimension

    Can anyone explain to me why historical partitions show wrong D2_id if D1_id hasn’t changed?

    Tuesday, November 13, 2012 7:58 PM

Answers

  • This is one of those issues with Reference type relationships that cause me to always avoid Reference type relationships. To improve query performance, you probably have the Materialize checkbox selected in the Define Relationship dialog box. What this does is use the source relationships AS THEY EXIST AT THAT TIME to resolve the reference to the D2_ID field and this value is stored in the leaf level of the measure group partition. This explains why you have the wrong value when querying historical partitions.

    There are three ways to correct.

    1. Uncheck the Materialize box in the Define Relationship box and incur the performance hit due to that. If the cardinality of D1 is low, this performance hit is not that great and this may be the best option in your particular application.
    2. When the relationship between D1 and D2 changes, always perform a ProcessData or a ProcessFull operation on all measure groups that are related to D1.
    3. Or, my usual approach, resolve the relationship between D1 and D2 in a source view and use a Regular relationship so that levels can be appropriately named in the by role in the user presentation layer. (Another reason that role playing dimensions are not always a good choice.)

    The option that you choose depends on your specific application and needs.

    HTH, Martin


    http://martinmason.wordpress.com


    • Edited by Martin Mason Wednesday, November 14, 2012 1:36 AM
    • Marked as answer by Elvis Long Monday, November 19, 2012 7:03 AM
    Wednesday, November 14, 2012 1:35 AM
  • Dimension D1 has been updated (D2_id was modified) and after processing we can see:

    • Correct D2_id  if we use attribute from D1 dimension
    • Correct D2_id for last partition ale old one D2_id for historical partitions if we use attribute from D2 dimension

    Can anyone explain to me why historical partitions show wrong D2_id if D1_id hasn’t changed?

    If you have a "materialised" reference dimension what happens is that the D2_id effectively gets stored in your partition at processing time. If this is how you have your relationship configured you would need to reprocess all your historical partitions if you change the D2_Id in your D1 dimension.

    The other option is to change the relationship to non-materialized. This will then resolve the D1_Id to D2_Id relationship at query time (resulting in correct data, but much slower query speed)


    http://darren.gosbell.com - please mark correct answers

    • Proposed as answer by Darren GosbellMVP Wednesday, November 14, 2012 7:27 PM
    • Marked as answer by Elvis Long Monday, November 19, 2012 7:03 AM
    Wednesday, November 14, 2012 1:37 AM

All replies

  • This is one of those issues with Reference type relationships that cause me to always avoid Reference type relationships. To improve query performance, you probably have the Materialize checkbox selected in the Define Relationship dialog box. What this does is use the source relationships AS THEY EXIST AT THAT TIME to resolve the reference to the D2_ID field and this value is stored in the leaf level of the measure group partition. This explains why you have the wrong value when querying historical partitions.

    There are three ways to correct.

    1. Uncheck the Materialize box in the Define Relationship box and incur the performance hit due to that. If the cardinality of D1 is low, this performance hit is not that great and this may be the best option in your particular application.
    2. When the relationship between D1 and D2 changes, always perform a ProcessData or a ProcessFull operation on all measure groups that are related to D1.
    3. Or, my usual approach, resolve the relationship between D1 and D2 in a source view and use a Regular relationship so that levels can be appropriately named in the by role in the user presentation layer. (Another reason that role playing dimensions are not always a good choice.)

    The option that you choose depends on your specific application and needs.

    HTH, Martin


    http://martinmason.wordpress.com


    • Edited by Martin Mason Wednesday, November 14, 2012 1:36 AM
    • Marked as answer by Elvis Long Monday, November 19, 2012 7:03 AM
    Wednesday, November 14, 2012 1:35 AM
  • Dimension D1 has been updated (D2_id was modified) and after processing we can see:

    • Correct D2_id  if we use attribute from D1 dimension
    • Correct D2_id for last partition ale old one D2_id for historical partitions if we use attribute from D2 dimension

    Can anyone explain to me why historical partitions show wrong D2_id if D1_id hasn’t changed?

    If you have a "materialised" reference dimension what happens is that the D2_id effectively gets stored in your partition at processing time. If this is how you have your relationship configured you would need to reprocess all your historical partitions if you change the D2_Id in your D1 dimension.

    The other option is to change the relationship to non-materialized. This will then resolve the D1_Id to D2_Id relationship at query time (resulting in correct data, but much slower query speed)


    http://darren.gosbell.com - please mark correct answers

    • Proposed as answer by Darren GosbellMVP Wednesday, November 14, 2012 7:27 PM
    • Marked as answer by Elvis Long Monday, November 19, 2012 7:03 AM
    Wednesday, November 14, 2012 1:37 AM
  • Your answers were really helpful. We've unchecked the Materialize box and it works.
    Thanks a lot!

    Wednesday, November 14, 2012 2:00 PM