none
Help with Calculating YTD Median RRS feed

  • Question

  • Hi All,

    I am having trouble defining calculation for deriving YTD Median for SSAS 2008.

    I've created a small test cube to try to isolate my problem. There are 10 fact records and with 3 for Jan 2008 and 2 for Feb 2008

    Simply query would look like this:

    WITH 
      MEMBER [Measures].[YTD Med] AS MEDIAN(
    			NonEmpty(
                    		[Fact].[Fact].children *       
                    		PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
    				,[Measures].[Price]  )
    			,[Measures].[Price]
                     )
    SELECT 
     [Measures].[YTD Med] ON 0
     ,{[Date].[Month].&[2008]&[Feb]} ON 1
    FROM [Test];
    Confirmation of correctness:

    Time              : 13 ms
    Calc covers       : 0
    Cells calculated  : 6
    Sonar subcubes    : 2
    NON EMPTYs        : 1
    Autoexists        : 0
    EXISTINGs         : 0
    SE queries        : 1
    Flat cache insert : 1
    Cache hits        : 1
    Cache misses      : 0
    Cache inserts     : 0
    Cache lookups     : 1
    Memory Usage KB   : 0

    Looks like it accesses 6 cells, 5 for each fact value and 1 to actually calculate a median.

    Next I am trying to add something like DateTool with helper dimensions (TimeCalc = {"Current Period" (default), "YTD"} and MathCalc = {"Actual Value" (default), "Median"})
    and added the following into the script:

    (([MathCalc].[Median],[TimeCalc].[YTD]) = 
    
                MEDIAN(NonEmpty(   
                    [Fact].[Fact].children *       
                    [MathCalc].DefaultMember *
                    [TimeCalc].DefaultMember *
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)   
                    )
               
            )
    
    );
    The query now becomes:

    SELECT 
     ([Measures].[Price], [MathCalc].[Median], [TimeCalc].[YTD]) ON 0
     ,{[Date].[Month].&[2008]&[Feb]} ON 1
    FROM [Test];
    And statistics seem to be similar to the one posted above:

    Time              : 8 ms
    Calc covers       : 0
    Cells calculated  : 6
    Sonar subcubes    : 2
    NON EMPTYs        : 1
    Autoexists        : 0
    EXISTINGs         : 0
    SE queries        : 1
    Flat cache insert : 1
    Cache hits        : 1
    Cache misses      : 0
    Cache inserts     : 0
    Cache lookups     : 1
    Memory Usage KB   : 0

    Next I break the above script into 2 parts to allow calculating Median and YTD separatly as well as combined:

    (([MathCalc].[Median],[TimeCalc].DefaultMember) = 
                MEDIAN(NonEmpty([Fact].[Fact].children * { [MathCalc].DefaultMember })) 
            );
    
    
    [TimeCalc].[YTD] = Aggregate(
                    [TimeCalc].DefaultMember *
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
                
            );
    This works correctly in SSAS 2005 but in 2008 the result that I get is incorrect, looks like its calculating Median for each Jan 2008 and Feb 2008 and then simply adding them up.

    If I rearrange the script as follows:
    [TimeCalc].[YTD] = Aggregate(
                    [TimeCalc].DefaultMember *
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
                
            );
    
    (([MathCalc].[Median]) = 
                MEDIAN(NonEmpty([Fact].[Fact].children * { [MathCalc].DefaultMember })) 
            );
    and run the same query I get the correct result but total accessed cells increases.

    Time              : 11 ms
    Calc covers       : 0
    Cells calculated  : 16
    Sonar subcubes    : 4
    NON EMPTYs        : 1
    Autoexists        : 0
    EXISTINGs         : 0
    SE queries        : 3
    Flat cache insert : 1
    Cache hits        : 0
    Cache misses      : 3
    Cache inserts     : 1
    Cache lookups     : 3
    Memory Usage KB   : 0

    What is the proper way to do this kind of calculation in SSAS 2008 so that it would be efficient and correct?

    If someone is willing to help me to figure this out I can send me test db as well as the cube.

    Thanks,
    Andrei Rjeousski

    Tuesday, June 16, 2009 4:34 PM

All replies

  • What if, in addition to the separate scope assignments for YTD and Median, you add another for the combination, like you originally had:

    ([TimeCalc].[YTD], [MathCalc].DefaultMember) = Aggregate(
                    [TimeCalc].DefaultMember *
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
               
            );

    ([MathCalc].[Median], [TimeCalc].DefaultMember) =
                MEDIAN(NonEmpty([Fact].[Fact].children * { [MathCalc].DefaultMember })
            );

    ([MathCalc].[Median], [TimeCalc].[YTD]) =

                MEDIAN(NonEmpty(  
                    [Fact].[Fact].children *      
                    [MathCalc].DefaultMember *
                    [TimeCalc].DefaultMember *
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)  
                    )
              

    );
    - Deepak
    Tuesday, June 16, 2009 11:51 PM
    Moderator
  • Hi Deepak,

    Thank you for your reply. I am actually trying to understand what changed in the way calculations are done between 2005 and 2008 to understand why the order have changed.

    I've created a query that when ran against AdventureWorks 2005 returns a correct result, but when same query is ran against AdventureWorks 2008 it returns incorrect result. I've looked at the help pages for Calculations order and Aggregate function but cannot see anything showing this change.

    (I know i am using Scenario not for intended purpose but its closely similar to what I am trying to accomplish)

    Query:
    WITH 
    MEMBER [Scenario].[Scenario].[MedInternetSales] AS
    	MEDIAN(existing [Date].[Date].children * [Scenario].[Scenario].DefaultMember,[Measures].[Internet Sales Amount] )
    MEMBER [Scenario].[Scenario].[YTDMedInternetSales] AS 
    	Aggregate(
    		PeriodsToDate([Date].[Calendar].[Calendar Year],[Date].[Calendar].CurrentMember)
    		* [Scenario].[Scenario].[MedInternetSales] 
    	)
    SELECT [Scenario].[Scenario].[YTDMedInternetSales] ON 0,
    {[Date].[Calendar].[Month].&[2004]&[2]} ON 1
    FROM [Adventure Works]
    WHERE [Measures].[Internet Sales Amount]
    AdventureWorks 2005 result:

    	YTDMedInternetSales
    February 2004	$45,436.04

    AdventureWorks 2008 result:
    	YTDMedInternetSales
    February 2004	$91,468.25

    As you can see that in the first case the median is calculated correctly for daily sales over the first 2 months of 2004 and in the second case medians for 
    Jan($43,055.60) and Feb ($48,412.65) are simply added up.

    Any insights on the changes that caused this would be great.

    Thanks,
    Andrei


    Wednesday, June 17, 2009 4:11 PM
  • Hi Andrei,

    You've ordered the calculations such that there is an Aggregate() of a calculated member. BOL for both AS 2005 and AS 2008 state that:

    "The exception to this precedence is the Aggregate function. Calculated members with the Aggregate function have a lower solve order than any intersecting calculated measure."

    Thus it looks like Aggregate() gets applied prior to Median(), even though the Median() member is placed within Aggregate(). However, it's not clear to me what the result of applying this rule should be, so I'm not sure which one (AS 2005 or AS 2008) is correct. You can certainly re-structure the calculation to avoid this Aggregate() solve order swapping issue (see below); but if you're more interested in how Aggregate() should work in this scenario, maybe someone on the group can help.

     

    WITH 
    MEMBER [Scenario].[Scenario].[YTDMedInternetSales] AS 
    MEDIAN(exists([Date].[Date].children,
    PeriodsToDate([Date].[Calendar].[Calendar Year])),
    [Scenario].[Scenario].DefaultMember)
    SELECT [Scenario].[Scenario].[YTDMedInternetSales] ON 0,
    {[Date].[Calendar].[Month].&[2004]&[2]} ON 1
    FROM [Adventure Works]
    WHERE [Measures].[Internet Sales Amount]
    ------------------------------------------
     YTDMedInternetSales
    February 2004 $45,436.04

     

     

     


    - Deepak
    Thursday, June 18, 2009 5:28 AM
    Moderator
  • Hi Deepak,

    Thank you for your reply.
     
    I am indeed very interested in how Aggregate should work in this scenario as I believe the way it worked in AS 2005 allows for more flexibility as well as amount of the code required to implement multiple helper dimensions.

    The way I was using this is as follows:
    SCOPE ([TimeCalc].DefaultMember);
    
        ([MathCalc].[Median] = 
                    MEDIAN(NonEmpty([Fact].[Fact].members * [MathCalc].DefaultMember) )
                );
    
      (MathCalc].[Average] = AVG.....
    
      (MathCalc].[STD DEV] = Stdev ...
    
    End Scope;
    
    

    This would define calculations for the current time period and remains unchanged no matter how many TimeCalculations are required.

    I then had a section of the script dedicated just for the Time Calculatoins and everything was working correctly and quite optimal.
     ([TimeCalc].[YTD]) = Aggregate(
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
                    *
                    [TimeCalc].DefaultMember
            );
    
    
    ([TimeCalc].[QTD]) = Aggregate(
                    PeriodsToDate([Date].[YM].[Quarter],[Date].[YM].CurrentMember)
                    *
                    [TimeCalc].DefaultMember
            );
    
    ([TimeCalc].[MTD]) = Aggregate(
                    PeriodsToDate([Date].[YM].[Month],[Date].[YM].CurrentMember)
                    *
                    [TimeCalc].DefaultMember
            );
    
    ([TimeCalc].[Rolling 12 Month]) = Aggregate(
                    ...
            );
    
    ([TimeCalc].[Rolling 6 Month]) = Aggregate(
                    ...
            );
    If I define it as you've mentioned above, I would have to explicitly define how to calculate median for all time transformation calculations and not just for the current period.

    Similarly if using query like you've provided above, If we need to display both Current Month Median as well as YTD median, we have to explicitly use Median function twice:

    WITH 
    MEMBER [Scenario].[Scenario].[MedInternetSales] AS 
    	MEDIAN(existing [Date].[Date].children * 
    		[Scenario].[Scenario].DefaultMember)	
    MEMBER [Scenario].[Scenario].[YTDMedInternetSales] AS 
    	MEDIAN(exists([Date].[Date].children,
    		PeriodsToDate([Date].[Calendar].[Calendar Year])),
    		[Scenario].[Scenario].DefaultMember)	
    SELECT AddCalculatedMembers([Scenario]) ON 0,
    {[Date].[Calendar].[Month].&[2004]&[2]} ON 1
    FROM [Adventure Works]
    WHERE [Measures].[Internet Sales Amount]

    Any help or explanation on how Aggregate function should work and if it was broken in AS 2005 or is broken in AS 2008 would be great.

    Thanks,
    Andrei Rjeousski
    Thursday, June 18, 2009 1:21 PM
  • Would anyone else have any suggestions? Because of this "bug/feature" a simple script that is quite general and uses about 50 lines, becomes about 1000 lines with alot of duplication and nightmare to maintain?

    Any help or explanation would be great.

    Thanks,
    Andrei Rjeousski

    Thursday, July 2, 2009 1:53 PM
  • Andre;
       I assume the [TimeCalc] is a time utility dimension? I have my own time utility dimension that has the following median calculation:

       ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[YTD Median] =
           MEDIAN(  EXISTING({[Invoice Creation Date Calculations].[Invoice Creation Date Calculations].Currentmember.level.members},
                    [Invoice Creation Date].[Calendar Hierarchy].currentmember)
                    )
           );

    Assuming yours is similar, could you not do something like:

    ([TimeCalc].[Median] =
                    MEDIAN(EXISTING({[TimeCalc].[TimeCalc].Currentmember.level.members},
            [TimeCalc].currentmember)
            )
        );
    Thursday, July 2, 2009 3:29 PM
  • Hi David,

    Thank you for your reply.

    I actually have 2 different utility dimensions [TimeCalc] and [MathCalc]

    They are used as following in my AS2005 cube:

    SCOPE ([TimeCalc].DefaultMember);

    SCOPE([Measures].[Price1]);
        ([MathCalc].[Median] =
                    MEDIAN(NonEmpty([Fact].[Fact].members * [MathCalc].DefaultMember) )
                );

      (MathCalc].[Average] = AVG.....

      (MathCalc].[STD DEV] = Stdev ...

    End Scope;
    End Scope;

    that is where I define all calculations on how to calculate the "Math" for each of the measures.

    After that I have quite a generic block of code that defines how to do aggregations for Time untility dimension.

    ([TimeCalc].[YTD]) = Aggregate(
                    PeriodsToDate([Date].[YM].[Year],[Date].[YM].CurrentMember)
                    *
                    [TimeCalc].DefaultMember
            );

    This covers all of calculations including, medians, averages, counts, sums.

    In your example, how do you calculate the YTD itself? The sample your provided seems to only show on how to compute Median.

    Thanks again! I really appriciate it.

    Thursday, July 2, 2009 3:40 PM
  • Andrei;
       Here is the MDX code I am using:

    --Implement Invoice Creation Date Calculations

    --Calendar Hierarchy

    Scope ([Invoice Creation Date].[Calendar Year].[Calendar Year].members ,[Invoice Creation Date].[Calendar Date].members );             

     

    --Prior Year Calculations

    ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Prior Year] =

        (PARALLELPERIOD ([Invoice Creation Date].[Calendar Hierarchy].[Calendar Year], 1,

         [Invoice Creation Date].[Calendar Hierarchy].currentmember ),

         [Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Current Period])

        );             

    --Prior Period Calculations

     ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Prior Period] =

        IIF (([Invoice Creation Date].[Calendar Hierarchy].currentmember .lag (1),

            [Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Current Period]) = 0, null ,

            ([Invoice Creation Date].[Calendar Hierarchy].currentmember .lag (1),

            [Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Current Period]))

      );             

     

    --YTD Calculations

      ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[YTD] =

        Aggregate (

                    CrossJoin ({[Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Current Period]},

                    PeriodsToDate (

                    [Invoice Creation Date].[Calendar Hierarchy].[Calendar Year],

                    [Invoice Creation Date].[Calendar Hierarchy].currentmember ))

                    )

        );             

     

    --YTD Prior Year

      ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[YTD Pr Yr] =

        Aggregate (

                    CrossJoin ({[Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Current Period]},

                    PeriodsToDate (

                    [Invoice Creation Date].[Calendar Hierarchy].[Calendar Year],

                    ParallelPeriod (

                    [Invoice Creation Date].[Calendar Hierarchy].[Calendar Year],1,

                    [Invoice Creation Date].[Calendar Hierarchy].currentmember ))

                    ))

        );             

    --YTD Median

       ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[YTD Median] =

           MEDIANEXISTING ({[Invoice Creation Date Calculations].[Invoice Creation Date Calculations].Currentmember .level .members },

                    [Invoice Creation Date].[Calendar Hierarchy].currentmember )

                    )

           );         

    --Prior 12 Month

        ([Invoice Creation Date Calculations].[Invoice Creation Date Calculations].[Prior 12 Month] =

     

            Aggregate (

            {ParallelPeriod (

            [Invoice Creation Date].[Calendar Hierarchy].[Calendar Month],12,

            [Invoice Creation Date].[Calendar Hierarchy].currentmember )

            :[Invoice Creation Date].[Calendar Hierarchy].prevmember }

            * {[Invoice Creation Date Calculations].[Invoice Creation Date Calculations].defaultmember }

        )

     

        );        

     

    End Scope ;     



    This is all based on the excellent white paper by David Shroyer http://www.obs3.com/A%20Different%20Approach%20to%20Time%20Calculations%20in%20SSAS.pdf.

    Dave
    Thursday, July 2, 2009 3:48 PM
  • David,

    Thank you, I will try this implementation and see if I can adopt it to using multiple Utility Dimensions.

    I've actually based alot of my code on that PDF as well, and combined it with concept of DateTool to come up with something even easier to maintain.

    You are using AS2008 correct?

    Thanks,
    Andrei Rjeousski
    Thursday, July 2, 2009 4:13 PM
  • Hi Andrei

    We also have problem with Aggrerate() function on AS2008, but it work correct in AS2005.

    See my tread http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/6d7871ee-89f7-41be-b40c-d7dc718ba30f

    Have you tested query NOT on x64 platform?

    If you found some workaround, please share your thoughts.

    Thanks.
    Oleg.
    OR
    Tuesday, July 14, 2009 5:04 PM
  • Oleg,

    I've installed it on many different servers that are both 64bit and 32bit and all are having the same issue.

    No workaround so far...

    Let me know if you find anything.

    Thanks,
    Andrei Rjeousski
    Monday, July 20, 2009 5:33 PM