locked
2 Stage Average RRS feed

  • Question

  • I'm trying to do a couple of things here with a measure that should be an average.

    1. Give global values when we go down below the grain that the average should be (see expected results C)

    2. Perform an average in a hierarchy structure, so I group together the fact rows one way and get an average and then if we go to a coarser grain filter then I want to average the previous average. (See expected results B and A)

    I'm clearly approaching this wrong and would be grateful for some experienced guidance on this.

    First, some example data and expected results :

    SchemeParticipant_Id, ComponentPeriod_Id, ParticipantMeasure_Id, ComponentType_Id, weighted_Rank

    1,1,1,1,0.5

    1,1,2,1,0.25

    1,2,3,1,0.5

    1,2,4,1,0.5

    1,3,5,1,0.5

    1,3,6,1,0.5

    1,4,7,2,0.6

    Expected Results A

    SchemeParticipant_Id, avg_AchievementRating_Pcent

    1, 0.758  (i.e (0.916 + 0.6) / 2)

    Expected Results B

    SchemeParticipant_Id, Component_Id, avg_AchievementRating_Pcent

    1,1,0.916

    1,2,0.6

    Expected Results C

    SchemeParticipant_Id, ComponentPeriod_Id, ParticipantMeasure_Id, avg_AchievementRating_Pcent

    1,1,1,0.916

    1,1,2,0.916

    1,2,3,0.916

    1,2,4,0.916

    1,3,5,0.916

    1,3,6,0.916

    1,4,7,0.6

    For the first problem I realise that because I am trying to aggregate at a higher level than the granularity of my fact table, the function I created doesn't work well if the user includes columns that are below the summarize granularity.

    avg_AchievementRating_Pcent = 
    AVERAGEX
    (   summarize
    	(   ParticipantMeasure
    	,	ParticipantMeasure[SchemeParticipant_Id]
    	,	ParticipantMeasure[ComponentParticipant_Id]
    	,	ParticipantMeasure[ComponentType_Id]
    	,	"sum_Achievment_weighted_Rank",sum(ParticipantMeasure[weighted_Rank])
    	)
    ,	[sum_Achievment_weighted_Rank]
    )

    So to fix this I cobbled together this function that I thought might work or at least get me further on the journey.. but I am getting the error on commit "The column 'ComponentType_Id' specified in the SUMMARIZE function was not found in the input table"

    avg_AchievementRating_Pcent = 
    AVERAGEX
    (   summarize
    	(   ALLEXCEPT
    		(	ParticipantMeasure
    		,	ParticipantMeasure[SchemeParticipant_Id]
    		,	ParticipantMeasure[ComponentParticipant_Id]
    		,	ParticipantMeasure[ComponentType_Id]
    		)
    	,	[SchemeParticipant_Id]
    	,	[ComponentParticipant_Id]
    	,	[ComponentType_Id]
    	,	"sum_Achievment_weighted_Rank", sum([weighted_Achievement_Rank_achieved])
    	)
    ,	[sum_Achievment_weighted_Rank]
    )

    I don't get it.. I thought ALLEXCEPT returns the full table and intellisense allows me to pick the field (unqualified as it is coming from a table returning function).. but it wont allow it for some reason

    For the second problem

    I've tried nesting two summarize function to perform the two stage average, but that doesn't work.

    SUMMARIZE
    (
    	summarize
    		(   ParticipantMeasure
    		,	ParticipantMeasure[SchemeParticipant_Id]
    		,	ParticipantMeasure[ComponentParticipant_Id]
    		,	ParticipantMeasure[ComponentType_Id]
    		,	"sum_Achievment_weighted_Rank", sum(ParticipantMeasure[weighted_Achievement_Rank_achieved])
    		)
    	,   [SchemeParticipant_Id]
    	,   [ComponentType_Id]
    	,   "avg_Achievement_weighted_Rank", average([sum_Achievment_weighted_Rank])
    )
    It would appear that the average is not allowed against a summarize added field? is there a workaround for this?

    Thanks for any help!

    Monday, November 23, 2015 1:57 PM

Answers

  • Creating an average of averages is not a good idea see: http://geekswithblogs.net/darrengosbell/archive/2014/07/28/the-perils-of-calculating-an-average-of-averages.aspx

    This requirement nearly always comes out of trying to replicate this mistake from an Excel report. It's simpler, faster and mathematically correct to simply re-calculate the average at the upper levels when using a cube.


    http://darren.gosbell.com - please mark correct answers

    • Proposed as answer by greggyb Tuesday, November 24, 2015 2:22 AM
    • Marked as answer by Charlie Liao Wednesday, December 2, 2015 8:37 AM
    Tuesday, November 24, 2015 12:52 AM
  • Adding on to what Darren has said (and it is very good advice), I'd mention that it seems like you are very clear on what you want your numerator and denominator to be based on your description. Sometimes it is easier to just do a SUM() of whatever fields, with whatever filters you need to apply and divide by a COUNT() or DISTINCTCOUNT() appropriate to create the denominator, like he describes in his article.

    Some general notes about DAX based on the measures you're writing.

    It's considered a best practice to use ADDCOLUMNS(SUMMARIZE()). See this article from the Italians. I have had no problems using an outer SUMMARIZE() around an ADDCOLUMNS(), though this use case is exceedingly rare. I cannot even speak to SUMMARIZE(SUMMARIZE()) as I do not use SUMMARIZE() for extension columns in queries.

    ALLEXCEPT() does not return the fields specified. Additionally, I've found ALLEXCEPT() to be among the quirkiest table expressions in DAX, as there are specific table use-cases you can fill with all other table functions, but not ALLEXCEPT(). I'd have to do a lot of digging to find examples of this. Ultimately, I find it safest to use ALLEXCEPT() only as a filter argument and not in other cases where a table is expected.

    These are just notes about usage; Darren's is the answer you should heed.

    GNet Group BI Consultant

    • Proposed as answer by Charlie Liao Tuesday, November 24, 2015 3:23 AM
    • Marked as answer by Charlie Liao Wednesday, December 2, 2015 8:37 AM
    Tuesday, November 24, 2015 2:22 AM

All replies

  • Creating an average of averages is not a good idea see: http://geekswithblogs.net/darrengosbell/archive/2014/07/28/the-perils-of-calculating-an-average-of-averages.aspx

    This requirement nearly always comes out of trying to replicate this mistake from an Excel report. It's simpler, faster and mathematically correct to simply re-calculate the average at the upper levels when using a cube.


    http://darren.gosbell.com - please mark correct answers

    • Proposed as answer by greggyb Tuesday, November 24, 2015 2:22 AM
    • Marked as answer by Charlie Liao Wednesday, December 2, 2015 8:37 AM
    Tuesday, November 24, 2015 12:52 AM
  • Adding on to what Darren has said (and it is very good advice), I'd mention that it seems like you are very clear on what you want your numerator and denominator to be based on your description. Sometimes it is easier to just do a SUM() of whatever fields, with whatever filters you need to apply and divide by a COUNT() or DISTINCTCOUNT() appropriate to create the denominator, like he describes in his article.

    Some general notes about DAX based on the measures you're writing.

    It's considered a best practice to use ADDCOLUMNS(SUMMARIZE()). See this article from the Italians. I have had no problems using an outer SUMMARIZE() around an ADDCOLUMNS(), though this use case is exceedingly rare. I cannot even speak to SUMMARIZE(SUMMARIZE()) as I do not use SUMMARIZE() for extension columns in queries.

    ALLEXCEPT() does not return the fields specified. Additionally, I've found ALLEXCEPT() to be among the quirkiest table expressions in DAX, as there are specific table use-cases you can fill with all other table functions, but not ALLEXCEPT(). I'd have to do a lot of digging to find examples of this. Ultimately, I find it safest to use ALLEXCEPT() only as a filter argument and not in other cases where a table is expected.

    These are just notes about usage; Darren's is the answer you should heed.

    GNet Group BI Consultant

    • Proposed as answer by Charlie Liao Tuesday, November 24, 2015 3:23 AM
    • Marked as answer by Charlie Liao Wednesday, December 2, 2015 8:37 AM
    Tuesday, November 24, 2015 2:22 AM