Answered by:
Aggregation functions and Semi additivity
Question

Hi
I have some troubles these days with the concept of semi additivity related to the aggregation fuctions. i found in many web sites that it means that the mesure can be aggregated with all dimensions expect the time ones (http://www.packtpub.com/article/measuresandmeasuregroupsmicrosoftanalysisservicespart1) when the official docs don't mention this.
Is it true that a semi additive function can't only be associated with a time dimension?
if it can be some thing else than a time dimension, is there a way to specify the dimensions that can not be used for the aggregation?
and what will be the result if we aggregat with a no supported dimension? NULL or a result that is logicaly wrong ?
Thanks
Wednesday, July 21, 2010 9:11 AM
Answers

Ah, because in my example there could be any number of other dimensions. Let's say there was also a Product dimension whose default member is [All]. So, the values in my example are the values summed across all products but for the given time period. If I was to place the Product dimension on columns say, I would get the Sum value, LastNonEmpty and LastChild values for each product.
So in essence, semiadditive measures are summed up all dimension except time.
Yes?
 Marked as answer by BadBoy06000 Friday, July 23, 2010 1:44 PM
Thursday, July 22, 2010 5:39 PM 
No, MIN and MAX are semiadditive. MIN and MAX will show the minimum (or maximum) value for a given period across all other dimensions. It cannot SUM and then take the MIN or MAX 'cos that doesn't make sense.
So, using the following base for AS:
 DROP TABLE fact_simple_test; CREATE TABLE fact_simple_test ( product_code varchar (10), product_category_code varchar (10), sale_date date , amount numeric (24,12) ); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P1' , 'Cat1' , '20100601' , 10); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P1' , 'Cat1' , '20100602' , 30); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P3' , 'Cat2' , '20100601' , 25); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P2' , 'Cat1' , '20100601' , 13); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P3' , 'Cat2' , '20100601' , 5);  DROP TABLE sale_calendar CREATE TABLE sale_calendar ( sale_date date , sale_month varchar (20), sale_year varchar (20), ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100601' , 'June 2010' , '2010' ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100602' , 'June 2010' , '2010' ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100603' , 'June 2010' , '2010' );
And using MIN as the aggregation function and configuring Sales Calendar as time, we can see that the results for June across all products will show 5 but for June for category 1 will show 10.
All All All All Cat1 Cat1 Cat1 Cat2 Cat2
All P1 P2 P3 All P1 P2 All P3
All All All 5 10 13 5 10 10 13 5 5
All All 20100601 5 10 13 5 10 10 13 5 5
All All 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
All June 2010 All 5 10 13 5 10 10 13 5 5
All June 2010 20100601 5 10 13 5 10 10 13 5 5
All June 2010 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
2010 All All 5 10 13 5 10 10 13 5 5
2010 All 20100601 5 10 13 5 10 10 13 5 5
2010 All 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
2010 June 2010 All 5 10 13 5 10 10 13 5 5
2010 June 2010 20100601 5 10 13 5 10 10 13 5 5
2010 June 2010 20100602 30 30 (null) (null) 30 30 (null) (null) (null)Apologies for the formatting! Haven't got time to fix it...
Please mark answers as correct if you feel I have answered your question.
 Marked as answer by BadBoy06000 Friday, July 23, 2010 1:44 PM
Friday, July 23, 2010 1:28 PM
All replies

Yes, all semiadditive aggregation functions relate to time.
From BOL (2008) :
 Semiadditive

A semiadditive measure can be aggregated along some, but not all, dimensions that are included in the measure group that contains the measure. For example, a measure that represents the quantity available for inventory can be aggregated along a geography dimension to produce a total quantity available for all warehouses, but the measure cannot be aggregated along a time dimension because the measure represents a periodic snapshot of quantities available. Aggregating such a measure along a time dimension would produce incorrect results.
There is a slight extension to this related to the "ByAccount" aggregation function which alters the behaviour of the aggregation function based on an account. The way the behaviour is changed still relates to time, just that the "ByAccount" function derives its behaviour from a dimension defined as an Account dimension. For more information on this see the section in BOL (2008) entitled: Account (Analysis Services  Multidimensional Data)
The following are the SemiAdditive Aggregation Functions :Min
Retrieves the lowest value for all child members.
Max
Retrieves the highest value for all child members.
ByAccount
Calculates the aggregation according to the aggregation function assigned to the account type for a member in an account dimension. If no account type dimension exists in the measure group, treated as the None aggregation function.
For more information about account dimensions, see Account (Analysis Services  Multidimensional Data).
AverageOfChildren
Calculates the average of values for all nonempty child members.
FirstChild
Retrieves the value of the first child member.
LastChild
Retrieves the value of the last child member.
FirstNonEmpty
Retrieves the value of the first nonempty child member.
LastNonEmpty
Retrieves the value of the last nonempty child member.Essentially, measures defined with a semiadditive aggregation function are aggregated based on a dimension set as being of type "time", be it a dimension of "ship dates" or "ordered dates" or "report dates" etc. Clearly you can created calculations using base measures to do anything you want. For example, you could create a calculated member to aggregate over time but use the LastChild function on some other dimension to get the value for all time and some last member in another dimension.
Wednesday, July 21, 2010 1:31 PM 
thanks for your answer
actualy, i ask this question because when i read BOL, it seems like the time dimension was juste an exemple and not the UNIQUE case.
Can u explain me what is the specificity of the semi additvity behavior? when we take for exemple the quantity available mesure, i can get the minimum or maximum for all periods, by years or months, where is the difference with additive ones?
Thursday, July 22, 2010 10:12 AM 
Maybe the little example below demonstrate the behaivior of semiadditive aggregates; here I used distinct count.
Depending on the level of aggregation; here the first on product and the second without a level, you get different results in the sum of distinct count. The sum of the amount instead stays always the same.
Facttable Product Customer Amount Bike Peter 12 Bike Steve 14 Car Peter 223 Car Martin 112 Group by Product Distinct # Customer Sum Amount Bike 2 26 Car 2 335 Total 4 361 Group by none Distinct # Customer Sum Amount 3 361
Olaf Helper  * cogito ergo sum * errare humanum est * quote erat demonstrandum * Wenn ich denke, ist das ein Fehler und das beweise ich täglich http://olafhelper.overblog.deThursday, July 22, 2010 10:48 AM 
thanks olaf, your exemple is realy helpful to see the behavior of an additive and a non additive
But what i can't understand, is the semi additive, like i said before, is the time dimension the UNIQUE dimension that i can't use with semi additive functions and, if it's the case, why for exemple when we take the quantity available mesure(the exemple of philip), i can get the minimum or maximum for all periods, by years or months, isn't this an agregation over time?
Thursday, July 22, 2010 11:40 AM 
Okay, the differences are as follows...
Let's assume a time dimension with 3 days for Jan and 1 day for Feb only and 3 measures whose aggregation functions are set such that the first is additive (sum), the second is semiadditive (LastNonEmpty) and the third is also semiadditive (LastChild). So...
Sum LastNonEmpty LastChild
Year 2010 40 10 10
 Jan 30 10 Null
 01 Jan 20 20 20
 02 Jan 10 10 10
 03 Jan   
 Feb 10 10 10
 01 Feb 10 10 10We can see that for Jan, the Sum is 30 (20 + 10 + Null), LastNonEmpty is 10 (Jan 02 is the last nonempty) and LastChild is Null (Jan 03 is the last child of Jan and its value is Null).
At the year level, the Sum is 40 (Jan + Feb = 30 + 10), LastNonEmpty is 10 (Feb is the last nonempty child of 2010) and LastChild is 10 (Feb is the last child of 2010).Thursday, July 22, 2010 1:16 PM 
It's exactly the point that i want to reach
when we take the two additive functions (sum, count) the agregation result of a level is the sum of the level below (and it works for sum and count). for the semi additive ones the agregation must be redone from the details and, i don't now now if you agree, it is the same thing when we use a time dimension or any other type of dimension.
So why we relate always semi additivity to time dimension when we got exactly the same behavior for the time dimension and for any type of dimensions ?
Thursday, July 22, 2010 1:53 PM 
Ah, because in my example there could be any number of other dimensions. Let's say there was also a Product dimension whose default member is [All]. So, the values in my example are the values summed across all products but for the given time period. If I was to place the Product dimension on columns say, I would get the Sum value, LastNonEmpty and LastChild values for each product.
So in essence, semiadditive measures are summed up all dimension except time.
Yes?
 Marked as answer by BadBoy06000 Friday, July 23, 2010 1:44 PM
Thursday, July 22, 2010 5:39 PM 
YES !! THANK U VERY MUCH
i tested what u said and it s 100% true for average, last, first, first no empty, last no empty: the selected function is used only for the time dimension and for all others the sum function is used
Just a last thing, in my test the min and max are used even for a time dimension (behave as an additive function) just like said in many docs but unlike what said in BOL where they classify min and max with semi additive functions
Is it just an error in BOL, Can you test it please?
Thursday, July 22, 2010 8:29 PM 
No, MIN and MAX are semiadditive. MIN and MAX will show the minimum (or maximum) value for a given period across all other dimensions. It cannot SUM and then take the MIN or MAX 'cos that doesn't make sense.
So, using the following base for AS:
 DROP TABLE fact_simple_test; CREATE TABLE fact_simple_test ( product_code varchar (10), product_category_code varchar (10), sale_date date , amount numeric (24,12) ); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P1' , 'Cat1' , '20100601' , 10); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P1' , 'Cat1' , '20100602' , 30); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P3' , 'Cat2' , '20100601' , 25); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P2' , 'Cat1' , '20100601' , 13); INSERT INTO fact_simple_test (product_code, product_category_code, sale_date, amount) VALUES ('P3' , 'Cat2' , '20100601' , 5);  DROP TABLE sale_calendar CREATE TABLE sale_calendar ( sale_date date , sale_month varchar (20), sale_year varchar (20), ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100601' , 'June 2010' , '2010' ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100602' , 'June 2010' , '2010' ); INSERT INTO sale_calendar (sale_date, sale_month, sale_year) VALUES ('20100603' , 'June 2010' , '2010' );
And using MIN as the aggregation function and configuring Sales Calendar as time, we can see that the results for June across all products will show 5 but for June for category 1 will show 10.
All All All All Cat1 Cat1 Cat1 Cat2 Cat2
All P1 P2 P3 All P1 P2 All P3
All All All 5 10 13 5 10 10 13 5 5
All All 20100601 5 10 13 5 10 10 13 5 5
All All 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
All June 2010 All 5 10 13 5 10 10 13 5 5
All June 2010 20100601 5 10 13 5 10 10 13 5 5
All June 2010 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
2010 All All 5 10 13 5 10 10 13 5 5
2010 All 20100601 5 10 13 5 10 10 13 5 5
2010 All 20100602 30 30 (null) (null) 30 30 (null) (null) (null)
2010 June 2010 All 5 10 13 5 10 10 13 5 5
2010 June 2010 20100601 5 10 13 5 10 10 13 5 5
2010 June 2010 20100602 30 30 (null) (null) 30 30 (null) (null) (null)Apologies for the formatting! Haven't got time to fix it...
Please mark answers as correct if you feel I have answered your question.
 Marked as answer by BadBoy06000 Friday, July 23, 2010 1:44 PM
Friday, July 23, 2010 1:28 PM 
Thank you very much for your explanations, they were realy clear
A last(I hope) thing, if i summarise:
* all semi additive use the selected function for the aggregation with time dimensions and sum with all others.
* the Min and Max are particular in the sense that they are used both for the time dimensions and others dimension.
It's that?
Friday, July 23, 2010 1:53 PM 
Yup, you've got it!
The Min and Max is possibly made more clear by looking at the second to last line in the "unformatted" output in my previous post. You can see there that for a single day (i.e. 20100601) the Min value for all products was 5, thus it is the minumum value across all products on that day.
 Proposed as answer by MASNSN Tuesday, September 14, 2010 11:27 PM
Friday, July 23, 2010 3:24 PM 
At last, it took me a while to get a correct and a clear response
thank u for the time that you spent to explain me
Friday, July 23, 2010 4:26 PM