locked
Grabbing first or Last Value for Custom Time Period RRS feed

  • Question

  • First off, very new to SSAS.  But what I would like to accomplish seems fairly trivial (I hope).

    I have transactional data which occurs by the second.  And I have created two time dimensions.  One dimension is based on the standard time dimension using the time dimension wizard.  The other is a time dimension that was generated by my own script where I have a record for each second of the day.

    I've linked my fact table to my dimensions and have my cube deployed properly.  Within my fact data, one of the attributes is price. Now using min and max I have been able to obtain the highest price and lowest price which occurred over a specific time period.  What I would like to obtain is the first and last price that occurred over a time period.  I've tried using "first value" and "last value" but what is being returned is not correct, it seems to be some sort of aggregation.

    So just for example. If my prices over a 1 minute period were:

    1233, 1233.85, 1300,1250

    First should return, 1233 and last should return 1250.  This should work at all time periods.

    To add, for my custom time dimension I've tried setting its type to "Time" but the results remain the same.  I appreciate any suggestions

    Wednesday, May 23, 2012 5:40 AM

Answers

  • Hello,

    When setting theaggregationFunction property of the price to Firstchild or MAX, does the price still being summed ? if not your problem may be solved.

    If price analysis is needed under the second, your challenge would be to provide the facts rows with a unique timestamp.  If your date dimension is build upon a distinct table then an addtional challenge will be to get this granularity in it. It might be impossible in casse you have a server time dimension.
    Depending of the ETL source, there could be many options. If you can not achieve it in ETL, you may use name queries or named calculations in DSV wich uses a SQL syntax.

    One  option would be to add a new column with an integer that would represent distinct order appearance of a fact row within the second. You cold then use it as an addtional key column in the key column collection for you lowest time level attribute.

    Philip,



    • Edited by VHteghem_Ph Friday, May 25, 2012 7:01 AM
    • Marked as answer by Davewolfs Wednesday, May 30, 2012 2:46 AM
    Friday, May 25, 2012 6:59 AM

All replies

  • You might want to have a look at

    OpeningPeriod(): http://msdn.microsoft.com/en-us/library/ms145992.aspx

    ClosingPeriod(): http://msdn.microsoft.com/en-us/library/ms145584.aspx


    Please vote as helpful or mark as answer, if it helps
    Cheers, Raunak | t: @raunakjhawar | My Blog

    Wednesday, May 23, 2012 5:46 AM
  • Opening period requires that we specifiy which period to pull the value from.  Is there anyway to make this generic for all levels in the cube?
    Wednesday, May 23, 2012 7:45 PM
  • Hello,

    Price variation at the minute level is maybe not the object or not of great interest as BI info and managing it could prove to be costly in ressources. If that is the case, the best option is to fix it in your ETL.  You can also flattened the variation in DSV with named queries or maybe named calculations. In this scenario you should choose the time level at which price variations are considered to be relevant for your analysis.

    Philip,


    • Edited by VHteghem_Ph Wednesday, May 23, 2012 8:54 PM
    Wednesday, May 23, 2012 8:53 PM
  • I need minute and second level.  How can I obtain the first and last value?
    Wednesday, May 23, 2012 10:10 PM
  • Hello,

    Price variation at the minute level is maybe not the object or not of great interest as BI info and managing it could prove to be costly in ressources. If that is the case, the best option is to fix it in your ETL.  You can also flattened the variation in DSV with named queries or maybe named calculations. In this scenario you should choose the time level at which price variations are considered to be relevant for your analysis.

    Philip,



    Not the case.  I need this level of granularity.
    Wednesday, May 23, 2012 10:11 PM
  • I suggest you create two different dimensions:

    One for Date (Year - Quarter - Month - Week - Day)

    One for Time (Hour - Minute - Second)


    Please vote as helpful or mark as answer, if it helps
    Cheers, Raunak | t: @raunakjhawar | My Blog

    Thursday, May 24, 2012 5:41 AM
  • I've done this.  Adding an additional example to clarify what I need.

    SELECT NON EMPTY { [Measures].[High], [Measures].[Low],[Measures].[Price Count],
    [Measures].[Price]} ON COLUMNS, NON EMPTY { ([Dim Date].[Year -  Week -  Date].
    [Date].ALLMEMBERS * HEAD([Dim Time].[Hour - Second].[Second].ALLMEMBERS))} ON ROWS 
    FROM [Cube]
    

    Note that Price is being summed in some instances.  This occurs when there is more than one record.  How can I grab the "first" record from the underlying records here?

    Thursday, May 24, 2012 6:37 AM
  • hello,

    If you have serveral facts per second with different  prices, you will have to define a time dimension at the granularity of your facts. You can achieve this by defining the time key to include the fact key (key colection column). this key time attribute should be place at the lowest level of your time hierarchy (under second level). In case you have a fact count at the second level in your fact table, you could use this count in your time keycollection in stead of the fact key. You can use any fact column that would ensure the unicity of the time dimension key at the fact level.

    Philip,



    Thursday, May 24, 2012 7:20 AM
  • hello,

    If you have serveral facts per second with different  prices, you will have to define a time dimension at a granularity of your facts. You can achieve this by defining the time key to include the fact key (key colection column). this key time attribute should be place at the lowest level of your time hierarchy (under second level). In case you have a fact count at the second level in your fact table, you could use this count in your time keycollection in stead of the fact key. You can use any fact column that would ensure the unicity of the time dimension key at the fact level.

    Philip,



    Thank you Phillip.  This is very helpful.  I do have several facts per second. Could you possibly provide an example of carrying this out.  I am not 100% sure on what you mean by a Time dimension at a granularity of your facts.

    Is there a way that we can do something along the lines of first_value() here?

    Thursday, May 24, 2012 7:44 AM
  • What you need is just a Time dimension. The picture snapshot of result is meaningless.

    The values in second column are not correct. Why do they start with 1900-01-01 and why all values have 00:00:00.000.


    Please vote as helpful or mark as answer, if it helps
    Cheers, Raunak | t: @raunakjhawar | My Blog

    Thursday, May 24, 2012 8:06 AM
  • Hello,

    first: I assume your time dimension is a fact dimension. If not, i doubt the source table of the time dimension would include the fact granularity. I also assume that price variations is the object of the analysis and that you have define it as a measure instead of an slowing changing attribute and that you have choosen an appropriate aggregationFunction property: (Min,Max Firstchild, LastChild ?). One explanation of the aggregated values might be that it is still set to the default "SUM" value.


    You can define  the key attribute of the time dimension to be a collection of attributes  (source columns)

    In the following example the attribute of the postal code of the geagraphy dimension has a collection of KeyColumns because a postal code of a town in germany could have the same postal code as that of a town in another country:

    The goal is to ensure the unicity of each attribute.

     

    In a similar way you can define your time dimension key to be a collection. You choose the source columns such that it would ensure the unicity of the key.


    Philip,








    Thursday, May 24, 2012 8:30 AM
  • What you need is just a Time dimension. The picture snapshot of result is meaningless.

    The values in second column are not correct. Why do they start with 1900-01-01 and why all values have 00:00:00.000.


    Please vote as helpful or mark as answer, if it helps
    Cheers, Raunak | t: @raunakjhawar | My Blog

    I have a time dimension.  The second value is correct.  They are all 00:00:00.000 because I am selecting HEAD in my query and the first second of the day is 00.00.00.000
    Thursday, May 24, 2012 2:12 PM
  • Hello,

    first: I assume your time dimension is a fact dimension. If not, i doubt the source table of the time dimension would include the fact granularity. I also assume that price variations is the object of the analysis and that you have define it as a measure instead of an slowing changing attribute and that you have choosen an appropriate aggregationFunction property: (Min,Max Firstchild, LastChild ?). One explanation of the aggregated values might be that it is still set to the default "SUM" value.


    You can define  the key attribute of the time dimension to be a collection of attributes  (source columns)

    In the following example the attribute of the postal code of the geagraphy dimension has a collection of KeyColumns because a postal code of a town in germany could have the same postal code as that of a town in another country:

    The goal is to ensure the unicity of each attribute.

     

    In a similar way you can define your time dimension key to be a collection. You choose the source columns such that it would ensure the unicity of the key.


    Philip,








    Will FirstChild work with Time dimensions that were not created by the time dimension wizard?

    EDIT: FirstChild does work on records where we do not have multiple records within the same second.

    Also, there is no way to make the record unique other than adding a physical id here.  Databases like Oracle support first_value().

    If a DB uses Date as it's lowest level of granularity, one could easily have multiple records here.  Are we saying that in those conditions one would also not be able to obtain the actual first value of the underlying record?

    



    • Edited by Davewolfs Thursday, May 24, 2012 2:28 PM
    Thursday, May 24, 2012 2:15 PM
  • hello,

    If you have serveral facts per second with different  prices, you will have to define a time dimension at the granularity of your facts. You can achieve this by defining the time key to include the fact key (key colection column). this key time attribute should be place at the lowest level of your time hierarchy (under second level). In case you have a fact count at the second level in your fact table, you could use this count in your time keycollection in stead of the fact key. You can use any fact column that would ensure the unicity of the time dimension key at the fact level.

    Philip,



    Can  you explain how one places the key at the lowest level of the time hierarchy.  Also, does the second have to be part of the same table?  There could be instances where we have hundreds of transactions that occur within the same second.
    Thursday, May 24, 2012 2:52 PM
  • If it helps.  The following query works on SQL.

      select Date,Time,Price,first_value(Price) over (partition by Date, Time order by transactionId) as first_price from [mydb].[dbo].[mytable]


    • Edited by Davewolfs Thursday, May 24, 2012 3:19 PM
    Thursday, May 24, 2012 3:18 PM
  • Hello,

    When setting theaggregationFunction property of the price to Firstchild or MAX, does the price still being summed ? if not your problem may be solved.

    If price analysis is needed under the second, your challenge would be to provide the facts rows with a unique timestamp.  If your date dimension is build upon a distinct table then an addtional challenge will be to get this granularity in it. It might be impossible in casse you have a server time dimension.
    Depending of the ETL source, there could be many options. If you can not achieve it in ETL, you may use name queries or named calculations in DSV wich uses a SQL syntax.

    One  option would be to add a new column with an integer that would represent distinct order appearance of a fact row within the second. You cold then use it as an addtional key column in the key column collection for you lowest time level attribute.

    Philip,



    • Edited by VHteghem_Ph Friday, May 25, 2012 7:01 AM
    • Marked as answer by Davewolfs Wednesday, May 30, 2012 2:46 AM
    Friday, May 25, 2012 6:59 AM