none
fact table design question RRS feed

  • Question

  •  

    I have created a factSales table with dimDate, dimCustomer, dimProduct, dimSalesPerson tables.  The dimensions are all joined with surrogate integer identity PK fields which serve as the composite key in the fact table.

     

    It is possible for the same customer to place multiple orders for the same product from the same sales person on the same date.  When this happens it seems to me that only the last order will be stored in the fact table.  I want to have a row for each order.  How would one design a fact table to accomplish this.  The only truly unique piece of data from the OLTP is the sales order number.

     

    factSales:

    PK DateKey, int

    PK CustomerKey, int

    PK ProductKey, int

    PK SalesPersonKey, int

    Amount

    UnitCost

    Weight

    ShippingCost

     

    dimDate

    PK DateKey, int, identity

    SalesDate

     

    dimCustomer

    PK CustomerKey, int, identity

    CustomerName

    Address

     

    dimProduct

    PK ProductKey, int, identity

    Product#

    ProductName

     

    DimSalesPerson

    PK SalesPersonKey, int, idnetity

    SalesPersonName

    Dept

    StartDate

    EndDate

     

    Wednesday, August 22, 2007 12:18 PM

Answers

  • Hi,

    You'll need to stick another key on the fact to make each row unique.  Even if it is something simple like a count

    e.g.

    FK DateKey, int

    FK CustomerKey, int

    FK ProductKey, int

    FK SalesPersonKey, int

    FK OrderNumberOfTheDay, int

    measures ...

     

    If possible a time stamp might be another way, I am assuming though that the different orders happen at different times of the day.  But basically another key in your ETL process will fix that problem.

     

    Hope that helps,

    Matt

    Wednesday, August 22, 2007 1:00 PM

All replies

  • Hi,

    You'll need to stick another key on the fact to make each row unique.  Even if it is something simple like a count

    e.g.

    FK DateKey, int

    FK CustomerKey, int

    FK ProductKey, int

    FK SalesPersonKey, int

    FK OrderNumberOfTheDay, int

    measures ...

     

    If possible a time stamp might be another way, I am assuming though that the different orders happen at different times of the day.  But basically another key in your ETL process will fix that problem.

     

    Hope that helps,

    Matt

    Wednesday, August 22, 2007 1:00 PM
  • Thanks Matt.  I was going to just add the sales order # as it is unique but I didn't want to violate and conventions that might cause problems down the line. 

     

    Thanks again!

     

    Wednesday, August 22, 2007 1:39 PM
  • Hi John

     

    Matt is right in that you should simply bring through the Sales Order Number then your DSV design will take care of the aggregation

     

    Alternatively, unless you need to analyse the data by the Sales Order Number, (which some might argue you should use your OLTP for) you could aggregate the facts in your fact table ETL load process.  This can increase performance both in processing the cube and queries if you have a lot of data

     

    If you want to use any drillthrough functionality then keep the Sales Order Number in.

     

    HTH

     

    Tim

    Wednesday, August 22, 2007 1:46 PM
  • Thanks Tim.

     

    I had also considered that possibility.  My actual OLTP/OLAP is more complex than I showed (I tried to keep it simple for this example).  I would have to join several dozen records in order to aggregate the data and I may have a sales order record which does not yet have a ship date which will  be the key date slicer.  I can load my fact table with a pointer to a 0 date key or just skip SO records without a ship date (my next task to figure out); it would be very complex to aggregate them.  Plus I was already storing the SO# in my fact table on the likely probability the users will want to drill back to the original table.

     

    Wednesday, August 22, 2007 2:36 PM
  • Hey Matt,

     

    Can you please tell me where are these fact tables created and stored in PerformancePoint Server 2007?

     

    Thanks

     

    Mils

     

    Monday, October 15, 2007 11:01 AM
  • Hi,

     

    It's an area I've not covered yet, you could try posting to:

     

    http://forums.microsoft.com/TechNet/default.aspx?ForumGroupID=517&SiteID=17

     

    One of the Adatis boys might be able to help you out.

     

    Sorry I can't be much more help

     

    Matt

    Monday, October 15, 2007 11:21 AM
  • Hi Mils

     

    (And thanks Matt!)

     

    When you submit data it is first submitted to the Service Broker queue and then written to a table MGMeasuregroupName (where the measure group name corresponds to the name of the model you are working with) in the database ApplicationName (which is the name of the database you specified when setting up the Application and usually corresponds to the name of the PPS Application you are working with).

     

    It is then processed into the cubes at regular intervals (set in the planning console)

     

    HTH

     

    Tim K

     

    http://blogs.adatis.co.uk

    http://www.adatis.co.uk

     

    Monday, October 15, 2007 11:31 AM