none
Profit and Loss fact table and possible mixed grain RRS feed

  • Question

  • So, let's say I have a fact table that is capturing revenue and cost events. These can be related to sales orders or (a level above) a merchant or (the highest level) country level. The merchant represents a store and there are several revenues and costs related to the store that is not related to any of the purchases tied to it, for example if "we" would be paying the store to sell our products then that's a merchant level cost, and all the purchases done through that store is a sales order level revenue. Country level rows could be revenues or costs that come from the bookkeeping that is tied to different countries we operate in.

    The fact table has an ID and Type column right now - where the Type column would state if it's a sales order, merchant or country level row revenue or cost, and the ID is the merchant ID, sales order ID or country code.

    There are dimensions like Payment method, Payment terms, Product etc. that are only applicable for the sales order rows.

    Is this a mixed grain fact table or is the grain the "profit and loss event" and having different dimensions for different rows is fine? Since the Merchant and country level revenues and costs are not aggregated numbers from the order data, I feel like this is the same grain. But it feels weird to mix together sales order data with store and country level data, even though it totally works.

    Reporting on country , merchant or any of the sales order related dimensions (product etc.) still work as intended. But I guess these could be three base fact table and then an aggregate fact table could provide the same results. But for what use? 

    Thursday, November 29, 2018 3:29 PM

Answers

  • Just why would you "try to adjust the model to not mix the grains" if "everything works good in the cube"? If it ain't broke, don't fix it.

    Expect me to help you solve your problems, not to solve your problems for you.

    • Proposed as answer by Pirlo Zhang Tuesday, December 4, 2018 1:59 AM
    • Marked as answer by Simn Tuesday, December 4, 2018 7:55 AM
    Friday, November 30, 2018 1:41 PM

All replies

  • having different dimensions for different rows

    What more proof that there's a mix of data at different granularities in the table do you need? That the data is additive doesn't make it uniformly grained.

    Note that a mix of granularities is a frequent feature of data models, and it is up to developers to decide how best to deal with it in each particular case. I certainly have never seen a 'Thou shalt never ever put rows of different granularity in the same table' commandment.


    Expect me to help you solve your problems, not to solve your problems for you.

    Thursday, November 29, 2018 4:18 PM
  • Thanks for answering!
    Sticking to the grain is basically the first and most important rule of Kimball modeling as far as I understand. There's a strong emphasis on that across all modeling discussions I've ever read/participated in.

    But, when Kimball group discusses mixed grain it is usually within the conception that the data turns bad, overlapping or semi-additive etc.. That's a no brainer, the model will simply be bad. I can't see how that applies to this fact measure though.

    I was also really sure at first that this is a mixed grain fact table we are dealing with just by the fact that there are some dimensions that are not applicable for all rows. But I came across some articles online saying that the grain is the business event and not the number of dimensions and in that case I could say my grain is "profitability event", be it for a store or for a sales order, doesn't matter..
    The fact that setting dimension keys to "Not applicable" is a well known modeling/ETL technique also shows that it doesnt' automatically means the fact is an incorrect grain (then it should be remodeled and put into another fact table instead of marking the dimension as Not Applicable).


    Friday, November 30, 2018 8:22 AM
  • I believe BI developers' primary job is to satisfy users' requirements, not Kimball's. While Kimball techniques are incredibly useful and popular, they are not an end in themselves - they are just a means towards an end. Of course, the better a rule is, the rarer it can justifiably be broken.

    Now online authors are free to define grain as they see fit. I think of grain as the product of all attributes, so if some data can be sliced by an attribute and some other can not, their grain is different to me. Still, if your users are OK with country-level figures having 'NA' for Payment terms and such, why, you may still be able to cram it into a single table. But the burden of determining whether the users are really OK is with you, the BI expert. I'll just throw in one example.

    Order and merchant-level figures can be aggregated to country's regions. Can a user ever sum these region-level aggregates, discarding the 'NA', and claim the sum be country's total?


    Expect me to help you solve your problems, not to solve your problems for you.

    Friday, November 30, 2018 9:46 AM
  • Okay, fair enough.

    I think I understand your question correctly and yes - when the users want to see the totals for a country, they want to see all the country, merchant and order sums for that country. When looking at merchant totals they want to see merchant and order values.. And when they are adding in payment method or other order related dimensions to the mix, it would only show the measures from the orders under whatever merchant or country they might at the same time filter on.. So everything works good in the cube.

    But if I would try to adjust the model to not mix the grains - would I create three base fact tables for the three levels of granularity and then have a fourth table that is only on country level but keeps all the aggregated data from merchant and order level and a fifth table that is on merchant level but keeps all the aggregated data from order fact?

    Because otherwise I don't see how I could provide the same data to the users, when they want to slice the data across all levels at the same time as maybe throwing in a dimension that is just merchant or order related.

    Friday, November 30, 2018 1:25 PM
  • Just why would you "try to adjust the model to not mix the grains" if "everything works good in the cube"? If it ain't broke, don't fix it.

    Expect me to help you solve your problems, not to solve your problems for you.

    • Proposed as answer by Pirlo Zhang Tuesday, December 4, 2018 1:59 AM
    • Marked as answer by Simn Tuesday, December 4, 2018 7:55 AM
    Friday, November 30, 2018 1:41 PM