none
Partitioned Fact Table Causing Double Results?? RRS feed

  • Question

  • So, we have a tabular model which we want to partition in order to speed up processing.

    What we're now seeing is somewhat odd. There are three key measures we're looking at: Passenger Revenue, Passenger Legs, and Average Fare. Between our two environments - Stage and Production - these numbers should be the same since they both draw on the same source data. Only difference is that Stage is our "pre-prod" environment. This instance of our tabular model does have the partitioning in place. We are partitioning the FactPassengerSale table.

    Essentially what we are seeing are double counts between the two environments. Below, is a screenshot of the data being returned from production:

    And this is from Stage:

    To make sure there was nothing on the data side wrong, done a query of all tables and row counts between the two are equal, so we SHOULD be seeing the same fare and revenue information. The only thing I can think of is that the partition on the fact table is causing this.

    I've worked with partitioning in a multidimensional environment before, and have never run into this. Again, the only difference between the two models is one is partitioned and the other isn't. I don't see how that would cause a problem, but I could be wrong.

    Any insight would be appreciated! 


    A. M. Robinson

    Wednesday, January 22, 2020 9:31 PM

Answers

  • So any single row is present in two partitions, the default one and one of the two others. That's exactly how you achieve double counting with partitions.

    Expect me to help you solve your problems, not to solve your problems for you.

    • Marked as answer by ansonee Friday, January 24, 2020 9:19 PM
    Thursday, January 23, 2020 11:44 AM

All replies

  • Hi Robinson,

    Thanks for posting here.

    Per your description, I guess the issue is from the fact that you have added the same condition query for the partitions. Please turn to partitions of that fact table, see the query statement under partitions, check if it exists the same where clause.

    Best Regards,

    Will


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, January 23, 2020 2:18 AM
  • It's a tabular model, so there's no "WHERE" clause - it's M.

    There are three partitions: the default "Partition" which is everything. I created two new partitions. FactPassengerSaleSabre

    let
        Source = DbTables,
        dbo_FactPassengerSale = Source{[Schema="dbo",Item="FactPassengerSale"]}[Data],
        #"Removed Other Columns" = Table.SelectColumns(dbo_FactPassengerSale,{"PassengerSaleID", "PassengerLegID", "PointOfPurchaseID", "RevenueProductID", "SaleAgentID", "SaleCT_DateID", "SaleTimeCT", "SaleTimeUTC", "SaleDateID", "DimSourceKey", "SaleType", "NonProratedProductUnit", "ProductLegUnit", "ProratedProductUnit", "NonProratedSaleAmount", "ProratedSaleAmount"}),
        #"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each [DimSourceKey] = 2)
    in
        #"Filtered Rows"

    And FactPassengerSaleNavitaire:

    let
        Source = DbTables,
        dbo_FactPassengerSale = Source{[Schema="dbo",Item="FactPassengerSale"]}[Data],
        #"Removed Other Columns" = Table.SelectColumns(dbo_FactPassengerSale,{"PassengerSaleID", "PassengerLegID", "PointOfPurchaseID", "RevenueProductID", "SaleAgentID", "SaleCT_DateID", "SaleTimeCT", "SaleTimeUTC", "SaleDateID", "DimSourceKey", "SaleType", "NonProratedProductUnit", "ProductLegUnit", "ProratedProductUnit", "NonProratedSaleAmount", "ProratedSaleAmount"}),
        #"Filtered Rows" = Table.SelectRows(#"Removed Other Columns", each [DimSourceKey] = 1)
    in
        #"Filtered Rows"

    The new partitions are on the DimSourceKey column - either 1 or 2. One partition per sourcekey. When processing the model, we do a process full - which would include all three partitions


    A. M. Robinson

    Thursday, January 23, 2020 2:37 AM
  • So any single row is present in two partitions, the default one and one of the two others. That's exactly how you achieve double counting with partitions.

    Expect me to help you solve your problems, not to solve your problems for you.

    • Marked as answer by ansonee Friday, January 24, 2020 9:19 PM
    Thursday, January 23, 2020 11:44 AM
  • That's what I was afraid of!

    So, if  have two separate partitions that "cover" all the rows, would it ten be okay to delete the default partition htat is automatically created by Analysis Services? If you look at any tabular model, you'll see a partition called "Partition" on each table.

    I'm already working on the logic app to only process the one partition that is "current" (source key = 1) and not process the historical partition (source key = 2).


    A. M. Robinson

    Thursday, January 23, 2020 12:03 PM
  • Once you've created your custom partitions, it is in fact necessary to remove the default partition. It stores all the rows for the table and is sure to introduce double counting with any other partition.


    Expect me to help you solve your problems, not to solve your problems for you.

    Thursday, January 23, 2020 12:35 PM
  • That's what I figured, just wanted someone to affirm my "fears"!

    Thanks!!


    A. M. Robinson

    Friday, January 24, 2020 9:19 PM