locked
Data model performance issue in SSDT RRS feed

  • Question

  • Hi team,

    I need your help in data model design.

    I have the following data model created. The problem here is if I run a report with just 1 Id, it takes 8-10 secs to render the result which is huge considering we have in memory SSAS. The data in the report looks good though.

    I assume I am doing something wrong here that is why the performance is slow. 

    Here I am filtering on Dim A which will just give 1 record. In the Fact A, I have 1 record for the same. I have 10 records in the Dim B for the corresponding 1 record in Dim A. So when I query I just get 10 records in the report. I have also added a metric from fact A which is just count of Foreign Key.

    However, the number of records in Dim A and Fact A are close to 3 million. And records in Dim B is close to 100 K.

    I think its something to do with direction of the join. Need help.


    • Edited by akj2784 Wednesday, February 28, 2018 3:47 PM
    Wednesday, February 28, 2018 3:46 PM

All replies

  • Can you combine the 2 bridge tables between Fact A and Dim B, then you the bi-directional filtering between Bridge and Fact table?

    Thomas LeBlanc twitter ( @TheSmilingDBA )

    Wednesday, February 28, 2018 4:34 PM
  • That's actually a single table at db level. But I have splitted it in SSDT to avoid many to many.

    Bridge table has FK1, FK2. And FK1 :FK2 is 1:M(Its a relationship table)

    So I am splitting Bridge table into BT 1 and BT 2.

    BT1 will have distinct FK1 and BT2 will have both the FK1 and FK2.

    And I am joining Fact A with BT1 on FK1 so it becomes M:1

    BT1 is joined to BT2 on FK1 so it becomes 1:M

    and BT2 is joined to Dim B on FK2 so it becomes M:1


    Wednesday, February 28, 2018 4:55 PM
  • Consider merging DimA into FactA: similar huge row counts, 1:1 example. And, we could use more details about the measure(s) you're using, and on bridge tables' statistics.

    As a general note, being in-memory only eliminates the storage bottleneck. There're still computations to be done on the data, which is where CPU comes into play.


    Expect me to help you solve your problems, not to solve your problems for you.

    Wednesday, February 28, 2018 7:04 PM
  • How is it gonna help. Assume I have 4-5 more dimensions which are tagged to Fact A. It does not make sense to put all dimensional data point to the fact. 

    And that too if we have multiple facts coming into picture. It makes sense to keep Dim A as conformed dimension.

    So are you suggesting to map all the dimensional attributes too in the fact itself ?

    Assume, I have too many Merge/Append etc in the Dim A Query. Will it also have impact on performance ?

    What I think is it may have impact on data processing. But once the data is processed, it should be treated as one table and query retrieval performance should not be impacted.

    Correct me if I am wrong.

    Regards,

    Akash

    Thursday, March 1, 2018 3:09 AM
  • Hi Akash,

    Thanks for your question.

    Assuming you are using SSAS Tabular Model, to optimize your data model with many to many relationship, please refer to below blog:
    https://www.sqlbi.com/articles/optimize-many-to-many-calculation-in-dax-with-summarize-and-cross-table-filtering/

    For more detailed information about many to many data model, please refer to below whitepaper:
    https://www.sqlbi.com/whitepapers/many2many/


    Best Regards
    Willson Yuan
    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    • Proposed as answer by alexander fun Friday, March 2, 2018 1:17 PM
    Thursday, March 1, 2018 6:36 AM
  • Well, I haven't suggested denormalizing all dimensions into FactA, have I? The possible trouble with DimA-FactA relationship as it stands right now is that the joining columns could have about the same high cardinality, which is an obvious bottleneck candidate. You don't have to denormalize the entirety of DimA into FactA, only high cardinality columns would have to go.

    In your picture, there were relationships leading outside, but not from DimA. If DimA has to relate to several other fact tables, you'd have to weigh the degree of conformity in the dimension against performance degradation from high-cardinality relationships.

    The queries you use to populate your tables only concern processing times. This is a valid concern, but it doesn't affect query times, which was the only complaint in your original post.

    Oh, and I'm not saying this is the one bottleneck. Remember we haven't yet seen the measures nor details about the bridge tables.


    Expect me to help you solve your problems, not to solve your problems for you.



    Thursday, March 1, 2018 8:08 AM
  • Thanks Alexei. I see your point. Will read the document you shared and post my observation.

    Thank you.

    Regards,

    Akash

    Thursday, March 1, 2018 8:47 AM
  • Have gone through the document and wanted to check on the following.

    So If we have just one table combining Dim A and Fact A, all the columns(Dimensional attribute and metrics) would be mapped to one folder in UI. Measures we can still group it in a different folder. But if a metric is created as a column then we can't

    We can't separate the dimensional and factual columns. 

    From end users point of view, they will have to scan through a lot of dimensional data points to select any metric as my dimension table has a lot of dimensional data points.

    Monday, March 5, 2018 8:28 AM
  • That will be the case if you choose to move all dim columns into fact. As I've mentioned, one of the options is to move only the columns that have about the same count of distinct values as the fact table itself.

    You are correct that this is another balance that you'll have to find for your project: between performance and user interface experience.


    Expect me to help you solve your problems, not to solve your problems for you.

    Monday, March 5, 2018 11:05 AM
  • Actually technically it the same table.. But I am treating one alias as a dimension and other as fact just to make the Dim as Conformed dimension. So technically all the columns have same count distinct values as the fact table.

    Thank you quick response.

    Regards,

    Akash

    Monday, March 5, 2018 1:48 PM
  • "Same count of distinct values as the fact table", not "as the same column in the fact table" - in other words, columns nearing the 3m distinct values. Say the table in question is the customer table, so you'll have 3m customer IDs but a single-digit number of options for, e.g., gender column.

    Note how you're again have to balance user interface (you want a dedicated table to serve as 'fact' and its copy dedicated to 'dimension' use) with performance. While I don't know the full scope of your tabular model, this specific solution - to create a full copy of the table so one could house measures and the other slicing/filtering fields, doesn't feel natural to me. Unlike MD, Tabular doesn't differentiate between measure groups and dimensions, and the general concept is that one table plays many roles.


    Expect me to help you solve your problems, not to solve your problems for you.

    Tuesday, March 6, 2018 7:07 AM