none
Same SSAS cube has different drive sizes in two identical environment RRS feed

  • Question

  • I have a PROD SSAS cube that I restored on QA SSAS VM. Both QA and PROD VMs are on same windows OS level AND SSAS version. Both QA and PROD also have the same solution deployed to this cube. They both point to same SQL server as datasource.

    My question is when I FULL process the cube in PROD, the size for CUBENAME.db folder ~ 35 GB but the same cube in QA takes ~ 175 GB. I also compared the SSAS properties between QA and PROD and they are identical too.

    What am I missing here? I also tried creating new cube with that solution in QA and FULL processed it but it still takes up ~175 GB in QA.

    Thanks!

    Friday, April 5, 2019 5:45 PM

Answers

  • Do you think those aggregations in PROD got dropped or modified because we did Process Update on dimensions? We do FULL process PROD cubes weekly so shouldn't those aggregations be processed then. Disk size would not be problem in PROD even after that.

    When you do a ProcessUpdate any flexible aggs will get cleared. Aggs are stored at the partition level, so if you do a full process on that partition it should process all those aggs (until you do the next ProcessUpdate when they will be cleared again). So your daily process should probably be:

    • ProcessUpdate all dims
    • ProcessFull the required fact table partitions
    • ProcessDefault on the database to process any aggs that got cleared by the ProcessUpdate

    What is the recommended steps for cubes after aggregation changes? My assumption was FULL process would bring all aggregations to processed state?

    A ProcessDefault should scan for any unprocessed objects and then process them (without needing to re-read the data from source). Or a ProcessFull will re-read the source data and then re-build everything.


    http://darren.gosbell.com - please mark correct answers

    Thursday, April 11, 2019 11:50 PM
    Moderator

All replies

  • Hi  Anup,

    I am not sure which part cause this problem, compress or storage? I will try to test this in my environment, and discuss this with other engineers. And I will inform you as soon as possible. 

    You also could refer to Comparing tabular and multidimensional solutions  for details.

    Best Regards,
    Zoe Zhi


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, April 8, 2019 6:39 AM
  • Zoe,

    Anything on this yet? Compress and storage settings, I checked are also same between two servers.

    Thanks!

    Tuesday, April 9, 2019 10:04 PM
  • Can you compare the contents of the two different cubename.db folders? Are there additional files in QA that don't exist in Prod? Or are the files in QA just a larger in size?

    http://darren.gosbell.com - please mark correct answers

    Wednesday, April 10, 2019 4:19 AM
    Moderator
  • Darren,

    I was able to find the difference between those two folders. The cubename.0.db in QA is significantly larger because it has  agg.flex.data data files (sizes in the range of ~ GB) for each partitions in .prt folder whereas PROD folders does not have it.

    I did make some aggregation changes on the cube solution before this drive space issue but the same solution was deployed to both QA and PROD server and both cubes were FULL processed. SO why would it only impact QA server only? 

    Thanks!

    Wednesday, April 10, 2019 3:21 PM
  • I did make some aggregation changes on the cube solution before this drive space issue but the same solution was deployed to both QA and PROD server and both cubes were FULL processed. SO why would it only impact QA server only? 

    It's hard to say, but you could start by either: 

    1) scripting out a create script for both Prod and QA and save them as files and then use a diff tool like winmerge to check what's different between the two cubes, maybe something went wrong during the deployment 

    and/or

    2) use a script like the one I posted here (http://geekswithblogs.net/darrengosbell/archive/2008/12/02/ssas-are-my-aggregations-processed.aspx) to check what aggs you have on Prod and if they are processed. 


    http://darren.gosbell.com - please mark correct answers

    Wednesday, April 10, 2019 8:43 PM
    Moderator
  • PROD partitions size is lot smaller for all measure groups. When I ran the query you provided above, most of the aggregations in PROD came back with 0 for the size.

    Do you think those aggregations in PROD got dropped or modified because we did Process Update on dimensions? We do FULL process PROD cubes weekly so shouldn't those aggregations be processed then. Disk size would not be problem in PROD even after that.

    It's QA cube which we only do the FULL process daily that has the DISK space problem.

    What is the recommended steps for cubes after aggregation changes? My assumption was FULL process would bring all aggregations to processed state?

    Thursday, April 11, 2019 5:11 PM
  • Do you think those aggregations in PROD got dropped or modified because we did Process Update on dimensions? We do FULL process PROD cubes weekly so shouldn't those aggregations be processed then. Disk size would not be problem in PROD even after that.

    When you do a ProcessUpdate any flexible aggs will get cleared. Aggs are stored at the partition level, so if you do a full process on that partition it should process all those aggs (until you do the next ProcessUpdate when they will be cleared again). So your daily process should probably be:

    • ProcessUpdate all dims
    • ProcessFull the required fact table partitions
    • ProcessDefault on the database to process any aggs that got cleared by the ProcessUpdate

    What is the recommended steps for cubes after aggregation changes? My assumption was FULL process would bring all aggregations to processed state?

    A ProcessDefault should scan for any unprocessed objects and then process them (without needing to re-read the data from source). Or a ProcessFull will re-read the source data and then re-build everything.


    http://darren.gosbell.com - please mark correct answers

    Thursday, April 11, 2019 11:50 PM
    Moderator
  • As a side note, there seems to be 140GB of aggregations for 35GB of base data - this looks a bit excessive.

    Expect me to help you solve your problems, not to solve your problems for you.

    Friday, April 12, 2019 7:20 AM
  • Ok, got it. Thanks.

    I am using the SSAS OLAPQueryLog table to build my aggregations. QueryLogSample setting is 10 and I also filter the query on duration longer than 10000ms during aggregation building. I have partitions by year for most of the measure groups so these aggregations becomes huge and eat up the space in QA because I am only doing FULL PROCESS there. 

    Is there a way for me to trim the number of aggregations effectively? Should we only assign aggregations on the partitions are mostly used like last two years?

    Please advise.


    Monday, April 15, 2019 4:06 PM
  • Alexei,

    Yes, I agree but I am just creating those aggregations from OLAPQueryLog table and don't think doing it effectively. Please see the question above that I also asked Darren. 

    Let me know if you have any advice on that.

    Thanks!! 

    Monday, April 15, 2019 4:12 PM