none
SSAS size too large

    Question

  • Hi there.

    I've got this Cube that used to take around 120-130Gb of disk space, and with 80Gb of free diskspace, it was possible to process update the cube.

    Since there were some aggregation designs that could be of some interest, I saved them to the server, and reprocessed the partitions.

     

    Unfortunantely, I ran out of disk (the disk has 252Gb, with around 80Gb free at that time).

    I removed all aggregation designs, by setting every partition with the "no aggregation design" flag, and saved changes back to server again.

    I thought it was best to start with a "clean" cube, so I unprocessed all the dimensions and cubes, and did a full process for both.

     

    But now, for no apparent reason, every time I process this cube, it takes tremendous amounts of disk space. Just to get an idea, when this cube in unprocessed, free space goes up to 215Gb, nowadays,I'm only left with 15Gb, when I used to get 80Gb. Source data didn't grow that much from one day to another.

     

    Strange things happen during process: even with no aggregation designs, several messages appear during the process, informing that aggregations are being built, when no aggregations exist at all.

     

                    "Building the aggregations and indexes for the 'ACCOUNTS_2008_04' partition has started."

     

    When full process is finished, I end up with much less free disk space than before (even with no aggregations, that's the strangest part), thus I'm unable to do a process update the next day.

     

    My cube's storage mode is MOLAP based, with no proactive caching.

     

    Questions:

    Does anyone have ever experienced similar behavior, and knows a reason for it?

    How can I check disk space usage for AS partitions, or even dimensions and measure groups?

    If I change dimension attributes' properties (such as allMemberAggregationUsage, OptimizedState, AttributeHierarchyOptimizedState), do they greatly affect disk space used?

     

    Thanks in advance.

     

       Luís Goes

     

     

    Tuesday, September 09, 2008 4:11 PM

All replies

  • SSAS will use a fair bit of disk space when it processes as it will process using temporary copies of the objects before it commits the processing transaction so that the "current" data is still available for user queries. In the worst case it could create a complete second copy of the database on disk.

     

    If you can do your processing in smaller batches it might help ie. doing process update on one dimension at a time, then processing partitions.

     

     lmgoes wrote:

    Questions:

    Does anyone have ever experienced similar behavior, and knows a reason for it?

    If you are talking about the "building aggregations and Indexes..." messages, this would be because SSAS still needs to build indexes even if you have no aggregations. Checking the size of the partitions and dimensions might give you an idea where the space has gone.

     

     lmgoes wrote:

    How can I check disk space usage for AS partitions, or even dimensions and measure groups?

    If you have a look in the data folder for SSAS you will find a folder for each database. Inside each database there are folders for dimensions, cubes and measure groups. Adding up the size of the files in these folders (or just right click on the folder and look at the properties) will tell you the size of these objects.

     

     lmgoes wrote:

    If I change dimension attributes' properties (such as allMemberAggregationUsage, OptimizedState, AttributeHierarchyOptimizedState), do they greatly affect disk space used?

    If you have any instances where you can set the AttributeHierarchyOptimizedState to NotOptimized this will reduce the size of the dimension and partition Indexes.

    Wednesday, September 10, 2008 5:24 AM
  •  

    Hi there. I already process dimensions one at a time, although I use parallel batch in my normal daily tasks, instead of sequencial. I'm also aware of the fact that SSAS duplicates some cube parts that'll be subject to changes, before the actual commit (a semi-transaction log, so to speak).


     

    >If you have a look in the data folder for SSAS you will find a folder for each database. Inside each database there are

    >folders for dimensions, cubes and measure groups. Adding up the size of the files in these folders (or just right click on

    >the folder and look at the properties) will tell you the size of these objects.

     

    I use SSAS 2005. I'm not sure if you're refering to actual storage location. If you are, unfortunantely in my case, directory names are all but intuitive, since they look like guids to me, and I don't have a clue how to related them with dimensions or measure groups.

     

     

    >If you have any instances where you can set the AttributeHierarchyOptimizedState to NotOptimized this will reduce the

    >size of the dimension and partition Indexes.

     

    I'll check to see If some attributes are good candidates for setting their flag as NotOptimized.

     

    Final questions:

    In which scenarios should I set the allMemberAggregationUsage flag to 'none'?

    Should I set the Grouping behavior flag with "encourage grouping" value, for key attributes?

     

    Thanks in advance.

     

       Luís Goes

    Wednesday, September 10, 2008 9:27 AM
  •  lmgoes wrote:

    >If you have a look in the data folder for SSAS you will find a folder for each database. Inside each database there are

    >folders for dimensions, cubes and measure groups. Adding up the size of the files in these folders (or just right click on

    >the folder and look at the properties) will tell you the size of these objects.

     

    I use SSAS 2005. I'm not sure if you're refering to actual storage location. If you are, unfortunantely in my case, directory names are all but intuitive, since they look like guids to me, and I don't have a clue how to related them with dimensions or measure groups.

    Both SSAS 2005 & 2008 create folder structures that look like the following

     

    <database id>.<version>.db

       <cube id>.<version>.cub

          <measure group id>.<version>.det

            <partition id>.<version>.prt

       ...

       <dimension id>.<version>.dim

       <dimension id>.<version>.dim

       ...

     

    The only way I can guess that you have guids in there is that your objects have guids in the ID properties. I think that some of the temporary files that are created during processing use GUIDs, but I don't think you should see any of these when there is not processing operation going on.

     

     lmgoes wrote:

    In which scenarios should I set the allMemberAggregationUsage flag to 'none'?

    This property controls which attributes the aggregation designer considers in the dimension when it designs aggregations. If you don't any of the attributes considered in aggregation designs then you set this to none. But if you do not have any aggregation designs this will not affect your cube size.

     

     lmgoes wrote:

    Should I set the Grouping behavior flag with "encourage grouping" value, for key attributes?

    I'm pretty sure that the grouping behaviour is really a hint to client tools - it does not affect any of the structures on disk.

    Wednesday, September 10, 2008 11:50 AM
  • 1. If you have some partitions with <StorageLocation> property pointing outside of your Data folder, you might be able to see partition directories with names like <GUID>.DB_Version.Cube_Version.MeasureGroup_Version

    2. If you have Restore or Synch operation running you can also see GUIDlike directories right under Data folder.

    3. Temporary files should be created in 2 places:

    a. right under your Data directory;

    b. in TempDir specified in your msmdsrv.ini file;

     

    Friday, September 12, 2008 7:38 PM
  • 2. If you have Restore or Synch operation running you can also see GUIDlike directories right under Data folder.

    When using Sync - should this GUID folders kept after the Sync is over, or we can delete it?
    Jermy Hoffman
    Sunday, June 28, 2009 9:08 AM
  • When using Sync - should this GUID folders kept after the Sync is over, or we can delete it?
    Jermy Hoffman

    SSAS should clean up any temporary files when the sync completes, any files that remain are most likely required by SSAS. As a general rule you should not need to manually maintain any of the folders in the data directory.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, June 29, 2009 2:53 AM