none
SSAS 2008 -- Why is metadata so slow???

    Question

  • We notice that the first time after a restart, and also after some amount of time after the system has been in operation, metadata access is EXTREMELY slow.  For example, if you launch SSMS, connect to your SSAS db, then try to navigate in the object explorer, it can take several minutes each time you click on a '+' to expand, for exmaple to expand the dimensions folder to view all the dimensions.  I've seen it take up to 10 minutes for the object explorer to update, meanwhile the whole SSMS application is hung and you can't interact with it.  I've run profiler in these situations, sometimes all I see is the DISCOVERY query get launched, but then nothing else.  10 mintues later it returns.  If I'm in an MDX query window and using the cube explorer in the query window left pane, I've noticed in profiler that all of my 250+ measure group partitions are being scanned just so I can explore the members of a simple date dimension.  After the 1st time it does this, subsequent times it is much faster and no longer goes on this major partition scan.  However, after some amount of time, it will 'reset.'  I'm assuming some kind of caching thing is going on.  It impairs operations to some degree for my clients since just to run a backup, add a role, or whatever, it can take several minutes to wait for SSMS to become responsive again.

    Has anyone noticed this or have a work around? 

    Tuesday, June 09, 2009 7:42 PM

Answers

  • Hi Lee,

     

    Did you enable Proactive Caching for the cube? Open SQL Server Management Studio, right click the cube, and click properties to check the Proactive Caching, or disable the Proactive Caching. Since the cube has a large of data, you can consider create partitions, just enable Proactive Caching for those frequency changed measures.

     

    Proactive Caching (Partition Properties Dialog Box):

    http://msdn.microsoft.com/en-us/library/ms189716.aspx

    Managing Analysis Services Partitions:

    http://technet.microsoft.com/en-us/library/ms175604.aspx

    Proactive Caching (Partitions):

    http://technet.microsoft.com/en-us/library/ms174769.aspx

     

    A good article about analyzing real-time business intelligence data, similar to 2008:

    http://technet.microsoft.com/en-gb/library/cc966446.aspx

     

    Hope this helps.

    Raymond

     

    Thursday, June 11, 2009 4:19 AM
    Moderator
  • I was thinking that proactive caching was more likely to be a cause of this than a solution.

    I would recommend opening a second instance of SSMS and running

    SELECT * FROM $SYSTEM.DISCOVER_LOCKS

    To see if there are any pending locks. It's possible that a long running query is holding up the commit of a processing operation (such as one initiated by proactive caching) which has stalled your metadata request.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Thursday, June 11, 2009 5:39 AM
    Moderator
  • Are you on Windows 2003 or 2008. If Windows 2003, then try using the PreAllocate setting:
    http://sqlcat.com/technicalnotes/archive/2008/07/16/running-microsoft-sql-server-2008-analysis-services-on-windows-server-2008-vs-windows-server-2003-and-memory-preallocation-lessons-learned.aspx

    http://artisconsulting.com/Blog/GregGalloway
    Thursday, June 11, 2009 7:12 PM
    Moderator
  • I assume you're on a 64-bit box and that it's the 64-bit edition of SSAS installed? (If not, that would explain it.

    Hmm. That's puzzling. What do the following perfmon counters say after you give SSAS 10 minutes to restart but before anyone connects?

    MSAS 200X:Memory / Memory Usage KB
    MSAS 200X:Memory / Memory Limit Low KB
    MSAS 200X:Memory / Memory Limit High KB

    Memory \ Cache Bytes
    Memory \ Cache Bytes Peak
    Memory \ Available KBytes

    I would also recommend you make a change to the SSAS server properties via SSMS and make double sure that you see the change show up in msmdsrv.ini. (Double check the msmdsrv.ini file you think is the right one is in the directory you think it's in.) Then I would do a diff on your msmdsrv.ini file versus a known new/good install of SSAS and see if any crazy settings have been changed.

    Hope that helps.
    http://artisconsulting.com/Blog/GregGalloway
    Tuesday, June 16, 2009 2:36 PM
    Moderator
  • Oh... and I don't really understand the need for MinimumAllocatedMemory when you have the PreAllocate, but you might try messing with both if you're desperate.
    http://artisconsulting.com/Blog/GregGalloway
    Tuesday, June 16, 2009 2:42 PM
    Moderator
  • Thanks for posting back Lee, that's really interesting. I would have thought that SSMS was just using AMO under the covers, but maybe it is doing some extra calls as well. It should be possible to trace this with Profiler to see if there are any extra DICOVER calls, alternatively you may have uncovered an issue with the Object Browser in SSMS.

    I don't know if you have seen http://powerssas.codeplex.com but this is a powershell provider that I developed which allows you to "mount" an SSAS server like a drive and then you can navigate around like you would on a file system. This is a fairly straight mapping over the AMO object library and you will see that it looks very similar to the metadata pane in SSMS. I am looking at doing another release soon to add some new cmdlets that I have been working on.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Wednesday, June 24, 2009 10:02 PM
    Moderator
  • Hi,

    If you can, please try to connect to your SSAS 2008 server using SSMS 2005. With my experience it must be faster! Why I do not know and would like for SSMS 2008 to open the databases faster!

    Tiago.
    Thursday, June 25, 2009 8:46 AM
  • Great idea on the SSMS 2005!  Then it does seem like something is wrong with SSMS 2008. 


    When I do this I get an error from SSMS 2005 saying that it can only connect to 2005 instances.

    When I've watched profiler, I don't see much activity or excessive discoveries.  I originally thought it was MetaData problem because I would see the discovery query in profiler seem to take a very long time to end but maybe this is because either SSMS is making a large discovery query (don't know that much about discovery yet), or SSMS is taking a long time to accept the returned data(not sure if it works like this, similar to SQL Server's wait type for external client).


    Yes, I see similar queries from both versions, the DISCOVER_XML_METADATA is really the onyl call I saw for the brief tests I ran. So there could be an issue with the way the Object Browser is populated, but I cannot see it on my laptop as I only have a few small databases on it. I think the times you see in profiler are purely server times, the resultset gets sent over as XML and the time for the client (SSMS) to process it is not included.

    The first time I connect to SSAS from SSMS after rebooting my laptop it is slow, but subsequent connections are faster. This indicates to me that this metadata gets cached on the server and I would guess that under memory pressure this cached information can get flushed out. So this could be an indication of possible memory resource issues (on x64 machines you also have to watch the file system cache as that can chew up RAM) However you should possibly see similar issues when using AMO as it has to make similar calls to the server under the covers. (and I would expect that SSMS is using AMO interally anyway) - so it would be interesting to see if AMO via powershell is consistently faster. That would maybe point to an issue in SSMS.


    One bad thing I do notice is if I browse my cube from SSMS.  Sometimes just the act of lauching the browser will cause a massive amount of partition scanning and high CPU usage by SSMS before the browser will render (and before SSMS will become responsive again).  I don't understand this as no data has even been selected yet so why would it be reading all the partitions in the cube, but it will sometimes take several minutes launch the browser.  Maybe it is trying to do some proactive local caching or something.


    The browser in SSMS issues a query like "SELECT FITLER(measures.members, Measures.Currentmember.member_caption = Measures.Currentmember.member_caption) ON COLUMNS FROM <cube>" when you start browsing, presumably to populate the members collection, I'm not sure why does this (as it also appears calls a discover on MDSCHEMA_MEASURES), but this is most likely what causes the partition scan.

    There should not be any caching on the client, it may be an attempt to warm the server cache, but it does not
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Friday, June 26, 2009 5:27 AM
    Moderator
  • When I do this I get an error from SSMS 2005 saying that it can only connect to 2005 instances.


    Yes, you are right. SSMS 2005 can connect to SS2005 but not to AS2005.

    Actually, I use SSMS 2008 and BIDS 2008 to work with AS2005 since we do not yet migrated the production cubes to AS2008.
    In this scenario, what I notice is that when I create the XMLA script for a AS2005 cube in SSMS 2008 it takes much more time than generating the same XMLA script in SSMS 2005.

    The scenario that you described is also visible in SSMS 2005 connecting to AS2005. I.e., if you restart the AS2005 in a server with some databases (> 10) and depending on the size/complexity of the database and then you open SSMS 2005, it will take some time to refresh the metadata. I did not test if SSMS 2008 is slower then SSMS2005 is such situations.

    There is also another issue with SSMS 2008 that I reported to MS thru Connect but I did not got any response yet.
    See http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/33030c6f-2681-47df-9c08-adb4e6ee07f7 and https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=456691&wa=wsignin1.0.

    Tiago.
    Friday, June 26, 2009 9:12 AM
  • Hi Darren,

    I apologize if this is the wrong forum to ask, but I installed powerssas successfully and can see the files and folders it created.  I went to snap it in and get:

    Add-PSSnapin : Windows PowerShell snap-in powerSSAS is not installed on the machine.
    At line:1 char:13
    + add-pssnapin  <<<< powerSSAS

    Is there anything you'd recommend I check?  Maybe the snapi in is not going to the right folder?

    SQL Server 2008 SP1 CU2 x64
    Windows 2003 Server
    PowerShell 1.0

    Hi Lee, it is probably better to continue this discussion here http://powerssas.codeplex.com/Thread/View.aspx?ThreadId=60900 than on this forum.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, June 29, 2009 12:24 AM
    Moderator
  • In SSMS 2008, there's a problem scripting SSAS stuff to a new window. As you've noticed, it takes 10x too long. If you use SSMS to script to a file instead, there is not that delay. Then you can open the .xmla file and there's no delay. I don't have time to search Connect to see if this has been reported yet, but it might be worthwhile for someone to.

    Hope that helps. To summarize, it's the fault of SSMS, not SSAS. And I think you'll see the CPU of ssms.exe taking so long, not msmdsrv.exe. So all the discussion of memory settings and such was off topic since it is the fault of SSMS. Agree?
    http://artisconsulting.com/Blog/GregGalloway
    Monday, June 29, 2009 3:35 PM
    Moderator
  • Hi Lee,

    Just another ideia. Open the database in BIDS "File -> Open -> Analysis Services Database..." and try to see if it is faster then SSMS. To complete, you must do a Save to update the cube directly in the server.

    Tiago.
    Monday, July 06, 2009 5:44 PM
  • If it's just for QA you could also investigate importing the database back into a BIDS project ( File- New Project - Import Analysis Services 10.0 Database ) so that you could check the roles.

    I also played a few years ago with writing reports to list roles using one of the stored procedures from http://asstoredprocedures.codeplex.com . There is a screen shot of the roles report here: http://asstoredprocedures.codeplex.com/Wiki/View.aspx?title=Discover%20Reports&referringTitle=Home you might be able to adapt this technique to your needs.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, July 06, 2009 11:41 PM
    Moderator
  • Hi Lee, I can't reproduce this sort of behaviour. Opening an XMLA window on my SSAS 2008 SP1 instance on results in two discover queries. One to get the ProviderVersion and another to get the Catalog, both of which are small, fast queries. I tried a few different approaches to try to force partition data to be read, but have been unable to find a means of replicating this behaviour.

    Do you have any proactive caching, or any of the automatic MOLAP settings configured for any of your objects?

    I think your best course of action may be to ring Microsoft Product Support. I do not think that this behaviour is normal.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, August 24, 2009 1:59 AM
    Moderator
  • The most common reason for this disconnect between metadata performance and perceived metadata size is the aggregations associated with each partition. Most people don't realize how aggregations can add up in terms of metadata size and cost. And the interesting thing is that an unprocessed partition has no aggregations built, and therefore doesn't have to load the aggregations...

    I would advise you to look at how many aggregations you have for your partitions and consider reducing them. Loading of a partition requires the server to load all the aggregation metadata for that partition, and considering that you have 250 partitions * e.g. 100 aggs per partition, and now there are 25,000 objects to load -- which is non-trivial when you consider that these are XML files on disk!

    HTH,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Wednesday, September 09, 2009 8:36 PM
  • Yes, I do agree -- the reason for the perf issue is actually a very unfortunate side-effect of the implementation of the EstimatedSize property on metadata objects. The implementation requires iterating over all the sub-directories and collecting sizes of the files in them.

    It's something we weren't aware of when we shipped AS 2005, and wanted to address in AS 2008 but couldn't get to in time -- we couldn't just eliminate (or greatly degrade) the property because there are customers who depend on it being pretty accurate. A UI app like Explorer can "return quickly", but a server app has to return a fairly precise value for this to be useful -- and we can't easily change it to be an on-demand property without breaking apps.

    Anyway, the end result is that fetching properties for a database is a lot more expensive than you would expect due to this. Once the files are in the OS cache, it's a lot quicker...

    Hope this helps at least understand the issue even if it doesn't really address it for you. There is already a bug on this, and I'll try to raise the priority on it.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 10, 2009 2:37 AM
  • If the partitions were stored on different storage locations, then the estimated size property will not give us right number. In this case, the meta data will be fast, becuase it is not going to iterate over all the sub directories and collect the size from different LUNs.

    Just a thought !

    -Shah
    imran shah
    Saturday, September 12, 2009 12:56 AM
  • Setup a Null default member on the date attribute. You should be fine.

    -Shah
    imran shah
    Saturday, September 12, 2009 1:01 AM
  • Yes, moving partitions to custom storage locations outside the data folder would help reduce the cost -- but result in a poor estimate of size. It's also more difficult to maintain for backup/restore, etc. As long as you accept those consequences, it's a good idea.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Saturday, September 12, 2009 1:49 AM
  • +

    Export the meta data of the SSAS to a SQL table using AMO. Define views on the top of the table. Query the views for the meta data. It is much better then right click to see the properties in SSMS.

    Adding my 2 cents.

    -Imran.
    imran shah
    Tuesday, September 15, 2009 1:02 PM
  • Hi Lee,

    Thanks for the comments -- we absolutely appreciate the issues this causes for you other customers and will do what we can to improve things going forward.

    As I mentioned before, if we had realized how bad this was going to get with such large models (and an uncached file system) we would have probably not have included the property by default. As it stands though, what's done is done, and we have to try and make things better without implementing changes that are breaking to the other customers who may have started to rely on the property...

    Your suggestions are all valid -- each one has its own issues. The first one is a pretty large change across multiple components (server, AMO, SSMS, BIDS) and causes backward compatibility problems when an older SSMS connects to a newer server. The second one breaks customers who are using it. The third is the probably the direction we would go with -- essentially to do something to make the property perform better instead of trying to eliminate it. As I said, we'll do what we can to improve this in the future and hopefully make this an insignificant issue... We're perfectly willing to compromise on the accuracy -- it is after all an estimate -- but the expectation is for it to be a decent ballpark number for the customers who do want to use it.

    An interesting workaround (although as I mentioned before there are other caveats with it) is to set the Storage Location of the partitions to be a custom location. It's a bit of a hack relying on the fact that the way estimated size is implemented today is to scan the sub-directories of the database folder, and if partitions are not under the database folder then they won't get counted and the scan will therefore be a lot quicker.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Tuesday, September 22, 2009 6:59 PM
  • Hi Lee,

    The custom storage location will not interfere with the ability to process the cube -- its just that the data for the partition is now in a different location. It does affect Backup/Restore (you might need to re-specify the storage locations when you do restore), or Synchronize -- essentially situations where the storage location needs to be re-defined.

    It's not necessarily actionable information, but it can be used for the same reason you use Explorer to show you the size of a folder... The issue isn't even so much about how useful it is (which as I've mentioned before I tend to agree with you that we shouldn't have added it in the first place), but rather that we don't have a good enough sense for how many customers would be impacted by cutting it. If we were to deprecate it and get rid of it, we would have to wait two releases to do so.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 7:03 PM
  • We're not supposed to break functionality that customers might be using. If we started returning EstimatedSize=0 and somebody depended on it (even if only for auditing), then we'll have to restore the functionality -- therefore the guideline is to wait two releases before a deprecated feature can actually go away.

    To you and to me, the estimate might be non-actionable -- but for somebody who has decided to act based on this value, returning 0 would be a problem.

    Canvasing customers will only tell us what we already know (that they want this to be fast) -- not whether any customer out there has built a dependency on this functionality.

    I may be coming off as very rigid here -- but trust me, I'm doing what I can to make this better. We just have to work within a system that has been designed to protect our customers from arbitrary breaking changes.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 7:40 PM
  • "You are negatively affecting customers by keeping it there. I suspect this is impacting more people by having it there than it would if you removed it."
    You're probably correct, but we're better off improving the perf without removing the property than removing it and impacting some customers.

    "If management studio is using the AMO object model and it's this intenally that performs the calculation of the property,..."
    The problem lies with the granularity of the APIs supported by the engine and then used by AMO/SSMS. Asking for an object's metadata returns all its properties, including inefficient ones like EstimatedSize. I've already commented on this general approach in an earlier thread (Lee already proposed it) so I won't go into it any further.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 8:52 PM
  • I'm wondering if this needs to implement this as a scan operation at all. The only things that can change the size are process or drop operations. So if the processing engine keeps a track of how many bytes it just wrote you could write this information out to a separate file - effectively caching the estimated sizes. Otherwise one of these operations could kick off a background thread to update the estimated sizes.

    I'm guessing that this is a vast over simplification and consideration would need to be given to proactive caching and not overly impacting the time it takes to commit a processing operation. But it would seem that the main issues is having to potentially re-scan the disk to get this information if it gets flushed out of RAM.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Thursday, September 24, 2009 9:11 PM
    Moderator
  • That's right Darren -- and that's probably the general direction of the approach we would take to improve the perf (although the separate file is avoidable). Process/Create/Alter/Drop can all have an impact, but before we commit a large object we can get its estimated size and save it in the metadata.

    The file metadata for the modified object is probably already in the file system cache at that time, so it should be a lot quicker than on a cold system and shouldn't significantly impact performance of processing.

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 9:24 PM
  • As an update to this discussion, we did implement an improvement to the way this EstimatedSize property works in the upcoming SQL Server 2008 R2 release -- the latest CTP includes this change. It would be great to hear from any of you who can try this out to see what the impact is on the performance in real-world situations.

    This change does require you to update all your partitions and dimensions in some way -- the server will persist the estimated size of those objects at that point and not require scanning their file sizes every time you discover metadata. ProcessUpdate on the dimensions and ProcessClearIndexes+ProcessIndexes on the partitions should do the trick. Without updating the dimension/partition objects things will still work, but without the performance gain... 

    Additional note: due to these types of changes, you may not be able to take a 2008 R2 database downlevel to a 2008 server.

    As most of you have probably heard, AS 2008 R2 is primarily targeted at the self-service user with the new PowerPivot solution -- but we've tried to tackle a few of the pain points for our existing customers along the way. Some examples:
    - This discovery of metadata issue
    - Processing of a single (typically small) partition in the context of a very large number of metadata objects (e.g. 1000s of partitions) should now be significantly quicker. The algorithm that analyzes dependencies was optimized for this scenario.
    - Also, R2 now supports calculated members in subselects -- and the new Excel 2010 pivot tables will let you make use of that feature.

    Lee, thanks for the feedback in general -- it's very valuable to the team.

    Thanks,
    Akshai


    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Sunday, February 07, 2010 10:04 PM
  • The performance of processUpdate is meant to be better in R2 when processing a cube with a lot of partitions. When ever you do a processUpdate SSAS has to check all the dependant objects (in your case all 3000 partitions) and this checking has been improved in R2.

    But do you really need that many partitions? Have you considered a hybrid approach like having daily for the current month then monthly for data older than that?


    http://darren.gosbell.com - please mark correct answers
    Friday, July 15, 2011 1:18 AM
    Moderator

All replies

  • Hi Lee,

     

    Did you enable Proactive Caching for the cube? Open SQL Server Management Studio, right click the cube, and click properties to check the Proactive Caching, or disable the Proactive Caching. Since the cube has a large of data, you can consider create partitions, just enable Proactive Caching for those frequency changed measures.

     

    Proactive Caching (Partition Properties Dialog Box):

    http://msdn.microsoft.com/en-us/library/ms189716.aspx

    Managing Analysis Services Partitions:

    http://technet.microsoft.com/en-us/library/ms175604.aspx

    Proactive Caching (Partitions):

    http://technet.microsoft.com/en-us/library/ms174769.aspx

     

    A good article about analyzing real-time business intelligence data, similar to 2008:

    http://technet.microsoft.com/en-gb/library/cc966446.aspx

     

    Hope this helps.

    Raymond

     

    Thursday, June 11, 2009 4:19 AM
    Moderator
  • I was thinking that proactive caching was more likely to be a cause of this than a solution.

    I would recommend opening a second instance of SSMS and running

    SELECT * FROM $SYSTEM.DISCOVER_LOCKS

    To see if there are any pending locks. It's possible that a long running query is holding up the commit of a processing operation (such as one initiated by proactive caching) which has stalled your metadata request.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Thursday, June 11, 2009 5:39 AM
    Moderator
  • Are you on Windows 2003 or 2008. If Windows 2003, then try using the PreAllocate setting:
    http://sqlcat.com/technicalnotes/archive/2008/07/16/running-microsoft-sql-server-2008-analysis-services-on-windows-server-2008-vs-windows-server-2003-and-memory-preallocation-lessons-learned.aspx

    http://artisconsulting.com/Blog/GregGalloway
    Thursday, June 11, 2009 7:12 PM
    Moderator
  • We're on Windows 2003.  We did try the PreAllocate prior to posting, which did not work.  We have 32Gb of RAM and the box is dedicated to SSAS (run ETL from a different box) so I made the PreAllocate '80', meaning 80% but it would only use about 4Gb and the system was extremely slow as a result.  In Task Manager or PErfMon, what should it look like after setting PreAllocate?  Immediately upon restart, it should grab 80% of 32Gb (minus OS allocation) right?

    I need to comb through that "lessons learned" again because I think we must have some other property set that impares the PreAllocate.

    The MetaData slowness occurs when nothing else is going on, no queries for data or metadata.  I'll try the DISCOVERY for locks or check it out in ActivityMonitor.  Thanks for the tips.

    Monday, June 15, 2009 9:47 PM
  • I assume you're on a 64-bit box and that it's the 64-bit edition of SSAS installed? (If not, that would explain it.

    Hmm. That's puzzling. What do the following perfmon counters say after you give SSAS 10 minutes to restart but before anyone connects?

    MSAS 200X:Memory / Memory Usage KB
    MSAS 200X:Memory / Memory Limit Low KB
    MSAS 200X:Memory / Memory Limit High KB

    Memory \ Cache Bytes
    Memory \ Cache Bytes Peak
    Memory \ Available KBytes

    I would also recommend you make a change to the SSAS server properties via SSMS and make double sure that you see the change show up in msmdsrv.ini. (Double check the msmdsrv.ini file you think is the right one is in the directory you think it's in.) Then I would do a diff on your msmdsrv.ini file versus a known new/good install of SSAS and see if any crazy settings have been changed.

    Hope that helps.
    http://artisconsulting.com/Blog/GregGalloway
    Tuesday, June 16, 2009 2:36 PM
    Moderator
  • Oh... and I don't really understand the need for MinimumAllocatedMemory when you have the PreAllocate, but you might try messing with both if you're desperate.
    http://artisconsulting.com/Blog/GregGalloway
    Tuesday, June 16, 2009 2:42 PM
    Moderator
  • I'll try that and post back results.  I started a new thread with the PreAllocate problem.  We would love to use that feature as we are need to reduce our processing times.  Thanks!
    Wednesday, June 17, 2009 6:31 PM
  • I've been playing with PowerShell and notice that it is MUCH faster to retrieve metadata than SSMS.  I can script the entire cube database within a minute from PowerShell.  With SSMS, it takes 15-30 minutes for the cube db to script.  Something appears wrong with SSMS and may not be the cube's metadata that is the root cause.
    Wednesday, June 24, 2009 9:25 PM
  • Thanks for posting back Lee, that's really interesting. I would have thought that SSMS was just using AMO under the covers, but maybe it is doing some extra calls as well. It should be possible to trace this with Profiler to see if there are any extra DICOVER calls, alternatively you may have uncovered an issue with the Object Browser in SSMS.

    I don't know if you have seen http://powerssas.codeplex.com but this is a powershell provider that I developed which allows you to "mount" an SSAS server like a drive and then you can navigate around like you would on a file system. This is a fairly straight mapping over the AMO object library and you will see that it looks very similar to the metadata pane in SSMS. I am looking at doing another release soon to add some new cmdlets that I have been working on.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Wednesday, June 24, 2009 10:02 PM
    Moderator
  • Hi,

    If you can, please try to connect to your SSAS 2008 server using SSMS 2005. With my experience it must be faster! Why I do not know and would like for SSMS 2008 to open the databases faster!

    Tiago.
    Thursday, June 25, 2009 8:46 AM
  • Great idea on the SSMS 2005!  Then it does seem like something is wrong with SSMS 2008. 

    When I've watched profiler, I don't see much activity or excessive discoveries.  I originally thought it was MetaData problem because I would see the discovery query in profiler seem to take a very long time to end but maybe this is because either SSMS is making a large discovery query (don't know that much about discovery yet), or SSMS is taking a long time to accept the returned data(not sure if it works like this, similar to SQL Server's wait type for external client).

    One bad thing I do notice is if I browse my cube from SSMS.  Sometimes just the act of lauching the browser will cause a massive amount of partition scanning and high CPU usage by SSMS before the browser will render (and before SSMS will become responsive again).  I don't understand this as no data has even been selected yet so why would it be reading all the partitions in the cube, but it will sometimes take several minutes launch the browser.  Maybe it is trying to do some proactive local caching or something.

    Thanks Darren, I'm going to check out your powershell provider.
    Thursday, June 25, 2009 8:43 PM
  • Great idea on the SSMS 2005!  Then it does seem like something is wrong with SSMS 2008. 


    When I do this I get an error from SSMS 2005 saying that it can only connect to 2005 instances.

    When I've watched profiler, I don't see much activity or excessive discoveries.  I originally thought it was MetaData problem because I would see the discovery query in profiler seem to take a very long time to end but maybe this is because either SSMS is making a large discovery query (don't know that much about discovery yet), or SSMS is taking a long time to accept the returned data(not sure if it works like this, similar to SQL Server's wait type for external client).


    Yes, I see similar queries from both versions, the DISCOVER_XML_METADATA is really the onyl call I saw for the brief tests I ran. So there could be an issue with the way the Object Browser is populated, but I cannot see it on my laptop as I only have a few small databases on it. I think the times you see in profiler are purely server times, the resultset gets sent over as XML and the time for the client (SSMS) to process it is not included.

    The first time I connect to SSAS from SSMS after rebooting my laptop it is slow, but subsequent connections are faster. This indicates to me that this metadata gets cached on the server and I would guess that under memory pressure this cached information can get flushed out. So this could be an indication of possible memory resource issues (on x64 machines you also have to watch the file system cache as that can chew up RAM) However you should possibly see similar issues when using AMO as it has to make similar calls to the server under the covers. (and I would expect that SSMS is using AMO interally anyway) - so it would be interesting to see if AMO via powershell is consistently faster. That would maybe point to an issue in SSMS.


    One bad thing I do notice is if I browse my cube from SSMS.  Sometimes just the act of lauching the browser will cause a massive amount of partition scanning and high CPU usage by SSMS before the browser will render (and before SSMS will become responsive again).  I don't understand this as no data has even been selected yet so why would it be reading all the partitions in the cube, but it will sometimes take several minutes launch the browser.  Maybe it is trying to do some proactive local caching or something.


    The browser in SSMS issues a query like "SELECT FITLER(measures.members, Measures.Currentmember.member_caption = Measures.Currentmember.member_caption) ON COLUMNS FROM <cube>" when you start browsing, presumably to populate the members collection, I'm not sure why does this (as it also appears calls a discover on MDSCHEMA_MEASURES), but this is most likely what causes the partition scan.

    There should not be any caching on the client, it may be an attempt to warm the server cache, but it does not
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Friday, June 26, 2009 5:27 AM
    Moderator
  • When I do this I get an error from SSMS 2005 saying that it can only connect to 2005 instances.


    Yes, you are right. SSMS 2005 can connect to SS2005 but not to AS2005.

    Actually, I use SSMS 2008 and BIDS 2008 to work with AS2005 since we do not yet migrated the production cubes to AS2008.
    In this scenario, what I notice is that when I create the XMLA script for a AS2005 cube in SSMS 2008 it takes much more time than generating the same XMLA script in SSMS 2005.

    The scenario that you described is also visible in SSMS 2005 connecting to AS2005. I.e., if you restart the AS2005 in a server with some databases (> 10) and depending on the size/complexity of the database and then you open SSMS 2005, it will take some time to refresh the metadata. I did not test if SSMS 2008 is slower then SSMS2005 is such situations.

    There is also another issue with SSMS 2008 that I reported to MS thru Connect but I did not got any response yet.
    See http://social.msdn.microsoft.com/Forums/en-US/sqlanalysisservices/thread/33030c6f-2681-47df-9c08-adb4e6ee07f7 and https://connect.microsoft.com/SQLServer/feedback/ViewFeedback.aspx?FeedbackID=456691&wa=wsignin1.0.

    Tiago.
    Friday, June 26, 2009 9:12 AM
  • Hi Darren,

    I apologize if this is the wrong forum to ask, but I installed powerssas successfully and can see the files and folders it created.  I went to snap it in and get:

    Add-PSSnapin : Windows PowerShell snap-in powerSSAS is not installed on the machine.
    At line:1 char:13
    + add-pssnapin  <<<< powerSSAS

    Is there anything you'd recommend I check?  Maybe the snapi in is not going to the right folder?

    SQL Server 2008 SP1 CU2 x64
    Windows 2003 Server
    PowerShell 1.0
    Friday, June 26, 2009 7:50 PM
  • Hi Darren,

    I apologize if this is the wrong forum to ask, but I installed powerssas successfully and can see the files and folders it created.  I went to snap it in and get:

    Add-PSSnapin : Windows PowerShell snap-in powerSSAS is not installed on the machine.
    At line:1 char:13
    + add-pssnapin  <<<< powerSSAS

    Is there anything you'd recommend I check?  Maybe the snapi in is not going to the right folder?

    SQL Server 2008 SP1 CU2 x64
    Windows 2003 Server
    PowerShell 1.0

    Hi Lee, it is probably better to continue this discussion here http://powerssas.codeplex.com/Thread/View.aspx?ThreadId=60900 than on this forum.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, June 29, 2009 12:24 AM
    Moderator
  • In SSMS 2008, there's a problem scripting SSAS stuff to a new window. As you've noticed, it takes 10x too long. If you use SSMS to script to a file instead, there is not that delay. Then you can open the .xmla file and there's no delay. I don't have time to search Connect to see if this has been reported yet, but it might be worthwhile for someone to.

    Hope that helps. To summarize, it's the fault of SSMS, not SSAS. And I think you'll see the CPU of ssms.exe taking so long, not msmdsrv.exe. So all the discussion of memory settings and such was off topic since it is the fault of SSMS. Agree?
    http://artisconsulting.com/Blog/GregGalloway
    Monday, June 29, 2009 3:35 PM
    Moderator
  • We're moving as much of our maintenance work to powershell.  The security role creation and management looks very promising so far.  SSMS is not a viable product to turn over to the client for this and many other administrative operations.

    Unfortunately I have to use SSMS to QA the powershell scripts.  It really is agony.  There's no load on the system, no other users, and each mouse click on a role still requires minutes of wait (no response) time.  Click on a dimension to set some attribute security, wait 3-4 minutes for SSMS to become responsive, do that again 3 times (4 secure dims) then click ok to finish and wait 5-10 minutes for SSMS  to finish writing the role.  Voilla, after just 15 minutes the role is finally created.

    It is surprising Microsoft hasn't corrected this as SSMS is very close to useless for some operations.  Something is clearly and dreadfully wrong with how it processes the cube metadata.
    Monday, July 06, 2009 5:39 PM
  • Hi Lee,

    Just another ideia. Open the database in BIDS "File -> Open -> Analysis Services Database..." and try to see if it is faster then SSMS. To complete, you must do a Save to update the cube directly in the server.

    Tiago.
    Monday, July 06, 2009 5:44 PM
  • If it's just for QA you could also investigate importing the database back into a BIDS project ( File- New Project - Import Analysis Services 10.0 Database ) so that you could check the roles.

    I also played a few years ago with writing reports to list roles using one of the stored procedures from http://asstoredprocedures.codeplex.com . There is a screen shot of the roles report here: http://asstoredprocedures.codeplex.com/Wiki/View.aspx?title=Discover%20Reports&referringTitle=Home you might be able to adapt this technique to your needs.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, July 06, 2009 11:41 PM
    Moderator
  • More statistics.
    (1) Fresh restart
    (2) Wait 15 minutes just in case
    (3) Confirm there is absolutely no load or connections on the cube database other than your SSMS connection
    (4) Right-click on your cube database and select properties
    Just a short 30 minutes later, you get your database properties with all of about 10 properties on it!  Wow, that really makes management of your cube very efficient doesn't it?  30 mintues per mouse click!

    I don't get much better response in this case with powershell AMO.  I see the discover queries but they take minutes to return.  There is something dreadfully wrong with metadata queries in SSAS.  It's a small enough dataset that it should be cached all the time and very fast to retrieve.  Not sure if Microsoft ever considered that there would be people who need to manage the application after it is built.  It's clear MetaData access is an achilles heal of SSAS similar to how BIDs is the achilles heal of SSIS.


    Here's the exact discovery query:
    <RestrictionList xmlns="urn:schemas-microsoft-com:xml-analysis">
              <DatabaseID>Lynx_20090730a</DatabaseID>
              <ObjectExpansion>ExpandObject</ObjectExpansion>
            </RestrictionList>

    <PropertyList xmlns="urn:schemas-microsoft-com:xml-analysis">
              <Timeout>0</Timeout>
              <SspropInitAppName>Microsoft SQL Server Management Studio</SspropInitAppName>
              <LocaleIdentifier>1033</LocaleIdentifier>
              <ClientProcessID>4656</ClientProcessID>
            </PropertyList>

    Duration: 2054175

    That's over 34 minutes to retrieve 4 properties under no system load with no other connections to the DB!!!

    Sunday, August 02, 2009 9:57 PM
  • Other observations.

    (1) Simply opening an XMLA query window in SSMS after a fresh restart or just after processing will result in all cube partitions being massively scanned.  This is amazing given that an XMLA window has no navigational pane so does not even need to look at MetaData.

    (2) Everything is fast when the cube db is in an unprocessed state.  So the slowness associated with SSMS accessing MetaData appears to have nothing to do with the MetaData itself.  It is because SSMS or the cube's query engine stupidly insists on reading significant amounts of data in the cube first before reporting back something as simple as a database ID, role proerties or whatever.

    Sunday, August 23, 2009 3:57 PM
  • Hi Lee, I can't reproduce this sort of behaviour. Opening an XMLA window on my SSAS 2008 SP1 instance on results in two discover queries. One to get the ProviderVersion and another to get the Catalog, both of which are small, fast queries. I tried a few different approaches to try to force partition data to be read, but have been unable to find a means of replicating this behaviour.

    Do you have any proactive caching, or any of the automatic MOLAP settings configured for any of your objects?

    I think your best course of action may be to ring Microsoft Product Support. I do not think that this behaviour is normal.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Monday, August 24, 2009 1:59 AM
    Moderator
  • Lee, have you managed to solve your problem?

    I've been facing the same problems but with SSAS 2005... Opening a cube causes loading metadata extremly slow... I don't why, and can not find any solution :(
    Friday, August 28, 2009 7:22 PM
  • Hmm, Lee how many partitions do you have in your DB?

    I have just found an interesting article:
    http://sqlblogcasts.com/blogs/acalvett/archive/2009/08/21/the-cost-of-ssas-metadata-management.aspx

    In my DB I have over 700 partitions, around 20 dimmensions, and 15 roles with around 100 users. I will try to merge the partitions and reduce the roles number.

    Maybye this will help, I will report back!

    EDIT:
    After merging a lot of partitions (now I have around 350, before I had over 700) the metadata speed improved a bit. The Cube list in SSMS opens a bit faster.
    But I did also two things:

    • I had two cubes, they were quality cubes on which I tested some solutions. One had errors in MDX cube scripts, the other one had error in dimmensions keys. After removing these cubes I saw a performance boost. Don't know why...
    • I changed the configuration variables in memsdrv.ini mentioned here: http://sqlblogcasts.com/blogs/acalvett/archive/2009/04/05/analysis-server-appears-to-hang.aspx#comments I think that this settings gave me a very big boost of performance, shame that weren't documented in the SSAS 2005 performance guide :(

    Regards,
    Adam

    Saturday, August 29, 2009 5:40 PM
  • Hi Adam,

    We have 260+ partitions and about 100 roles (though eventually we'll have as many roles are partitions).  We could defintely be smarter and collapse maybe 100 of small partitions into one.

    It really is suprising however that MS has not put attention this.  There simply is not that much "datasize" to metadata so why does it take so long?  We talking maybe megabytes of data and probably kilobytes!  It is clearly a highly inefficient discover process.  I don't know, maybe it re-reads all the same infomation thousands of times over in a runaway loop, or for some reason reads the actual data stored (cube data, not metadata) before it returns it to the requestor.  As I mentioned before, on a fresh restart, simply querying for a cube database's properties, of which there might be a dozen data points (eg. database name, database ID, etc...) I time it to take 15-30 mintues.  That is terrible performance!  Powershell helps, but only because you can issue a whole set of commands at once and not have to wait in between.  I still see delays with Powershell, they are just less painful than trying to use SSMS and waiting 10 minutes between every step.

    Per your first link, another thing I've noticed when we do weekend full re-processing is an unprocessed cube has MUCH faster metadata.  When the cube database is empty (unprocessed), the metadata is almost instaneous as one would expect it should be all the time (like SQL Server RDBMS metadata is in SSMS).  This makes no sense to me.  Going back to my simple test above, querying database properties, why would that discovery query be any different in a processed cube versus an unprocessed cube, except that the state would be different?

    Per your second link, we changed those properties long ago and they helped immensly to allow small queries higher priority.  So they probably gave you a similar boost.  My understanding is that the boost comes are the price that larger queries may take longer.  We are very careful about making sure MDX is fully optimized before allowing it run in a report that runs against our production system so we haven't had side effects.

    Per your point about when you removed the QA cubes, I have seen where a bad cube can have adverse affects.  We had a situation where we had multiple cube databases.  The IT team rebooted a server during a cube process and one of the cubes was corrupted.  After restart, we could not use SSMS for anything SSAS related on that box.  I can't remember the exact "error opening file..." error but even trying to shut down the services from SSMS would throw that error and not allow us to do anything.  To work around, I had to shutdown SSAS from the Admin Services console, delete the offending database and XML file, and then restart.  So one bad cube database can completely cripple SSMS!

    My best advice for you at the moment is to use Powershell and/or XMLA scripts wherever you can.  Powershell can be more cumbersome when you just want to see a property page (although I keep a "header" script handy that initializes all the basic objects) but it is far easier than trying to use SSMS for things like creating and managing roles.  I use SQL generated XMLA scripts to create new partitions, all processing, backup and restores, database renaming, etc...

    Wednesday, September 09, 2009 5:20 PM
  • The most common reason for this disconnect between metadata performance and perceived metadata size is the aggregations associated with each partition. Most people don't realize how aggregations can add up in terms of metadata size and cost. And the interesting thing is that an unprocessed partition has no aggregations built, and therefore doesn't have to load the aggregations...

    I would advise you to look at how many aggregations you have for your partitions and consider reducing them. Loading of a partition requires the server to load all the aggregation metadata for that partition, and considering that you have 250 partitions * e.g. 100 aggs per partition, and now there are 25,000 objects to load -- which is non-trivial when you consider that these are XML files on disk!

    HTH,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Wednesday, September 09, 2009 8:36 PM
  • That is an interesting point.  We are using minimal aggregations because they are almost useless given the extreme number of attributes we have.  But it would still add up.  I get the point, if you want one tiny piece of metadata information, it might be reading ALL of the metadata for the entire cube db, maybe as some kind of proactive read ahead caching or something.  And it does seem to be cache-like.  It will be faster after the initial long wait.  But after the system is hit with a couple hundred queries, next time you want metadata, it will take a long time again as if the metadata was cycled out of cache.

    It still doesn't make sense from a usability perspective.  If I just want to see the properties screen of my cube database, with the less than a dozen properties on it, why would it need to read all the aggregation metadata?

    Wednesday, September 09, 2009 10:06 PM
  • Yes, I do agree -- the reason for the perf issue is actually a very unfortunate side-effect of the implementation of the EstimatedSize property on metadata objects. The implementation requires iterating over all the sub-directories and collecting sizes of the files in them.

    It's something we weren't aware of when we shipped AS 2005, and wanted to address in AS 2008 but couldn't get to in time -- we couldn't just eliminate (or greatly degrade) the property because there are customers who depend on it being pretty accurate. A UI app like Explorer can "return quickly", but a server app has to return a fairly precise value for this to be useful -- and we can't easily change it to be an on-demand property without breaking apps.

    Anyway, the end result is that fetching properties for a database is a lot more expensive than you would expect due to this. Once the files are in the OS cache, it's a lot quicker...

    Hope this helps at least understand the issue even if it doesn't really address it for you. There is already a bug on this, and I'll try to raise the priority on it.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 10, 2009 2:37 AM
  • We've recently gone through some scalability testing and this is exactly the issue we faced. On large cubes, it is the number of files that are generated in the file system (potentially millions) that were the bottle-neck during meta-data read operations.

    Apparently the operating system file chaching mechanisms are much improved in Windows Server 2008 so that's our next stage of testing.
    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
    Thursday, September 10, 2009 4:27 AM
  • Hmm that is a very interesting point. I will test it during the weekend on our quality systems.

    I have quite a lot of aggregations for some cubes. Reducing the partiton number helped, so maybye the aggregations are the bottle neck.

    I will return to you after the weekend.
    Friday, September 11, 2009 8:14 AM
  • It's a combination of both, the number of partitions and the number of aggregations both impact the number of files generated.
    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
    Friday, September 11, 2009 8:59 AM
  • I used BIDS Helper to determine aggregations count. Some of the cubes had around 50 aggregations designed. I reduced the aggregations count, and mergerd more partitions.

    Now I have 24 cubes, each cube has no more than 10 partitions and 5 aggregations.

    I will process some of this during the weekend, and report back.
    Friday, September 11, 2009 11:03 AM
  • Akshai, that is very helpful and would explain the behavior.  So there is the huge delay initially and then the metadata is fast.   But after a while, if the system is busy servicing queries, it will be slow again.  Which is probably because the system OS cache would have cycled out the metadata files to make room for store files and such.

    Thanks for the explanation!
    Lee

    Friday, September 11, 2009 7:04 PM
  • If the partitions were stored on different storage locations, then the estimated size property will not give us right number. In this case, the meta data will be fast, becuase it is not going to iterate over all the sub directories and collect the size from different LUNs.

    Just a thought !

    -Shah
    imran shah
    Saturday, September 12, 2009 12:56 AM
  • Setup a Null default member on the date attribute. You should be fine.

    -Shah
    imran shah
    Saturday, September 12, 2009 1:01 AM
  • Yes, moving partitions to custom storage locations outside the data folder would help reduce the cost -- but result in a poor estimate of size. It's also more difficult to maintain for backup/restore, etc. As long as you accept those consequences, it's a good idea.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Saturday, September 12, 2009 1:49 AM
  • I created new quality database with 24 cubes. I created bigger partitions, max partition size 1 GB. In total no more than 150 partitions. I removed all aggregations from cubes.

    After processing whole database, metadata speed is only a bit slower then before processing. But it works more than fine.

    Then I designed aggregations for every cube (around 10 aggregations for cube) and processed every cube with 'Process Index' option. After processing metadata was slower. But it was still acceptable to use SSMS to work with the database.

    I tried to design more aggregations (around 30-40 per cube, aggregation data files size around 30% of fact data). After processing metadata speed became much slower.

    I tested what you written above: Metadata speed depends on the number of cubes, partitions number and aggregations number.
    Initially I had over 700 hunderd partitions with 24 cubes and a lot of aggregations. Using SSMS was a nightmare. After reducing partition number and aggregations it started to work just fine.

    Another thing that I noticed is that SSAS works quite fast when the partition size is around 1 GB. Reducing partition number didn't hit perfromance.

    Tuesday, September 15, 2009 10:33 AM
  • +

    Export the meta data of the SSAS to a SQL table using AMO. Define views on the top of the table. Query the views for the meta data. It is much better then right click to see the properties in SSMS.

    Adding my 2 cents.

    -Imran.
    imran shah
    Tuesday, September 15, 2009 1:02 PM
  • Another thing that I noticed is that SSAS works quite fast when the partition size is around 1 GB. Reducing partition number didn't hit perfromance.


    You have to be careful about any assumptions you're making with that statement. It might be fine in your scenario but not necessarily in all cases. The performance you get is very much dependent on the complexity of your cubes, the profile of your queries and the partitioning strategy you have chosen and not just on the size of each partition.

    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
    Tuesday, September 15, 2009 4:11 PM
  • Hi Akshai,

    I know we don't speak for the whole SSAS community but we never use nor even look at the estimated size property and would GLADLY trade it off to be able to use SSMS more effectively.  In our case, this property has rendered SSMS into the range of extremely painful at best and useless at worst.  The only time I've analyzed the size of SSAS files is when we've run into that 4Gb limitation on *store files.  And then I am looking at the file sizes in windows explorer and not looking at the property in SSMS.

    I would recommend the following possibilities, in order of preference:
    (1) Include the property but leave it blank with a button to calc it you want to get it (kind of like how in SQL Server, with SSMS you can OPTIONALLY get index fragmentation, it doesn't go an try to analyze fragmentation automatically and make you wait just because you wanted to see index properties) .  Also, with this approach, if do want to know the size of an object, you can get the size just for that object and don't have wait for it to get the size of every object in the whole cube db.

    (2) Don't include the property at all.  In our case, we'd never miss it.

    (3) Caching the size results somehow so it doesn't have to be regenerated everytime.  It sounds like it is regenerated everytime, it is just sometimes all the files are in system cache and it goes much faster than re-reading from disk.  If the file size and dates haven't changed it should not require a re-calc.  But I think this way would be more complicated and possibly still use too many resources.

    Thanks,
    Lee
    Monday, September 21, 2009 1:39 PM
  • Hi Lee,

    Thanks for the comments -- we absolutely appreciate the issues this causes for you other customers and will do what we can to improve things going forward.

    As I mentioned before, if we had realized how bad this was going to get with such large models (and an uncached file system) we would have probably not have included the property by default. As it stands though, what's done is done, and we have to try and make things better without implementing changes that are breaking to the other customers who may have started to rely on the property...

    Your suggestions are all valid -- each one has its own issues. The first one is a pretty large change across multiple components (server, AMO, SSMS, BIDS) and causes backward compatibility problems when an older SSMS connects to a newer server. The second one breaks customers who are using it. The third is the probably the direction we would go with -- essentially to do something to make the property perform better instead of trying to eliminate it. As I said, we'll do what we can to improve this in the future and hopefully make this an insignificant issue... We're perfectly willing to compromise on the accuracy -- it is after all an estimate -- but the expectation is for it to be a decent ballpark number for the customers who do want to use it.

    An interesting workaround (although as I mentioned before there are other caveats with it) is to set the Storage Location of the partitions to be a custom location. It's a bit of a hack relying on the fact that the way estimated size is implemented today is to scan the sub-directories of the database folder, and if partitions are not under the database folder then they won't get counted and the scan will therefore be a lot quicker.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Tuesday, September 22, 2009 6:59 PM
  • Hi Akshai,

    I'll think about that workaround.  It seems like it would break the ability to process the cube but I might not be understanding it and need to think it over more.

    As far as customers using the estimated size property, what is a valid use for it?  It does not seem like actionable information.  Performance tuning perhaps?

    Thanks,
    Lee
    Thursday, September 24, 2009 6:53 PM
  • Hi Lee,

    The custom storage location will not interfere with the ability to process the cube -- its just that the data for the partition is now in a different location. It does affect Backup/Restore (you might need to re-specify the storage locations when you do restore), or Synchronize -- essentially situations where the storage location needs to be re-defined.

    It's not necessarily actionable information, but it can be used for the same reason you use Explorer to show you the size of a folder... The issue isn't even so much about how useful it is (which as I've mentioned before I tend to agree with you that we shouldn't have added it in the first place), but rather that we don't have a good enough sense for how many customers would be impacted by cutting it. If we were to deprecate it and get rid of it, we would have to wait two releases to do so.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 7:03 PM
  • Why not deprecate it and default it's value to 0 or empty. At the same time add a method to the AMO object model e.g. CalculateEstimatedSize(), which if anybody is using the EstimatedSize property can migrate to.

    Like it's been said, it's not actionable information. At best it will appear on some audit report somewhere. At worst someones is using it to do capacity planning or tomanage hosting which they shouldn't do in the first place.

    Can't you canvas your customers about this somewhere? Put it for a vote? What about the Partner Advisory Council?

    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
    Thursday, September 24, 2009 7:22 PM
  • We're not supposed to break functionality that customers might be using. If we started returning EstimatedSize=0 and somebody depended on it (even if only for auditing), then we'll have to restore the functionality -- therefore the guideline is to wait two releases before a deprecated feature can actually go away.

    To you and to me, the estimate might be non-actionable -- but for somebody who has decided to act based on this value, returning 0 would be a problem.

    Canvasing customers will only tell us what we already know (that they want this to be fast) -- not whether any customer out there has built a dependency on this functionality.

    I may be coming off as very rigid here -- but trust me, I'm doing what I can to make this better. We just have to work within a system that has been designed to protect our customers from arbitrary breaking changes.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 7:40 PM
  • I appreciate that, but it just "seems" like a trivial thing that doesn't produce exact values anyway and is having a very big impact on usability. You are negatively affecting customers by keeping it there. I suspect this is impacting more people by having it there than it would if you removed it. So it's a bit of a no win situation and managing it is about affecting as few people as possible.

    In terms of improving the performance, I can't see why internally this property can't be calcualted on demand. If management studio is using the AMO object model and it's this intenally that performs the calculation of the property, then it shouldn't do so as part of the constructor, it should do so when the property is first accessed. Management studio should have noneed to access it until a user brings up the property pages and even then it could be hidden behind a button.

    But then this is all speculation and I have no idea how this works internally and what impact it would have which you obviously have a much better idea about. Just some thoughts really.
    Adam Tappis. (MCSD VB6 & .NET, MCDBA SQL 2000)
    Thursday, September 24, 2009 7:55 PM
  • "You are negatively affecting customers by keeping it there. I suspect this is impacting more people by having it there than it would if you removed it."
    You're probably correct, but we're better off improving the perf without removing the property than removing it and impacting some customers.

    "If management studio is using the AMO object model and it's this intenally that performs the calculation of the property,..."
    The problem lies with the granularity of the APIs supported by the engine and then used by AMO/SSMS. Asking for an object's metadata returns all its properties, including inefficient ones like EstimatedSize. I've already commented on this general approach in an earlier thread (Lee already proposed it) so I won't go into it any further.

    Thanks,
    Akshai

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 8:52 PM
  • I'm wondering if this needs to implement this as a scan operation at all. The only things that can change the size are process or drop operations. So if the processing engine keeps a track of how many bytes it just wrote you could write this information out to a separate file - effectively caching the estimated sizes. Otherwise one of these operations could kick off a background thread to update the estimated sizes.

    I'm guessing that this is a vast over simplification and consideration would need to be given to proactive caching and not overly impacting the time it takes to commit a processing operation. But it would seem that the main issues is having to potentially re-scan the disk to get this information if it gets flushed out of RAM.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Thursday, September 24, 2009 9:11 PM
    Moderator
  • That's right Darren -- and that's probably the general direction of the approach we would take to improve the perf (although the separate file is avoidable). Process/Create/Alter/Drop can all have an impact, but before we commit a large object we can get its estimated size and save it in the metadata.

    The file metadata for the modified object is probably already in the file system cache at that time, so it should be a lot quicker than on a cold system and shouldn't significantly impact performance of processing.

    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Thursday, September 24, 2009 9:24 PM
  • Hi Akshai,

    I think the major suggestion to take back to the devleopment team at Microsoft is that MANAGING the SSAS application is just as important as creating the SSAS application.  Analysis Services has now been around since the late 90's.  That is plenty of time to develop effective management tools for a database system.  But as of today the management tools are woefully lacking.  There is no Activity Monitor except the buggy codeplex download.  Cancelling long running queries or processing doesn't work half of the time (system will not cancel them no matter how many times you issue a cancel) requiring restarting services on a production application during primetime.  And a pretty much useless feature, estimating size of objects, clogs up everything you try to do with the SSMS interface.

    I'm not directing this at you but at the Microsoft product team in general that is responsible for delivering this product.  You have been very helpful by the way.  Without your input, we'd still be wondering why it behaves the way it does.  Thanks!
    Tuesday, October 20, 2009 6:51 PM
  • As an update to this discussion, we did implement an improvement to the way this EstimatedSize property works in the upcoming SQL Server 2008 R2 release -- the latest CTP includes this change. It would be great to hear from any of you who can try this out to see what the impact is on the performance in real-world situations.

    This change does require you to update all your partitions and dimensions in some way -- the server will persist the estimated size of those objects at that point and not require scanning their file sizes every time you discover metadata. ProcessUpdate on the dimensions and ProcessClearIndexes+ProcessIndexes on the partitions should do the trick. Without updating the dimension/partition objects things will still work, but without the performance gain... 

    Additional note: due to these types of changes, you may not be able to take a 2008 R2 database downlevel to a 2008 server.

    As most of you have probably heard, AS 2008 R2 is primarily targeted at the self-service user with the new PowerPivot solution -- but we've tried to tackle a few of the pain points for our existing customers along the way. Some examples:
    - This discovery of metadata issue
    - Processing of a single (typically small) partition in the context of a very large number of metadata objects (e.g. 1000s of partitions) should now be significantly quicker. The algorithm that analyzes dependencies was optimized for this scenario.
    - Also, R2 now supports calculated members in subselects -- and the new Excel 2010 pivot tables will let you make use of that feature.

    Lee, thanks for the feedback in general -- it's very valuable to the team.

    Thanks,
    Akshai


    -- This posting is provided "AS IS" with no warranties, and confers no rights
    Sunday, February 07, 2010 10:04 PM
  • That's great news Akshai.  I'll be taking a closer look at R2 shortly as I am interested in the PowerPivot Solution.
    Tuesday, February 23, 2010 6:27 PM
  • Hello, I am having this problem with SSAS 2008. Also, the processing time when process update dimensions has increased like hell... Why is this behavior? I have moved from a Monthly Partitioning Design to a Daily Partitioning Design what increased alot the number of partitions. It was around 100 partitions and now its over 3000... Can someone help me? Is this solved in R2? Thank you
    Luis Simões
    • Edited by Luis Simões Thursday, July 14, 2011 4:38 PM Mistake
    Thursday, July 14, 2011 4:33 PM
  • The performance of processUpdate is meant to be better in R2 when processing a cube with a lot of partitions. When ever you do a processUpdate SSAS has to check all the dependant objects (in your case all 3000 partitions) and this checking has been improved in R2.

    But do you really need that many partitions? Have you considered a hybrid approach like having daily for the current month then monthly for data older than that?


    http://darren.gosbell.com - please mark correct answers
    Friday, July 15, 2011 1:18 AM
    Moderator