locked
Proactive Cache Connections RRS feed

  • Question

  • I have a running theory based on observation and I am looking for some opinions on the subject. Namely, that when a cube is configured to use Proactive Caching, when users establish a connection to the cube from a client app, be that Excel or SSRS, the client actually connects to the cache, not the cube directly.

    Is that correct?

    Here's why I am asking. I had a fact table that was set up to use proactive caching. The configuration used the setting "Update teh cache when the data changes". It also had the option to "Bring online immediately". With that setup, it also used a SQL Server Tracking Table for notifications.

    Here's the observation...we had some SSIS packages that performed destructive loads of some of the fact tables that were configured as described above. The first thing the SSIS package did was truncated the fact table. Then it would run the source query and ultimately load th fact table.

    While that process was running, users would complain that they lost visibility to the data in those particular measure groups while actively browsing the cube with Excel (or running a SSRS report).

    My theory is that they clients connected to the cache. When the ETL ran, it truncated the table which triggered the cache to update...with no data in the fact table.

    Am I correct in this logic?
    Tuesday, April 2, 2013 6:30 PM

Answers

  • I have a running theory based on observation and I am looking for some opinions on the subject. Namely, that when a cube is configured to use Proactive Caching, when users establish a connection to the cube from a client app, be that Excel or SSRS, the client actually connects to the cache, not the cube directly.

    Is that correct?

    No, the cache is at the query level. The users still connect directly to the cube, but if they run the same query twice the second query will be answered from cache. Depending on your cube structure and security roles it may be possible to share the cache across multiple users or even across similar queries, but it's not all the data that gets cached, the cache only gets populated as the data is queried and cache entries will be aged out if the system is under memory pressure.

    Here's why I am asking. I had a fact table that was set up to use proactive caching. The configuration used the setting "Update teh cache when the data changes". It also had the option to "Bring online immediately". With that setup, it also used a SQL Server Tracking Table for notifications.

    Here's the observation...we had some SSIS packages that performed destructive loads of some of the fact tables that were configured as described above. The first thing the SSIS package did was truncated the fact table. Then it would run the source query and ultimately load th fact table.

    While that process was running, users would complain that they lost visibility to the data in those particular measure groups while actively browsing the cube with Excel (or running a SSRS report).

    My theory is that they clients connected to the cache. When the ETL ran, it truncated the table which triggered the cache to update...with no data in the fact table.

    Am I correct in this logic?

    No. the "Proactive Caching" is not really descriptive of what this feature does, it's really "Proactive Caching, Processing and changing Storage Modes". When you enable the "bring online immediately" what happens is that as soon as a proactive caching notification is received (ie. any insert/update/delete in any tracking table) the measure group is switched over to ROLAP mode and the queries will go directly against the SQL tables. So this is why your users see an empty measure group - because the cube is querying the fact table that you just cleared. What the "update the cache..." option does is to trigger a processing operation on the measure group.

    But if you are loading your facts with ETL packages then you probably should not be using proactive caching at all, it will be a lot simpler, and you can avoid downtime and excess processing if you just add a process Analysis Services task on the end of your SSIS package.

    Proactive caching is really only useful in a small set of scenarios and works best when you have realtime incremental inserts into your fact tables and you have a reliable means of identifying the new records.


    http://darren.gosbell.com - please mark correct answers

    • Marked as answer by JPtak Wednesday, April 3, 2013 12:29 PM
    Tuesday, April 2, 2013 11:28 PM

All replies

  • I have a running theory based on observation and I am looking for some opinions on the subject. Namely, that when a cube is configured to use Proactive Caching, when users establish a connection to the cube from a client app, be that Excel or SSRS, the client actually connects to the cache, not the cube directly.

    Is that correct?

    No, the cache is at the query level. The users still connect directly to the cube, but if they run the same query twice the second query will be answered from cache. Depending on your cube structure and security roles it may be possible to share the cache across multiple users or even across similar queries, but it's not all the data that gets cached, the cache only gets populated as the data is queried and cache entries will be aged out if the system is under memory pressure.

    Here's why I am asking. I had a fact table that was set up to use proactive caching. The configuration used the setting "Update teh cache when the data changes". It also had the option to "Bring online immediately". With that setup, it also used a SQL Server Tracking Table for notifications.

    Here's the observation...we had some SSIS packages that performed destructive loads of some of the fact tables that were configured as described above. The first thing the SSIS package did was truncated the fact table. Then it would run the source query and ultimately load th fact table.

    While that process was running, users would complain that they lost visibility to the data in those particular measure groups while actively browsing the cube with Excel (or running a SSRS report).

    My theory is that they clients connected to the cache. When the ETL ran, it truncated the table which triggered the cache to update...with no data in the fact table.

    Am I correct in this logic?

    No. the "Proactive Caching" is not really descriptive of what this feature does, it's really "Proactive Caching, Processing and changing Storage Modes". When you enable the "bring online immediately" what happens is that as soon as a proactive caching notification is received (ie. any insert/update/delete in any tracking table) the measure group is switched over to ROLAP mode and the queries will go directly against the SQL tables. So this is why your users see an empty measure group - because the cube is querying the fact table that you just cleared. What the "update the cache..." option does is to trigger a processing operation on the measure group.

    But if you are loading your facts with ETL packages then you probably should not be using proactive caching at all, it will be a lot simpler, and you can avoid downtime and excess processing if you just add a process Analysis Services task on the end of your SSIS package.

    Proactive caching is really only useful in a small set of scenarios and works best when you have realtime incremental inserts into your fact tables and you have a reliable means of identifying the new records.


    http://darren.gosbell.com - please mark correct answers

    • Marked as answer by JPtak Wednesday, April 3, 2013 12:29 PM
    Tuesday, April 2, 2013 11:28 PM
  • Thanks for the thorough reply Darren. One question...I was under the impression that Proactive Caching was intended to keep the cube online while it is processing.

    My ETL design does include a cube processing task at the end of the data loads. But what I have seen is that when the cube is actively processing (which in this case takes about 20-30 minutes), the cube is unavailable to users for browsing or reporting.

    So, how do I keep the cube online & available while it is processing if I don't use Proactive Caching?

    Wednesday, April 3, 2013 12:33 PM
  • Proactive caching is just a way of setting up "event based" processing, but it's really only useful in a very small set of circumstances and I usually advise people not to use it unless they really need to.

    There are two possible causes for a cube being unavailable during processing.

    1. You are processing in multiple transactions and one of the transactions is doing a process full on one or more dimensions.

    When you do a process full on a dimension and measure group that uses that dimension gets cleared when the dimension process transaction commits. To fix this you would either need to change to doing processUpdate on your dimensions as this keeps the internal dataIDs and so does not clear related measure groups. Or change your processing to do everything in a single transaction.

    2. The other thing that can make the cube appear to be offline is that SSAS is doing so much in parallel that the processing operation is consuming all the CPU, Memory and Disk IO resources.

    The first thing I would try here is to set the MaxParallel attribute on the Parallel element in your processing command to something around half or a quarter of the number of cores on your server. This means that your processing operation will take longer, but SSAS will leave some resource headroom for user queries. If you have the database engine and SSAS on the same server I would lean towards a MaxParallel setting of around a quarter of the total number of cores. You should then monitor the CPU, Disk and Memory resources during processing and you can experiment with raising or lowing that figure to get the best compromise between processing speed and the amount of resources to leave free for user queries.


    http://darren.gosbell.com - please mark correct answers

    Wednesday, April 3, 2013 8:21 PM