jueves, 29 de marzo de 2012 13:28
We are building a cube (SQL Server Analysis Services 2008 R2 SP1) that should be accessible through MS Excel to a wide variety of users. As a result, there is a need to secure a fair number of dimensions. Each dimension secures only a single attribute via a stored proc. That is, in a secured dimension, there is exactly one attribute that has allowed set filter defined as [assembly].storedproc (<dim name>, <attr>, CustomData()), where CustomData() identifies the user in our application. The stored procedure works very quickly (< 1ms) and returns a fairly small # of members.
There is a number of performance issues with the whole setup.
1. The stored proc is not called once, but many times for each dimension. It seems to be called once for each attribute in a dimension. Is this assumption correct? Is there a way to reduce the number of calls other than reducing # of attributes in the dim? (Perhaps this is not bad by itself but it seems to be very much related to issues below)
2. There appears to be a long delay between the calls for large dimensions. It seems that the cube takes a long time to build the security context for a given attribute after the stored proc returned.
The log of calls to the stored proc for large dims looks like this
12:31:00.000 StoredProc called for dim L, attribute S
12:31:00.001 StoredProc done
12:31:00.100 StoredProc called for dim L, attribute S
12:31:00.101 StoredProc done
12:31:00.200 StoredProc called for dim L, attribute S
12:31:00.201 StoredProc done
12:31:00.400 StoredProc called for dim L, attribute S
12:31:00.401 StoredProc done
*** 12:31:10.001 StoredProc called for dim L, attribute S ***************** note delay here
12:31:10.002 StoredProc done
12:31:10.300 StoredProc called for dim L, attribute S
12:31:10.301 StoredProc done
In comparison, the log of calls to a small dim looks like this:
12:30:00.000 StoredProc called for dim A, attribute S
12:30:00.001 StoredProc done
12:30:00.002 StoredProc called for dim A, attribute S
12:30:00.003 StoredProc done
12:30:00.004 StoredProc called for dim A, attribute S
12:30:00.005 StoredProc done
12:30:00.006 StoredProc called for dim A, attribute S
12:30:00.007 StoredProc done
The largest dims have about 1-3 mil rows and a large portion (but not all) members of most attributes will be visible to a user. Consistently, for one particular large dim, delay between 4-th call and 5-th call takes up 10s. The large delay happens even if the large dim contains only two enabled attributes - the key attribute and the attribute S on which the security is based and no hierarchies. If there are 2 attributes in the dim, the large delay comes after the second call to the stored proc. The delay seems to be proportional to the # of members that the user has access to as opposed to the # of members returned from the stored proc. Is there a way to reduce the delay? Or perhaps postpone the computation of the security context until it is actually needed by a query? Currently it takes up to 40 secs to get a connection for a single user when nobody else is using the system.
3. The calls to the stored proc seem to be serialized. That is, if there are concurrent users requesting a connection, the calls to a stored proc for a second user start appearing only after all calls for all dims are done for the first user, etc. Is there a way to avoid this? I believe that cumulative update 4 for SQL Server 2008 R2 addresses an issue very similar to this one. However, we are using an SP1 that was built after CU4 and should contain the fix. The serialization is (still?) there.
4. Security context is being rebuilt after partition processing. We'd like to have relevant partitions to be processed once in 5 mins. If it takes a single user 40s to login, it's almost unworkable. Is there a way to tell the cube that the allowed sets are not changing and therefore the partition processing does not have to clear security contexts for all connected users?
- Editado 42Dev jueves, 29 de marzo de 2012 13:31
Todas las respuestas
lunes, 02 de abril de 2012 7:17Moderador
In the project, by your original implementation, i don't have any idea to reduce the number of calling the stored procedure. But to implement dynamic security, there is another way described in following link, where the NonEmpty function is used. The NonEmpty can filter out the irrelative dimensions for the current users with the factless relationship table and such so the process time should be acceptable.
try to consider this method in your project.
Hope this helpfully,
jueves, 19 de abril de 2012 21:42
Thanks for your reply. I have tried the approach, but for each user, it is still taking ~40 seconds to create a first connection after any processing on the cube was done. Actually, I have tried something even simpler. This is no good for production, but it demonstrates the point. The allowed set expression now looks like this:
IIf(InStr(CustomData(), "user1") > 0, [Dim].[Attr].&, [Dim].[Attr].&)
There is no calls to stored procs anymore.
- Editado 42Dev jueves, 19 de abril de 2012 21:43
viernes, 20 de abril de 2012 1:14Moderador
... it is still taking ~40 seconds to create a first connection after any processing on the cube was done.
Can you create a test copy of your database and then comment out everything except the first CALCULATE; statement in the MDX script? If you have certain things like complex named sets in your MDX script it can sometimes take a while for the script to be parsed and evaluated on the first connection after processing (as the SSAS caches will be flushed after processing commits)
If this helps then you either need to look at trying to optimize your MDX script or possibly look at some sort of cache warming strategy after processing (although dynamic security could interfere with the effectiveness of a cache warming approach).
The delay you were seeing with your stored proc approach could be due to the overhead of marshalling the result set from .Net to the native SSAS process. This is one of the reasons why the recommendation is to only use stored procs in scenarios where the number of calls can be kept to a minimum.
http://darren.gosbell.com - please mark correct answers
viernes, 20 de abril de 2012 6:30Usuario que responde
i currently have similar problems at 2 of my customers
the reason in my cases is that, whenever you secure an attribute, this security is propagated to all other attributes
simple example: if you put security on your time-dimension and only see Year 2012 then you will also only see months and days belonging to year 2012
for big dimensions with many attributes this can take a quite a while, in my cases up to 2 minutes for each user
this is also independent of wheter you use StoredProcs or the NonEmpty-Approach as this implizit AutoExists is done in both cases
to check whether this is your problem simply use profiler and the events "Calculate Non Empty Begin / End" and "Query Dimension"
you will see the following event-patterns many times:
1) Calculate Non Empty Begin
2) Query Dimension -> contains a subcube with always 2 "1"s and rest of "0"s, where the 2 "1"s are the attributes where AutoExists is applied
3) Calculate Non Empty End -> this row contains a duration
a possible workaround is some kind of cache-warming using EffectiveUserName-Connectionstring property
as the security is cached after a user connects the first time
i also opened a supportcase @ Microsoft - i'll kepp you updated as soon as i have more information!
- www.pmOne.com -
jueves, 26 de abril de 2012 21:22
Thanks for the suggestions. I have removed all statements after CALCULATE. There was no effect on the connection time. The profiler has shown that "Calculate Non Empty" tasks are indeed the culprit in our case. They take up to 10s per some dimensions/attributes.
I'll look into cache warming options.
Another option that seems to show some promise in our situation is to create a new "wrapper" cube (database) that simply links to dims and cubes of the "data" cube (database) that holds the actual data. It seems to be possible to set dynamic security in a role of the wrapper cube. The good news is that the security context of the users connected to the wrapper cube is not discarded when partitions are processed on the data cube. The bad news is that it takes even longer to create a connection to a wrapper cube than directly to the data cube. Another bad news is that the wrapper cube becomes invalid when dimensions are processed on the data cube and we may need to build a retry logic into our application that handles the times when the data cube has just processed a dimension and the wrapper cube has not been reprocessed yet. Perhaps the worst news is the fact that it all feels like a hack - the reason that the wrapper cube is not refreshing the security context of the connection under the exact same conditions in which the data cube would do that doesn't seem to make logical sense. Any comments are very much appreciated.
Gerhard, thanks for the offer on keeping me updated re support case @ MS! I would love to see what they've got to say.
viernes, 11 de mayo de 2012 18:39
The "wrapper" cube solution did not work out, unfortunately. There were two issues:
- The first connection to the wrapper cube took approximately 3-4 times as long as the first connection to the data cube, which by itself was already 10s of seconds.
- When dimension data security in a role in wrapper cube is set for a Parent-Child type of dimension, any attempts to connect under that role fail with internal server error and no messages in the logs
martes, 29 de mayo de 2012 10:42Usuario que responde
here is a statement i got from MS escalation engineer:
"Unfortunately working with a big dimension (millions of members) can cause this slowness at initial connection." (in terms of security evaluation)
also we have not found any solution for this yet except warming the cache which is unfortunately not feasible in my special case
- www.pmOne.com -
- Marcado como respuesta 42Dev martes, 19 de junio de 2012 0:30