Asked by:
Serialize Results taking Ages

Question
-
Hi guys,
i have a simple query that runs about 1 minute where 50 seconds are spent serializing the result
the query returns about 20.000 rows and 10 columns
the cube contains one many-to-many relationship, rest is normal stuff
the data i select is on the lowest granularity of my measure-group
(i also tried a relational report which was not faster - wether in SSRS nor in Excel)
i also removed the many-to-many relationship without any big performance gain
does it take that long because the dataset is so big?
or could there be any other issue?
greets,
gerhard
Wednesday, December 10, 2008 5:14 PMAnswerer
All replies
-
That's an awfully large dataset.
Wednesday, December 10, 2008 6:52 PM -
yes, but thats something i can't change, i need all that dataWednesday, December 10, 2008 7:41 PMAnswerer
-
Hi Gerhard,
we tested our CubePlayer a lot with much, much larger datasets/cellsets. Display times become significant, but not so much as in your case.
Here's a few example I pulled for you this evening, to test the speed. Database is AW 2008 Ent, everything running on my 3 year old 2GB laptop. Simple queries, through pivot (Designer), no optimizations in MDX Editor. I used leaves of hierarchies on various dimensions, to generate large enough results and to simulate your queries. I even used one MM dimension. The whole result is aquired and displayed, there's no paging or similar tricks.
Legend. R=Rows, C=Columns, D=Data region.
Example 1)
R = Products × Dates
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size = 15863 × 10 = 158,630 cells
Execution Time = 7 sec
Display Time = 11 sec
Example 2)
R = Products × Dates × Order Lines
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size = 60,153 × 10 = 601,530 cells
Execution Time = 38 sec
Display Time = 55 sec
Example 3)
R = Products × Dates × Order Lines × Destination Currency (MM)
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size=N/A
Execution Time= N/A sec
Display Time= N/A sec
OutOfMemory message during execution of MDX. Well, it's only a laptop after all
, an older one.
Example 4)
R = Products × Dates × Destination Currency (MM)
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size = 191,574 × 10 = 1,915,740 cells (yes, that's almost 2 million cells)
Execution Time = 120 sec
Display Time = 170 sec
But that's a clear bug in program for time measurement, because I waited approx half an hour in total.
Example 5)
R = Products × Destination Currency (MM)
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size = 4,308 × 10 = 43,080 cells
Execution Time = 3 sec
Display Time = 3 sec
Example 6)
R = Dates × Destination Currency (MM)
C = Regions
D = Sales Amount, Total Product Cost
Dataset Size = 12,708 × 10 = 127,080 cells
Execution Time = 6 sec
Display Time = 10 sec
I made another test of Example 4, to see ratios of execution vs. display, hoping to hit a cache. After 2 minutes execution was over and I saw the scrollbars. So, the Execution time might be ok. But no grid yet. After 5 more minutes I heard my laptop's fan like never before. Obviously on such large datasets it is suffering to generate a grid and display it. Btw, we use ComponentOne's grid if I'm not mistaken, internally enhanced of course, to serve the purpose of multidimensional analysis.
Conclusion?
Your dataset should not generate such a significant display time vs. execution time. Maybe you have issues with low free memory on client machine or poor graphic card. See paging and other OS related stuff. Try optimization there. Also, try testing on another computer. I believe you should have from 2:3 up to 1:2 ratio on that data size. Certainly not 1:5.
I believe there's nothing you can do with SSAS, if I understood the problem correctly. It's only up to client computer's capability of generating results and displaying it in a grid. Or a grid itself.
In the end I finally got the results displayed from encore of Example 4. Since I was writing, I didn't capture the exact moment, but it took between 15 and 20 minutes to display the results. Yes, that 2 million cells result. The numbers say 100 vs. 164 this time. I believe the second one is false (display time), while the first matches my rough measurement (2 minutes while watching a Windows clock).
Btw, we measure 5 times in our application, so maybe the bug is in another Prepare Data Time (that comes after Display Time) which says 0. We'll fix that, of course
.
Hope this testing I made helps you in your problem. At least, to guide you where to look at or what to expect.
Best regards,
Tomislav Piasevoli
Business Intelligence Specialist
Thursday, December 11, 2008 1:22 AMAnswerer -
What are the profile numbers if you profile it with Mosha's MDXStudio ?
So you can rule out the grid and network times and get real cube times.
I also suffer performance problems, specially when I add calculated members with aggregates and stuff like that.
They start to add time to the whole query.
Thursday, December 11, 2008 12:32 PM -
With all due respect,
MDX Studio, being a true jewel for testing MDX query performance and doing optimization of MDX queries (as I witnessed at PASS preconf) is not the right application for testing display time. I mean, it's not ment for that. It's ment for testing execution time, besides other things. Moreover, as a free application, it most probably uses a simple (free) grid (it looks that way to me). Which means you can not expect fast display times in it.
I tested my Example 1 in it today, it took 3 seconds to get the results, but 120 seconds (!) to display it (I was looking at Windows clock since it's not measured there). That means Gerhard would wait even longer, maybe 3 minutes to display his 200,000 cells.
And, if I got it right, it's not a matter of MDX query or cube structure that you could enhance. Execution time in Gerhard's case was 10 seconds. It's fine, not something drastic. But displaying result took too long. So, it's a matter of how does an OLAP client (application X with its grid from company Y and additional tweaks) on a particular computer (OS & HW capabitilites with additional tweaks) handle that. Am I right Gerhard?
Regards,
Tomislav Piasevoli
Business Intelligence Specialist
Thursday, December 11, 2008 2:33 PMAnswerer -
Maybe I misundestood Gerhhard's objetive, and I was trying exactly the opposite. To test just olap performance and NOT network dataset transferring/grid performance.
My problem was always cube performance, and that's why I used profile counters.
My apologies if this is not the case.
Thursday, December 11, 2008 4:49 PM -
Hi,
thanks first for your fast responses
@Tomislav: special thanks to your extensive tests - i hope your laptop survived all the testing without any damage
ok, first of all, Serialize Results is an action on the server so it has little todo with the client/client application
if i am right, it is the process to convert the resultset returned by AS to an format that is then transmitted to the client
you can check that in profiler if you, additionaly to default-template, also select the Serialize-Results events
thats whats causing my problems
the time used for displaying/rendering are additional 1-2 minutes (in Excel)
i also tried the whole database on a different server (4 CPU, 12 gig RAM) instead of 2 CPU and 7 gig RAM
without any big difference
i will do further testing on monday and post my results
greets & thanks,
gerhard
Thursday, December 11, 2008 6:36 PMAnswerer -
Hi Gerhard,
What I suspect is happening, based on prior experience (and this is only a guess), is that AS starts serialising the resultset before it's finished all the query processing. So while it might look like your Serialise Results events are taking most of the time taken by your query, that might be because AS can start building up the cellset that needs to be returned fairly early in the overall query processing, but still needs to wait for that query processing to finish before it can finish populating the cellset. In which case, the whole Serialise Results problem is a red herring and you need to look elsewhere for your performance problem. I'm fairly sure I've seen similarly large cellsets returned much quicker than 2 minutes.
...but as I said, this is just a guess.
Chris
Thursday, December 11, 2008 10:03 PM -
Hi Chris,
thanks for the hint, never thought of something like this
so the problem would be the query, which is generated by excel and cannot be changed :/
anyway, i will write my own MDX to check wether its really the query that slows me down or not
greets,
gerhard
Thursday, December 11, 2008 10:20 PMAnswerer -
My apologies to all,
I completely missed the subject. Sorry.
Tomislav Piasevoli
Business Intelligence Specialist
Thursday, December 11, 2008 10:49 PMAnswerer -
Hi again,
after some further investigations i found out that querying the dimension properties has major impact on performance too
selecting all properties (what excel does by default) doubles the query-time
i did 3 tests: the original excel query, the excel query without properties and my custom query also without properties
with Properties (1:1 Excel)
Query Begin
Calculate Non Empty Begin
Query Subcube/Get Data From Cache/ ...
Calculate Non Empty End 94ms
Calculate Non Empty Begin
Query Dimension(s)
Calculate Non Empty End 16ms
Serialize Results Begin
Serialize Results End 17422ms
Query End 17531mswithout Properties
Query Begin
Calculate Non Empty Begin
Query Subcube/Get Data From Cache/ ...
Calculate Non Empty End 109ms
Calculate Non Empty Begin
Query Dimension(s)
Calculate Non Empty End 0ms
Serialize Results Begin
Serialize Results End 8250ms
Query End 8359 msCustom Query without Properties
Query Begin
Calculate Non Empty Begin
Query Subcube/Get Data From Cache/ ...
Calculate Non Empty End 141ms
Calculate Non Empty Begin
Query Dimension(s)
Calculate Non Empty End 0ms
Serialize Results Begin
Serialize Results End 7860ms
Query End 8016ms
notice in all queries that 99% of the time is spent "Serializing Results"
and that exactly what i would like to speed up
any ideas about this?
greets,
gerhardMonday, December 15, 2008 4:58 PMAnswerer -
Are your trace results outputting in the order that you listed them? As Chris says, the serializing often starts very early in the query and I usually see it wrapping the Query Subcube events too, so it is not just a measure of how long the serialization took.
If you use the Profile option in MDX Studio it will not display the results in the grid and will give you accurate timings for both a cold and warm cache. The other thing that the profiler option gives you is a hierarchical view of the trace events. This should show you which events are appearing "inside" the serialize events (which I suspect is a lot more activity than you would expect).
http://geekswithblogs.net/darrengosbell - please mark correct answersTuesday, December 16, 2008 4:22 AM -
Hi Darren,
yes, the trace results occured in the order i listed them
further, using MDX-Studio, execution on cold and warm cache is more or less the same
serialize results still take most of the time (>95%) where real query time of course speeds up from cold to warm cache execution
under the serialize results event are no more entries (no '+' to expand) in my query (but i see a '+' for other queries)
if i check it in Profiler i get the different serializiation events like Serialize Axes/Slicer/Cells
where Serialize Cells takes most of the time - about 90% of all Serialize Events
any further hints?Tuesday, December 16, 2008 8:58 AMAnswerer -
Gerhard,
Did you get any further hints on resolving the long run time during the "serializing the results" event from an Excel MDX query?
I am experiencing very similar results as you described in this thread due to a very large dataset being returned (as required by the customer).Any lessons learned would be greatly appreciated...
Thanks,
KenWednesday, October 21, 2009 2:56 PM -
Hi Ken,
no, unfortunatly i have no more hints
i could not solve this problems
the only advice i can give you is to keep the dataset as small as possible by removing all unnecessary coumns/rows
sorry,
gerhard
- www.pmOne.com -Wednesday, October 21, 2009 7:23 PMAnswerer -
If you are looking at the duration of the Serialization End event what you are actually seeing is a lot more than just the time it takes for SSAS to put data into an XMLA format. The data is actually sort of streamed into the resultset, so serialization start just after the query is parsed and includes the calculation of the axis, the fetching of the data from the storage engine, the calculation of any formulas and the assembly and formatting of the data.
The only easy change that affects serialization is to make sure that you are not requesting more properties (like forecolor, backcolor, font, etc) than you need. Otherwise you need to look at trying to tune the query as a whole.
http://geekswithblogs.net/darrengosbell - please mark correct answersThursday, October 22, 2009 8:16 PM