locked
Serialize Results taking Ages RRS feed

  • Question

  • Hi guys,

     

    i have a simple query that runs about 1 minute where 50 seconds are spent serializing the result

    the query returns about 20.000 rows and 10 columns

    the cube contains one many-to-many relationship, rest is normal stuff

    the data i select is on the lowest granularity of my measure-group

    (i also tried a relational report which was not faster - wether in SSRS nor in Excel)

    i also removed the many-to-many relationship without any big performance gain

     

    does it take that long because the dataset is so big?

    or could there be any other issue?

     

    greets,

    gerhard

    Wednesday, December 10, 2008 5:14 PM
    Answerer

All replies

  • That's an awfully large dataset.

    Wednesday, December 10, 2008 6:52 PM
  • yes, but thats something i can't change, i need all that data

     

    Wednesday, December 10, 2008 7:41 PM
    Answerer
  •  

    Hi Gerhard,

     

    we tested our CubePlayer a lot with much, much larger datasets/cellsets. Display times become significant, but not so much as in your case.

     

    Here's a few example I pulled for you this evening, to test the speed. Database is AW 2008 Ent, everything running on my 3 year old 2GB laptop. Simple queries, through pivot (Designer), no optimizations in MDX Editor. I used leaves of hierarchies on various dimensions, to generate large enough results and to simulate your queries. I even used one MM dimension. The whole result is aquired and displayed, there's no paging or similar tricks.

     

    Legend. R=Rows, C=Columns, D=Data region.

     

    Example 1)

    R = Products × Dates

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size = 15863 × 10 = 158,630 cells

    Execution Time = 7 sec

    Display Time = 11 sec

     

    Example 2)

    R = Products × Dates × Order Lines

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size = 60,153 × 10 = 601,530 cells

    Execution Time = 38 sec

    Display Time = 55 sec

     

    Example 3)

    R = Products × Dates × Order Lines × Destination Currency (MM)

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size=N/A

    Execution Time= N/A sec

    Display Time= N/A sec

    OutOfMemory message during execution of MDX. Well, it's only a laptop after all , an older one.

     

    Example 4)

    R = Products × Dates × Destination Currency (MM)

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size = 191,574 × 10 = 1,915,740 cells (yes, that's almost 2 million cells)

    Execution Time = 120 sec

    Display Time = 170 sec

    But that's a clear bug in program for time measurement, because I waited approx half an hour in total.

     

    Example 5)

    R = Products × Destination Currency (MM)

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size = 4,308 × 10 = 43,080 cells

    Execution Time = 3 sec

    Display Time = 3 sec

     

    Example 6)

    R = Dates × Destination Currency (MM)

    C = Regions

    D = Sales Amount, Total Product Cost

    Dataset Size = 12,708 × 10 = 127,080 cells

    Execution Time = 6 sec

    Display Time = 10 sec

     

    I made another test of Example 4, to see ratios of execution vs. display, hoping to hit a cache. After 2 minutes execution was over and I saw the scrollbars. So, the Execution time might be ok. But no grid yet. After 5 more minutes I heard my laptop's fan like never before. Obviously on such large datasets it is suffering to generate a grid and display it. Btw, we use ComponentOne's grid if I'm not mistaken, internally enhanced of course, to serve the purpose of multidimensional analysis.

     

    Conclusion?

     

    Your dataset should not generate such a significant display time vs. execution time. Maybe you have issues with low free memory on client machine or poor graphic card. See paging and other OS related stuff. Try optimization there. Also, try testing on another computer. I believe you should have from 2:3 up to 1:2 ratio on that data size. Certainly not 1:5.

     

    I believe there's nothing you can do with SSAS, if I understood the problem correctly. It's only up to client computer's capability of generating results and displaying it in a grid. Or a grid itself.

     

    In the end I finally got the results displayed from encore of Example 4. Since I was writing, I didn't capture the exact moment, but it took between 15 and 20 minutes to display the results. Yes, that 2 million cells result. The numbers say 100 vs. 164 this time. I believe the second one is false (display time), while the first matches my rough measurement (2 minutes while watching a Windows clock).

     

    Btw, we measure 5 times in our application, so maybe the bug is in another Prepare Data Time (that comes after Display Time) which says 0. We'll fix that, of course .

     

    Hope this testing I made helps you in your problem. At least, to guide you where to look at or what to expect.

     

    Best regards,

     

     

    Tomislav Piasevoli

    Business Intelligence Specialist

    http://www.softpro.hr

     

    Thursday, December 11, 2008 1:22 AM
    Answerer
  • What are the profile numbers if you profile it with Mosha's MDXStudio ?

    So you can rule out the grid and network times and get real cube times.

     

    I also suffer performance problems, specially when I add calculated members with aggregates and stuff like that.

    They start to add time to the whole query.

     

     

    Thursday, December 11, 2008 12:32 PM
  •  

    With all due respect,

     

    MDX Studio, being a true jewel for testing MDX query performance and doing optimization of MDX queries (as I witnessed at PASS preconf) is not the right application for testing display time. I mean, it's not ment for that. It's ment for testing execution time, besides other things. Moreover, as a free application, it most probably uses a simple (free) grid (it looks that way to me). Which means you can not expect fast display times in it.

     

    I tested my Example 1 in it today, it took 3 seconds to get the results, but 120 seconds (!) to display it (I was looking at Windows clock since it's not measured there). That means Gerhard would wait even longer, maybe 3 minutes to display his 200,000 cells.

     

    And, if I got it right, it's not a matter of MDX query or cube structure that you could enhance. Execution time in Gerhard's case was 10 seconds. It's fine, not something drastic. But displaying result took too long. So, it's a matter of how does an OLAP client (application X with its grid from company Y and additional tweaks) on a particular computer (OS & HW capabitilites with additional tweaks) handle that. Am I right Gerhard?

     

    Regards,

     

     

    Tomislav Piasevoli

    Business Intelligence Specialist

    http://www.softpro.hr

     

    Thursday, December 11, 2008 2:33 PM
    Answerer
  • Maybe I misundestood Gerhhard's objetive, and I was trying exactly the opposite. To test just olap performance and NOT network dataset transferring/grid performance.

     

    My problem was always cube performance, and that's why I used profile counters.

     

    My apologies if this is not the case.

     

     

     

     

     

    Thursday, December 11, 2008 4:49 PM
  • Hi,

    thanks first for your fast responses

    @Tomislav: special thanks to your extensive tests - i hope your laptop survived all the testing without any damage Smile

     

    ok, first of all, Serialize Results is an action on the server so it has little  todo with the client/client application

    if i am right, it is the process to convert the resultset returned by AS to an format that is then transmitted to the client

     

    you can check that in profiler if you, additionaly to default-template, also select the Serialize-Results events

    thats whats causing my problems

     

    the time used for displaying/rendering are additional 1-2 minutes (in Excel)

     

    i also tried the whole database on a different server (4 CPU, 12 gig RAM) instead of 2 CPU and 7 gig RAM

    without any big difference

     

    i will do further testing on monday and post my results

     

    greets & thanks,

    gerhard

     

     

    Thursday, December 11, 2008 6:36 PM
    Answerer
  • Hi Gerhard,

     

    What I suspect is happening, based on prior experience (and this is only a guess), is that AS starts serialising the resultset before it's finished all the query processing. So while it might look like your Serialise Results events are taking most of the time taken by your query, that might be because AS can start building up the cellset that needs to be returned fairly early in the overall query processing, but still needs to wait for that query processing to finish before it can finish populating the cellset. In which case, the whole Serialise Results problem is a red herring and you need to look elsewhere for your performance problem. I'm fairly sure I've seen similarly large cellsets returned much quicker than 2 minutes.

     

    ...but as I said, this is just a guess.

     

    Chris

     

    Thursday, December 11, 2008 10:03 PM
  • Hi Chris,

     

    thanks for the hint, never thought of something like this

     

    so the problem would be the query, which is generated by excel and cannot be changed :/

     

    anyway, i will write my own MDX to check wether its really the query that slows me down or not

     

    greets,

    gerhard

    Thursday, December 11, 2008 10:20 PM
    Answerer
  •  

    My apologies to all,

     

    I completely missed the subject. Sorry.

     

     

    Tomislav Piasevoli

    Business Intelligence Specialist

    http://www.softpro.hr

     

    Thursday, December 11, 2008 10:49 PM
    Answerer
  • Hi again,

    after some further investigations i found out that querying the dimension properties has major impact on performance too
    selecting all properties (what excel does by default) doubles the query-time

    i did 3 tests: the original excel query, the excel query without properties and my custom query also without properties

    with Properties (1:1 Excel)
    Query Begin
    Calculate Non Empty Begin
    Query Subcube/Get Data From Cache/ ...
    Calculate Non Empty End 94ms
    Calculate Non Empty Begin
    Query Dimension(s)
    Calculate Non Empty End 16ms
    Serialize Results Begin
    Serialize Results End 17422ms
    Query End 17531ms

    without Properties
    Query Begin
    Calculate Non Empty Begin
    Query Subcube/Get Data From Cache/ ...
    Calculate Non Empty End 109ms
    Calculate Non Empty Begin
    Query Dimension(s)
    Calculate Non Empty End 0ms
    Serialize Results Begin
    Serialize Results End 8250ms
    Query End 8359 ms

    Custom Query without Properties
    Query Begin
    Calculate Non Empty Begin
    Query Subcube/Get Data From Cache/ ...
    Calculate Non Empty End 141ms
    Calculate Non Empty Begin
    Query Dimension(s)
    Calculate Non Empty End 0ms
    Serialize Results Begin
    Serialize Results End 7860ms
    Query End 8016ms


    notice in all queries that 99% of the time is spent "Serializing Results"
    and that exactly what i would like to speed up

    any ideas about this?

    greets,
    gerhard

    Monday, December 15, 2008 4:58 PM
    Answerer
  • Are your trace results outputting in the order that you listed them? As Chris says, the serializing often starts very early in the query and I usually see it wrapping the Query Subcube events too, so it is not just a measure of how long the serialization took.

    If you use the Profile option in MDX Studio it will not display the results in the grid and will give you accurate timings for both a cold and warm cache. The other thing that the profiler option gives you is a hierarchical view of the trace events. This should show you which events are appearing "inside" the serialize events (which I suspect is a lot more activity than you would expect).


    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Tuesday, December 16, 2008 4:22 AM
  • Hi Darren,

    yes, the trace results occured in the order i listed them
    further, using MDX-Studio, execution on cold and warm cache is more or less the same
    serialize results still take most of the time (>95%) where real query time of course speeds up from cold to warm cache execution

    under the serialize results event are no more entries (no '+' to expand) in my query (but i see a '+' for other queries)
    if i check it in Profiler i get the different serializiation events like Serialize Axes/Slicer/Cells
    where Serialize Cells takes most of the time - about 90% of all Serialize Events

    any further hints?
    Tuesday, December 16, 2008 8:58 AM
    Answerer
  • Gerhard,

    Did you get any further hints on resolving the long run time during the "serializing the results" event from an Excel MDX query?
    I am experiencing very similar results as you described in this thread due to a very large dataset being returned (as required by the customer).

    Any lessons learned would be greatly appreciated...

    Thanks,
    Ken

     

     

     

    Wednesday, October 21, 2009 2:56 PM
  • Hi Ken,

    no, unfortunatly i have no more hints
    i could not solve this problems
    the only advice i can give you is to keep the dataset as small as possible by removing all unnecessary coumns/rows

    sorry,
    gerhard
    - www.pmOne.com -
    Wednesday, October 21, 2009 7:23 PM
    Answerer
  • If you are looking at the duration of the Serialization End event what you are actually seeing is a lot more than just the time it takes for SSAS to put data into an XMLA format. The data is actually sort of streamed into the resultset, so serialization start just after the query is parsed and includes the calculation of the axis, the fetching of the data from the storage engine, the calculation of any formulas and the assembly and formatting of the data.

    The only easy change that affects serialization is to make sure that you are not requesting more properties (like forecolor, backcolor, font, etc) than you need. Otherwise you need to look at trying to tune the query as a whole.
    http://geekswithblogs.net/darrengosbell - please mark correct answers
    Thursday, October 22, 2009 8:16 PM