locked
Super Mario, Tile Based Game and Silverlight Performance RRS feed

  • Question

  • Greetings folks, I have couple of questions about implementing tile-based games with Silverlight. In order to better illustrate the problems, implementation details are in order. Consider the case where a tile-based game, Super Mario, is to be done in Silverlight.

    Silverlight tile based game: Super Mario

    As shown above, the visible region is represented in colors, where as the region outside of the view area is in grey-scale.

    The following image outlines the different components that make up the rendering system:

    Silverlight tile based game: Super Mario

    In the above image, the View Canvas (in red) defines the visible region of the game world. It clips away region lying outside of it through its clipping properties. Content Canvas (in blue) is a child UIElement of the View Canvas, and naturally it is also subjected to the clipping of View Canvas.

    Square bricks in the game scene are made up of individual Image object, some sharing the same BitmapSource so they look the same. These bricks are child UIElement of Content Canvas. These child UIElements are stationary positioned with respect to the Content Canvas, so they appear to scroll along when the Content Canvas scrolls to the left edge of the screen (with respect to the View Canvas).

    The following image illustrates all the tiles (child Image objects of Content Canvas) at a given time.

    Silverlight tile based game: Super Mario

    In the illustration above, red tiles have scrolled outside of the visible region, thus becoming “retired tiles”. Green tiles are the ones current visible on screen, and blue color tiles are ones that will soon be visible as the game progresses.

    At one point in time Content Canvas would have scrolled to the left-most and its right edge coincides with the right edge of the View Canvas. In this case, the Content Canvas can no longer scroll further to left. It needs to remove retired tiles (red), introduce future tiles (blue), and reposition itself to the right side again, as shown below:

    Silverlight tile based game: Super Mario

    The image above shows that Content Canvas repositions itself (with respect to the View Canvas) to the right-most where its left edge coincides with the left edge of View Canvas. The retired red tiles are being recycled and used as “future tiles”, which are then reinserted as child UIElement of Content Canvas (no recreation, no garbage collection). One thing to note is that as this jump happens, all visible tiles need to be repositioned as their parent Content Canvas jumps, this is to ensure they remain where they are physically on the user screen:

    1    public void OffsetAllTiles(double x, double y)
    2    {
    3        foreach (Image image in m_VisibleTiles)
    4        {
    5            image.SetValue(Canvas.LeftProperty, ((double)image.GetValue(Canvas.LeftProperty)) + x);
    6            image.SetValue(Canvas.TopProperty, ((double)image.GetValue(Canvas.TopProperty)) + y);
    7        }
    8    }
    9    
    
     

    And so, that much about what goes behind the implementation, and here comes the questions:

    1. Offset of tiles as the jump happens takes non-zero time, the more visible tiles there are, the more tiles to iterate through and the slower it becomes. The frame where this jump happens usually takes longer time, which results in transition around that not being smooth. Q: Instead of having to loop through all visible Image objects and set their offset, is there a way to do this through a “global transform” kind of thing (i.e. having all visible tiles referring to the same global transformation matrix)?
    2. Q: Should I be setting Canvas.LeftProperty, Canvas.TopProperty of each Image object, or should I use Image.RenderTransform property? Which one is faster and lighter to Silverlight runtime?
    3. If I create Content Canvas as wide as the entire stage of the game, I can avoid this setting of Image offset. I will only have to recycle the retired tiles and reuse them for future tiles. Since the Content Canvas is as wide as the entire game stage, no jump will ever happen. Q: What kind of resource is required, if I have to create a canvas that is 20000x320 as compared to 480x320? Does this only mean its dimension differs or does that mean some huge memory segment needs to be allocated for 20000x320 canvas?

    Thanks a bunch for kind souls who managed to read through this and willing to help clarify things up. Sorry for the lengthy post. Big Smile

     

    Thanks and regards,
    Ben.

     

     

    Thursday, July 2, 2009 1:05 PM

Answers

  • Offset of tiles as the jump happens takes non-zero time, the more visible tiles there are, the more tiles to iterate through and the slower it becomes. The frame where this jump happens usually takes longer time, which results in transition around that not being smooth. Q: Instead of having to loop through all visible Image objects and set their offset, is there a way to do this through a “global transform” kind of thing (i.e. having all visible tiles referring to the same global transformation matrix)?
    I don't think there is a global transform that you can use.  I would suggest trying out the big canvas (20000x320) approach, and set the transform of the canvas accordingly.  I am under the impression that memory consumption of a canvas is not dictated by the size of the canvas.  This is not true for all controls, for example the image control, whose memory consumption is proportional to its size.
    Q: Should I be setting Canvas.LeftProperty, Canvas.TopProperty of each Image object, or should I use Image.RenderTransform property? Which one is faster and lighter to Silverlight runtime?
    Should be the same.
    If I create Content Canvas as wide as the entire stage of the game, I can avoid this setting of Image offset. I will only have to recycle the retired tiles and reuse them for future tiles. Since the Content Canvas is as wide as the entire game stage, no jump will ever happen. Q: What kind of resource is required, if I have to create a canvas that is 20000x320 as compared to 480x320? Does this only mean its dimension differs or does that mean some huge memory segment needs to be allocated for 20000x320 canvas?

    I believe there is no memory or performance penalty.

    On the other hand, you may want to consider using WriteableBitmap for background drawing.  The advantage is that, if you have large number of background objects, say one tile per cell, it is far more efficient to render them into WriteableBitmap (which is then used as an Image source for the background).  Make sure you don't re-render them to the WriteableBitmap on a per-frame basis though, since WriteableBitmap is slow compared to native SL rendering of controls.  But if you use it as a background -- say make it 3X the size of the screen and re-render it whenever it is about to scroll out of scope, which is much more infrequent than the frame rate -- it should be fast.  I think you can even arrange it to be redrawn (slightly ahead of getting out of scope) during "less busy" times, and this may further avoid jittery.

    Saturday, July 4, 2009 4:10 AM
  •  God i hate these forums, they dont have quote buttons either... who has a forum without paging and quote buttons in this day and age?

     

    A few things based on what you have said so far:

    You mentioned that you had them all sharing the same bitmapsource, but unfortunatly behind the scenes i dont think silverlight shares anything, it will just keep copying the data, so you wont gain any speed there (but i could be wrong).

    The Canvas.Top/Left vs RenderTransform argument... this is an interesting one and had the exact same problem when i started, and unfortunatly you need to use both... On its own canvas.top/left will position you anywhere you want *On Screen*, however as you will be scrolling you will need to set the render transform to be the offset *Off Screen*. I think i kinda lost my mind at one point and broke it all for the sake of it...

    The question about having one rendertransform or multiple... im currently using one for each object on screen but they are shared, (So i have a virtual camera class that contains the current translation and its always updated in there, note that i mean its always updated and not overridden, that way the rest of them dont keep creating new translationtransforms every movement)... but now that you mention it i may be going the completely wrong way and rather than getting each smaller thing to translate get the big thing to translate...

    Dont really know about the canvas size question, i only have a small canvas but have a quadtree style structure to define what is on the screen. Rather than looping through all element each movement though i have a rect that is a movement buffer, so rather than saying what is *In View* it will check whats in view and a little bit more, so when i start to scroll im not going to need to update my viewable tiles instantly, so lets say my tiles are 32x32 i would give it a 64x64 boost to the viewing bounds rect so i would pull in some additional tiles that are off screen. This causes a little overhead but gives MASSIVE performance benefits because you only need to update what is on screen after you have moved > 32 pixels. This also is combined with an overridden quadtree object which contains some caching, so rather than going through each list and turning them on and off, it remembers what was previously viewable and pushes them onto a list, then finds what else is viewable then works out what to turn off and on, so it wont turn off anything that is already on then turn it back on, it just does what it needs to. 

    Im currently moving house atm but hopefully later i will be able to revisit your question with a bit more info!

    Saturday, July 4, 2009 4:26 AM
  • For my tests, both used a Canvas, since the Grid has some overhead even if you're just using TranslateTransform and you're not using any of its features anyway, and I was using SL3 with GPU acceleration and about 4000 moving elements. I was getting about 40 FPS with Canvas.Left and Canvas.Top, and 45 FPS with TranslateTransform. With less elements the difference was negligible. I'll blog about my sample once SL3 is released.

    Monday, July 6, 2009 7:33 AM

All replies

  • Offset of tiles as the jump happens takes non-zero time, the more visible tiles there are, the more tiles to iterate through and the slower it becomes. The frame where this jump happens usually takes longer time, which results in transition around that not being smooth. Q: Instead of having to loop through all visible Image objects and set their offset, is there a way to do this through a “global transform” kind of thing (i.e. having all visible tiles referring to the same global transformation matrix)?
    I don't think there is a global transform that you can use.  I would suggest trying out the big canvas (20000x320) approach, and set the transform of the canvas accordingly.  I am under the impression that memory consumption of a canvas is not dictated by the size of the canvas.  This is not true for all controls, for example the image control, whose memory consumption is proportional to its size.
    Q: Should I be setting Canvas.LeftProperty, Canvas.TopProperty of each Image object, or should I use Image.RenderTransform property? Which one is faster and lighter to Silverlight runtime?
    Should be the same.
    If I create Content Canvas as wide as the entire stage of the game, I can avoid this setting of Image offset. I will only have to recycle the retired tiles and reuse them for future tiles. Since the Content Canvas is as wide as the entire game stage, no jump will ever happen. Q: What kind of resource is required, if I have to create a canvas that is 20000x320 as compared to 480x320? Does this only mean its dimension differs or does that mean some huge memory segment needs to be allocated for 20000x320 canvas?

    I believe there is no memory or performance penalty.

    On the other hand, you may want to consider using WriteableBitmap for background drawing.  The advantage is that, if you have large number of background objects, say one tile per cell, it is far more efficient to render them into WriteableBitmap (which is then used as an Image source for the background).  Make sure you don't re-render them to the WriteableBitmap on a per-frame basis though, since WriteableBitmap is slow compared to native SL rendering of controls.  But if you use it as a background -- say make it 3X the size of the screen and re-render it whenever it is about to scroll out of scope, which is much more infrequent than the frame rate -- it should be fast.  I think you can even arrange it to be redrawn (slightly ahead of getting out of scope) during "less busy" times, and this may further avoid jittery.

    Saturday, July 4, 2009 4:10 AM
  •  God i hate these forums, they dont have quote buttons either... who has a forum without paging and quote buttons in this day and age?

     

    A few things based on what you have said so far:

    You mentioned that you had them all sharing the same bitmapsource, but unfortunatly behind the scenes i dont think silverlight shares anything, it will just keep copying the data, so you wont gain any speed there (but i could be wrong).

    The Canvas.Top/Left vs RenderTransform argument... this is an interesting one and had the exact same problem when i started, and unfortunatly you need to use both... On its own canvas.top/left will position you anywhere you want *On Screen*, however as you will be scrolling you will need to set the render transform to be the offset *Off Screen*. I think i kinda lost my mind at one point and broke it all for the sake of it...

    The question about having one rendertransform or multiple... im currently using one for each object on screen but they are shared, (So i have a virtual camera class that contains the current translation and its always updated in there, note that i mean its always updated and not overridden, that way the rest of them dont keep creating new translationtransforms every movement)... but now that you mention it i may be going the completely wrong way and rather than getting each smaller thing to translate get the big thing to translate...

    Dont really know about the canvas size question, i only have a small canvas but have a quadtree style structure to define what is on the screen. Rather than looping through all element each movement though i have a rect that is a movement buffer, so rather than saying what is *In View* it will check whats in view and a little bit more, so when i start to scroll im not going to need to update my viewable tiles instantly, so lets say my tiles are 32x32 i would give it a 64x64 boost to the viewing bounds rect so i would pull in some additional tiles that are off screen. This causes a little overhead but gives MASSIVE performance benefits because you only need to update what is on screen after you have moved > 32 pixels. This also is combined with an overridden quadtree object which contains some caching, so rather than going through each list and turning them on and off, it remembers what was previously viewable and pushes them onto a list, then finds what else is viewable then works out what to turn off and on, so it wont turn off anything that is already on then turn it back on, it just does what it needs to. 

    Im currently moving house atm but hopefully later i will be able to revisit your question with a bit more info!

    Saturday, July 4, 2009 4:26 AM
  • First of all, thanks for taking time to answer my questions, guys. Big Smile

    I have omitted some history on this implementation actually. Instead of having individual tiles, I used to have one WriteableBitmap in the Content Canvas. This WriteableBitmap is as big as the Content Canvas and it represents static background data (i.e. those bricks on screen). In DirectX term this is called "primary WriteableBitmap", and there is another "secondary WriteableBitmap" which is not shown on screen.

    When the "jump" needs to happen, the unchanged portion of the "primary WriteableBitmap" is blitted to the "secondary WriteableBitmap" through "WriteableBitmap.Render" method. Delta tiles are then rendered on the "secondary WriteableBitmap", followed by a "flip" which displays the "secondary WriteableBitmap" in the Content Canvas. Delta tiles are only a fraction of the entire screen, so blitting them (again, WriteableBitmap.Render) does not take much time, but I found the bottleneck is actually the part where "primary WriteableBitmap" is blitted onto the "secondary WriteableBitmap" (even in Release build with GPU acceleration turned on and bitmap cached them).

    If you imagine a smooth scrolling of the Super Mario screen, there will be occasional "pause" happening, for my case, once every second. Though not exactly the same as Super Mario game, my game fixes the player in the middle of the screen, and it is very common for user to say "hey, please walk from (0, 0) to (80, 65)" with a click of the mouse button. Naturally, there will be multiple "jumps" that need to happen, which results in jittery scrolling.

    I tried using a BackgroundWorker, but as you may already know, it does not allow access to tile "Image" objects which are required for composing the "secondary WriteableBitmap" in the worker thread. I’d really appreciate if anyone can illustrate how to do "WriteableBitmap.Render" in a non-UI thread.

    Then I switched to use independent Image objects on-screen. Currently for the extreme case of my game, I will only have around 500 tiles on screen. It improves the jittery situation significantly, but still the jitter can be felt slightly. So if I am able to eliminate the offsetting of each Image object (by using a humungous Content Canvas), then the problem is likely to be resolved completely. Let me report this back soon as I figure it out (thinking of BitmapCache and the humungous Content Canvas sends a chill down my spine…).

    Also, I believe it is right to say single BitmapSource is duplicated for each Image object; that may explain why I saw bigger numbers when I turned on "EnableFrameRateCounter", if there are more Image objects (by the way, can you explain what are the five numbers shown when "EnableFrameRateCounter" is turned on? I know the first one is frame rate).

    I welcome more suggestions and possible ways to eliminate jitters entirely.

     

    Thanks,
    Ben.

    (Marked both these as answers, both helped in some ways.)

    Saturday, July 4, 2009 5:51 AM
  • haha i didnt know there was a built in frame counter, i had my own timer going to track it... i found that using WritableBitmap was brilliant and logical and just like making a game on any other platform until i tried running it and it was slower than a 486 running crysis...

    I found the only way to get *reasonable* performance (30fps) is to do the trick you mentioned earlier with 2 layers, one being the viewing canvas, and one being the moving canvas. It still juddered hits about 30fps every time you move, but thats alot better than 10fps it was hitting before... Ive turned GPU acceleration and caching on and noticed no benefit.

    In my current game my player is usually at center of the screen unless they are near an edge, ive basically made my own simple translation system, so everything has a game position, then the virtual camera is bound to the players position and everything else is drawn with the offset of the camera. That way i dont have to worry about maintaining local and render transform things all the time (well i do but its abstracted) i just make sure everything has a game position, and then when the camera gets to close to an edge it becomes unbound from the player letting them walk to the edges of the screen... still doesnt work 100% way i want it to but its better than nothing...

    My net will be off for a few days while i get it setup at the new place but it would be great to see how you get on as you sound like you are doing same stuff as me, if you can post any reports on FPS you are getting with movement and tiles that would be great!

    Saturday, July 4, 2009 12:40 PM
  • I am really surprised that the once-in-a-while rendering of primary to secondary WriteableBitmap is causing a lag...  A few ideas:

    (1) make sure you don't turn on the Bitmap Cache on the WriteableBitmap.  I don't think it'll help and in fact may force SL to do more work (keep a separate cached copy of something that takes zero effect to build).

    (2) perhaps you can use multiple WriteableBitmaps, like 48x320.  So, you avoid the problem of having the blit the entire primary over to the secondary.

    (3) i was also wondering about this in the previous reply...  can you look ahead and blit to the secondary when it is not as busy?  Not sure how to do this though (like an OnIdle)...

    (4) yes it sucks that WriteableBitmap can only be touched in the UI thread.  I complained about this in the SL4 wishlist as well.

    Saturday, July 4, 2009 12:50 PM
  • Q: Should I be setting Canvas.LeftProperty, Canvas.TopProperty of each Image object, or should I use Image.RenderTransform property? Which one is faster and lighter to Silverlight runtime?
    I did some tests on this recently and saw about a 10% performance gain using TranslateTransform instead of Canvas.Left and Canvas.Top. This is probably because of the extra overhead of setting attached properties using SetValue.
    Sunday, July 5, 2009 9:59 AM
  • I actually did a test too on this (very simple) and I actually see it performing very equal. But I haven't used any tools for performance testing or anything, just looked at the FPS counter.

    http://laumania.net/post/Using-Grid-or-Canvas-as-sprite-container.aspx

    Monday, July 6, 2009 5:52 AM
  • For my tests, both used a Canvas, since the Grid has some overhead even if you're just using TranslateTransform and you're not using any of its features anyway, and I was using SL3 with GPU acceleration and about 4000 moving elements. I was getting about 40 FPS with Canvas.Left and Canvas.Top, and 45 FPS with TranslateTransform. With less elements the difference was negligible. I'll blog about my sample once SL3 is released.

    Monday, July 6, 2009 7:33 AM
  • Ok great, I would like to read that post, as you can see I have asked my self the same question, about using one over they other. A big difference between Canvas and Grid, is the ease of implementation if you ask me. It's much easier to implement the Canvas approach. When using the Grid you need to set up all these transforms in the load event of something, and it's pretty much code you need to repeat, as you can't just easially inherit stuff like this in Silverlight. :)
    Monday, July 6, 2009 8:59 AM
  • I also have a solution to the inheritance problem that I'll be blogging about soon and is covered in the updated game chapter in my book, available soon electronically.

    Monday, July 6, 2009 9:06 AM
  • Man, you have a lot of good stuff coming :) Really looking forward to it. Actually I have just brought Michael Show's "Game development in Silverlight" book (not sure the name is right). Guess I will by yours when it comes out.
    Monday, July 6, 2009 9:09 AM
  • Okay folks, I have got some updates regarding the humungous Canvas. The outcome is good; I have created Canvas of size 20480x10240 pixels at no additional cost (both in terms of memory consumption and performance). So Ksleung was right. I am having 384 tiles visible each of which is 64x32 pixels in size. Content Canvas hosting them moves smoothly (it is going at 58fps constantly) since I no longer have to offset these 384 tiles when the “jump” happens.

    Thanks for your performance benchmark, Bill. I still use Get/SetValue methods for now as 384 tiles are much lesser than 4000 in your case, so I don’t expect that to cause too much of a problem (also I suspect creating another 384 TranslateTransform will likely to stress .NET runtime in a way that it needs to track these objects’ lifespan).

    I am not sure what is going to happen if I have more tile bitmaps in future, I hope Silverlight runtime puts all my bitmaps on one texture in hardware level (imho texture switching in video memory is costly).

    While this is not the optimal solution I would personally like to see (I think WriteableBitmap is more “natural” for my purposes), it does solve the problem for now. Ultimate solution I’d like to see is the ability to access Image (in SL3 RTW, even if it is read-only) from a background thread, then the jittery issue can be completely resolved.

    Thanks again for your help, guys!

    Monday, July 6, 2009 10:18 AM
  • Good to hear that you made some progress.

    One comment on WriteableBitmap.

    I don't think the SL team will provide access to the WriteableBitmap in background thread.  The reason is that SL must have full access to all UI assets whenever it is running the UI thread, so it really cannot afford to let you muck with it in the meantime.

    I do hope that the SL team can provide better WriteableBitmap APIs.  Currently the only pixel access is the [] operator, which I image has tons of checks (security and bound) on a per-pixel basis.  This is highly inefficient especially we all want to use WriteableBitmap in a very heavy way.  Why not provide wholesale array copy API?  For example,

    WriteableBitmap.CopyFrom(int[] Src, int SrcWidth, int SrcHeight, Rect Region, int TranslateX, int TranslateY);

    WriteableBitmap.CopyTo(int[] Dest, int DestWidth, int DestHeight, Rect Region, int TranslateX, int TranslateY);

    This way, whatever check only needs to be done once.

    It is still okay to require that these APIs be called from the UI thead.  But at least this approach would allow us to write decent code to perform all the maths in the background thread, and make one async call to push the result to the WriteableBitmap.

    I hope the SL team would consider this.  Perhaps they already did, but the frustrating thing is that nobody knows!

    Monday, July 6, 2009 1:30 PM
  •  I completely agree with you about WritableBitmap being alot better suited for this sort of task, just with the performance bottle neck its a killer...

    When you said you were getting around 54 Fps constant with a massive canvas, how many tiles did you have (roughly) in it and how many tiles were on screen at once (roughly), for my game the resolution will be anything between 640x480 to 1280x1024 (or whatever the resolution is around that size). This means i would have a few hundered on screen at once, but also they are layered, so although there may be lets say 200 flat tiles, there may be 3x this many acctually on the screen, and paired with any other moving objects... i think im going to try and test this tonight, i got a bit carried away with writing some of the logic the other day... oh and moving house, so will see if i can bring back any numbers from my example...

    At the moment the juddering on translating is killing my game, and if i could get rid of that it would probably be fine and i can do a little dance! OH also while we are on topic...

    so far we have been comparing Canvas.Top/Left vs TranslateTransformation.X/Y, for a reason i cant remember off the top of my head i have had to use both. I think the static tiles dont ever change the Canvas.Top/Left but as i have lots of moving entities they require the canvas position to move as well as the render transformations... well i think so off top of my head anyway...

    In your game im guessing you will have the goomba equivalent which will be moving around a bit, how have you gone about doing his movement with a scrolling background? Same route as me or some other gem?

    Tuesday, July 7, 2009 3:21 AM
  • Well, Ksleung I am not sure if that is going to solve my problem. I was merely doing a WriteableBitmap.Render(m_SourceImage, new TranslateTransform) when the "jump" needs to happen (roughly once per second), and it already causes jittery movement. I can of course lower my frame rate to make the "jump" transition smooth (if each frame takes longer time than what is required for the jump transition, then the jump won’t be noticeable). But I doubt I’d want to go there. I do trust SL team will be improving this "blitting" overtime though.

    Gofit, I have 384 tiles in the humungous Content Canvas at one time. Older tiles that scrolled out of view will be recycled and used in position of newer tiles, which then get scrolled into view. The only reason I use humungous canvas is because of the limitations I stated in the original post (to avoid 384 calls to Get/SetValue on 384 Image objects, which was proven to be costly).

    The tiles are the "background" actually. 384 isometric tiles made up the map that covers the entire displayable area. I have yet to put static buildings on the map, but I don’t expect that to cause much problem. The frame rate for a scrolling background of this many tiles is quite satisfying on my single core laptop actually. Big Smile

    Tuesday, July 7, 2009 10:45 AM
  •  I revisited my tile system to see if there were any immediate issues i could look at updating... and it ends up that i was already translating the canvas, so im not positioning every single one of the tiles... one thing is though im using 32x32 tiles so i would end up having more smaller elements compared to your fewer larger elements... i wont be in my actual game but as im just prototyping everything i thought i would start with them.

    Anyway my framerate is awful, its perfect if you are stationary constantly > 60fps with menus and other things going on... but the scrolling kills it. It seems to be marginly slower if i just put all the tiles into the canvas and let SL deal with what to display on the given page. However if i use a quadtree system to enable/disable children as and when are needed it is like 5fps faster.

    Im not sure if the bottle neck comes from having lots of things to draw on screen or just having lots of things in the canvas... im hoping to be able to have >30fps consistantly with the scrolling and lots of stuff on screen at once and hopefully some simple particle style effects... although im not sure if it will happen :( as im struggling to get >20 fps when scrolling with more than 1 tile layer at the moment... im still got a few optimizations up my sleave like baking the static things together ad a few other tweaks around...

    I also tried testing my app on my other laptop, which is a dedicated gaming one... and it got pretty much the same performance as my crappy little laptop which i found as odd...although it does only have a single core (its >3 years old), and im guessing SL still hammers the CPU more than anything else...

    Wednesday, July 8, 2009 7:49 AM
  • Thursday, August 27, 2009 3:15 AM