This is a basic move I've worked up in the last few days. Much like my previous comment on how to in-fill the depth image to avoid shadows, this is a technique to make the depth data more consistent and reliable.
Anyone working with the depth data knows that while broad shapes are consistent, smaller shapes (and in particular their edges) can be hard to reliably work with because of the noise in the Kinect's data. This noise isn't random - or at least, it isn't purely random - but rather due to the combination of limited resolution, offset between the IR camera/sensor, and of course the nature of the surface of whatever you're looking at.
My basic solution is to use a running average frame.
Basically, it works like this:
1. Get a frame of depth data (i.e. the imageframe)
2. Process that frame into a grayscale image (I'm actually doing B&W because my larger project requires my to look at only a narrow slice of depth near the hands).
3. pass that byte to a DepthFrame class I wrote wrote which, using get/set functions for all the variables, automatically does a few things:
(a) Put the newest byte on top of the stack, and move the others down, keeping a total of 3
(b) Calculate a byte that represents the average of those three bytes, applying a threshold cutoff so that I end up with meaningful B/W instead of blurry gray on the edges
(c) Output the averageFrame back to the main program
The result is that the image is much, much more stable but still maintains almost the exact same framerate as if I were only looking at the most recent depth data.
I'll post video later this evening, hopefully.