How does Kinect Coordinate Mapping work? RRS feed

  • Question

  • I am trying to understand how the SDK coordinate mapper functions work. Specifically the mapColorToCamera and mapDetpthToCamera functions. I understand how to use them, but I am interested in the ways the SDK calculates these transformations. All that I know is that it has something to do with the factory given camera parameters of the Kinect itself. 

    I am interested in this because I am doing research where I am tracking points on the color image and I need to know their location in the depth image. I want to know the process for how this works. This also brings up the question, is there any possible way directly find the depth associated with a particular color point, or group of color points instead of calculating the entire image (to reduce calculation costs)? 

    Wednesday, April 5, 2017 7:44 PM

All replies

  • Nope. Not entirely sure myself on the details. This is just my understanding of things so far.

    I'm sure you've noticed there's a lookup table you can get in order to map a color pixel to a depth pixel(DepthCoordinates). Basically each cell of the lookup table is an X,Y bundle that points to the pixel in the depth frame. But depending on where objects are in the IR frame, the pixels change mappings. So you can't cache it.The SDK is closed so there's also no way of knowing how it produces it exactly.

    The parameters stored in the sensor are there just so each sensor can produce the same result with the same algorithm.

    The SDK runs in the gpu so, if anything, it should be optimal.

    The only part of the SDK that could perhaps be optimized, is the fact that while half of the frame data are generated on the gpu, what you get from the service is a copy marshalled from the gpu to the cpu. And when you get it in your application, you sometimes throw it back to the gpu for usage. Lots of copy costs around.

    By the way, you need all these costs of post processing(like CoordinateMapper) on the frames mostly due to the fact that the amount of data ,that the sensor and by extension(actually even more so) the service pass around, is huge. So the data are being passed around in compact forms just to keep the 30FPS constraint and you end up paying conversion costs in order to use them in your application.

    So unless you create your own gpu-based algorithm for mapping color to depth, there's really no way(that I've found at least) to optimize the process.

    Friday, April 7, 2017 8:48 AM
  • I've worked on a custom coordiante system pipeline, so I have pretty much a good idea on how it works.

    You can do a direct coordinate translation from Depth to Color; you just pick the Vector (X,Y,Depth) and do the maths that will give you the X,Y on the Color frame.

    Direct Color to Depth mapping is much more complicated because you don't have the color "depth" required to do the maths. So what I am doing (and what I think the SDK is doing) is to do a Depth-To-Color on the whole frame, and store in the color frame the depth XY that originated the transformation. So you can then use that resulting frame as a lookup table.

    Vicente Penades

    Saturday, April 8, 2017 8:37 AM