The relationship between real depth and Kinect disparity RRS feed

  • Question

  • I know there has a relationship between  real depth Z and raw disparity r which can be written as:  Z = A / (B - r), where A and B are constants.

    For depth value which read from depth image ,the range should be from 0 to 2048 because of the Kinect has a precision of up to 11 bits. I do not sure the depth value which I read from depth image( between 0 to 2048) is the value of r or the value of Z?

    And there has another situation, 

    I capture a few depth images from Kinect with format of .tif. However,when I use data cursor to get the depth information, it shows different type of number, it indeed confuse me. I put two different read result as follow:

    First type shows: 
                X:170 Y 320
                RGB: 0.404 0.404 0.404
    Second type shows:
                X: 172 Y 132
                RGB: 159,0,0

    I just wonder what the meaning of the index and the two type of RGB value? Dose the RGB value means the Kinect disparity r or something else?

    Thank you very much for whom read and help!

    Tuesday, January 31, 2012 8:42 PM


  • The text you've copied into your post about the disparity refers to the very low level USB protocol data that the Kinect sends to the computer. It is from the early OpenKinect/libfreenect days. With OpenNI and Kinect SDK, the depth data is already converted into millimeters for you and you do not need to worry about calibration or conversion.

    I don't know what you're talking about with the second part of your question. Seems like you're trying to save the depth data to a file. Keep in mind that the depth data is two bytes per pixel and also includes the player indexes. You need to make sure you do two things:

    1) Convert the data returned from the depth stream into separate depth and player index values, then save them to two different files (or just save depth). See the sample projects and documentation for how to do this.
    2) Save using either a file format that supports 16-bit values, or a lossless file format that you pack the data into two 8-bit values. For example, with a PNG with RGBA, you could put the low depth byte into R and high depth byte into G and leave B and A as zero, or you could be more space efficient and pack two depth pixels into RG and BA.

    If you pack the data into RGBA, it is space efficient but won't be displayed completely correctly. You'll have some cool color artifacts, and you may see the image as doubled depending upon how you approach it. If you're just interested in displaying the images and not processing the data, then you could map the depth data (from 400 to 4000mm) into 0-255 and set R, G, and B to that value to get a grayscale image, or use a more complicated color range mapping for different depth values. That would produce visually pleasing depth images suitable for display, but destroys the accuracy of the data so you cannot use that method for storing and then processing the data later.

    -- Joshua Blake Microsoft Surface MVP Technical Director, InfoStrat Advanced Technology Group Blog: Twitter: Natural User Interfaces in .NET Book:
    • Marked as answer by zjflwh Wednesday, February 1, 2012 5:14 PM
    Wednesday, February 1, 2012 4:36 PM