what is the meaning of kinect depth image intensity within range 0 to 255 ? RRS feed

  • Question

  • Can anyone here who can clear me this picture? 

    I want to ask u a question ,  i knew that depth intensity means distance and kinect returns uint16 image. but recently i read some paper, they showed a graph, where depth intensity is inversely proportional to distance from camera to object. but i have the question is that if depth value means distance then what is the depth intensity[0 255] . I already sent mail the to authors but they didn't give reply.Sorry for that question. I am too much confused so that why i post .if anyone have any idea please share with me. 
    Here i also included both paper.

    1. paper link:

    2.paper link:

    3. image link

    • Edited by sufiian Saturday, July 8, 2017 4:56 AM
    Saturday, July 8, 2017 4:56 AM

All replies

  • First of all,Distance is not Depth Intensity. Distance is the uint16 frame Kinect gives you. Depth Intensity is the intensity of the grayscale color you get when converting the Distance to the 0-255 range which is the same as the rgb value range. The whiter the color value, the more intense it is.

    All papers you present are about recognizing steps on staircases etc. The chart you have on the third link, is about that conversion to color. It shows how they map the rgb value 255(which is the white color if given to all three channels) with 0 distance from the camera and the 0 rgb value which is black with the max distance. In the chart they use 290cm as the distance which is 2900mm and they say it corresponds to about 45 rgb value. They also explain how they realize from the rgb value changes how they recognize the steps.

    Also take note that those papers might not be about Kinect v2. In fact I looked up Kinect in both papers and none of them said "v2" , so it's probably the old sensor Kinect v1(or Xbox 360). So be careful of the content and also which sensor they use in the papers you read. Your conclusions might not apply to the newer sensor.

    Saturday, July 8, 2017 6:23 AM
  • Thanks for u comment sir. I am using kinect v2. so may i not get this graph using V2 ? 

    "they say it corresponds to about 45 rgb value"

    what do u mean by this line ?

    Kinect sdk v2 provides depth basics WPF and i can save this picture by pressing screen-shot . is this image mapped into the RGB from depth ? nor i need modification ?

    sir using kinect i take distance of point using this method: save image frame as uint format then take co-ordinates of that pixel and the find the pixel value to get the distance. is it ok ? i measure the distance manually but i got 10-15 cm differnce for ex: kinect provides 131cm and manually i got 122cm.

    Saturday, July 8, 2017 7:47 AM
  • Ok...the depth frame provides a ushort array basically. That's not an image. But you can make it an image if you create a grayscale out of that ushort array. To do so, you need to convert the distance to rgb values. When I say RGB I don't mean the color frame. I mean normal RGB values the way most images are translated to. RGB(A) is a format for color and transparency values. And to be more exact, most Kinect v2 sensors produce YUV2 images which are then converted to RGBA format for ease of use.

    So your question about depth basics is invalid.The sample just tries to convert the frame it's given by Kinect service from raw data to an image, hence the conversion to grayscale RGB values(0-255).

    That said, I'm not that familiar with the depth basics sample so if the image saved to your disk is 512x424 it's the resolution of the depth frame as it's gotten from the service and if the image you save is FullHD(1920x1080),then it's mapped to the color frame. But that's beside the point.

    As for the 45 RGB Value issue... If you look at the graph in the third link, the far right side, you'll see the lowest point maps to the 290cm value on the X axis and that corresponds to the 45 intensity value on the Y axis. But intensity is the same as saying a grayscale RGB value where R,G and B have the same 0-255 value. Look up how to create a grayscale image out of a normal RGB image and you'll probably get it.

    The fact that you are using a Kinect v2 doesn't mean that the people who wrote the paper did. Kinect v1 and v2 are totally different. Aside from various specs, the way they use the IR to figure out depth is totally different. Kinect v1 uses an IR light pattern projection whereas Kinect v2 uses time-of-flight. So you have to be careful what you read and understand what sensor it refers to.

    Also Kinects are usually calibrated by MS. But that doesn't mean the calibration is optimal. There's always an error.

    Saturday, July 8, 2017 8:18 AM
  • Sir i need two solution at this time.

    1)I have depth frame. but manual distance and kinect distance have little different. 10 to 20 cm almost. to solve what i need to do ? generally i take the frame using matlab directly. and save it uint16 format of PNG. is there any wrong ?

    2)if i scaling the distance value for example: 0 to 255 and maxdeptdistnce to zero and if then i draw the graph between scaled value and distance , can i get this graph ? 

    Another ques is that, why they mapped it : 0 to 255 and maxdeptdistnce to zero . if i map 0 to 0 and 255 to madepthdistance, is there anything wrong ? 

    here is the code of depth basics to visulize image:

    private unsafe void ProcessDepthFrameData(IntPtr depthFrameData, uint depthFrameDataSize, ushort minDepth, ushort maxDepth)
                // depth frame data is a 16 bit value
                ushort* frameData = (ushort*)depthFrameData;
                // convert depth to a visual representation
                for (int i = 0; i < (int)(depthFrameDataSize / this.depthFrameDescription.BytesPerPixel); ++i)
                    // Get the depth for this pixel
                    ushort depth = frameData[i];
                    // To convert to a byte, we're mapping the depth value to the byte range.
                    // Values outside the reliable depth range are mapped to 0 (black).
                    this.depthPixels[i] = (byte)(depth >= minDepth && depth <= maxDepth ? (depth / MapDepthToByte) : 0);

    • Edited by sufiian Saturday, July 8, 2017 9:03 AM
    Saturday, July 8, 2017 8:58 AM
  • 1) Depends...For example, is your sensor set up at 1m height with no tilt? If so then it's a calibration problem and you should look at how to manually calibrate the Kinect. But be warned, it's a pain because you'll have to write many other things yourself. If you don't want to, learn to live with it. Or do it the hacky way and subtract a value from the distance , per pixel.

    If the sensor is high up and tilted to look down, the manual distance will be different than what the sensor sees. The distance the sensor sees is a distance from the origin Z axis coming off the camera. So if the sensor is high up and looking down, then the straight line coming off the IR sensor is the origin.Also note that along the sides of the frame, the data are not proper.

    You should read up on Kinect v2 documentation in MSDN.

    Also the fact that you're asking me if saving your image in uint16 is wrong,means you don't understand your problem. Depends on what you're trying to achieve. The base depth frame is a uint16 1D array by nature which you can also interpret as a 2D array since you know it's a 512x424 2D array. Now whether you have to also convert it to a single byte interpretation depends on what you need it for. Are you doing it because you need it or because Depth Basics does it?

    2) Can you get what graph? The third link? You can get "a" graph where the Y axis is mapped to RGB values(0-255) and the X axis is mapped to distance values(whether it's cm or mm or m, depends on what you changes your values to). But not the exact same thing. Both papers you showed are trying to determing stair recognition. Is that what you're doing?

    No there's nothing wrong in mapping 0 to 0 and max dist to 255. But there's a convention when visualizing depth where black(255) usually means far away and white(0) means close up. That convention is also how we visualize the IR frame. Now if there's a valid reason for your case to map max distance to 255, then do it, but know why you're doing it.

    Saturday, July 8, 2017 11:45 AM
  • Thanks a lot sir.Many confusion gradually cleared.

    2) Can you get what graph? The third link? .. yes the third link.

    " the Y axis is mapped to RGB values(0-255) " It means distance value converted in RGB (0-255). right ?

    "Is that what you're doing?".. yes i am doing the same stair recognition using kinect.

    can u suggest me in which position kinect gives absolutely perfect result ? almost same as manual result.

    my kinect set  up is : height = 42cm length from object=120 cm. but i cant measure the angle.. but it is straight against the object. i think no tilt here. it may see straight forward.

    • Edited by sufiian Saturday, July 8, 2017 1:22 PM
    Saturday, July 8, 2017 1:11 PM
  • 2) Yes. Just don't expect the same results.

    This is one of those cases which go against the design decisions of the makers. The concept for Kinect is that it's placed between 0.6 m( and 1.8 m from the floor, the higher the better, but with the floor inside the field of view. Now the less tilt the better. And no obstructions. Also any reflective,refracting or absorbent materials will affect the result.

    Another thing I remembered, the depth accuracy is not very good when you fire up the sensor. The sensor needs to heat up a bit(for 20min or so) till it works in its optimal condition for more accurate results. Perhaps that's your cause of high error.

    Not really sure why you have to save pngs out of it. You can record and playback the streams in Kinect Studio so you can test stuff. I'm guessing you want to create something that can work realtime.

    Saturday, July 8, 2017 3:53 PM
  • "Not really sure why you have to save pngs out of it" .. sir at this stage, i just work on by taking frame manually.. if i do this work then i will go for realtime. To store the value of frame i just saved it in uint16 image. 

    "No there's nothing wrong in mapping 0 to 0 and max dist to 255. But there's a convention when visualizing depth where black(255) usually means far away and white(0) means close up." .. sir in this case what is the reason to map it as like, 0 for white and black for 255 ?

    At last, i am really thankful to u sir. u r great. u make my day great. lot of confusion now cleared. thanks a lot sir .

    • Edited by sufiian Saturday, July 8, 2017 5:53 PM
    Saturday, July 8, 2017 5:41 PM
  • No need for sir..I'm not that old :P.

    Anyway good luck on your project.

    Saturday, July 8, 2017 5:59 PM
  • sorry bro to interrupt u again.

    u said that : 

    "No there's nothing wrong in mapping 0 to 0 and max dist to 255. But there's a convention when visualizing depth where black(255) usually means far away and white(0) means close up."

    My question is : In this case what is the reason to map it as like, 0 for white and black for 255 ?

    Sunday, July 9, 2017 6:17 PM
  • Yeah I made a mistake when writing that part. RGB code for black is (0,0,0) and for white it's (255,255,255).

    So it should be

    "No there's nothing wrong in mapping 0 to 0 and max dist to 255. But there's a convention when visualizing depth where black(0) usually means far away and white(255) means close up."

    So basically 0 distance should be 255 in color and max distance should be 0 in color.

    Sometimes you have the answer in your head but mess up the delivery :P. Sorry about that.

    Sunday, July 9, 2017 7:56 PM
  • Sorry bro . i am not clear enough.

    how zero distance should be 255 ? 

    Monday, July 10, 2017 4:08 AM