none
KinectSensor.MapDepthFrameToColorFrame example? RRS feed

  • Question

  • Hi,

    It will be great if a simple sample program that uses KinectSensor.MapDepthFrameToColorFrame function is provided. It seemed to be easy to align the depth frame to the color frame using this API, but it turns out that the depth is not aligned with the color using this API.

    Here's the image I'm getting:

    Misalignemnt

    As you can see, the white hand shape (depth) is not aligned with the color hand shape. 

    I'm using 640x480 resolution for both the color and the depth, and the image is rendred in XNA 4.0.

    I would appreciate for any hints.

    Thank you

    Wednesday, February 8, 2012 1:26 AM

Answers

  • I'm not sure precisely how you're building your overlay, so I can't determine whether the logic is off, whether the overlay is being stretched, or whether there's a bug in the runtime.  What I can give you is this, which I just whipped up, in WPF.

    The project is a default WPF project, with an Image added in MainWindow.xaml:

    <Grid>
    <ImageName="Image"/>
    </Grid>

    And here's the entire MainWindow.xaml.cs.  I "cheated" in a number of places by writing 640 and 480 instead of pulling the size out of the frame, for expediency.

    The demo takes each frame, copies the color data into a byte array, then maps usingMapDepthFrameToColorFrame,then for each mapped depth pixel with a depth between 400 and 1000 mm it blends the target pixel 50/50 with white.  You'll see a grid emerge in the image where a given pixel has been mapped more than once.

    Hopefully this helps!

    using System;
    using System.Diagnostics;
    using System.Windows;
    using System.Windows.Media;
    using System.Windows.Media.Imaging;
    using Microsoft.Kinect;
    
    namespace WpfApplication1
    {
        /// <summary>
        /// Interaction logic for MainWindow.xaml
        /// </summary>
        public partial class MainWindow : Window
        {
            private KinectSensor _sensor;
            private WriteableBitmap _bitmap;
            private byte[] _bitmapBits;
            private ColorImagePoint[] _mappedDepthLocations;
            private byte[] _colorPixels = new byte[0];
            private short[] _depthPixels = new short[0];
    
            private void SetSensor(KinectSensor newSensor)
            {
                if (_sensor != null)
                {
                    _sensor.Stop();
                }
    
                _sensor = newSensor;
    
                if (_sensor != null)
                {
                    Debug.Assert(_sensor.Status == KinectStatus.Connected, "This should only be called with Connected sensors.");
                    _sensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);
                    _sensor.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);
                    _sensor.AllFramesReady += _sensor_AllFramesReady;
                    _sensor.Start();
                }
            }
    
            void _sensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
            {
                bool gotColor = false;
                bool gotDepth = false;
    
                using (ColorImageFrame colorFrame = e.OpenColorImageFrame())
                {
                    if (colorFrame != null)
                    {
                        Debug.Assert(colorFrame.Width == 640 && colorFrame.Height == 480, "This app only uses 640x480.");
    
                        if (_colorPixels.Length != colorFrame.PixelDataLength)
                        {
                            _colorPixels = new byte[colorFrame.PixelDataLength];
                            _bitmap = new WriteableBitmap(640, 480, 96.0, 96.0, PixelFormats.Bgr32, null);
                            _bitmapBits = new byte[640 * 480 * 4];
                            this.Image.Source = _bitmap;
                        }
    
                        colorFrame.CopyPixelDataTo(_colorPixels);
                        gotColor = true;
                    }
                }
    
                using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
                {
                    if (depthFrame != null)
                    {
                        Debug.Assert(depthFrame.Width == 640 && depthFrame.Height == 480, "This app only uses 640x480.");
    
                        if (_depthPixels.Length != depthFrame.PixelDataLength)
                        {
                            _depthPixels = new short[depthFrame.PixelDataLength];
                            _mappedDepthLocations = new ColorImagePoint[depthFrame.PixelDataLength];
                        }
    
                        depthFrame.CopyPixelDataTo(_depthPixels);
                        gotDepth = true;
                    }
                }
    
                // Put the color image into _bitmapBits
                for (int i = 0; i < _colorPixels.Length; i += 4)
                {
                    _bitmapBits[i + 3] = 255;
                    _bitmapBits[i + 2] = _colorPixels[i + 2];
                    _bitmapBits[i + 1] = _colorPixels[i + 1];
                    _bitmapBits[i] = _colorPixels[i];
                }
    
                this._sensor.MapDepthFrameToColorFrame(DepthImageFormat.Resolution640x480Fps30, _depthPixels, ColorImageFormat.RgbResolution640x480Fps30, _mappedDepthLocations);
    
                for (int i = 0; i < _depthPixels.Length; i++)
                {
                    int depthVal = _depthPixels[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
    
    // Put in the overlay of, say, depth values < 1 meters.       
    if ((depthVal < 1000) && (depthVal > 400))
                    {
                        ColorImagePoint point = _mappedDepthLocations[i];
    
                        if ((point.X >= 0 && point.X < 640) && (point.Y >= 0 && point.Y < 480))
                        {
                            int baseIndex = (point.Y * 640 + point.X) * 4;
                            _bitmapBits[baseIndex] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);
                            _bitmapBits[baseIndex + 1] = (byte)((_bitmapBits[baseIndex + 1] + 255) >> 1);
                            _bitmapBits[baseIndex + 2] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);
                        }
                    }
                }
    
                _bitmap.WritePixels(new Int32Rect(0, 0, _bitmap.PixelWidth, _bitmap.PixelHeight), _bitmapBits, _bitmap.PixelWidth * sizeof(int), 0);
            }
    
            public MainWindow()
            {
                InitializeComponent();
    
                KinectSensor.KinectSensors.StatusChanged += (object sender, StatusChangedEventArgs e) =>
                {
                    if (e.Sensor == _sensor)
                    {
                        if (e.Status != KinectStatus.Connected)
                        {
                            SetSensor(null);
                        }
                    }
                    else if ((_sensor == null) && (e.Status == KinectStatus.Connected))
                    {
                        SetSensor(e.Sensor);
                    }
                };
    
                foreach (var sensor in KinectSensor.KinectSensors)
                {
                    if (sensor.Status == KinectStatus.Connected)
                    {
                        SetSensor(sensor);
                    }
                }            
            }
        }
    }


    -Adam Smith [MS]




    Wednesday, February 8, 2012 4:56 AM
  • The "alignment" - or more precisely the reported depth mask values - will depend on many factors.  These include the depth of the object relative to the sensor, the ambient light, the reflectivity/absorbtion of the material (including color), the relative surface normal (is it square to the sensor?  a 45 surface? etc), etc. 

    In this case (my hand, relatively close to the sensor), I believe that edges aren't perfectly aligned because my fingers curve "away" from the sensor and are somewhat reflective.  I had better results with objects with planar surfaces, but, no, it's not perfect. 

    To improve maps, there are a number of processing steps that you can take to clean up/smooth the depth data, to eliminate "holes", and to use image processing techniques in the color stream to further refine things.  We hope to be able to provide richer guidance in these more advanced topics in the future.

    With all of that said - if you do continue to explore and believe that you've found an algorithmic bug, we'd absolutely love to see what you find.  We certainly can't promise we're defect free.


    -Adam Smith [MS]

    Friday, February 10, 2012 4:10 AM

All replies

  • I'm not sure precisely how you're building your overlay, so I can't determine whether the logic is off, whether the overlay is being stretched, or whether there's a bug in the runtime.  What I can give you is this, which I just whipped up, in WPF.

    The project is a default WPF project, with an Image added in MainWindow.xaml:

    <Grid>
    <ImageName="Image"/>
    </Grid>

    And here's the entire MainWindow.xaml.cs.  I "cheated" in a number of places by writing 640 and 480 instead of pulling the size out of the frame, for expediency.

    The demo takes each frame, copies the color data into a byte array, then maps usingMapDepthFrameToColorFrame,then for each mapped depth pixel with a depth between 400 and 1000 mm it blends the target pixel 50/50 with white.  You'll see a grid emerge in the image where a given pixel has been mapped more than once.

    Hopefully this helps!

    using System;
    using System.Diagnostics;
    using System.Windows;
    using System.Windows.Media;
    using System.Windows.Media.Imaging;
    using Microsoft.Kinect;
    
    namespace WpfApplication1
    {
        /// <summary>
        /// Interaction logic for MainWindow.xaml
        /// </summary>
        public partial class MainWindow : Window
        {
            private KinectSensor _sensor;
            private WriteableBitmap _bitmap;
            private byte[] _bitmapBits;
            private ColorImagePoint[] _mappedDepthLocations;
            private byte[] _colorPixels = new byte[0];
            private short[] _depthPixels = new short[0];
    
            private void SetSensor(KinectSensor newSensor)
            {
                if (_sensor != null)
                {
                    _sensor.Stop();
                }
    
                _sensor = newSensor;
    
                if (_sensor != null)
                {
                    Debug.Assert(_sensor.Status == KinectStatus.Connected, "This should only be called with Connected sensors.");
                    _sensor.ColorStream.Enable(ColorImageFormat.RgbResolution640x480Fps30);
                    _sensor.DepthStream.Enable(DepthImageFormat.Resolution640x480Fps30);
                    _sensor.AllFramesReady += _sensor_AllFramesReady;
                    _sensor.Start();
                }
            }
    
            void _sensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
            {
                bool gotColor = false;
                bool gotDepth = false;
    
                using (ColorImageFrame colorFrame = e.OpenColorImageFrame())
                {
                    if (colorFrame != null)
                    {
                        Debug.Assert(colorFrame.Width == 640 && colorFrame.Height == 480, "This app only uses 640x480.");
    
                        if (_colorPixels.Length != colorFrame.PixelDataLength)
                        {
                            _colorPixels = new byte[colorFrame.PixelDataLength];
                            _bitmap = new WriteableBitmap(640, 480, 96.0, 96.0, PixelFormats.Bgr32, null);
                            _bitmapBits = new byte[640 * 480 * 4];
                            this.Image.Source = _bitmap;
                        }
    
                        colorFrame.CopyPixelDataTo(_colorPixels);
                        gotColor = true;
                    }
                }
    
                using (DepthImageFrame depthFrame = e.OpenDepthImageFrame())
                {
                    if (depthFrame != null)
                    {
                        Debug.Assert(depthFrame.Width == 640 && depthFrame.Height == 480, "This app only uses 640x480.");
    
                        if (_depthPixels.Length != depthFrame.PixelDataLength)
                        {
                            _depthPixels = new short[depthFrame.PixelDataLength];
                            _mappedDepthLocations = new ColorImagePoint[depthFrame.PixelDataLength];
                        }
    
                        depthFrame.CopyPixelDataTo(_depthPixels);
                        gotDepth = true;
                    }
                }
    
                // Put the color image into _bitmapBits
                for (int i = 0; i < _colorPixels.Length; i += 4)
                {
                    _bitmapBits[i + 3] = 255;
                    _bitmapBits[i + 2] = _colorPixels[i + 2];
                    _bitmapBits[i + 1] = _colorPixels[i + 1];
                    _bitmapBits[i] = _colorPixels[i];
                }
    
                this._sensor.MapDepthFrameToColorFrame(DepthImageFormat.Resolution640x480Fps30, _depthPixels, ColorImageFormat.RgbResolution640x480Fps30, _mappedDepthLocations);
    
                for (int i = 0; i < _depthPixels.Length; i++)
                {
                    int depthVal = _depthPixels[i] >> DepthImageFrame.PlayerIndexBitmaskWidth;
    
    // Put in the overlay of, say, depth values < 1 meters.       
    if ((depthVal < 1000) && (depthVal > 400))
                    {
                        ColorImagePoint point = _mappedDepthLocations[i];
    
                        if ((point.X >= 0 && point.X < 640) && (point.Y >= 0 && point.Y < 480))
                        {
                            int baseIndex = (point.Y * 640 + point.X) * 4;
                            _bitmapBits[baseIndex] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);
                            _bitmapBits[baseIndex + 1] = (byte)((_bitmapBits[baseIndex + 1] + 255) >> 1);
                            _bitmapBits[baseIndex + 2] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);
                        }
                    }
                }
    
                _bitmap.WritePixels(new Int32Rect(0, 0, _bitmap.PixelWidth, _bitmap.PixelHeight), _bitmapBits, _bitmap.PixelWidth * sizeof(int), 0);
            }
    
            public MainWindow()
            {
                InitializeComponent();
    
                KinectSensor.KinectSensors.StatusChanged += (object sender, StatusChangedEventArgs e) =>
                {
                    if (e.Sensor == _sensor)
                    {
                        if (e.Status != KinectStatus.Connected)
                        {
                            SetSensor(null);
                        }
                    }
                    else if ((_sensor == null) && (e.Status == KinectStatus.Connected))
                    {
                        SetSensor(e.Sensor);
                    }
                };
    
                foreach (var sensor in KinectSensor.KinectSensors)
                {
                    if (sensor.Status == KinectStatus.Connected)
                    {
                        SetSensor(sensor);
                    }
                }            
            }
        }
    }


    -Adam Smith [MS]




    Wednesday, February 8, 2012 4:56 AM
  • Hi Adam,

    Thank you very much for a quick reply and the sample code!!

    I will try your code soon.

    However, I still see 3~5 pixel misalignment from the image you have posted. Is there always going to be some misalignment, or is the misalignment happens on near objects (in your case, it's between 400 and 1000 mm)?

    I remember that OpenNI had nearly perfect alignment between depth and color.

    Thanks

    Wednesday, February 8, 2012 7:14 PM
  • The "alignment" - or more precisely the reported depth mask values - will depend on many factors.  These include the depth of the object relative to the sensor, the ambient light, the reflectivity/absorbtion of the material (including color), the relative surface normal (is it square to the sensor?  a 45 surface? etc), etc. 

    In this case (my hand, relatively close to the sensor), I believe that edges aren't perfectly aligned because my fingers curve "away" from the sensor and are somewhat reflective.  I had better results with objects with planar surfaces, but, no, it's not perfect. 

    To improve maps, there are a number of processing steps that you can take to clean up/smooth the depth data, to eliminate "holes", and to use image processing techniques in the color stream to further refine things.  We hope to be able to provide richer guidance in these more advanced topics in the future.

    With all of that said - if you do continue to explore and believe that you've found an algorithmic bug, we'd absolutely love to see what you find.  We certainly can't promise we're defect free.


    -Adam Smith [MS]

    Friday, February 10, 2012 4:10 AM
  • HI , I have done the same work

    Maybe you can refer to the dicussion I posted before.

    http://social.msdn.microsoft.com/Forums/en-US/kinectsdknuiapi/thread/b98acb73-123b-4e5a-94fd-b01dc205ce95/#ba1d90da-cdaa-45ad-a4a6-9eab15c728b9

    And you can also refer to the paper

    http://vclab.gist.ac.kr/papers/03/2011/APSIPA_LSB.pdf

    It also using the function to do the alignment, but it took the "depth" in consideration.

    If you still have some questions, you can send E-mail to me.

    s7531234s@gmail.com          

    BTW, may I ask you a question?

    How can you add depth map into RGB image like the image ypu posted??

    I have replaced G-layer by depth information as follow image.

    But it seems not as good as yours.

    • Edited by 夏飄雪 Sunday, February 12, 2012 6:58 AM
    Sunday, February 12, 2012 6:28 AM
  • Hello,

    When I try to run that code, the main window isn't actually showing anything up. Can the reason be as follows? Is it taking too much time to process and the frames are thrown even before the previous frames are calibrated to be viewed on the main window? I'm using Kinect for Windows SDK 1.0 and the kinect hardware is the earlier one that was out for XBox. Also is there a way to flush frames till the frame that is currently being processed is calibrated?

    Thanks in advance.

    Sunday, February 12, 2012 6:57 PM
  • My code is above, but, basically, for each mapped pixel I blended the pixel 50% with white.  I was interested in marking pixels as "touched" rather than truely illustrating a range of depth values.  I like your example too, using the green channel.

    -Adam Smith [MSFT]

    Monday, February 13, 2012 3:18 AM
  • Hi Adam,

    Thank you very much for a quick reply and the sample code!!

    I will try your code soon.

    However, I still see 3~5 pixel misalignment from the image you have posted. Is there always going to be some misalignment, or is the misalignment happens on near objects (in your case, it's between 400 and 1000 mm)?

    I remember that OpenNI had nearly perfect alignment between depth and color.

    Thanks

    Hi!

    I have the same issues with calibration. Ideally I would like NO misalignment at all.

    You mention that OpenNi has nearly perfect alignment between depth and color.

    Is this something you've heard or read somewhere or have you actually verified it on your own??

    Do you have any code for this?

    Thanks,

    Paul

    Wednesday, March 14, 2012 3:43 PM
  • Hi Adam,

    Thanks for your examples, I'm learning on this, I'm begginer, I need your support. I'm trying to use the Kinect Sensor to detect an specific color to an specific distance from the kinect, I have the code to segregate the pixels for the different distances, but I don't know how to compare the colors, do you know how to do that?... the idea is to detect for example a green color object at 1000mm from the Kinect and send out a sign to be used in another process. Could you indicate me how to detect and compare the colors please?

    Thanks a lot.

    Regards,

    Jorge Huerta.

    Monday, April 2, 2012 9:27 PM
  • Hi,are you Chinese? May I have your QQ ,I want to turn to you for some help about that code posted by Adam.

    I cannot understand how he makes 50/50 white overlay.


    Friday, April 6, 2012 3:56 PM
  • Hi Adam

    I've run your project and cannot understand how you make it 50/50 overlay.

    I edit your original code :

     _bitmapBits[baseIndex] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);
                            _bitmapBits[baseIndex + 1] = (byte)((_bitmapBits[baseIndex + 1] + 255) >> 1);
                            _bitmapBits[baseIndex + 2] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);

    into:

     _bitmapBits[baseIndex] = 255;
                            _bitmapBits[baseIndex + 1] = 0;
                            _bitmapBits[baseIndex + 2] = 0;

    and I get solid BLUE overlay (not 50 transparent)

    or

     _bitmapBits[baseIndex] =255;                 

    BLUE overlay but still 50% transparent

    Another question about the GRID. As you said, some pixels have been mapped more than once,so that comes to a grid.Could you tell me why those pixels get mapped repeatedly?

    Thank you for you help!

    BK

    Friday, April 6, 2012 4:08 PM
  • >the idea is to detect for example a green color object at 1000mm from the Kinect  ... Could you indicate me how to detect and compare the colors please?

    In the sample I posted, note the final loop in the all frames ready handler.  This loop iterates over the depth pixels and, for each depth pixel, retrieves the color from the associated color pixel.  What you can do instead is to test whether the depth is in the range you want (say, 950 to 1050) and, if so, test the corresponding color pixels (using the same lookup I used above) and see of the RGB values are, roughly, ~0, ~something, ~0.  I write "~something" for the Green value because I don't know whether your "green" is very bright or dark, or something else.  Also, if it's not "pure" green as detected by our RGB camera and as greatly affected by lighting conditions, you may need to look for a different color. 

    Depending on your scenario, if you're in a position to provide a well-known "white" card to the camera as calibration under the same lighting conditions, you can do a very basic white balance.  That is, if a "white" card is detected as, say, 212, 170, 150 (a reddish-yellow tint and bright but not blindingly bright) then you can scale your target green color by 212/255, 170/255 and 150/255 respectively.  (note, I have not tried this, this is based on rudimentary theory on my part).


    -Adam Smith [MSFT]

    Saturday, April 7, 2012 6:29 AM
  • >I've run your project and cannot understand how you make it 50/50 overlay.

    Right - note what I've done in my per-channel logic:

    _bitmapBits[baseIndex + 2] = (byte)((_bitmapBits[baseIndex] + 255) >> 1);

    This code sets the value of this channel (red, as it happens) to a new value.  That new value is "old value + 255" divided by 2 ( shift right 1, i.e. >>1, is a divide by two).  So, for each channel of each pixel of interest, I'm averaging the existing channel value and 255 (i.e. 100%), so each time a given pixel is encountered in the mapping, it's blended 50% with white.  This is also why some pixels are more white than others (in the grid pattern) - the first time they were encountered, they were blended 50% with white.  The second time, it happened again, so now the color is effectively blended with 75% white.

    The Grid pattern appears because the depth camera's field of view / pixel width (and height) is different from that of the color field of view.  As such, as I iterate in depth space, I'm (effectively) iterating at a slightly slower rate across the color image, such that periodically (litterally periodically - you can see that at constant depth the grid is regularly repeating) two adjacent depth pixels map to the same color pixel.


    -Adam Smith [MSFT]

    Saturday, April 7, 2012 6:39 AM
  • Hi Adam Smith,

    Thanks for your code!

    If I change your "depthVal < 1000" to some value different, e.g. "depthVal < 2000" or ""depthVal < _sensor.DepthStream.TooFarDepth", the program doesn't work (the image just shows a blank white screen).

    Do you have any idea why it happens?

    Thanks again.


    • Edited by Craig Yu Monday, April 30, 2012 5:29 PM
    Monday, April 30, 2012 5:28 PM
  • Oh, I seem to get it. It is because too many pixels need to be handled, if upper limit is set to 2000 instead of 1000,
    my computer is too slow to process the pixels that the processed frame cannot be shown in real-time! 
    Monday, April 30, 2012 5:37 PM
  • Hi Adam,

    This post has been really helpful too me and I tried your code and it looks good! I have a slightly different issue at the moment though, I'm actually looking to map a colour pixel to a depth measurement rather than the other way round that the function allows. Is there a function that does this?

    Basically I have four sample points on the colour image and wish to find out their corresponding depths. 

    If not, then my current idea is to iterate through each converted depth pixel and test to see if each maps to the desired colour point and then store that value.. is there a better way perhaps?

    Any help would be appreciated.

    Kind Regards,

    Ben


    Wednesday, May 23, 2012 12:56 PM
  • What you describe works, and since a full frame Depth->Color map takes 2ms-3ms on a mid-range machine, this *should* be fast enough.  This basically does provide the depth value for each color value, but only by "pushing" the data to color space.

    We're looking at ways to make this more straightforward.  One potential complication is that there are color pixels that will not have a depth associated with them - the view frustum for the Color camera and Depth Camera are not idential.


    -Adam Smith [MSFT]

    Thursday, May 24, 2012 12:59 AM
  • how can i save this data(depth + color ) into .xml output ?
     or any 3d viewing software like blender or meshlab?
    • Edited by zaker125 Thursday, August 23, 2012 5:04 PM
    Thursday, August 23, 2012 5:03 PM
  • Hi,

    Can you please recommend something similar in C++? I want to use it in DepthWithColor-D3D project.

    Thanks,

    Maniar

    Sunday, September 23, 2012 5:59 PM
  • I used your code. But I got an error and a warning. 

    1. this.Image.Source=_bitmap;             'WpfApplication1.MainWindow' does not contain a definition for 'Image' accepting a first argument of type 'WpfApplication1.MainWindow' could be found (are you missing a using directive or an assembly reference?)

    2.

    this._sensor.MapDepthFrameToColorFrame(DepthImageFormat.Resolution640x480Fps30, _depthPixels, ColorImageFormat.RgbResolution640x480Fps30, _mappedDepthLocations);

    'This method is replaced by Microsoft.Kinect.CoordinateMapper.MapDepthFrameToColorFrame'

    Sunday, January 27, 2013 11:15 PM
  • I have solved the problem. Thank you!
    Monday, January 28, 2013 12:01 AM
  • Dear Adam

    Thanks for your code, I find it very interesting and useful. I was working with this code, you have replaced colors with a same brightness(255), I want to know is it possible to set different brightness? 

    Thanks in advance :)

    Monday, August 12, 2013 7:23 AM