none
How can I map a Depth Frame to Camera Space without having a Kinect on hand RRS feed

  • Question

  • Hi,

    It is very cool that Kinect 2 firmware has a precise per-camera calibration which allows easy conversion from depth space to camera space. No more need to calibrate each Kienct sensor individually to get precise real-world measurements !

    There is only one problem I am facing now: I want to be able to save depth images taken with a Kinect 2 sensor and replay those images "offline" (without having the Kinect connected on my computer). When replaying those images, I need to be able to convert depth information to Camera Space.

    I see that CoordinateMapper has 2 functions which seem appropriate for the task: GetDepthCameraIntrinsics()  and GetDepthFrameToCameraSpaceTable(). Using either of those functions, I should be able to save the camera parameters needed to perform the conversion offline.

    Now, here are my questions:

    1. How can I use the content of the CameraIntrinsic structure returned by GetDepthCameraIntrinsics() to convert a depth pixel to CameraSpace ? Most importantly, how do I use the distortion parameters to correct the lens radial distortion ?

    2. How can I do the same thing using the LUT returned by GetDepthFrameToCameraSpaceTable() ? I read the documentation on this LUT, I understand what a LUT is, and yet I don't understand at all what this LUT represents or how it should be used (having an example on how to use this LUT would help greatly)

    Thank you.

    Wednesday, February 25, 2015 7:31 PM

Answers

  • for every [x,y] in the table is a tuple that when multiplied by the [x,y] depth value will give you the point in camera space:

    // Get the depth for this pixel ushort depth = frameData[y * depthFrameDescription.Height + x]; // Get the value from the x/y table PointF lutValue = this.cameraSpaceTable[y * depthFrameDescription.Height + x]; // create the CameraSpacePoint for this pixel // values are in meters so convert CameraSpacePoint csp = new CameraSpacePoint(); csp.X = lutValue.X * depth * 0.001f; csp.Y = lutValue.Y * depth * 0.001f; csp.Z = depth * 0.001f;


    Carmine Sirignano - MSFT

    Thursday, February 26, 2015 1:14 AM
  • I haven't implemented anything to answer 1., but others have. I can answer 2. since I had implemented it prior to depth intrinsics being made available, for the same purpose as yours. This will show how to use the LUT, but to be a fully usable object for use offline you will need to add saving/loading of the LUT separately. I use OpenCV for lots of stuff. Code works in VS2013.

    Below is a DepthCameraModel class that performs both:

    1. forward mapping: from depth pixel coords to 3D direction vector, optionally scaled to 3D point using depth
    2. inverse mapping: from point in camera space to the corresponding depth pixel coord

    Both mappings use the LUT and perform sub-pixel interpolation so should compare quite favorably to using the intrisics directly. The following code was slightly modified to be useful for posting, so hopefully I have not introduced some bugs :)

    Usage: call createDepthCameraModelFromKinect() to make a DepthCameraModelPtr from a live Kinect device. Then use the DepthCameraModel methods :)

    Constructing the DepthCameraModel from a Kinect device:


    #include "DepthCameraModel.h"
    #include <opencv2/core/core.hpp>
    #include <folly/ScopeGuard.h>
    #include <Kinect.h>
    
    template <class T>
    static inline void SafeRelease(T*& p)
    {
      if (nullptr != p) {
        (void)p->Release();
        p = nullptr;
      }
    }
    
    //
    // Returns a map from depth pixel index to camera-space points.
    //
    // To convert a depth pixel to a camera-space point:
    //
    //   DepthSpacePoint dp;
    //   uint16_t depthMm;
    //   CameraSpacePoint cp;
    //   ...
    //   // access map in a manner similar to:
    //   auto m = map((int)dp.Y, (int)dp.X);
    //   cp.X = depthMm * m[0];
    //   cp.Y = depthMm * m[1];
    //   cp.Z = depthMm;
    //
    bool getDepthToCamTable(IKinectSensor* dev, cv::Mat2f& map)
    {
      ICoordinateMapper* cm = nullptr;
      SCOPE_EXIT{SafeRelease(cm);};
      if (FAILED(dev->get_CoordinateMapper(&cm)))
        return false;
    
      uint32_t tableLen = 0;
      PointF* table = nullptr;
      cm->GetDepthFrameToCameraSpaceTable(&tableLen, &table);
      SCOPE_EXIT{CoTaskMemFree(table); };
    
      map.create(DepthCameraModel::height, DepthCameraModel::width);
      if (tableLen != map.rows * map.cols)
        throw std::exception("getDepthToCamTable: invalid table length");
    
      for (int y = 0; y < map.rows; ++y) {
        auto pt = &table[y * map.cols];
        auto pm = &map(y, 0);
        for (int x = 0; x < map.cols; ++x)
          pm[x] = cv::Vec2f(pt[x].X, pt[x].Y);
      }
      return true;
    }
    
    //
    // Returns a map from camera-space points to depth pixel index.
    //
    // To convert a camera-space point to a map index:
    //
    //   CameraSpacePoint point;
    //   ...
    //   float x = point.x / point.z * scaleX + offX;
    //   float y = point.y / point.z * scaleY + offY;
    //   // access map in a manner similar to:
    //   auto depthPixelIndex = map((int)y, (int)x);
    //
    bool getCamToDepthTable(IKinectSensor* dev,
                            const cv::Mat2f& d2c, cv::Mat2f& map,
                            float& scaleX, float& offX,
                            float& scaleY, float& offY)
    {
      float maxX = 0, maxY = 0;
      for (auto i : d2c) {
        maxX = std::max(maxX, fabs(i[0]));
        maxY = std::max(maxY, fabs(i[1]));
      }
    
      float rangeW = maxX * 2.0f * 1.05f; // width range +5%
      float rangeH = maxY * 2.0f * 1.05f; // height range +5%
      int w = int(DepthCameraModel::width * 1.05f);    // depth width +5% also
      int h = int(DepthCameraModel::height * 1.05f);   // depth height +5% also
    
      scaleX = (w / rangeW);
      scaleY = (h / rangeH);
      offX = w * 0.5f;
      offY = h * 0.5f;
    
      ICoordinateMapper* cm = nullptr;
      SCOPE_EXIT{SafeRelease(cm); };
      if (FAILED(dev->get_CoordinateMapper(&cm)))
        return false;
    
      map.create(h, w);
    
      std::vector<CameraSpacePoint> cam;
      cam.reserve(w*h);
      for (int y = 0; y < map.rows; ++y) {
        for (int x = 0; x < map.cols; ++x) {
          //
          // Convert x,y into [-1,1] range then scale by scaleX,scaleY.
          //
          CameraSpacePoint p;
          p.X = (x - offX) / scaleX;
          p.Y = (y - offY) / scaleY;
          p.Z = 1;
          cam.push_back(p);
        }
      }
    
      std::vector<DepthSpacePoint> depth;
      depth.resize(w*h);
      if (FAILED(cm->MapCameraPointsToDepthSpace((int)cam.size(), &cam[0], (int)depth.size(), &depth[0])))
        return false;
    
      for (int y = 0; y < map.rows; ++y) {
        auto pMap = &map(y, 0);
        auto pDepth = &depth[y*w];
        for (int x = 0; x < map.cols; ++x) {
          auto& m = pMap[x];
          auto& d = pDepth[x];
          m[0] = std::max(-0.5f, std::min(float(DepthCameraModel::width - 0.5f), d.X));
          m[1] = std::max(-0.5f, std::min(float(DepthCameraModel::height - 0.5f), d.Y));
        }
      }
    
      return true;
    }
    
    //
    // Create a depth camera model using the mapping tables from a Kinect device.
    //
    DepthCameraModelPtr createDepthCameraModelFromKinect(IKinectSensor* dev)
    {
      cv::Mat2f dc;
      if (getDepthToCamTable(dev, dc)) {
        cv::Mat2f cd;
        float sx, ox, sy, oy;
        if (getCamToDepthTable(dev, dc, cd, sx, ox, sy, oy)) {
          return DepthCameraModel::create(dc, cd, sx, ox, sy, oy);
        }
      }
      return{};
    }
    


    Implementation of the DepthCameraModel.h, inlined functions for posting:


    #pragma once
    #include <opencv2/core/core.hpp>
    #include <memory>
    
    //
    // Object to use as a camera model is a shared pointer named DepthCameraModelPtr.
    //
    class DepthCameraModel;
    using DepthCameraModelPtr = std::shared_ptr < const DepthCameraModel > ;
    
    //
    // Kinect natively uses a camera-space that is left-handed and matches
    // the left-handed coordinate frame default of Direct3D as follows, with
    // +x to the right of the depth image, +y toward the top of the depth image,
    // and +z out of the depth image i.e. toward the viewer, and the origin
    // being somewhere near the central pixel:
    //
    //    +y
    //    ^
    //    |  +z
    //    | /
    //    |/
    //    o------> +x
    //
    // The pixel locations that are accepted as input are pixel index values
    // within the image array e.g. pix.x ranges [0, w-1]. The values are
    // floating-point, where the whole numbers are located at the centre of
    // the pixel being indexed e.g. x=0.5 is the boundary between the first
    // and second pixel column. This implies that the range of pix.x should
    // actually be [-0.5, w-0.5], but in practice there is a half-pixel clamp
    // around the image border.
    //
    // Boo for left-handed coordinate systems! Even the lead dev of DirectX
    // thinks it was a bit of a mistake:
    // http://www.alexstjohn.com/WP/2013/07/22/the-evolution-of-direct3d/
    //
    class DepthCameraModel
    {
    public:
      static const int width = 512;
      static const int height = 424;
    
      //
      // NOTE: this is expensive to create but cheap to share.
      //
      static DepthCameraModelPtr create(const cv::Mat2f& depthToCamTable,
                                        const cv::Mat2f& camToDepthTable, float scaleX,
                                        float offX, float scaleY, float offY)
      {
        return DepthCameraModelPtr(new DepthCameraModel(depthToCamTable, camToDepthTable, scaleX, offX, scaleY, offY));
      }
    
      //
      // Convert a pixel location to a direction vector in camera-space.
      // Note the returned vector is not normalised. In fact, it is scaled
      // such that the z-axis is always 1.0 i.e. unit in the depth direction.
      //
      cv::Vec3f pixelToVector(cv::Vec2f depthPixel) const
      {
        auto m = bilinear(mapDepthToCam, depthPixel[0], depthPixel[1]);
        return{m[0], m[1], 1};
      }
    
      //
      // Convert a pixel location and depth to a camera-space 3D point.
      // The 3D point will be in the units of the given depth value.
      //
      cv::Vec3f pixelToVector(cv::Vec2f depthPixel, float depth) const
      {
        auto m = bilinear(mapDepthToCam, depthPixel[0], depthPixel[1]);
        return{depth * m[0], depth * m[1], depth};
      }
    
      //
      // Projects the camera-space point onto the depth image pixel-space.
      //
      cv::Vec2f vectorToPixel(cv::Vec3f point) const
      {
        if (point[2] < 1e-3f)
          return{};
    
        float mx = point[0] / point[2] * scaleX + offX;
        float my = point[1] / point[2] * scaleY + offY;
        auto m = bilinear(mapCamToDepth, mx, my);
        return{m[0], m[1]};
      }
    
      //
      // Convert all depth values to camera-space 3D points. The returned
      // points are in m units, unlike the depth data which is mm units.
      //
      void depthToCam(const cv::Mat1w& depth, cv::Mat3f& dst) const
      {
        dst.create(depth.size());
        for (int y = 0; y < depth.rows; ++y) {
          auto pd = &depth(y, 0);
          auto pm = &mapDepthToCam(y, 0);
          auto po = &dst(y, 0);
          for (int x = 0; x < depth.cols; ++x) {
            auto d = pd[x] * 1e-3f; // convert mm to m
            auto m = pm[x];
            po[x] = {d * m[0], d * m[1], d};
          }
        }
      }
    
      //
      // This data is provided to allow saving of the camera model.
      // Since this map data is shared, these functions return a clone
      // of the images to avoid unintentional modification i.e. slow!
      //
      cv::Mat2f getDepthToCamTable() const
      {
        return mapDepthToCam.clone();
      }
    
      cv::Mat2f getCamToDepthTable() const
      {
        return mapCamToDepth.clone();
      }
    
      float getScaleX() const { return scaleX; }
    
      float getScaleY() const { return scaleY; }
    
      float getOffX() const { return offX; }
    
      float getOffY() const { return offY; }
    
    private:
      cv::Mat2f mapDepthToCam, mapCamToDepth;
      float scaleX = 0, offX = 0;
      float scaleY = 0, offY = 0;
    
      DepthCameraModel(const cv::Mat2f& depthToCamTable,
                       const cv::Mat2f& camToDepthTable, float scaleX,
                       float offX, float scaleY, float offY)
                       : mapDepthToCam(depthToCamTable), mapCamToDepth(camToDepthTable),
                       scaleX(scaleX), offX(offX), scaleY(scaleY), offY(offY)
      {
        if (mapDepthToCam.data == nullptr || mapCamToDepth.data == nullptr)
          throw std::exception("DepthCameraModel: invalid calibration map");
      }
    
      static float bilinear(float x, float y, float f00, float f10, float f01, float f11)
      {
        float ix = 1 - x;
        float iy = 1 - y;
        return (f00 * ix * iy) + (f10 * x * iy) + (f01 * ix * y) + (f11 * x * y);
      }
    
      static void bilinearClamp(float x, int min, int max, int& ix, float& fx)
      {
        if (x < min) {
          ix = min;
          fx = 0;
        } else {
          ix = (int)x;
          fx = x - ix;
          if (ix > max) {
            ix = max;
            fx = 1;
          }
        }
      }
    
      static cv::Vec2f bilinear(const cv::Mat2f& map, float x, float y)
      {
        int ix, iy;
        float fx, fy;
        bilinearClamp(x, 0, map.cols - 2, ix, fx);
        bilinearClamp(y, 0, map.rows - 2, iy, fy);
        auto f00 = map(iy, ix);
        auto f10 = map(iy, ix + 1);
        auto f01 = map(iy + 1, ix);
        auto f11 = map(iy + 1, ix + 1);
        return{bilinear(fx, fy, f00[0], f10[0], f01[0], f11[0]),
          bilinear(fx, fy, f00[1], f10[1], f01[1], f11[1])};
      }
    };
    


    But of course, the better option is to use the intrinsics without all the hoops to jump through and LUTs to save/load.

    Hope that helps.


    • Proposed as answer by sault Thursday, February 26, 2015 1:37 AM
    • Marked as answer by William.Legare Friday, March 6, 2015 3:29 PM
    Thursday, February 26, 2015 1:34 AM

All replies

  • for every [x,y] in the table is a tuple that when multiplied by the [x,y] depth value will give you the point in camera space:

    // Get the depth for this pixel ushort depth = frameData[y * depthFrameDescription.Height + x]; // Get the value from the x/y table PointF lutValue = this.cameraSpaceTable[y * depthFrameDescription.Height + x]; // create the CameraSpacePoint for this pixel // values are in meters so convert CameraSpacePoint csp = new CameraSpacePoint(); csp.X = lutValue.X * depth * 0.001f; csp.Y = lutValue.Y * depth * 0.001f; csp.Z = depth * 0.001f;


    Carmine Sirignano - MSFT

    Thursday, February 26, 2015 1:14 AM
  • I haven't implemented anything to answer 1., but others have. I can answer 2. since I had implemented it prior to depth intrinsics being made available, for the same purpose as yours. This will show how to use the LUT, but to be a fully usable object for use offline you will need to add saving/loading of the LUT separately. I use OpenCV for lots of stuff. Code works in VS2013.

    Below is a DepthCameraModel class that performs both:

    1. forward mapping: from depth pixel coords to 3D direction vector, optionally scaled to 3D point using depth
    2. inverse mapping: from point in camera space to the corresponding depth pixel coord

    Both mappings use the LUT and perform sub-pixel interpolation so should compare quite favorably to using the intrisics directly. The following code was slightly modified to be useful for posting, so hopefully I have not introduced some bugs :)

    Usage: call createDepthCameraModelFromKinect() to make a DepthCameraModelPtr from a live Kinect device. Then use the DepthCameraModel methods :)

    Constructing the DepthCameraModel from a Kinect device:


    #include "DepthCameraModel.h"
    #include <opencv2/core/core.hpp>
    #include <folly/ScopeGuard.h>
    #include <Kinect.h>
    
    template <class T>
    static inline void SafeRelease(T*& p)
    {
      if (nullptr != p) {
        (void)p->Release();
        p = nullptr;
      }
    }
    
    //
    // Returns a map from depth pixel index to camera-space points.
    //
    // To convert a depth pixel to a camera-space point:
    //
    //   DepthSpacePoint dp;
    //   uint16_t depthMm;
    //   CameraSpacePoint cp;
    //   ...
    //   // access map in a manner similar to:
    //   auto m = map((int)dp.Y, (int)dp.X);
    //   cp.X = depthMm * m[0];
    //   cp.Y = depthMm * m[1];
    //   cp.Z = depthMm;
    //
    bool getDepthToCamTable(IKinectSensor* dev, cv::Mat2f& map)
    {
      ICoordinateMapper* cm = nullptr;
      SCOPE_EXIT{SafeRelease(cm);};
      if (FAILED(dev->get_CoordinateMapper(&cm)))
        return false;
    
      uint32_t tableLen = 0;
      PointF* table = nullptr;
      cm->GetDepthFrameToCameraSpaceTable(&tableLen, &table);
      SCOPE_EXIT{CoTaskMemFree(table); };
    
      map.create(DepthCameraModel::height, DepthCameraModel::width);
      if (tableLen != map.rows * map.cols)
        throw std::exception("getDepthToCamTable: invalid table length");
    
      for (int y = 0; y < map.rows; ++y) {
        auto pt = &table[y * map.cols];
        auto pm = &map(y, 0);
        for (int x = 0; x < map.cols; ++x)
          pm[x] = cv::Vec2f(pt[x].X, pt[x].Y);
      }
      return true;
    }
    
    //
    // Returns a map from camera-space points to depth pixel index.
    //
    // To convert a camera-space point to a map index:
    //
    //   CameraSpacePoint point;
    //   ...
    //   float x = point.x / point.z * scaleX + offX;
    //   float y = point.y / point.z * scaleY + offY;
    //   // access map in a manner similar to:
    //   auto depthPixelIndex = map((int)y, (int)x);
    //
    bool getCamToDepthTable(IKinectSensor* dev,
                            const cv::Mat2f& d2c, cv::Mat2f& map,
                            float& scaleX, float& offX,
                            float& scaleY, float& offY)
    {
      float maxX = 0, maxY = 0;
      for (auto i : d2c) {
        maxX = std::max(maxX, fabs(i[0]));
        maxY = std::max(maxY, fabs(i[1]));
      }
    
      float rangeW = maxX * 2.0f * 1.05f; // width range +5%
      float rangeH = maxY * 2.0f * 1.05f; // height range +5%
      int w = int(DepthCameraModel::width * 1.05f);    // depth width +5% also
      int h = int(DepthCameraModel::height * 1.05f);   // depth height +5% also
    
      scaleX = (w / rangeW);
      scaleY = (h / rangeH);
      offX = w * 0.5f;
      offY = h * 0.5f;
    
      ICoordinateMapper* cm = nullptr;
      SCOPE_EXIT{SafeRelease(cm); };
      if (FAILED(dev->get_CoordinateMapper(&cm)))
        return false;
    
      map.create(h, w);
    
      std::vector<CameraSpacePoint> cam;
      cam.reserve(w*h);
      for (int y = 0; y < map.rows; ++y) {
        for (int x = 0; x < map.cols; ++x) {
          //
          // Convert x,y into [-1,1] range then scale by scaleX,scaleY.
          //
          CameraSpacePoint p;
          p.X = (x - offX) / scaleX;
          p.Y = (y - offY) / scaleY;
          p.Z = 1;
          cam.push_back(p);
        }
      }
    
      std::vector<DepthSpacePoint> depth;
      depth.resize(w*h);
      if (FAILED(cm->MapCameraPointsToDepthSpace((int)cam.size(), &cam[0], (int)depth.size(), &depth[0])))
        return false;
    
      for (int y = 0; y < map.rows; ++y) {
        auto pMap = &map(y, 0);
        auto pDepth = &depth[y*w];
        for (int x = 0; x < map.cols; ++x) {
          auto& m = pMap[x];
          auto& d = pDepth[x];
          m[0] = std::max(-0.5f, std::min(float(DepthCameraModel::width - 0.5f), d.X));
          m[1] = std::max(-0.5f, std::min(float(DepthCameraModel::height - 0.5f), d.Y));
        }
      }
    
      return true;
    }
    
    //
    // Create a depth camera model using the mapping tables from a Kinect device.
    //
    DepthCameraModelPtr createDepthCameraModelFromKinect(IKinectSensor* dev)
    {
      cv::Mat2f dc;
      if (getDepthToCamTable(dev, dc)) {
        cv::Mat2f cd;
        float sx, ox, sy, oy;
        if (getCamToDepthTable(dev, dc, cd, sx, ox, sy, oy)) {
          return DepthCameraModel::create(dc, cd, sx, ox, sy, oy);
        }
      }
      return{};
    }
    


    Implementation of the DepthCameraModel.h, inlined functions for posting:


    #pragma once
    #include <opencv2/core/core.hpp>
    #include <memory>
    
    //
    // Object to use as a camera model is a shared pointer named DepthCameraModelPtr.
    //
    class DepthCameraModel;
    using DepthCameraModelPtr = std::shared_ptr < const DepthCameraModel > ;
    
    //
    // Kinect natively uses a camera-space that is left-handed and matches
    // the left-handed coordinate frame default of Direct3D as follows, with
    // +x to the right of the depth image, +y toward the top of the depth image,
    // and +z out of the depth image i.e. toward the viewer, and the origin
    // being somewhere near the central pixel:
    //
    //    +y
    //    ^
    //    |  +z
    //    | /
    //    |/
    //    o------> +x
    //
    // The pixel locations that are accepted as input are pixel index values
    // within the image array e.g. pix.x ranges [0, w-1]. The values are
    // floating-point, where the whole numbers are located at the centre of
    // the pixel being indexed e.g. x=0.5 is the boundary between the first
    // and second pixel column. This implies that the range of pix.x should
    // actually be [-0.5, w-0.5], but in practice there is a half-pixel clamp
    // around the image border.
    //
    // Boo for left-handed coordinate systems! Even the lead dev of DirectX
    // thinks it was a bit of a mistake:
    // http://www.alexstjohn.com/WP/2013/07/22/the-evolution-of-direct3d/
    //
    class DepthCameraModel
    {
    public:
      static const int width = 512;
      static const int height = 424;
    
      //
      // NOTE: this is expensive to create but cheap to share.
      //
      static DepthCameraModelPtr create(const cv::Mat2f& depthToCamTable,
                                        const cv::Mat2f& camToDepthTable, float scaleX,
                                        float offX, float scaleY, float offY)
      {
        return DepthCameraModelPtr(new DepthCameraModel(depthToCamTable, camToDepthTable, scaleX, offX, scaleY, offY));
      }
    
      //
      // Convert a pixel location to a direction vector in camera-space.
      // Note the returned vector is not normalised. In fact, it is scaled
      // such that the z-axis is always 1.0 i.e. unit in the depth direction.
      //
      cv::Vec3f pixelToVector(cv::Vec2f depthPixel) const
      {
        auto m = bilinear(mapDepthToCam, depthPixel[0], depthPixel[1]);
        return{m[0], m[1], 1};
      }
    
      //
      // Convert a pixel location and depth to a camera-space 3D point.
      // The 3D point will be in the units of the given depth value.
      //
      cv::Vec3f pixelToVector(cv::Vec2f depthPixel, float depth) const
      {
        auto m = bilinear(mapDepthToCam, depthPixel[0], depthPixel[1]);
        return{depth * m[0], depth * m[1], depth};
      }
    
      //
      // Projects the camera-space point onto the depth image pixel-space.
      //
      cv::Vec2f vectorToPixel(cv::Vec3f point) const
      {
        if (point[2] < 1e-3f)
          return{};
    
        float mx = point[0] / point[2] * scaleX + offX;
        float my = point[1] / point[2] * scaleY + offY;
        auto m = bilinear(mapCamToDepth, mx, my);
        return{m[0], m[1]};
      }
    
      //
      // Convert all depth values to camera-space 3D points. The returned
      // points are in m units, unlike the depth data which is mm units.
      //
      void depthToCam(const cv::Mat1w& depth, cv::Mat3f& dst) const
      {
        dst.create(depth.size());
        for (int y = 0; y < depth.rows; ++y) {
          auto pd = &depth(y, 0);
          auto pm = &mapDepthToCam(y, 0);
          auto po = &dst(y, 0);
          for (int x = 0; x < depth.cols; ++x) {
            auto d = pd[x] * 1e-3f; // convert mm to m
            auto m = pm[x];
            po[x] = {d * m[0], d * m[1], d};
          }
        }
      }
    
      //
      // This data is provided to allow saving of the camera model.
      // Since this map data is shared, these functions return a clone
      // of the images to avoid unintentional modification i.e. slow!
      //
      cv::Mat2f getDepthToCamTable() const
      {
        return mapDepthToCam.clone();
      }
    
      cv::Mat2f getCamToDepthTable() const
      {
        return mapCamToDepth.clone();
      }
    
      float getScaleX() const { return scaleX; }
    
      float getScaleY() const { return scaleY; }
    
      float getOffX() const { return offX; }
    
      float getOffY() const { return offY; }
    
    private:
      cv::Mat2f mapDepthToCam, mapCamToDepth;
      float scaleX = 0, offX = 0;
      float scaleY = 0, offY = 0;
    
      DepthCameraModel(const cv::Mat2f& depthToCamTable,
                       const cv::Mat2f& camToDepthTable, float scaleX,
                       float offX, float scaleY, float offY)
                       : mapDepthToCam(depthToCamTable), mapCamToDepth(camToDepthTable),
                       scaleX(scaleX), offX(offX), scaleY(scaleY), offY(offY)
      {
        if (mapDepthToCam.data == nullptr || mapCamToDepth.data == nullptr)
          throw std::exception("DepthCameraModel: invalid calibration map");
      }
    
      static float bilinear(float x, float y, float f00, float f10, float f01, float f11)
      {
        float ix = 1 - x;
        float iy = 1 - y;
        return (f00 * ix * iy) + (f10 * x * iy) + (f01 * ix * y) + (f11 * x * y);
      }
    
      static void bilinearClamp(float x, int min, int max, int& ix, float& fx)
      {
        if (x < min) {
          ix = min;
          fx = 0;
        } else {
          ix = (int)x;
          fx = x - ix;
          if (ix > max) {
            ix = max;
            fx = 1;
          }
        }
      }
    
      static cv::Vec2f bilinear(const cv::Mat2f& map, float x, float y)
      {
        int ix, iy;
        float fx, fy;
        bilinearClamp(x, 0, map.cols - 2, ix, fx);
        bilinearClamp(y, 0, map.rows - 2, iy, fy);
        auto f00 = map(iy, ix);
        auto f10 = map(iy, ix + 1);
        auto f01 = map(iy + 1, ix);
        auto f11 = map(iy + 1, ix + 1);
        return{bilinear(fx, fy, f00[0], f10[0], f01[0], f11[0]),
          bilinear(fx, fy, f00[1], f10[1], f01[1], f11[1])};
      }
    };
    


    But of course, the better option is to use the intrinsics without all the hoops to jump through and LUTs to save/load.

    Hope that helps.


    • Proposed as answer by sault Thursday, February 26, 2015 1:37 AM
    • Marked as answer by William.Legare Friday, March 6, 2015 3:29 PM
    Thursday, February 26, 2015 1:34 AM
  • Below is code to use the Kinect depth-to-camera-space lookup table. Some notes:

    - The call to "GetDepthFrameToCameraSpaceTable" is expensive (a full millisecond on my own workstation).

    - I do not know whether the lookup table returned by "GetDepthFrameToCameraSpaceTable" is fixed, or if it changes over time (or under different conditions, such as hardware tilt). If the lookup table is fixed, you would want to cache it to avoid repeated calls to "GetDepthFrameToCameraSpaceTable".

    KinectSensor KINECT_SENSOR = (pointer to your Kinect sensor);

    int x = (your x-value in the 2D depth map, in pixels);

    int y = (your y-value in the 2D depth map, in pixels);

    ushort z = (your depth value in the 2D depth map at the given x,y coordinate, in mm);

    if ((z < KINECT_SENSOR.DepthFrameSource.DepthMinReliableDistance) ||

    (z > KINECT_SENSOR.DepthFrameSource.DepthMaxReliableDistance) ||

    float.IsInfinity(z)) { ... handle error ... }

    // get Kinect lookup table -- expensive operation

    PointF[] LUT = KINECT_SENSOR.CoordinateMapper.GetDepthFrameToCameraSpaceTable();

    // get multiplication tuple from LUT

    PointF LUT_tuple = LUT[(y * KINECT_SENSOR.DepthFrameSource.FrameDescription.Width) + x];

    // error-checking

    if (float.IsInfinity(LUT_tuple.X) || float.IsInfinity(LUT_tuple.Y)) { ... handle error ... }

    x_3D_in_mm = (int)(LUT_tuple.X * z);

    y_3D_in_mm = (int)(LUT_tuple.Y * z);

    z_3D_in_mm = z;



    • Proposed as answer by Earth Atlas Thursday, February 26, 2015 7:13 AM
    Thursday, February 26, 2015 7:11 AM
  • - I do not know whether the lookup table returned by "GetDepthFrameToCameraSpaceTable" is fixed, or if it changes over time (or under different conditions, such as hardware tilt). If the lookup table is fixed, you would want to cache it to avoid repeated calls to "GetDepthFrameToCameraSpaceTable".

    The table is constant for a kinect device, and is based on the calibration intrinsics internal to the device.
    Thursday, February 26, 2015 7:16 AM