# Skeleton Tracking with more than 2 Kinects

### Question

• Hi everybody,

I wrote a program which merges two skeletons coming from two different Kinect sensors. I used transformation matrices for transforming the coordinates of Kinect2 in those of Kinect1 and then interpolated the two skeletons.

While doing this, I noticed that there was a quite big error: this error originally comes from the "non-precision" of the skeleton tracking system and then it's propagated by the computation of the new coordinates of Kinect2 and then by the interpolation.

Now, suppose I want to extend this program to 3 or more Kinects (for now, running on the same computer). There will be a new issue: how to combine the sensors, in a way that I have on the resulting skeleton the smallest error possible? For example:

-3 Kinect K1, K2, K3: I could transform K3 to K2, interpolate and then transform the result to K1 and interpolate again or I could transform both K3 and K2 to K1 and interpolate all of them

-4 Kinect K1, K2, K3, K4: same. I could transform K4 to K3, interpolate, then same with the result and K2, then same again with the result and K1 or I could transform all of them to K1 and interpolate.

Could you please help me to understand how to reason with this situations? Is the best solution the one with the smallest number of computations?

Then I guess that with n sensor on m computers it will be the same kind of reasoning.

Thanks in advance and sorry for my bad english.

Monday, September 30, 2013 9:41 AM

• I have an algorithm for treating the inferred joints in a proper way, so I'm not considering that problem.

"One last note is Kinect Joint coordinates are not reliable if too far or hidden by other joints." So isn't that true that if two sensors are near, it's statistically more likely that if a joint is tracked for one, it is tracked also for the other and in this case I have confident coordinates coming from both sensors? So, if it's true, the chaining could be better for this reason.

Not necessarily, Having a person too far away or too close means they are not tracked with any Kinect. Range is the same with any Kinect and is not improved with multiple Kinect's. Same for quality unless you are making a cloud point system for 3d reconstruction in which you can verify that the data is of very good quality.  In addition, if you have one person in front of another and try to retrieve the coordinates of the person behind the first one its impossible or very difficult. I have a very simple project at on codeplex called Kinect multipoint (http://kinectmultipoint.codeplex.com)  in which I am doing multiple users (will later have to use multiple Kinect's to do so). I find that the Kinect sdk software from Microsoft tends to have some problems with multiple users such as one person standing in front of another person, sometimes when testing the Kinect picks up the table or other objects on low resource systems, and more then 2 players currently have to looped. I have found the fix to the second multi-user problem above which is to wave the hand in front of the Kinect resetting the depth stream or having the software roll the Kinect up and back down to retrieve a more clear image.

Big Note: do not point the Kinect's beams at each other as this will reduce quality of the depth image retrieved because the infrared beams are interfering with each other.

Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr

• Edited by Wednesday, October 02, 2013 8:04 PM edit2
• Marked as answer by Thursday, October 03, 2013 9:45 AM
Wednesday, October 02, 2013 8:02 PM

### All replies

• In order to prevent the error to increase across interpolation et minimize computation load, you should transform directly each Kinect referential to the target one (instead of chaining computation).

• K1-> no change
• K2*mat21->K1
• K3*mat31->K1
• K4*mat41->K1

You may be able to pre-compute (during software initialization and not during frame processing) the matrices mat31 and mat41 because they are just product of known matrices (mat31=mat32*mat21, mat41=mat43*mat32*mat21).

One last note is Kinect Joint coordinates are not reliable if too far or hidden by other joints.

I think in some case, averaging positions from multiple Kinects may increase the computation error as some joints are just interpolated (and wrong) by Kinect SDK. You should use a kind of confidence weighted-average algorithm in which "weight" varies depending on which Kinect sees with confidence which joint.

Hope this helps

Vincent Guigui Innovative Technologies Expert at OCTO Technology MVP Kinect

Monday, September 30, 2013 3:41 PM
• I have an algorithm for treating the inferred joints in a proper way, so I'm not considering that problem.

"One last note is Kinect Joint coordinates are not reliable if too far or hidden by other joints." So isn't that true that if two sensors are near, it's statistically more likely that if a joint is tracked for one, it is tracked also for the other and in this case I have confident coordinates coming from both sensors? So, if it's true, the chaining could be better for this reason.

Monday, September 30, 2013 4:17 PM
• I have an algorithm for treating the inferred joints in a proper way, so I'm not considering that problem.

"One last note is Kinect Joint coordinates are not reliable if too far or hidden by other joints." So isn't that true that if two sensors are near, it's statistically more likely that if a joint is tracked for one, it is tracked also for the other and in this case I have confident coordinates coming from both sensors? So, if it's true, the chaining could be better for this reason.

Not necessarily, Having a person too far away or too close means they are not tracked with any Kinect. Range is the same with any Kinect and is not improved with multiple Kinect's. Same for quality unless you are making a cloud point system for 3d reconstruction in which you can verify that the data is of very good quality.  In addition, if you have one person in front of another and try to retrieve the coordinates of the person behind the first one its impossible or very difficult. I have a very simple project at on codeplex called Kinect multipoint (http://kinectmultipoint.codeplex.com)  in which I am doing multiple users (will later have to use multiple Kinect's to do so). I find that the Kinect sdk software from Microsoft tends to have some problems with multiple users such as one person standing in front of another person, sometimes when testing the Kinect picks up the table or other objects on low resource systems, and more then 2 players currently have to looped. I have found the fix to the second multi-user problem above which is to wave the hand in front of the Kinect resetting the depth stream or having the software roll the Kinect up and back down to retrieve a more clear image.

Big Note: do not point the Kinect's beams at each other as this will reduce quality of the depth image retrieved because the infrared beams are interfering with each other.

Once you eliminate the impossible, whatever remains, no matter how improbable, must be the truth. - "Sherlock holmes" "speak softly and carry a big stick" - theodore roosevelt. Fear leads to anger, anger leads to hate, hate leads to suffering - Yoda. Blog - http://www.computerprofessions.co.nr

• Edited by Wednesday, October 02, 2013 8:04 PM edit2
• Marked as answer by Thursday, October 03, 2013 9:45 AM
Wednesday, October 02, 2013 8:02 PM
• hi

I am working on the same issue, if you like to discuss together you can contact me salous848@hotmil.com

Tuesday, September 16, 2014 10:24 AM