none
Follow an individual. RRS feed

  • Question

  • Is it possible to have the kinect respond to only one person’s voice? For example, in a crowdie room, could I follow the commands of only one particular individual?

     Thank you!

    -Martin


    Martin Paré
    Wednesday, January 11, 2012 7:52 PM

Answers

  • The Kinect does not have a built in ability to recognize or distinguish one persons voice vs. anothers.  You can do some things like check the sound source angle and verify that it matches the direction of the person to whom you would like to listen, but this is not a 100% solution.

    Also, a crowded room w/ lots of people talking is going to be a very difficult accoustic environment to do speech reco in.  The system needs to get a > 20 db signal to noise ratio to be able to recognize speech, and you are unlikely to get that in a crowded room, even w/ beam forming.

    We do not consider speech a good input modality for situations where there is lots of background noise at this point, although we are looking at ways to improve that in the future.  If you do build a speech enabled UI that is to be used in that kind of environment,

    1) Be 100% sure to test it in the real environment you'll be running.  You might be able to get it to work, but don't be surprised if you hit blocking issues

    2) Make sure you provide backup ways of controlling the interface (e.g. gesture)

    Tuesday, April 3, 2012 5:16 PM

All replies

  • I have not been able to listen to commands of one person with even one other speaker in the room. (Manually or automatically setting listen direction.)
    • Proposed as answer by Emile Victor Friday, January 13, 2012 4:42 AM
    Thursday, January 12, 2012 5:51 PM
  • Should I understand that you can listen to one person in a quiet room but that you have a hard time when there is more than one person talking?

     

    Thank you!


    Martin Paré
    Thursday, January 12, 2012 6:13 PM
  • Yes.
    • Proposed as answer by Emile Victor Friday, January 13, 2012 4:42 AM
    • Unproposed as answer by Emile Victor Friday, January 13, 2012 4:42 AM
    Friday, January 13, 2012 1:15 AM
  • The Kinect does not have a built in ability to recognize or distinguish one persons voice vs. anothers.  You can do some things like check the sound source angle and verify that it matches the direction of the person to whom you would like to listen, but this is not a 100% solution.

    Also, a crowded room w/ lots of people talking is going to be a very difficult accoustic environment to do speech reco in.  The system needs to get a > 20 db signal to noise ratio to be able to recognize speech, and you are unlikely to get that in a crowded room, even w/ beam forming.

    We do not consider speech a good input modality for situations where there is lots of background noise at this point, although we are looking at ways to improve that in the future.  If you do build a speech enabled UI that is to be used in that kind of environment,

    1) Be 100% sure to test it in the real environment you'll be running.  You might be able to get it to work, but don't be surprised if you hit blocking issues

    2) Make sure you provide backup ways of controlling the interface (e.g. gesture)

    Tuesday, April 3, 2012 5:16 PM