locked
Kinect Sound Location and "Background noise cancelling"? RRS feed

  • Question

  • Hi all, first time poster, please let me know if there is anything i've missed or can do to better help you guys help me!

    I'm currently working with the Kinect SDK in C++, and successfully have the kinect detecting the location of my voice as per the AEC walkthrough.  Perfect.

    However, in the project i'm working on, I want the kinect to continually determine the location of a sound as it moves.  For example: I sing a note to the kinect from the left, and walk around to the right side of the device whilst still holding the same note. The desired result is that the determined location will move with me as i walk from one side to the other.  The actual result however, is that the Kinect will determine the sound to be coming from the left - and that is all. I can move to the right side holding that exact same note but my results suggest the sound hasn't moved or no sound is detected.

    If i then cut off my voice for a moment then make another sound, or even if i shoot the pitch of my voice up one or two octaves - the results show a new sound has registered, and the location updates as expected.

    My thinking is that the kinect (or my application) is determining my voice to be "Background Noise" since it is constant and is therefore cancelling it out.  Which is why when a new sound is introduced it picks it up.  Is there a way for me to achieve my desired result (mentioned above)? Can the Kinect do this? Or is this just a fundamental issue that I cannot get around?

    Thanks in advanced everyone.

    Tuesday, August 16, 2011 9:32 AM

Answers

  • jemballs,

    The sound source localizer is optimized and tested for speech sound patterns, so while holding a single note and moving around might sometimes get picked up correctly, you'll notice that the confidence value returned by ISoundSourceLocalizer::GetPosition starts going down to zero, because it stops finding speech-like patterns in sound source. You might be able to get some localization by lowering your confidence threshold but, as mentioned, the sound localizer is optimized for speech signals.

    You won't be able to get reliable localization results for the scenario you mention from the Kinect SDK, unless maybe you try to do a domain-specific (for your domain) sound localization algorithm from the raw 4-channel microphone array input. You can get this input as shown in C:\Users\Public\Documents\Microsoft Research KinectSDK Samples\Audio\AudioCaptureRaw\CPP sample.

    Good luck, and sorry that API can't help you more.
    Eddy


    I'm here to help
    Friday, August 19, 2011 12:13 AM

All replies

  • jemballs,

    The sound source localizer is optimized and tested for speech sound patterns, so while holding a single note and moving around might sometimes get picked up correctly, you'll notice that the confidence value returned by ISoundSourceLocalizer::GetPosition starts going down to zero, because it stops finding speech-like patterns in sound source. You might be able to get some localization by lowering your confidence threshold but, as mentioned, the sound localizer is optimized for speech signals.

    You won't be able to get reliable localization results for the scenario you mention from the Kinect SDK, unless maybe you try to do a domain-specific (for your domain) sound localization algorithm from the raw 4-channel microphone array input. You can get this input as shown in C:\Users\Public\Documents\Microsoft Research KinectSDK Samples\Audio\AudioCaptureRaw\CPP sample.

    Good luck, and sorry that API can't help you more.
    Eddy


    I'm here to help
    Friday, August 19, 2011 12:13 AM
  • Hi Eddy,

    Thanks for clarifying that for me.  I'll have a look at that domain-specific sound localization, but if all else fails, i'll tweak the design of my application and see what I can do.

    Thanks again for your reply!

    Sunday, August 21, 2011 11:15 PM