none
AudioBeam - Confidence value for multiple directions RRS feed

  • Question

  • In the AudioBasics-D2D sample code, the Kinect is automatically setting the beam angle and then making available the selected angle and confidence value.

    // Get audio beam angle and confidence

    pAudioBeamSubFrame->get_BeamAngle(&fBeamAngle);

    pAudioBeamSubFrame->get_BeamAngleConfidence(&fBeamAngleConfidence);

    But I would like to get the BeamAngleConfidence for other directions besides the active beam angle so that I can choose a different direction manually if necessary. In order to choose the best direction (or sound source location), the Kinect must be calculating those values for other directions [-50...50] and choosing the most likely. Does the API reveal those estimates?

    Tuesday, January 20, 2015 7:21 PM

All replies

  • I really need the beam stream AND some estimate of confidence for different directions. If I have to, though, I can calculate the confidence with the raw stream data. If I cannot get the BeamAngleConfidence for other directions, can I get the raw (4-channel) microphone data AND the beam buffer at the same time? I don't care what language I use... I can switch to anything that will let me do this. Also, it's ok if I drop a frame every now and then...
    Wednesday, January 21, 2015 12:33 AM
  • The default for how the beam works is to automatically detect the strongest source and using IAudioBodyCorrelation determine which tracked user might be the source.

    You can override the IAudioBeam to manual and control the beam angle and acquire the confidence values in that direction.

    https://msdn.microsoft.com/en-us/library/microsoft.kinect.kinect.iaudiobeam.aspx

    • put_AudioBeamMode - AudioBeamMode_Manual
    • put_BeamAngle

    Carmine Sirignano - MSFT

    Wednesday, January 21, 2015 7:42 PM
  • I can override the AudioBeam and steer it, yes, but to estimate the confidence value for 10 different directions, I would still need to set each direction, then sample, and then estimate the confidence, right? How large does each sample need to be for a quality confidence estimate? And the system needs to stop recording data at the same time... right?

    If necessary, I CAN stop recording and resample, but if the resampling takes a second per sweep of the 10 beams, it won't work.

    Friday, January 23, 2015 7:43 PM
  • I don't quite understand what you are asking. Beamforming focuses on the "strongest source" to detect where the sound is coming from. Setting it to automatic will provide beam direction and provide how confident it was in the guess. The stronger the source and how noisy the area will be factors. From that you can deduce falloff for the other regions as demonstrated in the audio sample.

    If you want to setup your own manual beams and get the values at the "same time", create different threads for each angle. This is qualified because Windows is not a real-time operating system. Depending on the speed of the CPU and how you setup the threads will be factors you need to work with. With audio you have to worry about latency.


    Carmine Sirignano - MSFT

    Monday, January 26, 2015 7:09 PM
  • Thank you for your reply. I DO want to setup multiple manual beams at sample them at the same time, but it is not clear to me that this is possible. Looking at the AudioBasics-D2D code, creating a new IAudioBeam and setting the beam angle using put_BeamAngle seems to only overwrite the previous beam. Instead of storing multiple beams in the list. The resulting pAudioBeamList->get_BeamCount is still only 1. Furthermore, the comments on the pAudioBeamFrameList->OpenAudioBeamFrame say that only one beam is currently supported. So do I need to create multiple frame readers, then? Or should I create multiple AudioSources or...? As you mentioned, latency is a real issue, but I also don't want to drop frames because I am missing events...
    Tuesday, January 27, 2015 6:10 PM
  • You would use multiple audio sources one per thread, or use the WASAPI api's to do this for low level access to the microphones. 

    Carmine Sirignano - MSFT

    Wednesday, January 28, 2015 5:47 PM