Kinect benefits as a Default Microphone (Using it in a Silverlight application, I can get ColorStream, DepthStream and SkeletalStream but no Audio Stream) RRS feed

  • Question

  • Hi,

    I just want to know if there is any benefits using Kinect Sensor Microphone Array as a Default Microphone on a PC. I mean, I am doing a Silverlight 5 application where I do get the ColorStream, DepthStream and SkeletalStream, I am also able to get the audioStream but the performance is not good enough because I am using the Coding4Fun.KinectService on codeplex and the way it works is getting the kinect sensor input on a TCP port, so if I work with the audio in this way the latency increasingly makin the app useless.

    So as I want to add speech recognition features to my silverlight app I would have to use the SAPI COM from silverlight, an issue well known but what I want to ask you is if I use the kinect sensor as a default microphone on my pc I would get benefits from it?


    Friday, April 6, 2012 3:13 PM


All replies

  • Yes,

    you will be able to do speech recognition easily in your application

    MOHAMED A. SAKR | Software Development Lead Engineer | EgyptNetwork
    Please remember to click “Mark as Answer” on the post that helps you. This can be beneficial to other community members reading the thread. Also try to Vote as Helpful

    Friday, April 6, 2012 8:30 PM
  • Hi,

    I know I can do speech recognition with the well know Automation code to the SAPI engine, but my question is about what benefits will bring the kinect sensor to me as a default microphone if I am not using the Microsoft.Speech engine nor all the code provided by the Microsoft Kinect SDK.

    Could you light me please?

    Friday, April 6, 2012 8:41 PM
  • You should get better recognition results if you switch from the SAPI engine to the MSFT speech engine... The beam forming and custom accoustic models should give better results, as long as it's in a supported scenario. 

    What are you using as the input to SAPI right now?  What's your speech scenario?

    If you're not using any of the Microsoft Speech engine, or the Kinect SDK, or the accoustic models built for the Kinect microphone, the microphone will just show up to the operating system as an array microphone.  It will provide 4 channel audio.  All of the interesting stuff happens above that in the software.  (It's worth calling out that the results of the array microphone + beam forming + accoustic models should give a significant improvement over anything but a head mounted mic. :) The geometry of the array was built and optimized by Microsoft Research and provides excellent noise cancellation capabilities.)

    Other ways you could use the mic (if you're using the Kinect SDK) are to identify the direction that sound is coming from, and the person who was speaking (by correlating the direction w/ the directions of tracked skeletons)

    Friday, April 6, 2012 10:35 PM
  • Hi Chris, thanks for your response.

    I cannot use (Or I dont know how) Microsoft Speech, I identified that the best things about Kinect Technology are on the software. My scenario is silverlight running inside browser or out of brower. Microsoft does not release SDK that works with silverlight but with the help of the Coding4Fun KinectService (which doesnt work with silverlight either), I create some classes that let me use the ColorStream, DepthStream and SkeletalStream on Silverlight but the delay (which is high) of the AudioStream makes me think that I should use another API besides that the Microsoft.Speech doesnt work with silverlight too, or there is a way to make it work? Could you guide me please? 

    Saturday, April 7, 2012 2:03 AM
  • There's a demo video of using Speech Synthesis in Silverlight at:

    I haven't tried to do the same thing w/ speech reco, but I believe it should work.

    There's also a discussion of how to get access to the Kinect APIs from silverlight here:

    Saturday, May 12, 2012 4:32 PM