Microsoft Speech Platform Dependency Options RRS feed

  • Question

  • The Kinect Speech Recognition depends on the Microsoft Speech Platform 10.2 (  This namespace (Microsoft.Speech.*) already has an analogous namespace in .NET (System.Speech).  This can be quite confusing and finally became clear due to the following post:

    The problem with using Microsoft Speech Platform is first every user has to install the items listed in the Readme above.  In addition my second link above list some important restrictions of Microsoft Speech Platform.  For one speech recognition cannot control multiple applications and dictation capabilty is not present.

    1) To utilize .NET System.Speech instead of Microsoft Speech Platform in the samples do I simply change my use statements from using Microsoft.Speech.* to using System.Speech.*.

    2) The SDK also supplies a Language Pack Acoustic model (See Readme) which I assumes tailors the recognition engine for the Kinect Microphone Array.  Can I make use of this Acoustic model in the System.Speech or is it baked into the SDK somehow?

    3) The fundamental question is why doesn't Kinect convert to the .NET System.Speech which is identical for all intensive purposes?  As a last resort can the System.Speech and Microsoft .Speech be used in the same App?

    Note: I have also started a thread concerning using Advanced Audio in the Speech namespace (


    Thursday, July 21, 2011 7:32 PM


All replies

  • aidesigner,

    The recommendation from the speech team is to use Microsoft.Speech.* namespace.

    1) Yes, it is possible to use System.Speech with Kinect SDK Beta, but we will not provide support for that scenario as there is no Kinect-specific acoustic model built for that API set.

    2) No, the acoustic model is not compatible with System.Speech

    3) The acoustic models built specifically for Kinect are for Microsoft.Speech.* namespace, and they are more accurate than the default acoustic models for System.Speech.

    Hope this answers your questions,

    I'm here to help
    Thursday, July 21, 2011 8:22 PM
  • 1) The disadvantage of Microsoft.Speech is I lose dictation ability, Control of running applications, and require user does 3 downloads (See initial link).  The only disadvantage of System.Speech (.NET) is I lose the acoustic model.  Basically is it correct that I still have AEC, beam positioning, etc (Part of Microsoft.Research.Kinect.*)?  I get the impression it is really important to use Microsoft.Speech- Is this solely because of the Acoustic model?

    1a) Exactly how important is the Kinect Acoustic Model to speech recognition accuracy?

    2) Do you think Kinect will ever just migrate to System.Speech which is already in .NET?

    For my App I may give the user the option to use a standard microphone (Highly Recommend Kinect).  For this and other reasons it would be nice to not be tied to Microsoft.Speech.  However I am starting to lean toward Microsoft.Speech as I want the full power of the Kinect.

    Thanks Again,

    Friday, July 22, 2011 3:55 PM
  • I have taken a slightly different approach to this problem.  I am now using Microsoft's low level SAPI interface for speech recognition because it has functionality I require.  However I still need some functionality of Kinect SDK, specifically AEC and Accoustic Model.

    My questions is can I obtain an audio stream with Kinect SDK (Microsoft.Speech.*) with AEC active, and then pipe the stream to SAPI 5.4 for speech recognition?  If so how can this be done from the Kinect SDK libraries?

    • Edited by aidesigner Thursday, September 29, 2011 3:44 PM
    Thursday, September 29, 2011 3:44 PM
  • Your best bet if you want to use SAPI is to use the C++ interface for Kinect SDK. I gave some suggestions on how to do this in these two threads:

    However, we are unable to provide extensive support for SAPI integration at this time.

    Good luck!

    I'm here to help
    Monday, October 3, 2011 9:20 PM