none
What is Microsoft Speech Platform? RRS feed

  • Question

  • What is Microsoft Speech Platform? How does it relate to SAPI 5?

    I'm developing in native code, not .net. Currently I use SAPI 5, or rather, the COM automation objects I got from importing the SAPI 5 type library. My old speech recognition code still works* with the Kinect, but it doesn't have very good accuracy, and it doesn't work over the sound from the game. Also I'd like to be able to tell where a sound came from, or listen in a specified direction. Ideally, I'd like to be able to listen to two directions at once, corresponding to the two skeletons, and tell who said what.

    So, what's the relationship between what I'm doing now with SAPI, and the Kinect way of doing speech recognition? Can I still use SAPI? What is the Microsoft Speech Platform? I installed version 11, and some engines, but it didn't seem to affect anything in the speech control panel, or in speech applications. What's the difference between versions 10 and 11 of the Microsoft Speech Platform?

    And why doesn't Microsoft explain what things are anymore? I can't find anywhere that says what Microsoft Speech Platform actually is.

    * When I say "still works" I mean since Vista was introduced, when I (and everyone else in the world) had to change it from using a shared recogniser to detecting whether it is running on vista and above or not, and if it's running on vista creating an inproc recogniser instead of a shared one, to get around the deliberate bug introduced by Microsoft's arrogance. 

    Wednesday, December 28, 2011 1:59 PM

Answers

  • Kinect itself does not introduce a new speech platform.  However, we leverage the Microsoft Speech platform (link to download it is in the toolkit), which is a more recent technology than SAPI. There are significant differences between SAPI & Microsoft.Speech, most notably that the current technology is speaker independent.  That is, it does not require training data for a specific user.

    In addition, we ship language packs for Kinect which include acoustic models trained based on the Kinect.  Using these models will achieve significantly better results than using the base acoustic models which were trained on standard microphones.

    Tuesday, May 29, 2012 9:42 PM

All replies

  • I second this, it would really be helpful to get clarification about this.  I want to add speech recognition to a windows application.  Do I use SAPI 5.4 or the "Speech Platform"?  What are the differences?

    Monday, February 13, 2012 12:12 AM
  • the kinect does not introduce anything new to speech recognition platform ,microsoft added the language pack for the kinect and a stream method to use for adding the kinects audio stream as the audio source fot the speech recognizer. if you want to know where the sound is comming from using the kinect then you got to impliment that your self and if you want to add speech to other applications then you need to do that youselfs as well by selecting the default audio source or feed a stream from your application to the new use stream method as audio source on the speech recognizer. if sapi mirrors the managed speech platform then theres no diffrence between them.
    Monday, February 13, 2012 4:55 AM
  • Kinect itself does not introduce a new speech platform.  However, we leverage the Microsoft Speech platform (link to download it is in the toolkit), which is a more recent technology than SAPI. There are significant differences between SAPI & Microsoft.Speech, most notably that the current technology is speaker independent.  That is, it does not require training data for a specific user.

    In addition, we ship language packs for Kinect which include acoustic models trained based on the Kinect.  Using these models will achieve significantly better results than using the base acoustic models which were trained on standard microphones.

    Tuesday, May 29, 2012 9:42 PM