none
Speech Recognition with Kinect in C++ RRS feed

  • Question

  • Hello! I have the following problem:

    I can't figure out how to setup a speech recognizer in C++ using an audio stream from the Kinect.

     

    I know how to setup a basic speech recognizer using SAPI, and I see that ISpRecognizer has a method setInput that seems to accept some sort of audio stream.

    The kinect audio hardware is a IMMDevice, that, I guess, offers some stream capabilities (I looked up in the documentation, but it is overwhelming, I can't understand much).

    I can't figure out how to connect the two things, assigning to the ISpRecognizer a stream input from the kinect IMMDevice.

     

    Thank you

     

    PS: I know there is a C# sample, I tested it, even defining my own grammar, it's awesome, but I have to use C++ to develop an audio module for an existing big application that cannot be ported to C#

    Monday, June 27, 2011 10:25 PM

Answers

All replies

  • I have the same problem - i also want to use the speech recognition in C++ with the Kinect audio streem.

    Anybody an idea how this could be managed?

    Thursday, June 30, 2011 7:49 AM
  • Sorry for the really long delay in responding. The speech platform does support C++ development, as outlined here (http://download.microsoft.com/download/1/7/7/177963E6-9F59-4497-AB55-1DCBB7139ACF/MicrosoftSpeechSDK.chm). From what I've been able to find, you'll need to use ISpRecognizer::SetInput() to set the input to the audio stream from the Kinect.

    The input should be an SpStream, which you can set to the Kinect's audio using ISpStream::SetBaseStream, which takes an IStream. You will also need to define a WAVEFORMATEX structure appropriate for Kinect audio.

    We won't be able to provide much support for this scenario, unfortunately, so I hope that the information provided above is enough to get you started.  You may find additional support at the Developer Center for Microsoft Speech Technologies (http://msdn.microsoft.com/en-us/speech/default.aspx).

    Hope this helps,
    Eddy


    I'm here to help
    Thursday, July 7, 2011 9:23 PM
  • Sorry for the really long delay in responding. The speech platform does support C++ development, as outlined here (http://download.microsoft.com/download/1/7/7/177963E6-9F59-4497-AB55-1DCBB7139ACF/MicrosoftSpeechSDK.chm). From what I've been able to find, you'll need to use ISpRecognizer::SetInput() to set the input to the audio stream from the Kinect.

    The input should be an SpStream, which you can set to the Kinect's audio using ISpStream::SetBaseStream, which takes an IStream. You will also need to define a WAVEFORMATEX structure appropriate for Kinect audio.

    We won't be able to provide much support for this scenario, unfortunately, so I hope that the information provided above is enough to get you started.  You may find additional support at the Developer Center for Microsoft Speech Technologies (http://msdn.microsoft.com/en-us/speech/default.aspx).

    Hope this helps,
    Eddy


    I'm here to help


    Hey Eddy!

     

    When you say that you won't be providing much support for this scenario, do you mean support for a speech recognizer in C++? Also, how does one go about coding a WAVEFORMATEX structure appropriate for kinect audio? Has there been any development of the documentation with respect to this?

    Thursday, January 5, 2012 6:48 AM
  • You can write a COM wrapper for C#, the simple speech sample can work, and then implement in C++ to call functions directly. Look into writing a C# COM in Google and take the sample and on each events save the results of the event to a local variable and then call a function from a timer or which ever from C++ to get the results. 
    Tuesday, January 17, 2012 7:57 PM
  • But can I do that and receive video via C++ simultaneously?
    Clever, witty forum signature.
    Thursday, January 26, 2012 11:37 PM