none
Kinect Speech not getting RecognizerInfo

    Question

  • Hi guys,

    I've been racking my brain trying to get this to work maybe someone here can help.

    Trying to add speech to my application that uses gestures normally.  I'm using the KinectSensorChooser and

    straight after the skeletonStream is enabled I start the sensor then I try to start the voice recognition.

    I'm using the turtle example, which works fine so everything is installed properly.

    I've tracked the code to find that the RecognizerInfo isn't being found.

    Note the "test" messagebox doesn't show:

    private static RecognizerInfo GetKinectRecognizer()
            {
                
                foreach (RecognizerInfo recognizer in SpeechRecognitionEngine.InstalledRecognizers())
                {MessageBox.Show("test");
                    string value;
                    recognizer.AdditionalInfo.TryGetValue("Kinect", out value);
                    if ("True".Equals(value, StringComparison.OrdinalIgnoreCase) && "en-US".Equals(recognizer.Culture.Name, StringComparison.OrdinalIgnoreCase))
                    {
    
                        return recognizer;
                    }
                }
    
                return null;
            }

    Any ideas why? Been stuck on this half the day and can't find a solution.

    Read that there's a bug in which the audiostream stops if the skeleton stream starts after it so I changed the order they start and no difference.

    Can anybody shed some light on this?

    Friday, March 29, 2013 6:27 PM

All replies

  • The SpeechRecognitionEngine doesn't have a direct dependency on the Kinect since it is a separate SDK. You should be able to call this function and the object should iterate through available recognizers. Does this happen the first time you start the application?

    http://msdn.microsoft.com/en-us/library/system.speech.recognition.speechrecognitionengine.installedrecognizers.aspx

    Given the samples work, ensure you haven't started the sensor first. The initialization with Kinect comes later.

    Since you stated you are using the KinectChooser, you might want to look at the ShapesGame sample InitializeKinectServices function since the other samples use the KinectSensor directly. If you are using the "KinectSensorChoose" UI element, is the IsListening property set to true?

    <toolkit:KinectSensorChooserUI x:Name="SensorChooserUI" IsListening="True"  HorizontalAlignment="Center" VerticalAlignment="Top" />

    Friday, March 29, 2013 10:56 PM
    Owner
  • Sorry u'll have to excuse me I'm new to Kinect programming as well as wpf in general.

    I'm not using the KinectSensorChooserUI I don't know what that is I'm just using the KinectSensorChooser from the WPFViewers project.

    Looking at the doc you linked me I should have mentioned I'm using Microsoft.Speech.Recognition not System.Speech if that makes a massive difference?

    At no point in the application does not messageBox show so the foreach loop is just not happening even once.

    My KinectSensorChooser component doesn't have an isListening property?

    Maybe it would help if I post more code.

    The application starts by calling this class to initialize the Kinect:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    
    using Microsoft.Kinect;
    using Microsoft.Kinect.Toolkit;
    
    namespace WpfApplication1
    {
        class KinectDataReadIn
        {
            KinectSensorChooser sensorChooser = new KinectSensorChooser();
            GestureRecognition GestureRecognizer = new GestureRecognition();
            VoiceRecognition SpeechRecognition = new VoiceRecognition();
    
            //Declare Skeletons
            const int skeletonCount = 6;
            Skeleton[] allSkeletons = new Skeleton[skeletonCount];
    
            public void StartKinect()
            {
                sensorChooser.KinectChanged += new EventHandler<KinectChangedEventArgs>(sensorChooser_KinectChanged);
                sensorChooser.Start();
                
            }
    
            //Looks for Kinect until one is connected then starts it
            void sensorChooser_KinectChanged(object sender, KinectChangedEventArgs e)
            {
    
                KinectSensor oldSensor = (KinectSensor)e.OldSensor;
                StopKinect(oldSensor);
    
                KinectSensor newSensor = (KinectSensor)e.NewSensor;
    
                if (newSensor == null)
                {
                    return;
                }
    
                newSensor.ColorStream.Enable();
                newSensor.DepthStream.Enable();
                newSensor.SkeletonStream.Enable();
               // newSensor.AudioSource.Start();
                newSensor.AllFramesReady += new EventHandler<AllFramesReadyEventArgs>(newSensor_AllFramesReady);
    
                try
                {
                    newSensor.Start();
                }
                catch (System.IO.IOException)
                {
    
                    sensorChooser.TryResolveConflict();
                }
    
                isVoiceData(newSensor);
            }
    
            void newSensor_AllFramesReady(object sender, AllFramesReadyEventArgs e)
            {
                //Returns void if not, else sends to recog class
                isGestureData(e);
    
                //implement voice logic here.               
    
            }
    
            private void isVoiceData(KinectSensor sensor)
            {
                SpeechRecognition.CheckSpeech(sensor);
            }
    
            private void isGestureData(AllFramesReadyEventArgs e)
            {
                using (SkeletonFrame skeletonFrameData = e.OpenSkeletonFrame())
                {
                    if (skeletonFrameData == null)
                        return;
    
                    skeletonFrameData.CopySkeletonDataTo(allSkeletons);
    
                    Skeleton user = (from s in allSkeletons
                                     where s.TrackingState == SkeletonTrackingState.Tracked
                                     select s).FirstOrDefault();
                    if (user == null)
                        return;
                    else
                    {
                        
                        GestureRecognizer.findGesture(user);
                    }
                }
            }
    
            //Stops Kinect when program exits
            public void StopKinect(KinectSensor sensor)
            {
                if (sensor != null)
                {
                    if (sensor.IsRunning)
                    {
                        sensor.Stop();
                        if (sensor.AudioSource != null)
                        {
                            sensor.AudioSource.Stop();
                        }
                    }
                }
    
            }
        }
    }
    

    Then I'm trying to deal with the audio in the VoiceRecognition class:

    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Text;
    using System.Threading.Tasks;
    
    using Microsoft.Speech.AudioFormat;
    using Microsoft.Speech.Recognition;
    using Microsoft.Kinect;
    using System.Windows;
    
    namespace WpfApplication1
    {
        class VoiceRecognition
        {
            private SpeechRecognitionEngine speechEngine;
    
            public void CheckSpeech(KinectSensor sensor)
            {
                RecognizerInfo ri = GetKinectRecognizer();
    
                if (null != ri)
                {
                    this.speechEngine = new SpeechRecognitionEngine(ri.Id);
    
    
                    var directions = new Choices();
                    directions.Add(new SemanticResultValue("forward", "FORWARD"));
                    directions.Add(new SemanticResultValue("forwards", "FORWARD"));
                    directions.Add(new SemanticResultValue("straight", "FORWARD"));
                    directions.Add(new SemanticResultValue("backward", "BACKWARD"));
                    directions.Add(new SemanticResultValue("backwards", "BACKWARD"));
                    directions.Add(new SemanticResultValue("back", "BACKWARD"));
                    directions.Add(new SemanticResultValue("turn left", "LEFT"));
                    directions.Add(new SemanticResultValue("turn right", "RIGHT"));
    
                    var gb = new GrammarBuilder { Culture = ri.Culture };
                    gb.Append(directions);
    
                    var g = new Grammar(gb);
    
    
                    speechEngine.SpeechRecognized += SpeechRecognized;
    
    
                    speechEngine.SetInputToAudioStream(
                        sensor.AudioSource.Start(), new SpeechAudioFormatInfo(EncodingFormat.Pcm, 16000, 16, 1, 32000, 2, null));
                    speechEngine.RecognizeAsync(RecognizeMode.Multiple);
                }
            }
    
            private static RecognizerInfo GetKinectRecognizer()
            {
                
                foreach (RecognizerInfo recognizer in SpeechRecognitionEngine.InstalledRecognizers())
                {MessageBox.Show("test");
                    string value;
                    recognizer.AdditionalInfo.TryGetValue("Kinect", out value);
                    if ("True".Equals(value, StringComparison.OrdinalIgnoreCase) && "en-UK".Equals(recognizer.Culture.Name, StringComparison.OrdinalIgnoreCase))
                    {
    
                        return recognizer;
                    }
                }
    
                return null;
            }
    
            private void SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
            {
                // Speech utterance confidence below which we treat speech as if it hadn't been heard
                const double ConfidenceThreshold = 0.3;
    
                if (e.Result.Confidence >= ConfidenceThreshold)
                {
                    switch (e.Result.Semantics.Value.ToString())
                    {
                        case "FORWARD":
                            MessageBox.Show("Forward");
                            break;
    
                        case "BACKWARD":
                            MessageBox.Show("Backward");
                            break;
    
                        case "LEFT":
                            MessageBox.Show("Left");
    
                            break;
    
                        case "RIGHT":
                            MessageBox.Show("Right");
                            break;
                    }
                }
            }
    
        }
    
    }


    Hopefully you can spot the problem now, I've wasted way too much time on this...

    Cheers

    <Edit> Just realized the 1.7 SDK was released... Should mention I'm still using 1.6 and I have to stick with it. </Edit>

    Wednesday, April 03, 2013 1:43 PM
  • <Bump>

    The deadline for this project is pretty soon, and I really need this feature of the application to work.

    If anyone can tell me why it just refuses to work, that would be most awesome.

    Cheers,

    Friday, April 05, 2013 12:00 PM