none
SpeechSynthesizer memory fragmentation ( leak )

    Question

  • Hi,

     

    I'm trying to add some text2speech capabilities to an existing WPF application. But I'm running into increased memory usage problems.

    For a simple program like this:

    class Program
        {
            static SpeechSynthesizer m_speech;
            static void Main(string[] args)
            {
                m_speech = new SpeechSynthesizer();
                for (int i = 0; i < 2000000; i++)
                {
                    m_speech.SpeakAsync("hello");
                    Thread.Sleep(1000);
                }
            }
        }

     running it in the CLR profiler for 5 minutes shows very fragmented memory with WAV headers pinned, and the Garbage Collector not being able to compact the heap ( or completely collect internal objects ). Even using multiple Synths and disposing of them after a number of SpeakAsync()s will not unpin the handles.

    What's the proper use of the SpeechSynthesizer as to avoid these memory problems?

     

    Thanks in advance.

     

    PS: screenshots of CLRProfiler Timeline and Objects by Address views here

    • Moved by Yves.Z Friday, February 04, 2011 8:43 AM Speech related (From:Windows Presentation Foundation (WPF))
    Monday, January 24, 2011 4:05 PM

All replies

  • SpeechSynthesizer implements IDisposable, you should be calling Dispose on it.  Try that and see if you still have the leaks.

    Edit: actually, I missed your comment on having tried dispose, but you don't have it in your code.  Your sleep is only 1 second long and your loop is 2000000, I think it takes longer than 1 second to say Hello doesn't it?  I would try a more reasonable test. 3 to 5 second sleep loop of 100.  Then dispose and run the GC.  Remember this is an Async call.

     


    John Fenton
    Wordmasters Direct Mail and Data Processing Services
    Monday, January 24, 2011 4:51 PM
  • The timings are irrelevant since some of the SpeakAsync() calls will not be serviced in the worst case scenario. And I could have also used the Speak() call which is a synchronous call and remove the Sleep from there. That doesn't fix the issue where dangling handles are left in memory which causes the fragmentation.

    Dispose() should indeed be called at the time when the object is no longer needed. Unfortunately I do need that object for an extended period of time during the lifetime of the process and having to call Speak() 2 times or 1000 times should not make any difference regarding the memory used (fragmented) in the long run.

     

    Here's the program I tried with as well to Dispose of a Synth after 5 Speaks, in my attempt to "force" unpinning of those wav headers (CLRProfiler will yield similar Timeline and Objects by Address views as above ):

    using System;
    using System.Collections.Generic;
    using System.Speech.Synthesis;
    using System.Threading;
    using System.ComponentModel;

    namespace Text2SpeechLeak
    {
        class Program
        {
            class SynthHandler : IDisposable
            {
                SpeechSynthesizer m_synth;
                int m_timesSpoken;

                public event EventHandler<SpeakCompletedEventArgs> SpeakCompleted;

                public int TimesSpoken
                {
                    get { return m_timesSpoken; }
                    set { m_timesSpoken = value; }
                }

                public SynthHandler()
                {
                    m_synth = new SpeechSynthesizer();
                    m_synth.SpeakCompleted += new EventHandler<SpeakCompletedEventArgs>(m_synth_SpeakCompleted);
                    m_timesSpoken = 0;
                }

                void m_synth_SpeakCompleted(object sender, SpeakCompletedEventArgs e)
                {
                    m_timesSpoken++;
                    if (SpeakCompleted != null)
                    {
                        SpeakCompleted(this, e);
                    }
                }

                public void Dispose()
                {
                    m_synth.SpeakCompleted -= m_synth_SpeakCompleted;
                    m_synth.Dispose();
                    SpeakCompleted = null;
                }

                public void Speak(string text)
                {
                    m_synth.SpeakAsync(text);
                }

            }


            class Worker
            {
                Stack<SynthHandler> m_stack;
                const int NUM_SYNTHS = 3;
                const int MAX_SPEAKS = 5;
                const int NUM_ITERATIONS = 300;
                const int SLEEP_TIME = 3000;
                string SPEAK_TEXT = "hello";

                AutoResetEvent m_event;

                public Worker(AutoResetEvent ev)
                {
                    m_event = ev;
                }

                public void DoWork(object sender, DoWorkEventArgs e)
                {
                    m_stack = new Stack<SynthHandler>();
                    for (int i = 0; i < NUM_SYNTHS; i++)
                    {
                        var synth = new SynthHandler();
                        synth.SpeakCompleted += new EventHandler<SpeakCompletedEventArgs>(synth_SpeakCompleted);
                        m_stack.Push(synth);
                    }

                    for (int i = 0; i < NUM_ITERATIONS; i++)
                    {
                        if (m_stack.Count == 0)
                            continue;

                        var synth = m_stack.Pop();
                        synth.Speak(SPEAK_TEXT);
                        Thread.Sleep(SLEEP_TIME);
                    }

                    m_event.Set();
                }

                void synth_SpeakCompleted(object sender, SpeakCompletedEventArgs e)
                {
                    var synth = (SynthHandler)sender;

                    if (synth.TimesSpoken >= MAX_SPEAKS)
                    {
                        synth.SpeakCompleted -= synth_SpeakCompleted;
                        synth.Dispose();
                        synth = new SynthHandler();
                        synth.SpeakCompleted += new EventHandler<SpeakCompletedEventArgs>(synth_SpeakCompleted);
                    }

                    m_stack.Push(synth);
                }
            }
           
            static void Main(string[] args)
            {
                var m_event = new AutoResetEvent(false);
                BackgroundWorker bgw = new BackgroundWorker();
                var worker = new Worker(m_event);
                bgw.DoWork += new DoWorkEventHandler(worker.DoWork);

                bgw.RunWorkerAsync();

                m_event.WaitOne();

            }
        }
    }

    Monday, January 24, 2011 8:34 PM
  • I kind of put my foot in my mouth when I missed that you had tried dispose.  Hopefully one of the Microsoft guys, or some one with more memory leak experience than me will come along and help clear this up.  It's not my area of expertise.

    One thing about this does confuse me a bit.  Are you waiting for the loop to complete and the garbage collector to run before you analyze the results in the CLR profiler?  If the loop is still running, I'm not sure I would be surprised by the fragmentation and  the dangling handles.  The question would be do they still exist after the loop has completed, the objects have been disposed and the GC has done it's work.  If the GC is able to fully clean it up, once the loop has completed, then I don't think you have a leak.

    The fragmentation I would somewhat expect here, you are using the speech synthesizer in a way that really doesn't fit with what I would consider normal usage.  Not sure what your usage scenario is, but this is how I would approach the problem to reduce fragmentation.  First, I would only use one SpeechSynthesizer and not dispose of it till I was done with it, and rather then sending it individual words, I would queue the words and send longer strings to it, waiting until it completed the previous string before sending the new string.  You'll need to divide it up into small enough strings that the synth can handle them.  (From playing with your code, it has a pretty small limit.)  I don't think the class was ever designed to be a full bore text reader, but if you handle it like that, it might work.

    Like I said, hopefully some one with more experience in this area than me comes along and helps out.  But that's my 2 cents.


    John Fenton
    Wordmasters Direct Mail and Data Processing Services
    Tuesday, January 25, 2011 3:44 AM
  • It's no problem at all. The way you would use the Synth is exactly how I would envision it should work too. Seems logical and to the point.

    The use case is something along of: user clicks on an item on the screen (image, button, custom control) and the Synth will Speak() the associated string that comes with it ("cat", "dog","Apply"). This is why I'm only sending one word to the Synth at a time and can't really concat words together. The reason there are multiple Synths instantiated is because if the app is multitouch used by multiple users (let's say 4) concurrently it would be ..."cool"... if every user had his own dedicated Synth.

    The GC does run (the Timeline view shows a reversed sawtooth wave...basically at each drop in the wave the GC cleans memory....or supposedly).

     

    Thanks for the help nonetheless. Much appreciated.

    Tuesday, January 25, 2011 2:46 PM
  • The use case is something along of: user clicks on an item on the screen (image, button, custom control) and the Synth will Speak() the associated string that comes with it ("cat", "dog","Apply"). This is why I'm only sending one word to the Synth at a time and can't really concat words together. The reason there are multiple Synths instantiated is because if the app is multitouch used by multiple users (let's say 4) concurrently it would be ..."cool"... if every user had his own dedicated Synth.

    Cool, now your tests make sense! I hadn't thought of children (even fully grown ones) banging away at a keyboard or a touch screen.  I agree that a synth for each user would be more fun for that.

    I would do what I can to keep the calls to the synths under control.  Part of that would be in the programing, but part of it is also in the game design.  Good game design will usually have slower periods followed by flurries of activity and then a return to quieter play.  The slower parts should allow for the GC to cleanup some of the mess you've created.

    Sounds like a fun project!  Best of luck with it.


    John Fenton
    Wordmasters Direct Mail and Data Processing Services
    Wednesday, January 26, 2011 5:10 AM
  • Hi animosity,

     

    Welcome to MSDN Forums! 

      

    I suggest you post your questions at Gotspeech.NET forum because of it's likely to get quicker and better responses to Speech develop issues at http://gotspeech.net/forums/34/ShowForum.aspx where Speech develop experts live in.

    Have a nice day!

    Best regards

     

     


    Yves Zhang [MSFT]
    MSDN Community Support | Feedback to us
    Get or Request Code Sample from Microsoft
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.

    Wednesday, January 26, 2011 9:41 AM
  • Hello Animosity (and others),

     

    I am experiencing a very similar (if not the exact same) issue.  I have logged it with Microsoft here and provided them with sample code and dumps.

    http://connect.microsoft.com/VisualStudio/feedback/details/664196/system-speech-has-a-memory-leak

    What will help to get it sorted is if other people who are experiencing this could let them know by posting in the issue and also by clicking the up arrow on the number of users who can replicate the bug, along with any extra info you might have.

    I also see a post here: http://gotspeech.net/forums/post/11208.aspx about it, which might be your post.  I will be posting there as soon as my account is approved.

    There seem to be quite a few memory leaks in System.Speech, but with various posts on various different threads all around the Internet, Microsoft possibly don't seem to realise how many people are experiencing them.  The only way we'll probably get this fixed is if we log the issue officially with Microsoft.  I've done this, now I need other people to contribute to it so it moves up their priority list.  It looks like Microsoft need to spend some time on System.Speech as a whole.

    Anyway, the link is there, so if anybody can contribute, I think that should help, and thanks in advance.

     

    Thanks,

    Eoghan.

    Monday, May 09, 2011 2:56 PM