Latency for realtime audio in CLR. RRS feed

  • Question

  • I'm pushing the envelope, trying to sythesize real-time audio from a CLR thread. It works!...... almost. I'm having some problems with very intermittent dropouts, and would appreciate any advice on how to move forward. Yes, I know this sounds quite mad, but everything in the docs, and results of experiments so far seem to indicate that it can be made to work.

    I'm working on an experimental audio project in WPF.  I decided to take the "LowLatency" LatencyMode setting at face value, and try generating realtime audio in CLR code. fwiw, I'm willing to consider fairly radical solutions, up to and including a custom build of Mono for hosting an audio service process. But at this point, I suspect that I have a more subtle problem.

    The current state of affairs:
    • Audio is generated via use of custom AudioExpression objects. Similar to a LINQ Expression, AudioExpressions are compiled into dynamic assemblies at runtime. Audio code is IL-emitted, and pre-JITted. Except for a very small piece of unsafe code to copy generated audio into WASPI audio buffers, managed arrays are used in preference to pointers and pins, because I'd like to be able to run in a partial-trust environment, but I can convert if that buys me anything.
    • Audio code allocates infrequently, but an event management and scheduling system performs occasional small allocations -- primarily for a continuation coding style for control events that are layered over iterators. I don't think I can totally avoid memory allocations, because of delegates and/or iterators are required at a very fundamental level, and these do allocate small objects.
    • Audio control variables are bindable to Dependency objects via a custom asynchronous binding mechanism (polling from the UI thread with DispatcherTimers).
    • WASPI unmanaged to managed transitions in the audio service thread are handled with managed C++ "It just works" code. Transition code should be fine: two float* parameters cross the unmanaged/managed boundary (if any), and can be executed at an entirely adequate rate to satisfy 10-ms callbacks.
    The system is currently running on a Core audio (WASAPI) driver with frame-to-frame audio latency of about 15ms (2x per video refresh?). (This seems to have changed in 3.5 SP1 from a documented frame-to-frame latency of 30ms).

    Without UI, the system runs quite well. Unofortunately, when running in the same process as a rich WPF UI, there are occasional dropouts; but, surprisingly, they don't seem to be GC related. Gen 2 collections do cause dropouts; but gen 2 collections are exceeding rare. Gen 0 collections occur continuously (about 1 per second), without apparent ill effect; Gen 1 collections occur occasionally (2 or 3 a minute), also without ill effect. I have a button on the UI that forces gen1 collections. Clicking continuous to cause calls to GC.Collect(1) does NOT cause audio dropouts.

    The results are that I get very occasional audio dropouts -- maybe one every minute or two under heavy UI activity, and almost never when the UI is idle. Interesetingly, default buttons -- which have a blue animated glow when focused -- aggravate the problem. The issue may be related to gen 0 collections; but I don't think so. Gen 0 collections occur pretty much continuously, and forced gen0 collections never trigger a dropout. The dropouts are not gen1 or gen2 GC releated, because then occur even if no gen1 or gen2 collection has occurred. So, I don't think the problem is garbage collection.

    Current process configuration is: Workstation GC, background garbage collection. I have tried setting LatencyMode to LowLatency permanently, and also setting LatencyMode to LowLatency only while filling the WASAPI audio buffer. The only difference seems to be that permanent LowLatency causes Gen1 collections to run more often; but GCs don't cause dropouts under either system. The process is running with Realtime priority; the realtime audio thread (on which the audio buffers are filled) run at high priority. I have not -- so far -- messed with OS thread settings, although I do plan to pin the audio thread to a specific processor via SetThreadAffinityMask at some point, because it makes it easier to measure CPU use.

    I have seen *some* evidence that locks don't have priority inversion. If someone could confirm this, that would be extremely useful. I do have a number of locks that could be covnerted to non-blocking synchronization mechanisms if that's the problem. I *think*, but can't confirm with certainty, that the problem is aggravated by lock calls on objects even when the lock is aquired without blocking. Removing a lock that gets called once per buffer fill seems to improve (but not completely eliminate) the problem, although there are still other lock calls.

    I'm also not completely clear on the circumstances in which the realtime audio thread can be suspended while a garbage collection is in progress. The documentation seems to suggest that gen0 allocations can procede even with a GC in progress on the background thread (which would be great).  Presumably, the realtime thread has to be temporarily suspended in order to mark live objects on the stack. The fact that I survive gen 0 collections without harm seems to suggest that the temporary suspension is relatively harmless. What I'm not clear on though, is whether certain access patterns would suspend the thread in a more dramatic fashion: e.g. does a pinning a double[] suspend the thread? Does calling lock(object) suspend the thread even when the lock is succesfully aquired (I strongly suspect that it does).

    I'm also not sure whether Workstation GC is really the best host mode. There's some reason to believe that Server gc might be better than Workstation GC. Also worth considering: being able to totally disable GC would be a solution. The entire test app runs in 32MB; and I'd be perfectly content to leak a GB of memory, in a dedicated audio process. 1GB of leaking, at a rate of 1 or 2k per second would keep an audio syntehsis process alive for a very very long time.

    Any theories or advice appreciated. Yes. I know this sounds quite mad, but everything in the documentation, and the results so far seems to suggest that it could be made to work.

    Part of the problem is that there doesn't seem to be a good way to catch the dropout in a debugger. There does appear to be a thread that terminates at about the time of the error. But WASPI calls from the audio service thread do not return HRESULT errors of any kind.

    Tuesday, November 4, 2008 5:15 PM


  • And the answer....

    I needed the Vista-ism for "no really. I really wan't high priority". Gets me to really really really close. There's still occasional drops when tabbing between applications.

    Code is Managed C++; Vista-specific calls should be converted to dynamic calls:

        DWORD_PTR dwProcessAffinityMask;  
        DWORD_PTR dwSystemAffinityMask;  
        HANDLE hThread;  
        HANDLE hAvTask = 0;  
            // Pin .net thread to current OS thread.  
            hThread = GetCurrentThread();  
            // Native highest priority is higher than .net highest priority.  
            int priority = ::GetThreadPriority(hThread);  
            // Prevent Dwm from pre-empting audio thread.  
            // boost the whole process.  
            System::Diagnostics::Process::GetCurrentProcess()->PriorityClass = System::Diagnostics::ProcessPriorityClass::RealTime;  
            // STUB: should be dynamically loaded for downlevel platforms  
            // Ask MMCSS to boost the thread priority  
            DWORD taskIndex = 0;  
            hAvTask = AvSetMmThreadCharacteristics(TEXT("Pro Audio"), &taskIndex);  
            hr = m_pAudioClient->Start();  // Start playing.  
            ... service the audio client....  
             } catch .... {  
        } finally {  
            // downgrade the OS thread to normal priority.  
            if (hAvTask != 0)  
            System::Diagnostics::Process::GetCurrentProcess()->PriorityClass = System::Diagnostics::ProcessPriorityClass::Normal;  
            System::Runtime::GCSettings::LatencyMode = defaultLatencyMode;  
    • Marked as answer by Robin E Davies Wednesday, November 5, 2008 2:38 PM
    Wednesday, November 5, 2008 2:38 PM