locked
MediaCapture performance issues

    Question

  • Hi,

    we're trying to capture frames from the video stream frequently in order to analyze them for known patterns with our own algorithms. In the past we did the capturing using OpenCV (to be more precise the wrapper OpenCVSharp) and got the frames from there, but since that isn't really available for Windows Store apps we're stuck with what's currently in the API. So we went with MediaCapture and CaptureToStreamAsync:

    IRandomAccessStream stream = new InMemoryRandomAccessStream();
    await mCaptureElement.Source.CapturePhotoToStreamAsync(ImageEncodingProperties.CreateJpeg(), stream);
    mStreamList.Add(stream);

    and analyze the generated streamlist in a separate thread periodically. While this looked quite usable in the beginning, it turned out that the performance is only acceptable on a few devices. When we measure the time it takes to generate one stream, we get the following results:

    • Surface Pro: 50-70ms
    • Surface 2: 500-600ms
    • Nondescript Atom Tablet with Win 8.1 Pro: 1300-1500ms

    So even if we only grab a frame every 500ms it only really works on the Surface Pro, on all other devices it slows the camera preview down to single-digit FPS rates. Now the question is, are we missing something or is this just the way it is?

    We also tried to use the (very poorly documented, so it was just a guess) LowLagCapture in the meantime, but the results weren't any better.

    Do we really have to create unmanaged components and the entire interop stuff back to C# only to grab frames from a video stream? I think that this is quite a common requirement for things like face recognition, Augmented Reality, barcode scanning and so on.

    Any help is greatly appreciated, we really want to avoid the complicated path for something that seems to be very normal in the modern App world!

    Saturday, May 10, 2014 6:46 PM

Answers

  • You'll get much better behavior with a Media Foundation Transform (MFT) in C++/Cx. That will let you work with the unencoded frame rather than taking significant time to encode it to a Jpeg and then decode it back to a pixel buffer.

    This needs to be done from unmanaged code. Take a look at the MediaExtensions sample for an example.

    The LowLagPhotoCapture class will capture the photo with minimal shutter lag on the capture side, but it won't help reduce post-processing time.

    --Rob


    Saturday, May 10, 2014 6:55 PM
    Owner

All replies

  • You'll get much better behavior with a Media Foundation Transform (MFT) in C++/Cx. That will let you work with the unencoded frame rather than taking significant time to encode it to a Jpeg and then decode it back to a pixel buffer.

    This needs to be done from unmanaged code. Take a look at the MediaExtensions sample for an example.

    The LowLagPhotoCapture class will capture the photo with minimal shutter lag on the capture side, but it won't help reduce post-processing time.

    --Rob


    Saturday, May 10, 2014 6:55 PM
    Owner
  • Rob,

    thanks a lot for your reply and the clarification on LowLagPhotoCapture. Unfortunately I can't follow the link you provided, it takes me right to this thread instead of the example you proposed. However, I googled it and I think I found it: http://code.msdn.microsoft.com/windowsapps/Media-extensions-sample-7b466096#content

    We'll have a look at it a asap.

    Monday, May 12, 2014 6:53 AM
  • Sorry about that. I've fixed the link, but you found the right one.
    Monday, May 12, 2014 12:38 PM
    Owner
  • Hey, i ran into the same problem the CapturePhotoToStreamAsync(...) is with 100ms way to slow for real-time applications. And i notices that it is not depending on the ImageEncodingProperties. If I use BMP what is except of the header just uncompressed RGBA data it is not faster. Specifing unprcompressed explicitly results in an empty stream. I don't know why but I can imagine that the webcam driver does not support this.

    I also tried to use MFT what works fast. But It doesn't make much sense to me. Since the image of the webcam comes over USB into the RAM, from there it's copied to GPU where I just copy the data into another texture (modified the grayscale effect to just copy). After that the memory is mapped to RAM and copied into a buffer that is now ready for further processing on the CPU (e.g. by OpenCV). I don't understand why this whole complicated copying is way faster then everything the Media Capute API offers. I just want to access RAW uncompressed data. 24 or 32 bit per pixel. Nothing else.

    After I found this thread I tried to use the LowLagPhotoCapture. I configured it to capture Photos from VideoPreview and started the preview (the rest is pretty much as in the tutorial) By the way: PrepareLowLagPhotoCaptureAsync(ImageEncodingProperties.CreateUncompressed(MediaPixelFormat.Bgra8)) works in this scenario.

    But result is the same. It takes about 100 ms per frame. Or is it possible to start the next capturing before the current one finished and build up a capturing queue? I haven't tried that so far..

    Thanks for advice ;) 



    • Edited by XnDerKai Wednesday, August 27, 2014 12:34 PM
    Wednesday, August 27, 2014 11:41 AM