Multiple h.264 decoder instances RRS feed

  • Question

  • Our application can simultaneously decode and display up to 16 video streams in H.264 format. We currently use a software H.264 decoder built with Intel IPP.

    I'm looking for an alternative to increase performance and I'm wondering if decoding could be offloaded to the GPU by using Media Foundation and/or DXVA. 

    Can I get any optimization by using those technologies? Any recommendations?

    Friday, August 24, 2012 5:09 PM

All replies

  • DXVA offers greatly increased performance over plain software decoding.  It also takes the load off of the CPU and puts it on the GPU, which frees up the CPU for other processing.  However, GPU resources can be limited as well -- we have seen people have trouble with some drivers and some GPUs when they try and create too many DXVA enabled decoders.  The driver will reach a limit of the number of DXVA sessions it supports and then will fail further requests.  The ideal solution for your application would probably be to have a set number of DXVA decoders -- say 4, or 8, or some other number based upon testing on multiple graphics cards -- and then have the rest be software decoders.

    The Media Foundation H264 decoder supports DXVA, and if used in the media session with an EVR for rendering DXVA will be configured automatically.  Using DXVA directly can be quite tricky so I would not recommend it unless you have a specific reason for doing so.

    Tuesday, August 28, 2012 10:47 PM
  • Thanks for the answer.

    I need to get the raw RGB or YUV frame once decoded to perform some video analysis. That means sending the stream to the GPU and getting it back to main memory. Would I still get increased performance even if I have to perform Mem->GPU->Mem? Can this be performed by MF and/or DXVA?

    Wednesday, August 29, 2012 2:27 PM
  • Copying data back to main memory definitely slows things down, but it is still faster than software decoding.  Make sure you use IMF2DBuffer for accessing D3D buffers, as this is a lot faster than using the standard IMFMediaBuffer.
    Wednesday, September 5, 2012 7:30 PM