Video processing with alpha channel RRS feed

  • Question

  • Hi everyone,

    This is one of my first questions as I'm experienced developer but new to Windows SDK and DirectX so please apologize if the question mixes non-related concepts because I'm trying to figure out how Windows works with video.

    First of all I will describe my scenario and what I'm trying to do.

    I want to develop a C++ application with the Windows SDK v7.1 and DirectX SDK June 2010 which should be capable of handling 2 input videos with alpha channel (for each video). Then process those videos in order to blend them together (output 1 video) with the alpha channel information. The videos are .avi containers uncompressed (no encoding was applied).

    As far as I know (thanks to MSDN forums and docs) this can be done in several ways:

    1. With a DirectShow Filter: I found this solution a bit hard and too much for my requirements

    2. Using DirectShow Editing Services: I've tried to write a very simple app but always getting an error related with qedit.h and as Microsoft explains in the following link ( is no longer included on newer SDKso I think is not a good solution because is deprecated.

    Then I came across with Media Foundation, which relplaces DES, but sincerely I'm a bit lost and I've not been able to find good and compiling examples for me to test.

    So could anyone clarify if I could achieve what I need with Media Foundation? If yes, how to begin with? If not, there's another solution?

    Any help will be appreciated!

    Note: I'm using Visual C++ 2010 Express if matters.



    • Edited by Adrià Gil Monday, February 17, 2014 3:30 PM typing mistakes
    Monday, February 17, 2014 3:26 PM

All replies

  • Hello.

    If you want to play the each videos in real time you can use this with MediaFoundation : EVR

    If you want to encode the videos as a file or stream them, MediaFoundation can do too, but with more work.

    Monday, February 17, 2014 8:15 PM
  • Thanks Miaou77 for the quick resposne.

    I've been reading about EVR and successfully worked with the EVRPresenter sample from Microsoft SDK.

    Now I've got a custom EVR Presenter that is used in a MFPlayer to display the video.

    As far as I only want to display the video on the display (don't want to encode an output stream or file for the moment) I will use MediaFoundation as you suggest.

    I guess I will be able to implement a transform / modify the Presenter source code in order to get working with Alpha information as I want.

    I will keep this thread up to date with my progression and / or issues during the development for anyone interested.

    P.D: Any good starting point to work / extract video frame channel information?

    Thanks again,


    Tuesday, February 18, 2014 11:30 AM
  • Hello.

    It seems the good start, but if you want a better help, you need to explain what are your special alpha treatment.

    EVR manage alpha between videos, so is it enough ? If so, you don't need to implement a transform, just use the EVR. You can also provide a custom EVR and play with alpha and shader. There are differents ways to play alpha, depends on your needs.

    • Edited by Miaou77 Tuesday, February 18, 2014 1:00 PM
    Tuesday, February 18, 2014 12:59 PM
  • You're right, I give you five cents of what I pretend to do.

    1. Have two .avi or .mov video files with alpha channel on each one (or at least the one on top). Then overlay the source stream with the substream usign the alpha channel information as a mask. For example if my substream video is a circle moving in x,y directions over time and is on top then I expect to see the source stream masked inside the circle when displaying the video with EVR.

    2. Find the correct video codec to achieve that result. I'm trying different combinantions but there are only 2 of them that are working by the info provided by experts over the net:

    A) 32 bit uncompressed .avi RGB+Alpha with no codec (set to 'None') from After Effects or FFMPEG encoder. I've not bee able to playback or get the alpha channel information on EVR, maybe is due to the video export so the alpha info is dropped but theoretically possible.

    B) A quicktime .mov with Animation Codec (qtrle) which works 100% with VLC or Quicktime Player but impossible to playback on EVR getting the error "Playback Error (HRESULT = 0xC00D5212)  (using MFPlayer sample)

    Also I've tried a .mov with PNG enconding but got the same results.

    In case that I found the correct video format / codec I wonder how work with alpha channel information in order to mask one video with the other and present the output in the display.

    Then, is possible to playback Quicktime Animation movies (which sure has the alpha info inside) in EVR? And how to overlay the 2 videos, with shaders or pixel operations?

    I'm used to work with OpenCV and I've done something similar with a PNG sequence but impossible for a video file but of course I would like to use Media Foundation for better app performance.

    Thanks for your time,


    Tuesday, February 18, 2014 3:35 PM
  • Hello.

    Your project seems to be complicated but interesting.

    Can you provide two 32 bit uncompressed .avi RGB+Alpha with no codec. I think it's a good start to see how to manage the project.

    • Edited by Miaou77 Tuesday, February 18, 2014 9:27 PM
    Tuesday, February 18, 2014 9:26 PM
  • Hey,

        Trying to do something similar and share your pain.

    Have a similar thread on here. My hunch is RGBA isn't implemented (yet) in media foundation but maybe we just don't know the magic configuration to get it working as expected. As already requested, please could Microsoft provide sample code demonstrating how to decode RGBA from the Media Foundation API.

    The composite - I'd do it in an hlsl shader following output of your two texture sources (when we get that far).

    I also noticed MFPlayer seems unable to decode two streams simultaneously - maybe its user error. My workaround was to decode each in its own process then they have their own address space so whatever fun global is screwing up is resolved.

    Long term workaround if Media Foundation (and support thereof) does not improve is to move to motion jpeg-2000 in an OpenCL shader. That'll give compressed RGBA but would rather use the video decode hardware than a shader for decode if this if possible.

    Works on Linux. Come on Microsoft, keep up :)

    @Microsoft : Please provide sample code demonstrating decode of two streams simultaneously one with alpha, using MFPlayer derivative or some new sample application, with a DX9 EVR texture output on each, blending in a hlsl shader & outputting result to the screen following a blend. You have a D3D sample application so can grab the output/display part from there.

    Can't use DX11 EVR (yet) because it's only available on Win8 and we have to ship software to legacy also. DX11 equivalent worth doing also though because going the DX9 route means messing about with DirectX shared surfaces to get an interface into the DX11 shader for composite/display.

    @Adrià - hi. It's not you, it's lack of decent sample code and possible MF limitations. Maybe it's possible with what's there, maybe the API needs work, we just don't know. Good luck.

    > Your project seems to be complicated

    It's only complicated because of the way the APIs are structured. From a user perspective, it's a straightforward composite of one video over another.

    @MS - If you do provide sample code (pretty please with a cherry on top, chocolate AND whipped cream),  please perform the blend in HLSL shader, not some media foundation internal API because going forwards we'll want to perform more advanced effects/blending and HLSL has the flexibility to allow us to do this. 


    Tuesday, February 25, 2014 7:36 AM
  • > EVR manage alpha between videos, so is it enough ? If so, you don't need to implement a transform, just use the EVR. You can also provide a custom EVR and play with alpha and shader. There are differents ways to play alpha, depends on your needs


    Please provide an example of a working alpha channel if you've achieved that. Just because docs say something is possible doesn't mean it's implemented in reality. Can you do this ? If so please explain configuration required.
    Tuesday, February 25, 2014 7:56 AM
  • Hi again,

    It's been a while since the last reply I made because I've been working hard to get the desired result.

    After a lot of reading docs and trying different DirectShow filters (remember I'm a newbie) I got what I need...well almost!

    Due to the project especifications -need to be donde in DirecShow and use Direct2D for graphics- I decided not to use WMF -DS is also better documented and a lot more of information over the net-, I ended up using GraphEdit to build a custom graph to use in my C++ application.

    I've got a video underlayed with another one with an alpha channel, finnally, with the following graph pattern:

    File Source (Async.) > AVI Splitter > Mpeg4 Decoder DMO > Color Space Converter > VMR9 (pin: Input 0)

    File Source (Async.) > AVI Splitter > Color Space Converter > VMR9 (pin: Input 1)

    It's important to mention that the video connected in Input 1 -on top for overlay- needs to be an .AVI uncompressed exported setting 32bit and RGB+Alpha in FFMPEG or AfterEffects

    Now I'm facing another problem: the file size! Of course uncompressed AVI weights a lot so I need to investigate a video codec to work with alpha channel and a DS filter that doesn't drop alpha channel information.

    I've found Lagarith Loseless codec and the video is compressed preserving alpha channel as an .avi file but when I try to use it in my application / graph the alpha channel is dropped when the video arrives to the VMR9. I'm pretty sure it is because of the AVI Decompressor filter. The graph is the following:

    File Source (Async.) > AVI Splitter > AVI Decompressor > Color Space Converter > VMR9 (pin: Input1)

    I think I have a hard work for now because the unique solution I've found is to write a custom AVI Decompressor filter!!!

    I'll keep you guys updated!

    Tuesday, February 25, 2014 12:19 PM
  • so a directshow solution. great you have what you need.

    maybe this will help microsoft demonstrate how we can hardware decode video with alpha via media foundation :)

    thanks for the information.
    Tuesday, February 25, 2014 6:01 PM
  • > EVR manage alpha between videos, so is it enough ? If so, you don't need to implement a transform, just use the EVR. You can also provide a custom EVR and play with alpha and shader. There are differents ways to play alpha, depends on your needs


    Please provide an example of a working alpha channel if you've achieved that. Just because docs say something is possible doesn't mean it's implemented in reality. Can you do this ? If so please explain configuration required.


    I said this before having more information about the project.

    EVR can mix video with alpha. The problem here, seems to be that decoder loses alpha information, after decoding. That's why the use of uncompressed format is needed to keep alpha inside the EVR.

    The way is to find software/hardware decoder that keeps alpha. I don't know if it exists. Or to write one.

    • Edited by Miaou77 Tuesday, February 25, 2014 8:32 PM
    Tuesday, February 25, 2014 8:30 PM
  • Another way perhaps.

    Instead of using alpha, can you use a reference color ?

    If you can have a reference color that the real image does not have, you can use this approach.

    Encode video file with unique reference color in place of alpha. You will get the image as is in EVR. Then pass the video texture in a shader. The shader check for the reference color, and then returns alpha to zero for reference color, otherwize returns the color.

    With this approach, you could use compressed video format.

    Tuesday, February 25, 2014 9:05 PM
  • yeah of course you can colour key it and all manner of other tricks but the point of posting here is to try to get it done right.

    > EVR can mix video with alpha.

    don't mix inside the video decode API. mix in HLSL using the output so advanced effects can be used.

    I want to do this right, not hack it. I'm asking MS to provide sample code because my hunch is it's not possible right now & that's cool. we make progress by identifying limitations & addressing them.
    Tuesday, February 25, 2014 9:44 PM
  • I think both of you are right.

    Miaou77 I will give it a try using a reference color but I have to postpone for few days -until I get some free time- because of the project specifications -they want to use a video with an alpha channel. This hack is a good solution for me but not for this project in detail, sorry. Again thank you for the advice!

    Steve it will be great to have a MS sample code to look at because I think it is something people must like to know. For the moment I think there are 3 possible solutions as mentioned before:

    1. Use the uncompressed video: I've tested and works.

    2. Find a software / hardware decoder that preserves the alpha channel: p.e.: AVI Decompressor drops alpha

    3. Write your own decompressor that preserves alpha channel: that's the tast I'm facing right now.

    Wednesday, February 26, 2014 10:25 AM
  • Hey,

         Miaou77 is right - for a lot of cases, colour key is sufficient - you'll have to use a range around whatever  specific colour you choose because the compress/decompress process results in approximate output. It also lets you use hardware decompress, as he stated.

    > Use the uncompressed video: I've tested and works.

    To be 100% clear here, have you managed to get uncompressed ARGB (with preserved alpha) out of a media foundation EVR presenter ? (not DShow - might be ok for you but I have to stay with latest interfaces).

    Cheers. Think we might slowly be making some progress. Please could somebody from Microsoft give us an update of your understanding of where you are with this functionality so we don't have to second guess.

    Thanks in advance.

    Wednesday, February 26, 2014 5:44 PM
  • Downside with colour key is you have no resolution to your alpha - it's opaque or transparent. No partial transparency. That's why we use alpha channels - to give us partial transparency resolution.
    Wednesday, February 26, 2014 5:46 PM