locked
How to mix two streams in EVR RRS feed

  • Question

  • I am trying to mix 2 streams to an EVR in a Media Foundation project (using Win7),

    I tried many ways to mix a substream to the reference stream, but still can't mix them two.

    Is there a sample describes how to mix 2 streams in EVR?

    I have already build a correct topology that can render reference stream alone, and I am now trying to add a second stream sink to EVR.

    The topology looks like this :

    Video Capture  >>  Tee  >>  EVR output node 1 (for media sink 1)
                                        >>  EVR output node 2 (for media sink 2)

    However, this topology can not be resolved by topology loader , so I changed the topology to this:

    Video Capture  >>  Tee  >>  EVR output node 1, input index 1
                                        >>  EVR output node 1, input index 2

    This topogy can be resolved by topology loader, but somehow it can not be render any image.
    Can anybody help me out here.

    Thanks in advance.

    ----------------------
    By the way, I have found an article that seems to be the same problem with me :
    http://social.msdn.microsoft.com/Forums/en-US/mediafoundationdevelopment/thread/a9a489e2-27ee-4f1c-b71d-9e57f48ae4da
     
    Although, mine is in Win7.
    Wednesday, October 14, 2009 12:28 AM

Answers

  • Vlad the EVR expert looked into this.  Apparently substream mixing is largely dependent on the video driver implementation, and many drivers do not do a good job of this.  My understanding now is that substreams are only confirmed to work for a very limited case -- overlaying subtitles on the video -- and should not be relied on to do anything else important.

    The potential solution I got from Vlad was to implement a custom mixer using DXVA-HD (http://msdn.microsoft.com/en-us/library/ee663586(VS.85).aspx), but only a subset of video cards support this.  If you do not need hardware acceleration, you could implement a multi-input MFT that takes two input video samples and draws their contents into the desired positions on the output sample. 

    • Marked as answer by optimaluu Friday, October 23, 2009 1:01 AM
    Thursday, October 22, 2009 9:06 PM

All replies

  • I haven't gone down the road of mixers myself but I have a feeling that you shouldn't be using a Tee Node for this. Even if they can be used, you're adding unneeded complexity. You should be able to use a single node connection with multiple stream connections. http://msdn.microsoft.com/en-us/library/ms695284(VS.85).aspx is used for this purpose. All you need to do is reference the node object streams via index. The remark about existing connections is ambiguous though. It doesn't specify if the existing connection is broken if the new source stream index is the same as the current connection or not.
    Thursday, October 15, 2009 6:29 PM
  • I tried to connect source node directly to output node, but the outcome still the same.
    Here is the code segment :

    --------------
        // Add a source node for this stream.
        if (SUCCEEDED(hr))
        {
            hr = AddSourceNode(pTopology, pSource, pPD, pSD, &pSourceNode);
        }

        // Get meida sink
        {
            // Get the media type handler for the stream.
            IMFMediaTypeHandler *pHandler = NULL;
            hr = pSD->GetMediaTypeHandler(&pHandler);

            // Get the major media type.
            GUID guidMajorType = GUID_NULL;
            if (SUCCEEDED(hr))
            {
                hr = pHandler->GetMajorType(&guidMajorType);
            }

            if (SUCCEEDED(hr))
            {
                hr = MFCreateVideoRenderer(__uuidof(IMFMediaSink), (void**)&pSink);
            }

            //// Get the current media type of the source
            //hr = pHandler->GetCurrentMediaType(&pMT_src);

            if (pHandler){
                pHandler->Release();
                pHandler = NULL;
            }
        }

        // get stream sinks
        if (pSink){
            // reference stream sink
            hr = pSink->GetStreamSinkByIndex(0, &pStreamSink1);

            // substream
            hr = pSink->AddStreamSink(1, NULL, &pStreamSink2);
            //hr = pSink->AddStreamSink(1, pMT_src, &pStreamSink2);// This function call fail
            if (hr != S_OK)
            {
                std::Gostringstream Goss;
                Goss << std::hex << hr << std::dec << TEXT(" -- ") << __LINE__;
                MessageBox(NULL, Goss.str().c_str(), szAppName, 0);
            }
        }

        // Create output node
        if (SUCCEEDED(hr))
        {
            hr = AddOutputNode_By_StreamSink(pTopology, pStreamSink1, &pOutputNode);
            hr = AddOutputNode_By_StreamSink(pTopology, pStreamSink2, &pOutputNode2);
        }

        // connection
        hr = pSourceNode->ConnectOutput(0, pOutputNode, 0);    // If I use this function alone, the image can be rendered normally
        //hr = pSourceNode->ConnectOutput(1, pOutputNode2, 0);    // If I use this function, the topology won't be resolved by topology loader
        hr = pSourceNode->ConnectOutput(1, pOutputNode, 1);        // If I use this function, the topology can be resolved, but the image can't be rendered (no image comes out in my video window)
    --------------

    This topology is really simple, I can't think of any other way to connect them.
    Thanks.
    Friday, October 16, 2009 7:46 AM
  • Did you try calling

    hr = pSourceNode->ConnectOutput(1, pOutputNode, 1);       

    with

    hr = AddOutputNode_By_StreamSink(pTopology, pStreamSink2, &pOutputNode2);

    commented out?

    Friday, October 16, 2009 8:41 AM
  • Yes, I tried this, but still the same....
    (The topology can be resolved, but the image can't be rendered.)
    Friday, October 16, 2009 9:11 AM
  • Did you configure a display window for the EVR?  The code for that would look something like this:

    IMFVideoDisplayControl* pVideoControl;
    HWND hwndVideo = GetYourDisplayWindow();

    hr = MFGetService(pSink, MR_VIDEO_RENDER_SERVICE, IID_PPV_ARGS(&pVideoControl));
    if(SUCCEEDED(hr))
    {
        hr = pVideoControl->SetVideoWindow(hwndVideo);
    }
    Monday, October 19, 2009 6:01 PM
  • Yes, I did this in my asynchronous call back interface :: Invoke member function.
    And this part of code can is fine when there is only one stream.

    My code as follows:

    --------------------------------
    .....
    if (meType == MESessionTopologyStatus && hrStatus == S_OK){
        UINT32 status;

        HRESULT hr = pEvent->GetUINT32(MF_EVENT_TOPOLOGY_STATUS, &status);
        if (SUCCEEDED(hr) && (status == MF_TOPOSTATUS_READY))
        {
            // get IMFVideoDisplayControl, set video window
            hr = MFGetService(pSession,
                            MR_VIDEO_RENDER_SERVICE,
                            __uuidof(IMFVideoDisplayControl),
                            (void**)&pVideoDisplayControl);
            if (hr == S_OK){
                pVideoDisplayControl->SetVideoWindow(hVideo);
                MFVideoNormalizedRect src_rc;
                src_rc.left = src_rc.top = 0;
                src_rc.right = src_rc.bottom = 1;
                RECT des_rc;
                ::GetClientRect(hVideo, &des_rc);
                pVideoDisplayControl->SetVideoPosition(&src_rc, &des_rc);
            }

            // get IMFVideoMixerControl, set streams z-order...
            hr = MFGetService(pSession,
                            MR_VIDEO_MIXER_SERVICE,
                            __uuidof(IMFVideoMixerControl),
                            (void**)&pMixer);
            if (hr == S_OK){
                // set the z-order of thesecond streams
                pMixer->SetStreamZOrder(0, 0);    // stream 0 (reference stream)
                pMixer->SetStreamZOrder(1, 1);    // stream 1

                // set the display position ofthe mixed stream
                MFVideoNormalizedRect rc;
                rc.top = rc.left = 0;
                rc.right = rc.bottom  = 0.5;
                pMixer->SetStreamOutputRect(0, &rc);

                rc.top = rc.left = 0.5;
                rc.right = rc.bottom  = 1;
                pMixer->SetStreamOutputRect(1, &rc);
            }
        }
    }
    • Edited by optimaluu Wednesday, October 21, 2009 6:43 AM
    Tuesday, October 20, 2009 1:27 AM
  • Two things to look into:

    -Have you checked the HRESULTS from any of your IMFVideoMixerControl interface function calls? In particular, SetVideoPosition

    -Does your destination rectangle contain normalised boundaries?
    Tuesday, October 20, 2009 11:11 AM
  • There is one thing I found from this, if I use the following code, the program never get to run into the block.

        // connection
        hr = pSourceNode->ConnectOutput(0, pOutputNode, 0);
        hr = pSourceNode->ConnectOutput(1, pOutputNode, 1);

    because when I get MESessionTopologyStatus event, the hrStatus will be E_FAIL, not S_OK.

    but if I comment the second line. ( Only use pSourceNode->ConnectOutput(0, pOutputNode, 0); )
    Then hrStatus will be S_OK, and the program will be fine.

    Tuesday, October 20, 2009 12:16 PM
  • A source stream will never have more than one output.  The topology loader does not know what to do with a source stream node with a second output connected.

    I looked into this a bit more.  There are a bunch of caveats to substreams, to the point where I think they might not be terribly useful.  First, you must set the reference stream (stream ID 0) type before setting any substream types.  This is a bit counter-intuitive since typically an application relies on the topoloader to resolve the sink media type.  Then the substreams will support a specific set of subtypes, often not what you would expect them to support.  For example, with NV12 on the reference stream, the substreams only support AU44 or AYUV.  When I set up the EVR in this way, I was able to get video samples to be processed.

    See http://msdn.microsoft.com/en-us/library/aa965242(VS.85).aspx for more on substream type negotiation.
    Wednesday, October 21, 2009 1:14 AM
  • Greate! You are totally right.
    After I set the media type of the reference stream, and use a tee node between source and sink node, the topology works.

    But there is an additional question.

    I connect the node in this way :
                hr = pSourceNode->ConnectOutput(0, pTeeNode, 0);
                hr = pTeeNode->ConnectOutput(0, pOutputNode, 0);
                hr = pTeeNode->ConnectOutput(1, pOutputNode2, 0);

    the image of the second stream doesn't show up. (Although, the topology can be solved with 2 streams in it, which is good to me.)

    The rendered image can be seen from here :
    http://i.imagehost.org/view/0436/Render2Streams

    I think this might be some setting problem in IMFVideoDisplayControl or IMFVideoMixerControl interface. But I have called all the functions affects video windows. (As I mentioned in previous post. When I get MESessionTopologyStatus event, I call the following functions.)

    IMFVideoDisplayControl::SetVideoWindow
    IMFVideoDisplayControl::SetVideoPosition
    IMFVideoMixerControl::SetStreamZOrder
    IMFVideoMixerControl::SetStreamOutputRect

    Thanks
    Wednesday, October 21, 2009 6:26 AM
  • Vlad the EVR expert looked into this.  Apparently substream mixing is largely dependent on the video driver implementation, and many drivers do not do a good job of this.  My understanding now is that substreams are only confirmed to work for a very limited case -- overlaying subtitles on the video -- and should not be relied on to do anything else important.

    The potential solution I got from Vlad was to implement a custom mixer using DXVA-HD (http://msdn.microsoft.com/en-us/library/ee663586(VS.85).aspx), but only a subset of video cards support this.  If you do not need hardware acceleration, you could implement a multi-input MFT that takes two input video samples and draws their contents into the desired positions on the output sample. 

    • Marked as answer by optimaluu Friday, October 23, 2009 1:01 AM
    Thursday, October 22, 2009 9:06 PM
  • Hey Matt, I'm interested in implementing a multi-input MFT for a simple image overlay. It would be like a water-mark but animating(i.e. a digital clock). Is a transform a little bit overkill for this?
    • Marked as answer by optimaluu Friday, October 23, 2009 7:00 AM
    • Unmarked as answer by optimaluu Friday, October 23, 2009 7:00 AM
    Thursday, October 22, 2009 9:35 PM
  • Even though the final answer is I can't use the mixer to mix two video streams, it is good to know why.
    Hopefully the mixer can be more convenient to use in the future.

    And thanks a lot for your patience.

    Friday, October 23, 2009 1:19 AM
  • Nobby -- Out of sinks, sources, and transforms, transforms are the easiest to implement in my opinion.  Multi-input MFTs are a bit more complex as you potentially have to deal with streams of different frame rates; of course, you can force the frame rate to be the same on both input types and have the topoloader insert a frame rate converter.  Multi-input MFTs work well for image overlays.  I cannot say whether it is overkill because I do not know what your other options are, but it is probably the easiest 'good' implementation (for example you could just overlay the image directly in the source but that reduces the flexibility of the source).
    Tuesday, October 27, 2009 6:12 PM