locked
how to know if CLSID_CColorConvertDMO supports hardware acceleration RRS feed

  • Question

  • So i created CLSID_CColorConvertDMO using:

    IMediaObject pMediaObject;
    pMediaObject.CoCreateInstance(CLSID_CColorConvertDMO);

    Now I want to check if it will do that using Hardware (GPU) or not. If it will not do that using GPU then i do not want to use it. I read about MF_SA_D3D11_AWARE and on MFT_ENUM_HARDWARE_URL_Attribute They should tell if it is Hardware accelerated supported. But to check that I need access to IMFAttributes. So I tried this:

        IMFTransform* oIMFTransform = NULL;
        IMFAttributes* pAttributes = NULL;
        HRESULT  hr = pMediaObject->QueryInterface(IID_IMFTransform, (void**)&oIMFTransform);
    
        hr = oIMFTransform->GetAttributes(0, &pAttributes);
        if (SUCCEEDED(hr))
        {
            UINT32 bD3DAware = MFGetAttributeUINT32(pAttributes, MF_SA_D3D_AWARE, FALSE);
            bD3DAware++;
            pAttributes->Release();
        }

    But hr that came from hr = oIMFTransform->GetAttributes(&pAttributes); is always E_NOTIMPL So how i can tell if on this PC it will do the color conversion using Hardware or not?

    Thanks!

    Monday, May 25, 2020 12:54 AM

All replies

  • This DMO is software implementation only.

    http://alax.info/blog/tag/directshow

    Monday, May 25, 2020 5:42 AM
  • Video Processor MFT is GPU-aware with software fallback. Yes it can convert between pixel formats texture to texture.

    http://alax.info/blog/tag/directshow

    Monday, May 25, 2020 9:24 AM
  • Thank you.

    So it took me almost entire day to make it to work with Video Processor MFT
    (first time i deal with this)

    Now i do color conversion from rgb to i420 using Video Processor MFT as you offered.
    I put it on loop but gpu load is 0. so it does it on CPU.

    I checked pAttributes->GetUINT32(MF_SA_D3D11_AWARE ,&Val);

    and got Val = 1.

    Should i send  MFT_MESSAGE_SET_D3D_MANAGER

    to make it to work on GPU or i am in wrong direction and i have no control on that.

    Again thank you Roman




    • Edited by ben125452 Monday, May 25, 2020 10:50 PM
    Monday, May 25, 2020 10:25 PM
  • Yes, it is essential to do MFT_MESSAGE_SET_D3D_MANAGER. Without it you don't bind the MFT to any D3D11 device and the transform falls back to CPU.

    http://alax.info/blog/tag/directshow

    Tuesday, May 26, 2020 5:51 AM
  • You know it funny. I checked on release mode and it does goes to gpu (without MFT_MESSAGE_SET_D3D_MANAGER). My first test was in debug mode and in debug mode it did not go to GPU.

    But i still want to test how it goes with MFT_MESSAGE_SET_D3D_MANAGER. Do you have some example for that?

    I tried like this (without understanding 100% what i am doing):
    (Just to remind you that pTransform makes RGB32 image with Resolution X to I420 image with resolution Y)
    ------------------------------------------------------------------------------------------------------------------
    HRESULT CreateD3DDeviceManager(IDirect3DDevice9* pDevice,UINT* pReset,IDirect3DDeviceManager9** ppManager
    )
    {
        UINT resetToken = 0;
        IDirect3DDeviceManager9* pD3DManager = NULL;
        HRESULT hr = DXVA2CreateDirect3DDeviceManager9(&resetToken, &pD3DManager);
        if (FAILED(hr))
            goto done;
       
        hr = pD3DManager->ResetDevice(pDevice, resetToken);
        if (FAILED(hr))
            goto done;
        *ppManager = pD3DManager;
        (*ppManager)->AddRef();
        *pReset = resetToken;
    done:
        SafeRelease(&pD3DManager);
        return hr;
    }
    ------------------------------------------------------------------------------------------------------------------

    IDirect3DDevice9Ex      *mpD3DDevice;
    D3DPRESENT_PARAMETERS   mD3dpp;

        mD3dpp.BackBufferWidth = moOutputVideoInfoHeader.bmiHeader.biWidth;
        mD3dpp.BackBufferHeight = moOutputVideoInfoHeader.bmiHeader.biHeight;
        mD3dpp.BackBufferFormat = D3DFMT_X8R8G8B8;//should it be this or i420?????
        mD3dpp.BackBufferCount = 1;
        mD3dpp.SwapEffect = D3DSWAPEFFECT_DISCARD;
        mD3dpp.hDeviceWindow = hwnd;
        mD3dpp.Windowed = 1;
        mD3dpp.Flags = D3DPRESENTFLAG_VIDEO | D3DPRESENTFLAG_LOCKABLE_BACKBUFFER;
        mD3dpp.FullScreen_RefreshRateInHz = D3DPRESENT_RATE_DEFAULT;
        mD3dpp.PresentationInterval = D3DPRESENT_INTERVAL_ONE;


        hr = mpD3D->CreateDeviceEx(
            D3DADAPTER_DEFAULT,
            D3DDEVTYPE_HAL,
            hwnd,
            D3DCREATE_FPU_PRESERVE |
            //D3DCREATE_MULTITHREADED |
            D3DCREATE_HARDWARE_VERTEXPROCESSING|
            D3DCREATE_NOWINDOWCHANGES
            ,
            &mD3dpp,
            NULL,
            &mpD3DDevice
        );
        UINT m_pD3DResetToken;
            IDirect3DDeviceManager9* ppManager;
            CreateD3DDeviceManager(mpD3DDevice, &m_pD3DResetToken, &ppManager);//function below

    hr = pTransform.CoCreateInstance(CLSID_VideoProcessorMFT);
    .....

    ..... some init code for pTransform to make RGB32 image with Resolution X to I420 image with resolution Y

    .....

    hr = pTransform->ProcessMessage (MFT_MESSAGE_SET_D3D_MANAGER, ULONG_PTR(ppManager));
    Issue is that this ProcessMessage gives me E_NOTIMPL Not implemented.

    Do you see what am i doing wrong?
    And again thank you so much for your help




    • Edited by ben125452 Tuesday, May 26, 2020 12:46 PM
    Tuesday, May 26, 2020 12:43 PM
  • You get E_NOTIMPL because you are trying to mount a Direct3D 9 device. Video Processor MFT is capable to use GPU via Direct3D 11 device and so you should create device and device manager respectively. D3D11CreateDevice and MFCreateDXGIDeviceManager are the keywords. Some sample code here.

    http://alax.info/blog/tag/directshow

    Tuesday, May 26, 2020 2:26 PM
  • Again i want to tank you roman for your help.

    I added what needed but it still does not do it on GPU, btw it was my mistake before - Even on release it used CPU only. Here is my code:

            IMFDXGIDeviceManager*           m_pDXGIManager;
            UINT resetToken;
            hr = MFCreateDXGIDeviceManager(&resetToken, &m_pDXGIManager);
         

            ID3D11Device * lDevice;
            D3D_FEATURE_LEVEL lFeatureLevel;
     
            D3D_FEATURE_LEVEL gFeatureLevels[] =
            {
                D3D_FEATURE_LEVEL_11_1,
                D3D_FEATURE_LEVEL_11_0,
                D3D_FEATURE_LEVEL_10_1,
                D3D_FEATURE_LEVEL_10_0,
                D3D_FEATURE_LEVEL_9_3,
                D3D_FEATURE_LEVEL_9_2,
                D3D_FEATURE_LEVEL_9_1
            };
            HRESULT hr(E_FAIL);
            UINT gNumFeatureLevels = ARRAYSIZE(gFeatureLevels);
           
                hr = D3D11CreateDevice(
                    nullptr,
                    D3D_DRIVER_TYPE_HARDWARE,
                    nullptr,
                    D3D11_CREATE_DEVICE_VIDEO_SUPPORT,//D3D11_CREATE_DEVICE_VIDEO_SUPPORT
                    gFeatureLevels,
                    gNumFeatureLevels,
                    D3D11_SDK_VERSION,
                    &lDevice,
                    &lFeatureLevel,
                    nullptr);
     
            hr = m_pDXGIManager->ResetDevice(lDevice, resetToken);
            hr = pTransform->ProcessMessage(MFT_MESSAGE_SET_D3D_MANAGER, ULONG_PTR(lDevice));


    I do not get hr error in no step including in ProcessInput, ProcessOutput.
    I get the resized converted to i420 image but GPU is on 0 and CPU is on sky (i do this in loop).
    Roman, Do I need to add something else except the above to make it to use GPU?

    Thanks!



    • Edited by ben125452 Tuesday, May 26, 2020 8:51 PM
    Tuesday, May 26, 2020 8:49 PM
  • D3D device creation looks about right but you also need to enable multithreaded operation protection.

    It was in line 39 of sample code I referenced above, I don't see this in your posted snippet.

    Now what is "by GPU" exactly: based on the conversation above I don't think you implemented it right. By GPU means that you supply a D3D texture created on that device you supplied via manager and you receive back a texture from MFT managed pool having converted data in new requested format.

    I suspect that you don't provide proper input, hence the CPU fallback.


    http://alax.info/blog/tag/directshow

    Wednesday, May 27, 2020 5:43 AM
  • I just added this to the code:

            CComPtr<ID3D10Multithread> multi;
            hr = lDevice->QueryInterface(IID_PPV_ARGS(&multi));
            if (SUCCEEDED(hr) && multi)
                multi->SetMultithreadProtected(TRUE);
    (little different from the code you gave but i think it should do the same).

    But it did not change anything.

    I will should you now how i set the input:
    I am feeding it with mpSource which is rgb32 byte array

            hr = gfMFCreateMediaType(&pInputMediaType); ON_ERROR_MFT(30)
            hr = pInputMediaType->SetGUID(MF_MT_MAJOR_TYPE, MFMediaType_Video); ON_ERROR_MFT(40)
            hr = pInputMediaType->SetGUID(MF_MT_SUBTYPE, MFVideoFormat_RGB32); ON_ERROR_MFT(50)
            hr = pInputMediaType->SetUINT32(MF_MT_ALL_SAMPLES_INDEPENDENT, 1); ON_ERROR_MFT(60)
            hr = MFSetAttributeSize(pInputMediaType, MF_MT_FRAME_SIZE, oImageProcessorData.miSourceWidth, abs(oImageProcessorData.miSourceHeight)); ON_ERROR_MFT(70)
            //    pInputMediaType->SetUINT32(MF_MT_DEFAULT_STRIDE, -2880);
            hr = MFSetAttributeRatio(pInputMediaType, MF_MT_PIXEL_ASPECT_RATIO, 1, 1); ON_ERROR_MFT(80)
            hr = pInputMediaType->SetUINT32(MF_MT_INTERLACE_MODE, (UINT32)MFVideoInterlace_Progressive); ON_ERROR_MFT(90)
            hr = pInputMediaType->SetUINT32(MF_MT_FIXED_SIZE_SAMPLES, 1); ON_ERROR_MFT(100)
            hr = pInputMediaType->SetUINT32(MF_MT_SAMPLE_SIZE, iInImageSize); ON_ERROR_MFT(110)
            hr = mpTransform->SetInputType(0, pInputMediaType, 0); ON_ERROR_MFT(120)

            MFCreateMemoryBuffer(iInImageSize, &mpBufferIn); ON_ERROR_MFT(275)
            hr = MFCreateSample(&mpIMFSampleIn); ON_ERROR_MFT(277)
            mpIMFSampleIn->SetSampleTime(0);


     bool MFTImageProcessor::Process2(byte* mpSource)//mpSource is rgb32 byte array
    {
        DWORD pcbMaxLength; DWORD pcbCurrentLength;
        HRESULT hr = S_OK; bool bOK = true;

        byte* pbBufferIn = 0; byte* pbBufferOut = 0;
        mpBufferIn->Lock(&pbBufferIn, &pcbMaxLength, &pcbCurrentLength); ON_ERROR_MFT(295)
        memcpy(pbBufferIn, mpSource, pcbMaxLength);
        mpBufferIn->Unlock(); ON_ERROR_MFT(297)

        DWORD pdwStatus = 0;  
     
         hr = mpIMFSampleIn->RemoveAllBuffers();
         hr = mpIMFSampleIn->AddBuffer(mpBufferIn); ON_ERROR_MFT(320)
         hr = mpTransform->ProcessInput(0, mpIMFSampleIn, 0); ON_ERROR_MFT(330)
         hr = mpTransform->ProcessOutput(0, 1, &mpOutputSamples, &pdwStatus); ON_ERROR_MFT(340)
          

         hr = mpBufferOut->Lock(&pbBufferOut, &pcbMaxLength, &pcbCurrentLength); ON_ERROR_MFT(350)
         memcpy(mpByte, pbBufferOut, pcbCurrentLength);//mpByte is the output if the scaled image in i420
         hr = mpBufferOut->Unlock(); ON_ERROR_MFT(360)
     
      
    _END:
        return bOK;
        
    }

    Wednesday, May 27, 2020 1:53 PM
  • You are processing input and output as system memory buffers. There is no much sense in having the conversion in question to be on GPU in this case. If you want to take advantage of GPU, you are supposed to operate with D3D11 2D texture buffers instead.

    Otherwise, the MFT is already doing what you request it to do.


    http://alax.info/blog/tag/directshow



    Wednesday, May 27, 2020 2:13 PM
  • What i need from MFT is to scale my image and convert it to i420.

    I can do that on CPU but i want to move this part to the gpu.

    In anyway i will need it as byte array since my h264 encoder requires byte array in i420 format.

    So i have the byte array as a non scaled rgb in the cpu. and i need it as a scaled i420 byte array in the cpu.

    I can so the scaling and color conversion in CPU or i can just use CPU to copy the byte array to IMFMediaBuffer and copy it back from there after the scaling and color conversion to the CPU again but the scaling and color conversion will be in GPU. It will still be maybe faster than to do the scaling in CPU (and save the copy of the byte array to and from the IMFMediaBuffer) and it will free CPU since i think that copy the image byte to and from the IMFMediaBuffer and let gpu do scaling is still faster than do scaling on the CPU itself (or at least to free CPU from this).

    But still i gave MFT what it wants, IMFMediaBuffer. I also gave it manager, so it should do the scaling in GPU no?

    What am I missing?




    • Edited by ben125452 Wednesday, May 27, 2020 2:28 PM
    Wednesday, May 27, 2020 2:26 PM
  • will still be maybe faster... since i think that copy the image byte to and from....

    Not necessarily because you are neglecting the expense of CPU-to-GPU and GPU-to-CPU transfers.

    i gave MFT what it wants, IMFMediaBuffer

    No, you did not provide a suitable input texture.


    http://alax.info/blog/tag/directshow

    Wednesday, May 27, 2020 3:41 PM
  • OK i am starting to understand.

    I gave it now the right input. It goes to gpu but yes you were right, performance are not good. like 40% less.

    I will work on this now.

    Thank you so much!

    Wednesday, May 27, 2020 3:58 PM
  • Hello ben125452,

    If any reply helped you solve this issue you can mark it as an answer. It will make this question and answer more clear that will helpful for others are searching on the similar issue.

    Thanks for your cooperation.

    Best regards,

    Rita


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, May 28, 2020 1:06 AM