locked
IMFSourceReader sample frame resolution RRS feed

  • Question

  • Hello,

    I am currently working on an application that reads in video frames from a video file and I have come across a few instances where the expected sample resolution that I get from querying the reader for MF_MT_FRAME_SIZE does not match the output I get from reading frame samples.  For instance, at one point before I start reading the samples I have a function that gets various information about the video such as resolution, duration, etc and I first get the media type of the file currently opened by the IMFSourceReader by doing the following:

     

    // Get the media type from the stream.
    this->pReader->GetCurrentMediaType((DWORD)MF_SOURCE_READER_FIRST_VIDEO_STREAM, 
                                                              &pType);

    I then get the frame resolution before reading any samples:

    MFGetAttributeSize(pType, MF_MT_FRAME_SIZE, &width, &height);

     

    The pixel resolution returned for one video I have is width=460 and height=348 but when the bitmap is displayed using that resolution there is scan-line overflow and after taking a closer look at the bitmap I found that the width was actually 464 pixels.  So if I then render the bitmap using the resolution 464x348 it renders just fine with no scanline overflow.  I get this behavior when I set the output to be RGB24 as well as RGB32.

    Has anyone else seen behavior like this?  Also, is there any way to query the resolution of an IMFSample interface directly or is the only way to get it how I described above querying the attribute MF_MT_FRAME_SIZE of the IMFSourceReader?

    I am fairly new to using Media Foundation so if I am doing something incorrectly that would be nice to know as well.

    Thank you,

    Andrew

     


    www.infinadyne.com
    Monday, October 18, 2010 11:47 PM

All replies

  • If you want to get the resolution of the source video stream, you should call this->pReader->GetNativeMediaType(). This gives you the media type for the original media(input stream/IMFMediaSource).

    If you have called pReader->SetCurrentMediaType() and provided a resolution other than the native resolution in the output media type, the output stream samples SHOULD have that resolution. If they don't then the conversion process undertaken by the chain of transforms within the source reader possibly was forced to produce the samples that you have recieved for some reason.

    You'd have to provide more information about your source reader configuration. What is the native video media type and what output media type have you specified?

    Tuesday, October 19, 2010 5:49 AM
  • Nobby, thank you for the reply.

     

    Anyway, I have used GetNativeMediaType and it returns the same resolution as GetCurrentMediaType, the 460x348 is actually the native resolution (this is the resolution displayed by all the video editors I have run it through as well) of this particular video file.  I have also tried to use SetCurrentMediaType to manually set the output resolution to 460x348 but it still returns frames with the resolution of 464x348.  So as far as I can tell from the documentation I have done everything possible to both query the output resolution as well as manually set it and no matter what I always get back a different resolution.

     

    The file is an MP4 file and I get the current media type with MF_SOURCE_READER_FIRST_VIDEO_STREAM and then the subtype I select for this file is MFVideoFormat_RGB32.  As I mentioned before once I start receiving samples the bitmaps are completely fine other than the width being 4 pixels greater than I am told the samples should be as well as 4 pixels greater than what appears to be the native resolution.  I also have another MP4 video file that returns frames that are 8 pixels greater than expected.  99% of the other test video files I have work just fine and how expected, so I hope I can get some help in figuring out what is going on and why.

    Thank you.


    www.infinadyne.com
    Tuesday, October 19, 2010 4:50 PM
  • Is the MP4 video subtype RGB32 or are you setting RGB32 as the output sample subtype? I'm just trying to figure out what decoders and transforms the Source Reader is configuring for you and work it out from there.

    For example, if your MP4 file uses H.264 video data, there are only 4 video formats the decoder can output as. One of them is YUY2 for example. None of these output subtypes are RBG based. If you specify the output subtype to be RGB32, the Source Reader will try to configure transforms to convert from YUY2, for example to RGB32. This is a bad example because I know from experience that the Source Reader won't insert a Colour Space Transform after the H.264 decoder. I don't know exactly why but I have a feeling that it has to do with there being more than one supported output subtype of the decoder.

    What I'm getting at here is that your Source Reader is possibly configuring Colour Space Transforms or something similar and the issue might be there. It's also possible that RGB32 is a supported output subtype of the decoder required to decode your MP4 file and the issue could be there as well. To find out what video subtype your MP4 file has, get the Native Media Type and then get the MF_MT_SUBTYPE attribute which is a FOURCC string. Match with the table in http://msdn.microsoft.com/en-us/library/aa370819(v=VS.85).aspx.

    As for your issue with different media files giving you different resolutions in the samples, have a look at the MF_MT_PIXEL_ASPECT_RATIO attribute for the native format of each file. Also have a look at MF_MT_MINIMUM_DISPLAY_APERTURE. It's possible that your media files might not have these attributes at all when loaded by the Source Reader.

    Tuesday, October 19, 2010 11:34 PM
  • The native subtype for the video is MFVideoFormat_H264 and then I manually set the subtype to MFVideoFormat_RGB32 since that is the prefered format I wish to get the samples in.  So since it is H264 maybe the transform you mentioned is causing this behavior but even if that is the case then I wonder why querying the reader for the output resolution does not give me the true output resolution.

    Is there possibly a way to query a transform that may be used to see what the output resolution would be?  Right now I am just querying the readers current media type for MF_MT_FRAME_SIZE and for this particular video it says the width is 460 but then the bitmap I recieve in my samples has a width of 464.

    Another interesting fact is I have many other H264 videos that I am able to set the subtype to MFVideoFormat_RGB24 (which fails for the video I mentioned above and that is why I set it to MFVideoFormat_RGB32) and then the resolution I get back from MF_MT_FRAME_SIZE is always correct.

    Also, I don't think the aspect ratio has anything to do with the other related problem I am having where another video I have where the MF_MT_FRAME_SIZE resolution does not match what I get in samples.  With this video the resolution from MF_MT_FRAME_SIZE matches the native resolution however the sample bitmaps have a width that is 8 pixels greater than what MF_MT_FRAME_SIZE returns.  So for now I have the original video I mentioned where the MF_MT_FRAME_SIZE queried and manually set matches the native resolution yet the samples are 4 pixels wider (464 instead of 460) and then another video where the samples are 8 pixels wider than the native resolution.

    Is there anything else I can query to check what the resolution of samples will be other than what I am currently doing which is calling MF_MT_FRAME_SIZE on the IMFSourceReader's GetCurrentMediaType attributes?  All that I need is to be able to reliably query the resolution of a video frame sample which I thought MF_MT_FRAME_SIZE would provide but as I have mentioned I have 2 videos that the resolution returned does not match the sample bitmap resolutions.


    www.infinadyne.com
    Wednesday, October 20, 2010 8:14 PM
  • That's interesting that you have been successfull in creating a Source Reader which provides RGB samples at the output with an H.264 video source. When I've tried to do this in the past I get MF_E_INVALIDMEDIATYPE. I end up having to chose one of the four supported output subtypes of the H.264 decoder as the output type for the Source Reader.

    The reason why I asked about the pixel aspect ratio and minimum video aperture is that it's possible that the decoder might try to preserve aspect ratios instead of skewing the image or padding a particular axis. It's possible that the YUV to RGB transform logic could be adding the pixels. In the case for YUY2 for example, the ratios of Y:U:V are 2:2:1. In order to convert to RGB, the missing V values are extrapolated by sampling neighbouring values. The degree of extrapolation can be simple or quite complex depending on how accurate you want the values to be. The logic could include adding helper pixels to the end of each line to help calculate the missing V values there and they're ending up in the output samples. This is complete stab-in-the-dark and I'm most likely wrong but I can't really think of anything else.

    You could try and prove it right or wrong by setting the output subtype to MFVideoFormat_YUY2 and checking the dimensions of the samples there. Failing that, you'd need a response from a MF guru who works specifically with these things. One thing I can tell you is that there's no documentation that I can find which allows you to get the chain of transforms the IMFSourceReader is using after you set the output media type. You can with Media Sessions after you use the TopoLoader to resolve transforms though.

    Wednesday, October 20, 2010 11:48 PM
  • One more attribute to check out besides what Nobby_ already mentioned: MF_MT_DEFAULT_STRIDE (warning: stride can be negative for bottom-up images). There is probably some padding in the frame: 460 is not a multiple of 8 (460/8=57.5) but 464 is (464/9=58).

    You can use MFTrace to see the media types of each component, so you get an idea what to look for in your code:
    http://blogs.msdn.com/b/mf/archive/2010/08/11/using-mftrace-to-trace-media-foundation.aspx
    http://blogs.msdn.com/b/mf/archive/2010/09/09/analyzing-media-foundation-traces.aspx

    That will also tell you which transforms are being used inside the Source Reader. The Source Reader does not really have a chain of transforms internally (at least not a full-blown topology). Instead, it just has a source, possibly a decoder, and possibly some color/interlacing conversion:
    http://msdn.microsoft.com/en-us/library/dd940436%28v=VS.85%29.aspx

    MFTrace generates quite a lot of traces, so analyzing them is not as simple as it should be. We have a blog post in preparation to try to improve that.

    Thursday, October 21, 2010 4:19 PM
  • I was actually already querying MF_MT_DEFAULT_STRIDE just to tell if the sample images were top-down or bottom-up and I just took a look at what the stride returned for the video with the native resolution of 460x348 and it is 1840.  Since I requested RGB32 samples a stride of 1840 appears correct since 1840/4=460.  So once again everything I know to check the resolution says it is 460 wide but the samples actually have 4 extra pixels.  These extra pixels may be padding as Matthieu mentioned and I definitely would like to find out why they are showing up and more importantly how to query if they are added.  From what I can see by looking at the bitmap it just appears the 460th pixel is repeated 4 more times for each scan-line which also surprised me since most padding I have seen usually just adds black pixels.

    Anyway, I will see what I can find using MFTrace as well as setting the output format to MFVideoFormat_YUY2 and then seeing if the resolution is the expected 460x348 and if the samples are too.

    Thank you very much for the assistance so far, I am working on a commercial forensic video analysis application so this problem definitely cannot be ignored and I hope to resolve it soon.


    www.infinadyne.com
    Thursday, October 21, 2010 6:21 PM
  • I just tried setting the output format to MFVideoFormat_YUY2 and then queried the output resolution which said the resolution is 460x358, I haven't had a chance to take a look at the actual samples yet to verify they are 460 wide though.
    www.infinadyne.com
    Thursday, October 21, 2010 6:29 PM
  • I just got done trying out MFTrace and I found the following line in the trace file which verifies that the transform indeed is outputting frames in the resolution 464x352 instead of the videos native resolution of 460x348:

    *********************************************************************************************************************

    1424,F8C 18:51:41.78448 CMFTransformDetours::SetOutputType @01417208 Succeeded MT: MF_MT_MAJOR_TYPE=MEDIATYPE_Video;MF_MT_SUBTYPE=MFVideoFormat_NV12;MF_MT_FRAME_SIZE=1992864825696 (464,352);

    MF_MT_FRAME_RATE=128849018881001 (30000,1001);MF_MT_GEOMETRIC_APERTURE=00 00 00 00 00 00 00 00 cc 01 00 00 5c 01 00 00 ;MF_MT_MINIMUM_DISPLAY_APERTURE=00 00 00 00 00 00 00 00 cc 01 00 00 5c 01 00 00 ;MF_MT_PIXEL_ASPECT_RATIO=4294967297 (1,1);MF_MT_INTERLACE_MODE=7;MF_MT_VIDEO_NOMINAL_RANGE=2;MF_MT_AVG_BITRATE=478311;MF_MT_DEFAULT_STRIDE=464;MF_MT_ALL_SAMPLES_INDEPENDENT=1;MF_MT_FIXED_SIZE_SAMPLES=1;MF_MT_SAMPLE_SIZE=244992

    *********************************************************************************************************************

    So, since I am trying to manually set the resolution to the native resolution of 460x348 through the reader but the samples are not coming out that way is there a way to check what the frame size will be after the transform is done converting from H264 to RGB32?  I don't mind that the frame size changes (although I would prefer the native frame size) but I definitely need some way to check what the frame size is after the transform is complete, if I don't have the frame size then displaying the frames correctly as well as properly analyzing them cannot be done.

    Thank you for all the help so far and if there is any other information I can get using MFTrace that may be useful let me know and I will get it.


    www.infinadyne.com
    Friday, October 22, 2010 7:11 PM
  • Were you able to see the RGB media types in the traces? There is a helper function here to prettyprint media types in your code otherwise:
    http://msdn.microsoft.com/en-us/library/ee663602(v=VS.85).aspx

    In the media type you gave, the decoder is telling that although the frame is 464x352 only a 460x348 area is valid and should be displayed:

    MF_MT_FRAME_SIZE = 464x352

    MF_MT_MINIMUM_DISPLAY_APERTURE      00 00 00 00 00 00 00 00 cc 01 00 00 5c 01 00 00
        struct MFVideoArea
            struct MFOffset OffsetX
                WORD  fract             00 00          
                short value               00 00
            struct MFOffset OffsetY
                WORD  fract             00 00
                short value               00 00
            SIZE     Area
                LONG cx                   cc 01 00 00 = 0x1cc = 460
                LONG cy                   5c 01 00 00 = 0x15c = 348

    What I am missing is how MF_MT_FRAME_SIZE ends-up being 460x348 in the RGB media type. When you ask the Source Reader to output RGB32 via SetCurrentMediaType(), are you setting other attributes besides MF_MT_MAJOR_TYPE and MF_MT_SUBTYPE by any chance?

    Saturday, October 23, 2010 4:26 PM