locked
H264 decoder problem : artifacts on decoded image RRS feed

  • Question

  • Hello,

    I usually succeed at decoding an AVI/H264 with Microsoft Media Foundation, but I have a problem with a movie, that exhibits artificats when decoded by me. VLC or Windows Media player do not have such a problem.

    See the attached image.

    I don't know what could be wrong in my use of the H264 decoder.

    -I tried to decode directly out from the IMFSourceReader or through a MFT transform

    -I tried under W7 and W10, without difference

    -I tried options like MF_LOW_LATENCY under W10, but it does not change anything (as expected).

    -The movie only have I and P frames, no B-frames.

    -The only noticeable thing is that there are very few keyframes (8 out of 1500 images, because it is slow motion). Could it be a bug of MSMF because of that large GOP size ?

    Here is also a link to a minimal VS2019 project, with the movie (7MB), that generates one image file per decoded frame, so that it is easy to see the artifacts appear.

    https://chachatelier.fr/tmp/TestMFT.zip

    Do you have any idea of what could be wrong ?

    Wednesday, March 25, 2020 12:04 PM

Answers

    • In Windows 10 extracted raw H.264 still exhibits decoder issues (I included output file in the repo above)
    • I don't think xvid is available in MF interface in first place
    • There is no direct way to instruct MF Source Reader to use specific decoder; instead you can read in original compressed format and then do decoding seaprately (whether it's easy or not)

    http://alax.info/blog/tag/directshow

    • Marked as answer by C.h.a.c.h.a_ Tuesday, March 31, 2020 3:44 PM
    Tuesday, March 31, 2020 3:00 PM

All replies

  • Sorry, what is the artifact exactly?

    Media Foundation decoder reports frame corruption. I thought MF AVI demultiplexer could be a problem here, sending data in some wrong, such as truncated, shape. 


    http://alax.info/blog/tag/directshow

    Wednesday, March 25, 2020 10:58 PM
  • I gave the file a quick look and I think the problem, whatever it is, is this.
    H.264 track format includes SPS and PPS NAL units, which is fine.
    However payload data has another PPS embedded into frame data. This could be confusing for decoder.

    Input Path: test.avi
    
    Video:
    MF_MT_FRAME_SIZE: 1280 x 888
    MF_MT_INTERLACE_MODE: MFVideoInterlace_MixedInterlaceOrProgressive
    MF_MT_FRAME_RATE: 10000000/400000 (25.000)
    
    Sample 0, CleanPoint:
    45 bytes: 00 00 00 01 06 05 25 0A CB 24 13 FA 30 33 01 A2 BD 9C 52 2B 3B 8D C1 58 56 49 44 20 30 78 30 30 31 38 20 5B 30 78 62 62 32 64 5D 00 80
    21 bytes: 00 00 00 01 67 64 00 20 AC 76 80 50 07 1F 97 01 01 E1 00 85 40
    9 bytes: 00 00 00 01 68 EE 06 8D CB <<-----------------------------
    59162 bytes: 00 00 00 01 65 B8 10 0B 39 C0 72 2F 78 2D 82 7E 54 F8 85 1C 4A 80 5A A1 53 03 C3 55 84 FC 28 39 8C 70 CD 29 9E DC 0C 0D 42 3B 2A AD FC 3B 0A 81 4C B4 52 C8 70 CB F6 85 EA A2 DF 49 BC A6 C8 0C...
    59934 bytes: 00 00 00 01 65 00 12 22 E0 40 2C E7 F7 6E F3 FF FC 25 29 7B 84 FB AB A2 B1 5A 39 23 6C 51 67 9D 98 2B EF 82 E7 92 95 6E 68 8F D6 41 4C 01 68 B9 5C E0 32 D3 42 23 4E 8D 13 E6 B4 41 8B 40 B7 EF...
    
    Sample 40 0000 (+ 40 0000):
    9 bytes: 00 00 00 01 68 5B 82 97 2C <<-----------------------------
    446 bytes: 00 00 00 01 41 D0 20 8D 9C E0 8B 9C 78 22 1D DF FF E9 DD 83 15 70 64 9E 5D 25 E7 BB EA 24 FF 45 A3 D1 8A C8 7D C8 89 E9 B0 2A AA F9 68 26 01 EC 5E D3 77 2B E6 A5 26 EF E2 F6 10 96 80 0C 20 65...
    569 bytes: 00 00 00 01 41 00 12 C3 40 82 36 73 80 42 D4 27 0B 04 D0 A9 55 A6 CD 9A F7 68 29 16 FE DB 75 38 71 4F 2E B1 95 A5 5D 53 4D FC 45 C4 D4 65 0A 76 6B 68 F3 13 C3 65 41 05 38 73 43 52 BB 6A 24 00...
    17 bytes: 00 00 00 01 41 00 08 98 D0 20 8D 9C E0 00 02 DA 80
    
    Sample 80 0000 (+ 40 0000):
    1252 bytes: 00 00 00 01 41 D0 41 0D 9C E0 6A A8 C6 E9 40 53 EE 23 FF F5 21 EB 76 74 59 60 65 2B FC 6A 19 08 3B FD 23 D1 36 90 24 A3 C9 31 75 19 A3 73 7A 95 88 B3 B5 46 CE 97 51 CF 6C 51 CD ED 27 BE A5 D5...
    1467 bytes: 00 00 00 01 41 00 12 C3 41 04 36 73 80 0C 23 EE C7 C4 2E AF 00 D1 FE F0 AE F0 CB 05 7D 14 DB 3B BC B9 D4 AB 77 46 76 D5 FA 66 73 D6 44 FB 45 B8 B9 B0 5A 39 17 49 B7 F9 1E 76 0A EF 59 89 FD 49...
    17 bytes: 00 00 00 01 41 00 08 98 D0 41 0D 9C E0 00 02 DA 80
    
    Sample 120 0000 (+ 40 0000):
    1232 bytes: 00 00 00 01 41 D0 61 8D 9C E0 69 CD 47 9B 0A 44 0A BB F9 FF 0E 77 4F D0 55 C4 D6 96 81 AF DB A6 FD 08 F3 BF F7 5C 87 04 D2 E5 72 14 63 46 45 B5 1B 8E A3 B5 E5 E9 F8 3C 64 04 69 09 D7 D8 97 4D...
    1697 bytes: 00 00 00 01 41 00 12 C3 41 86 36 73 80 44 C6 78 C2 BF 5F 32 84 55 B1 AC 3D 50 BE 17 04 89 F5 37 F9 F1 55 37 8B 90 3E F2 0A A3 0B 4D 0B 4D 93 63 95 BB EB DE 14 09 EA FF 1F C1 59 3E DF 6B E8 35...
    17 bytes: 00 00 00 01 41 00 08 98 D0 61 8D 9C E0 00 02 DA 80
    ...

    http://alax.info/blog/tag/directshow

    Wednesday, March 25, 2020 11:10 PM
  • Just look at the wrong pixels in the highlighted red area of the screen capture attached to the initial post. Those pixels are like dirty pixels remaining from previous images.

    Wednesday, March 25, 2020 11:11 PM
  • OK, let me go one step further and decode the PPS data (using this bot)

    00 00 00 01 68 EE 06 8D CB
    
    Detail:
    
    !! Found NAL at offset 4 (0x0004), size 5 (0x0005) 
    0.8: forbidden_zero_bit: 0 
    0.7: nal->nal_ref_idc: 3 
    0.5: nal->nal_unit_type: 8 
    1.8: pps->pic_parameter_set_id: 0 
    1.7: pps->seq_parameter_set_id: 0 
    1.6: pps->entropy_coding_mode_flag: 1 
    1.5: pps->pic_order_present_flag: 0 
    1.4: pps->num_slice_groups_minus1: 0 
    1.3: pps->num_ref_idx_l0_active_minus1: 0 
    1.2: pps->num_ref_idx_l1_active_minus1: 0 
    1.1: pps->weighted_pred_flag: 0 
    2.8: pps->weighted_bipred_idc: 0 
    2.6: pps->pic_init_qp_minus26: -6 
    3.7: pps->pic_init_qs_minus26: -6 
    4.8: pps->chroma_qp_index_offset: 0 
    4.7: pps->deblocking_filter_control_present_flag: 1 
    4.6: pps->constrained_intra_pred_flag: 0 
    4.5: pps->redundant_pic_cnt_present_flag: 0 
    4.4: pps->transform_8x8_mode_flag: 1 
    4.3: pps->pic_scaling_matrix_present_flag: 0 
    4.2: pps->second_chroma_qp_index_offset: 0 
    4.1: rbsp_stop_one_bit: 1

    and

    00 00 00 01 68 5B 82 97 2C
    
    Detail:
    
    !! Found NAL at offset 4 (0x0004), size 5 (0x0005) 
    0.8: forbidden_zero_bit: 0 
    0.7: nal->nal_ref_idc: 3 
    0.5: nal->nal_unit_type: 8 
    1.8: pps->pic_parameter_set_id: 1 
    1.5: pps->seq_parameter_set_id: 0 
    1.4: pps->entropy_coding_mode_flag: 1 
    1.3: pps->pic_order_present_flag: 0 
    1.2: pps->num_slice_groups_minus1: 0 
    1.1: pps->num_ref_idx_l0_active_minus1: 0 
    2.8: pps->num_ref_idx_l1_active_minus1: 0 
    2.7: pps->weighted_pred_flag: 0 
    2.6: pps->weighted_bipred_idc: 0 
    2.4: pps->pic_init_qp_minus26: -2 
    3.7: pps->pic_init_qs_minus26: -2 
    3.2: pps->chroma_qp_index_offset: 0 
    3.1: pps->deblocking_filter_control_present_flag: 1 
    4.8: pps->constrained_intra_pred_flag: 0 
    4.7: pps->redundant_pic_cnt_present_flag: 0 
    4.6: pps->transform_8x8_mode_flag: 1 
    4.5: pps->pic_scaling_matrix_present_flag: 0 
    4.4: pps->second_chroma_qp_index_offset: 0 
    4.3: rbsp_stop_one_bit: 1 
    4.2: rbsp_alignment_zero_bit: 0 
    4.1: rbsp_alignment_zero_bit: 0

    There seem to be two PPS in flight with pic_parameter_set_id 0 and 1 respectively. This makes this file weird enough to show issues of sorts.

    You can go ahead and see which PPS is being referenced by VCL NAL units. There are probably more anomalies to find.


    http://alax.info/blog/tag/directshow



    Wednesday, March 25, 2020 11:15 PM
  • But having multiple SPS/PPS is legal, and if the encoder has decided to put one SPS/PPS per I-frame, this is not a problem (that's even better form streaming)

    In the current movie, there is indeed in additional PPS in the frame after each I-frame. That's certainly useless here, but legal, since the PPS id is distinct.

    The decoder should handle that.

    I have added the attribute MFT_SUPPORT_DYNAMIC_FORMAT_CHANGE, but it has no visible effect here.

    Moreover, what is strange is that the artefacts seem to occur more and more as we get further from the previous I-frame. When the next I-frame is reached, everything is clean again. If the extra PPS was wrong, it would certainly have a visible effect as soon as the very few frames after each I-frame.

    [edit]

    I removed the additional PPS from the file using an hex editor, and the decoded output is the same (still with "dirty" pixels). It tends to show that the extra PPS is not a problem.


    Thursday, March 26, 2020 7:01 AM
  • Extra PPS is not a problem and is legal in terms of stream decoding. However, demultiplexer and decoder have to be prepared to receive parameter set updates with the frame data. I doubt that it is compliant to, for example, MP4 spec to have parameter set NAL outside of track description (excluding repeated PS NALU). Even if it is legal, implementations in wild life are known to have issues with this.

    I would suggest, if you just need ideas how to tackle all this, lifting that PPS 1 into media type and decode that way. Or convert data into H.264 ES and process the stream. Also it might o happen that decoding behavior is different depending on whether use use HW accelerated decoder or software version.

    We assume that H.264 data per se is valid. However decoder does report corruption, so it is pretty clear that some pipeline component is not happy with this layout of data.


    http://alax.info/blog/tag/directshow


    Thursday, March 26, 2020 8:43 AM
  • How do you get the decoder corruption report ?

    I used MFTrace but found nothing relevant

    Thursday, March 26, 2020 9:07 AM
  • Thursday, March 26, 2020 9:34 AM
  • So  at this point, to summarize :

    -MFSampleExtension_FrameCorruption is reported when decoding samples (from raw to YUV; if the sourcereader's outputmediatype has not been changed, for instance to query timestamps only, no error is raised)

    -it seems to come from the additional PPS that come after a I-frame/SPS/PPS, and that defines QP parameter different from the I-frame

    -it is valid, but the decoder does not handle that

    -unless I perform extra work to filter the input, I will get corrupted frames

    And at this point, it can be attributed to a decoder bug/incomplete standard support.

    Right ?


    What do you mean by "lifting that PPS 1 into media type and decode that way" ?


    Thursday, March 26, 2020 10:57 AM
  • Generally right, but I am not 100% sure whether it is a (1) decoder bug, (2) limitation incompleteness of implementation, or (3) limitation behavior by design.

    If the question is to clarify this a bit further, I think it's possible to do the following.

    Currently, the first corruption is on the fourth frame of the video - pretty early.

    If you get all H.264 NAL units up to this moment and put it into raw H.264 bitstream, then use the same MFT and set it up to take input as H264 ES stream and feed the bitstream, you would either see the same corruption or valid frame output. 

    If corruption persists, I would say it's (1) or (2). Maybe new PPS invalidates previous and then something references the first one and hits the problem. 

    If corruption is gone, I would say it is (3) the decoder is okay and is just not prepared to get in-band PPS. The NAL is expected to be a part of media type and/or caller is responsible for dynamic media type change.


    http://alax.info/blog/tag/directshow

    Thursday, March 26, 2020 12:53 PM
  • >If you get all H.264 NAL units up to this moment and put it into raw H.264 bitstream, then use the same MFT and set it up to take input as H264 ES stream and feed the bitstream, you would either see the same corruption or valid frame output.

    I will try that. It will take me some time since I have never used RAW H264 input and I am not sure how to set the source reader as H264_ES. But at least I have already written some code to extract and analyze the NALUs myself (and it works, even my golomb decoding inside the SPS/PPS)

    >The NAL is expected to be a part of media type and/or caller is responsible for dynamic media type change.

    I could understand that, but here only the quantization parameter is changing, so it is not like a media type change. It should only update some "minor" decoders parameters, and should be "easy".



    Thursday, March 26, 2020 1:50 PM
  • It looks normal from the standpoint of bitstream and, yeah, it is also perfectly legal to have multiple PPS. But internally APIs do differentiate between NALUs, spilt them one from the other etc. If it so happens that (a) API has a limitation of 1 PPS at a time or (b) parameter set NALs are supposed to be attached to media type only the layout of your stream would get into conflict with API expectations.

    In the case of H.264 ES (used pretty rarely out there, but it's functional) it is sufficient to have clear media type with all NALUs transferred in payload buffers. If this works out with 2 PPS for you, you know that decoder is generally fine and it's preparation step which is letting your down and is not capable to process PPS NALUs correctly.

    I would just concatenate NALUs from media type and buffers from first 50 frames, then put them into single IMFMediaBuffer/IMFSample and then would send to MFT via ProcessInput. From there I would check how many frames I get back and whether they still come with corruption reported. This way it's merely easy to test things out.


    http://alax.info/blog/tag/directshow


    Thursday, March 26, 2020 2:07 PM
  • I really have a hard time to set up a data flow with H.264 ES to send to  IMFTransform.

    I have problems :

    -to set up a source reader from a file containing raw NALUs (can't create from the bytestream, I always get errors, even with "video/h264" properly set as the bytestream mime type)

    -to make a IMFTransform accept the MFVideoFormat_H264_ES as MF_MT_SUBTYPE (GetInputStatus() will fail)

    -to get rid of IMFTransform "needs more input", that occurs until the second I-frame is met, and seems to drop every previous frames (the first frame out of ProcessOutput() does not match the first timestamp pushed into ProcessInput())

    Do you have any sample code that I could modify ?

    Thursday, March 26, 2020 5:44 PM
  • Thanks. I don't have W10 for now, so MFVideoFormat_H264_ES seems unavailable (after porting the code from winrt to basic MSMF code).

    Anyway, the movie produced told me he used the Xvid encoder. Is it possible to force the source reader transform to use the Xvid codec ?


    Tuesday, March 31, 2020 2:53 PM
    • In Windows 10 extracted raw H.264 still exhibits decoder issues (I included output file in the repo above)
    • I don't think xvid is available in MF interface in first place
    • There is no direct way to instruct MF Source Reader to use specific decoder; instead you can read in original compressed format and then do decoding seaprately (whether it's easy or not)

    http://alax.info/blog/tag/directshow

    • Marked as answer by C.h.a.c.h.a_ Tuesday, March 31, 2020 3:44 PM
    Tuesday, March 31, 2020 3:00 PM
  • Ok.

    If I have time I will try to compile the XVid sample codes and try to decode the movie with it.
    But at least I have plenty of answers.

    Thanks for your help.

    Tuesday, March 31, 2020 3:44 PM