locked
Scroll forward and backwards through H264 video RRS feed

  • Question

  • Hi,

    I'm trying to develop an application that will allow me to read .mts video files (H264 MPEG interlaced). I want to be able to display still images of individual frames and to scroll backwards and forwards through the video (frame by frame) using a slider control. I don't need audio and I don't need to be able to "play" the video - so I don't care about the original frame rate and don't need a clock.

    At present I've developed a DirectShow application that uses:

    - Haali Media Splitter

    - Microsoft MPEG2 Video Decoder

    - Microsoft Video Mixing Renderer 9

    It plays OK, can pause and step forward frame by frame, but I've not been able to get it to step backwards in a reliable fashion. It can seek backwards, but I've been unable to reliably get it to display the previous frame. Sometimes it refreshes the display, sometimes it doesn't. I assume this has something to do with how the renderer buffers frames and flushes them when a seek is performed. I realize the these video formats aren't designed to support efficient playback in reverse - but it should be possible albeit inefficient (e.g. by caching previous frames and passing them to the renderer to redisplay). I've tried creating my own transform filter in order to control the flow of frames but haven't had any luck achieving what I desire.

    I'm frustrated by the lack of control I have over exactly what frame is displayed. I'd ideally prefer to avoid the DirectShow framework altogether and to control everything myself. However, I'd need components to parse/split the H264 files, decode the MPEG, de-interlace the samples and paint the YUV images to a display window. I've not been able to find such individual components outside of the DirectShow framework.

    Does anyone have any suggestions?

    Cheers, Wayne.

    Thursday, January 26, 2012 4:08 AM

Answers

  • An MTS (or M2TS) file is an MPEG-2 Transport Stream (TS), which is defined in ISO/IEC 13818-1 (identical to ITU-T H.222.0, the ITU-T standards can be downloaded for free).

    Windows includes a TS splitter (the MPEG-2 Demux), which however does not work with files. It requires an upstream push source instead of an async reader and such a push source can be written in less than a day using CSource+CSourceStream in the BaseClasses. However, the filter is designed to work with live broadcasts (which is what TS was designed for in the first place) so I don't know whether it can seek properly.

    The Elecard MPEG Demultiplexer, included in the Elecard SDK, is supposed to be able to seek precisely and even use a pre-built index if available (but I have personally never used it).

    There are open source implementations of MPEG-2 TS parsers in ffmpeg and GStreamer (both LGPL'ed I believe).

    To implement a short-term buffering filter you can not ask the video renderer's allocator to provide the buffers for you: its media samples are hardware-accelerated surfaces allocated in video memory and it can only allocate so many, so it only allocates as few of them as necessary for presentation.

    You need to implement a full transform filter that buffers the samples itself by copying their payloads to an internal circular list, before copying the data from the input to the output. The insertion of such a filter will prevent the decoder from taking advantage of any hardware-acceleration offered by the video card.

    You can implement the circular list using CMemAllocator+CMediaSample for the internal allocation (not for the pins), the list using CGenericList and the filter using CTransformFilter+CSourceSeeking.

    Since even a short-term buffer would consume plenty of memory, a better idea would be to store the buffer on disk. You can do that directly or you can implement a special allocator using CBaseAllocator that uses memory-mapped files to back the samples with buffers on disk.

    Writing your own renderer is not worth the huge effort. After all, all the renderer does during a seek is to pass the request upstream and flush its buffer when asked to from upstream. So it really wouldn'd make a difference. If you want to avoid the stop/run cycle, you can ask the splitter (or the renderer) to seek instead of relying on the FGM to pass the request along: whether it will work depends on the filters, but it usually does. The flushing and prerolling will still be there since they are necessary.


    <http://www.riseoftheants.com/mmx/faq.htm>
    • Marked as answer by WayneKelly Tuesday, January 31, 2012 10:29 PM
    Tuesday, January 31, 2012 10:02 PM

All replies

  • The problem here is not the buffering, since the first thing that happens when you issue a seek command is a graph-wide flush of all the buffers.

    The seek is actually performed by the splitter, which seeks to the nearest preceding keyframe and restarts streaming from there, marking all the following frames as preroll until the right one. The decoder is supposed to decode the preroll frames as quickly as possible and discard them, restarting the delivery of decoded frames at the right point.

    But there are 2 problems in this scenario:

    1. M2TS files don't include an index of the keyframes so the splitter has to search for it, but the M2TS does not allow to scan the file backwards and performing a thorough forward scan is very time consuming, so the splitter guesses at the position of the keyframe by looking at the timestamps and bitrate and performs quick localized searches until it finds a candidate, however the candidate may just be close enough and not exactly the best choice

    2. the frames in an H.264 video stream that uses anything but IP sequences (and usualy H.264 does use other reference frames) are not in display order so the concept of previous frame is not well defined

    Plus a few minor issue, like the possibility of the timestamps in the M2TS being discontinuos or the REFERENCE_TIME not corresponding. to an exact timestamp because of rounding errors.

    Dedicated M2TS editors have splitters that build an index of all the frames, which takes time when opening the file, but allows fast frame-precise seeking afterwards, since then you know exactly where each frame is and in what order.


    <http://www.riseoftheants.com/mmx/faq.htm>
    Monday, January 30, 2012 9:46 PM
  • Thanks Alessandro for your very helpful reply.

    So, if the problem is with the splitter - do you have any suggestions as to how I might solve this problem?

    To me there appears to be two alternatives:

    1) Replace the Haali Media Splitter with another more powerful splitter. Unfortunately I've not been able to find any other software components that can parse MTS files and I've not been able to find a specification for the format that would allow me to attempt writing my own.

    2) Circumvent by introducing a custom downstream filter or renderer. My basic idea is to catch the frames as they are generated while the video is being played forward for the first time (at maximum speed) and to cache those frames as they go past so that I can then manage seeking myself rather than asking the splitter. Does that sound like a reasonable approach? Obviously these past frames will take up a lot of memory (especially if they are uncompressed), but even if I could cache just the last 10 seconds of footage - that would be sufficient for my needs. Does anyone have any ideas as to the easiest way to implement such a strategy? I tried implementing a transform filter between the decoder and render, but when negotiating the allocator I was unable to get it to allocate a sufficiently large number of buffers for my purposes. It ignored my request and just generated about 5 buffers.  Would I be better off implementing a custom allocator/presenter or perhaps my own simple renderer? If I had my own simple renderer then I could bypass the entire DirectShow play/pause/step/seek infrastructure and control all of those things myself locally within the renderer.

    Cheers, Wayne.

    Monday, January 30, 2012 11:10 PM
  • In my opinion:

    1) This is the better alternative.

    2) I guess you could make a transform filter with its own buffering strategy, catch the seek calls, and - if the filter still holds the frame - feed it downstream without ever asking the splitter. Negotiated buffers are meant to run overhead, so I don't see how you could make use of them backwards. But really, holding *several* seconds in RAM, unless you need all the data to compute your current frame, sounds like a no-no :-)

    3) This is the best (easiest) alternative: convert your video to AVI for temporary use. I guess Adobe Premiere uses that strategy as well (according to many avis stored in temp-folders).

    Cheers

    Tuesday, January 31, 2012 10:16 AM
  • An MTS (or M2TS) file is an MPEG-2 Transport Stream (TS), which is defined in ISO/IEC 13818-1 (identical to ITU-T H.222.0, the ITU-T standards can be downloaded for free).

    Windows includes a TS splitter (the MPEG-2 Demux), which however does not work with files. It requires an upstream push source instead of an async reader and such a push source can be written in less than a day using CSource+CSourceStream in the BaseClasses. However, the filter is designed to work with live broadcasts (which is what TS was designed for in the first place) so I don't know whether it can seek properly.

    The Elecard MPEG Demultiplexer, included in the Elecard SDK, is supposed to be able to seek precisely and even use a pre-built index if available (but I have personally never used it).

    There are open source implementations of MPEG-2 TS parsers in ffmpeg and GStreamer (both LGPL'ed I believe).

    To implement a short-term buffering filter you can not ask the video renderer's allocator to provide the buffers for you: its media samples are hardware-accelerated surfaces allocated in video memory and it can only allocate so many, so it only allocates as few of them as necessary for presentation.

    You need to implement a full transform filter that buffers the samples itself by copying their payloads to an internal circular list, before copying the data from the input to the output. The insertion of such a filter will prevent the decoder from taking advantage of any hardware-acceleration offered by the video card.

    You can implement the circular list using CMemAllocator+CMediaSample for the internal allocation (not for the pins), the list using CGenericList and the filter using CTransformFilter+CSourceSeeking.

    Since even a short-term buffer would consume plenty of memory, a better idea would be to store the buffer on disk. You can do that directly or you can implement a special allocator using CBaseAllocator that uses memory-mapped files to back the samples with buffers on disk.

    Writing your own renderer is not worth the huge effort. After all, all the renderer does during a seek is to pass the request upstream and flush its buffer when asked to from upstream. So it really wouldn'd make a difference. If you want to avoid the stop/run cycle, you can ask the splitter (or the renderer) to seek instead of relying on the FGM to pass the request along: whether it will work depends on the filters, but it usually does. The flushing and prerolling will still be there since they are necessary.


    <http://www.riseoftheants.com/mmx/faq.htm>
    • Marked as answer by WayneKelly Tuesday, January 31, 2012 10:29 PM
    Tuesday, January 31, 2012 10:02 PM
  • Thanks Alessandro!
    Wednesday, February 1, 2012 12:33 AM