I build a filtergraph consisting of H264RTPSOURCE->DECODER->EVR in graph edit.
The RTSP filter is our own. When I use the INTEL H264 decoder it works fine. When I use the DTV-DVD Decoder I get scrambled video. After MUCH debugging I have discovered that the DTV decoder is asynchronously scribbling back into the IMediaSample that I passed to it in FillBuffer. I.e. while I'm assembling a NAL unit in the buffer from the stream, the DTV decoder is apparently overwriting that data on some other thread. Anyone have any idea whats going on and how to prevent it? I tried setting cBuffers = 2 in DecideBufferSize but it didn't help. Is there some sort of magic buffer lock that my CSourceStream should be calling? This one has me stumped...
I want to use the DTV Decoder rather than the Intel decoder because it supports hardware decoding and is much faster and more efficient.
I suspect that you incorrectly identified the cause. You presumably fill the buffer correctly. If decoder modifies the contents, it should basically not be a problem for you. Note that stock DTV-DVD decoder is notorious for incorrect decoding (possibly related to your debugging) when you run it under debugger. You need to check it with no debugger attached as well.
Roman, I was hoping you would respond - I know you are one of the 'wizards' of Directshow. - I also suspected you would doubt my diagnosis (I certainly did initially).
So, it definitely also breaks with no debugger. (That's why I'm debugging it). One confusing thing is that if I record the stream into a DVR file (we happen to use a proprietary format) and play the same stream back from the DVR file using the DTV decoder then it plays just fine.
To be more specific, I tried the following. (This is going to sound strange but its absolutely true)
RTPSource->TEE filter then the TEE ->DTV->RENDER and also to TEE->DVRWRITER
If I use INTEL in the render branch then everything is fine. It I use DTV in the render branch then BOTH the render and the stream in the file are corrupted. This is why I started to believe the DTV was corrupting the MediaSample. Note that the 'good' DVR file (created with the INTEL decoder in the graph) will also play back fine with the DTV filter. So whats the difference between the stream from the RTPSource and the DVR Source he asked rhetorically... None that I can detect so far...
So then I added code to the RTPSource and computed a checksum on the media buffer before and after I filled it. The 'before' checksum is on the buffer when it arrives in FillBuffer for me to write into it. My guess (which turned out to be true) was that it would still contain the data I had written to it the previous time thru fillbuffer (as long as cBuffers = 1). With the INTEL decoder the checksum at the start of FillBuffer always matches what it was at the end of the previous pass (i.e. the buffer has not been changed since I filled it) as one would expect. With the DTV decoder the checksum sometimes matches and sometimes does not (which leads me to believe there might be a timing component to this problem).
Next step which I will try tomorrow is to save a copy of the buffer and do a byte by byte comparison to find out what exactly got changed. Not that that will tell me how to fix it... But it might offer a clue or possibly an avenue to a workaround.
Any suggestions would be much appreciated. And if you still think I'm on the wrong track - let me know - I'll listen...
BTW What I REALLY want is just to find an off the shelf decoder with DXVA support so that we can scale our security application to show more cameras on the screen at one time. DTV seemed like a good candidate (free, preinstalled, no license issues, and with apparently reasonably good DXVA support).
P.S. You are right - in general it should not matter to me if the decoder modifies the buffer as long as it does it synchronously. But iif its changing the previous buffer asynchronously (which I suspect) at the same time i'm trying to write the next frame into the buffer then that's definitely a problem (and that's what I'm currently convinced is happening)
- Edited by acc3141 Tuesday, January 10, 2017 2:34 AM
The symptoms you describe make sense, esp. as you attempt to support your assumption that stock decoder decodes incorrectly because you see the data in the media sample buffer modified. However, this does not prove the decoder is guilty, even if you see different behavior compared to other decoders.
Except in rare case when media sample data is read-only, which is presumably not the scenario here, decoder receives the media sample and is free to modify the data as it likes. Your symptoms don't match well with DTV-DVD decoder's very stable operation in other pipelines, including plain playback of H.264 video.
In normal conditions, your FillBuffer initializes the data, filter base sends it to downstream connection like decoder, and your source never gets the buffer back before the decoder completes its processing. At this late point the modified data you find is not a consistent indicator of decoder's failure to process correctly. I don't know what you do in your source filter you mention "asynchronously" which suggests that maybe you are overwriting the buffer yourself before the buffer is released back from decoder and is ready for refill. Decoder is quite expectedly multi-threaded inside, and is supposed to both split the complexity between threads in software decoding as well as hiding multi-threaded DXVA processing complexity for you behind the frontend of the filter interface. If you somehow try to trick it and fill next buffer without getting buffer via GetBuffer call, you can easily end up overwriting the data scheduled for decoding.
All together, even your debugging attempts seem to be logical and reasonable, they don't explain why decoder would fail with your source and work nicely with other.
BTW standalone decoder filter from Media Player Classic package might be DXVA enabled.
Nope, not trying anything tricky with the buffers. The source filter is very straightforward - it only writes to the Media sample in the context of FillBuffer. I've been doing DirectShow for 10 years or so and have probably written 30 or more filters of various sorts. Its hard enough to make simple directshow code work much less trying anything complicated... FYI I have some other decoders that also work fine, one I wrote using a different Intel SDK, another from AXIS. I only see this problem with DTV. You are right I don't know for SURE that its the culprit but the evidence points that way. As noted below I might be able to prove it if I get lucky.
My best guess on the DVR vs LIVE discrepancy is that the timing is different. The LIVE stream assembles the frame over a period of time from multiple packets. The DVR stream reads it all at once from a file. Perhaps whatever DTV (or some other culprit) is doing, its done with it by the time the DVR filter writes into the buffer so his data does not get overwritten.
Try the MPC decoder (although not totally sure how the licensing/redistribution would work...)
Pin down what exactly is changing in the buffer. If its consistent maybe try a BREAKPOINT ON WRITE to catch who is doing it. Possibly prove that its DTV (or not).
Add the checksum code to my DVR reader filter. Hypothesis is that I will see the previous buffer get overwritten in the same way but that due to differences in timing as noted above it will always happen before I write the data.
Possibly add code to the LIVE source to assemble the frame in a separate buffer then copy that at the last second to the MediaSample. I.e. change the timing to resemble the DVR case and hope for the best. I may do this because its easy and would be an interesting experiment but obviously would not have high confidence in it as a robust solution for production software running on different machines with different timing.
Thanks for you suggestions. If you think of anything else, let me know.
P.S. I found a nice DXVAChecker tool to quickly determine which filters support DXVA.
One special note about your attempt to use decoder with tee filter. If decoder modifies data, then it's an expected side effect that data in the file is damaged. This happens because it's specificity of tee filter which reuses memory allocator and samples. It does it aggressively for a performance gain, and a side effect is that modification in data is propagated to peer output legs... Like a noise, it makes isolation of original cause more difficult yet it does not point to the one guilty in data damage in first place.
The real problem here is that decoder is not known for data damages with and without DXVA decoding in simple scenarios, and works pretty stable. This makes me think that even if the problem exists, it is not as simple as data damages causing incorrect decoding. A more specific misfit needs a much better defined debugging context to move on with further guesses.
By the way, you mentioned that you want to increase number of video feeds you display at a time side by side. I wonder if it's 25, 50, 100? I remember we did 10x10 layout when I was developing video surveillance application. The problem with stock components like EVR with this approach is that it's more like a single feed renderer and it's not doing well being duplicated so many times. There have been a number of reports that displaying many video side bv side eventually hits internal resource limitations without descriptive indication of the problem case (just E_FAIL at some point). If you want to display that many video feeds, you might need to end up with something more sophisticated than EVR instances, and I am not sure that DXVA will assist is decoding that many streams simultaneously as well (though it's possible, and perhaps it's resolution dependent).
You were right about DTV not working in the debugger. Thanks for that tip. Half the problems I was seeing were a result of this including the memory overwrite which was apparently a false alarm as it only happens in the debugger. (You can say 'I told you so' now...). I'm still checking for MediaSample overwrites but when running outside the debugger they are not happening. (I was also thinking as you were that a problem that obvious was unlikely in a component that is otherwise so stable and widely used)
So I have updated symptoms for running without the debugger.
When I run my TEE test
TEE->DTV->RENDER (EVR or VMR9 - same result)
The live stream gets corrupted after 15 to 20 seconds. Note that several Iframes have passed through during this time.
Contrary to my previous report, in this scenario the DVR file is fine. The DVR file can be played back without problems using DTV or any other codec.
I tried a few different profiles in the camera - no difference in behavior.
So I have conflicting implications here - the stream format must be basically fine because thats what goes into the DVR file which plays back OK But the stream format must be bad in some way since the live render image gets corrupted after several seconds.
I am therefore bewildered and stuck again... Why does DVR record fine while LIVE fails on exactly the same stream? I thought perhaps a problem with the timestamps but I tried setting start/end time to 0 and it made no difference.Logically it seems like there must be something subtle wrong with either the MediaSamples themselves or with my CStream implementation. I also have a DXTrace Filter that I use to dump all the MediaSample settings of each frame into a file. Sync, Preroll, discontinuity, timestamps, etc. I put this upstream of the decoder and everything looks OK. Which along with the DVR file being OK suggests the mediasamples are probably OK. This leaves my Stream implementation in the RTPFilter - which mostly consists of FillBuffer... which is VERY similar to the FillBuffer in my DVRSource (which works OK). I am running out of things to look at...
It is days like this I sometimes wish I had become a Forest Ranger instead of a programmer...
Yes I have seen the unexpected resource limitations with minimal error indications previously with multiple instances of VMR9.
I looked at MPC but the standalone codec pack does not include the decoder - that appears to be part of LAVVideo.ax which (a) only comes with the full MPC install. I think it probably supports DXVA (there is a source file called MPCDXVAVideoDecFilter.cpp which is obviously suggestive). However it also causes GraphEdit to crash if I register it. Plus a quick look at the readme suggested it likely has insurmountable licensing issues. We already have enough problems paying the bills without giving the software away... So I don't think that approach will work out but it was certainly worth a look. I have not completely given up on it but am not optimistic at this point.
As for performance, we are trying to support multiple 1920x1080 cameras at 30fps. We would be very happy if we could support 16 in a display grid. In fact we would be reasonably happy with 8. At the moment with software decoding we can barely support 2. Thus our interest in hardware decoding.
It is maddening because DTV seems to be SO close to working for me...
You seem to be doing it right. Perhaps somehow decoder receives input out of order, or some frames are skipped (for a reason which is not present when you record into file). This way it might produce incorrect output. You should be able to see something checking media sample data. Then if recorded file is fine, and also including when played back through the same decoder, then the problem might be between your source and decoder and not necessarily in damage of data where you originally looked it for.
Right - I agree now that the damaged data was a strange side effect of running DTV in the debugger and is not the cause of my current problem.
I have tried different profiles and resolutions and cameras from different vendors and am consistently getting the same result. It typically plays OK for several seconds then the updates become irregular and the image becomes distorted as if delta frames were being dropped. Also I often see the image jump back to a frame from several seconds earlier (the camera overlays a timestamp) which is interesting and might tell me something if I knew a bit more about the actual H264 internals.... Guessing perhaps something is messing up the internal buffering logic in the decoder - but what and how? Especially with timestamps set to 0?
I'm doing most of the testing in Graphedt. There is in fact nothing in between the source and the decoder - no way for the order to change - although I sometimes add a TRACE filter which confirms that the samples are all passing through in the right order and that all the flags appear to be set properly.
I'm going to look around for a simple third party RTSP or ONVIF source to try in place of my filter. (Do you know of any?)
I will also continue to dump and compare the streams from the RTPSource and the DVRSource to look for any subtle differences. They are both PUSH filters with very similar implementations - but obviously something is different somewhere....
Perhaps I can paste a quick hack of the file reader code into the RTPSource in place of the network reads and see what happens...
Just found and started using your GraphStudioNext. Very nice tool! I grabbed some screen shots from your Analyzer on the input and output of the decoder at the point where the video goes bad. I cant actually see any problems but was hoping you might take a quick look. I cant post them here apparently so I emailed them to you at 'contact'. Perhaps a thought like - well since we have ruled out A and B the next thing to look at is C...