locked
MediaStreamSource and h264, expected format of samples

    Question

  • I have an h264 stream. I'm able to parse out the SPS & PPS headers and have the frame content. But I'm stuck not knowing what format do I need to pass each sample? If I take an Annex.B capture (using ffmpeg) I can play that buffer perfectly, but when I try to do it realtime from the stream I'm working with, I can't figure out what MediaElement is expecting. Could we get more details as to the format of the samples?
    Tuesday, December 16, 2014 6:26 PM

Answers

  • OK, I finally got it!!! I used a hex editor to look at the output from ffmpeg annex.b filter, since I knew that was working, and compared that to the raw packets to understand what format I needed to send. So basically, I needed to convert the incoming data from RTMP into Annex.B format, and ensure I sent the SPS and PPS for each sample, along with the appropriate starter bytes (0x00000001). I also left the default buffering time, but I'll have to play with that to figure out what's an appropriate value, and I set the timestamp for each sample.
    • Marked as answer by SFiorito Monday, December 22, 2014 6:57 PM
    Monday, December 22, 2014 6:57 PM

All replies

  • Hello,

    You're probably not setting the codec private data correctly. Here is a link that should help get you started:

    http://msdn.microsoft.com/en-us/subscriptions/index/hh180779(v=vs.95).aspx

    -James


    Windows SDK Technologies - Microsoft Developer Services - http://blogs.msdn.com/mediasdkstuff/

    Wednesday, December 17, 2014 12:09 AM
    Moderator
  • Hi James,

    Is that guidance still relevant for non-Silverlight MediaStreamSource, used in WinRT universal apps?

    Thanks,

    Silvio

    Wednesday, December 17, 2014 6:34 PM
  • Yes. The underlying codecs are the same for both SL and WRT apps.

    -James


    Windows SDK Technologies - Microsoft Developer Services - http://blogs.msdn.com/mediasdkstuff/

    Thursday, December 18, 2014 1:37 AM
    Moderator
  • Hi James, sorry but those enums doesn't seem to be in the assemblies for WinRT or maybe named differently? I couldn't find MediaStreamAttributeKeys in the WinRT assemblies.

    I am setting the VideoStreamDescriptor properly since the h264 buffer that I extract using ffmpeg works. It's only when I'm trying to extract the h264 video packets from an RTMP stream that I'm just not clear how to send the data to the decoder in the SampleRequested event handler.

    Thursday, December 18, 2014 2:24 AM
  • Hello,

    Maybe I'm still misunderstanding. Are you unpacking the individual frames from the packet and sending a single frame down with every SampleRequested call?

    -James


    Windows SDK Technologies - Microsoft Developer Services - http://blogs.msdn.com/mediasdkstuff/

    Friday, December 19, 2014 10:37 PM
    Moderator
  • Well, that's it I'm not clear what I should be sending out for each SampleRequested call. For instance, do I send SPS and PPS in each call? I've seen some references saying it needs to be in Annex.B format, is that correct? I'm not familiar with programming against the media APIs, but I had to create a custom MediaStreamSource out of necessity and just trying to figure out the final part of actually getting the h264 to display in a MediaElement.

    Thanks again!

    Saturday, December 20, 2014 1:48 AM
  • I haven't implemented any video media stream sources, but for audio, you need to send exactly one sample every time the sample requested even triggers. If you don't have a sample available, you should return a mock frame.

    There should be no particular reason why the video media stream source should be working any differently. The only difference I see is that the requests will be one from the video stream and one from the audio stream.

    You should be able to tell which sample is required by using the StreamDescription property of the MediaStreamSourceSampleRequestedEventArgs.Request field in the event handler.

    EDIT:

    These are the methods you use to get the codec private data for media stream sources in winRT

    http://msdn.microsoft.com/en-us/library/windows.media.mediaproperties.videoencodingproperties.aspx

    H264 is supported out of the box, so all you have to do is to determine the sample length  and simply read buffers of said length from an IInputStream using a DataReader. You shouldn't have to mess around with uncompressing the stream


    • Edited by mcosmin Saturday, December 20, 2014 5:32 PM
    Saturday, December 20, 2014 5:23 PM
  • I'm already setting the VideoEncodingProperties per the meta data I receive from the RTMP stream (using CreateH264 then setting fps, width, & height). I'm getting the video packets via RTMP so I first get the stream descriptor, then I start getting frames. The key frames have full SPS & PPS while the inter-frames don't. So I'm not sure if I'm supposed to just pass these through as is, or do I need to do anything to them? Do they need to be passed as one sample (key-frame plus following i-frames?) or just individually as I get them? The MediaElement never leaves buffering state no matter what I've tried, so clearly something's not right.

    The problem here is I know very little about working with video streams, but I have no other choice in terms of how to pull this video stream. I'll just keep hacking away at it, I know I'm close just not sure if I'm sending the video packets correctly or not.

    Thanks!

    Saturday, December 20, 2014 6:55 PM
  • The media stream source sample class contains a field which notifies the media pipleline if the sample contains a key frame. Since I am on unexplored territory, the best thing I can suggest is a trial and error approach (truth to be told, this is the best approach when dealing with media stream sources in general, you need to be some sort of programming and mathematics god to get it working in one shot.). I believe the correct answer would be key frame+following inter-frames, with setting the property which notifies the presence of a key frame.


    • Edited by mcosmin Sunday, December 21, 2014 9:21 AM
    Sunday, December 21, 2014 9:21 AM
  • OK, I finally got it!!! I used a hex editor to look at the output from ffmpeg annex.b filter, since I knew that was working, and compared that to the raw packets to understand what format I needed to send. So basically, I needed to convert the incoming data from RTMP into Annex.B format, and ensure I sent the SPS and PPS for each sample, along with the appropriate starter bytes (0x00000001). I also left the default buffering time, but I'll have to play with that to figure out what's an appropriate value, and I set the timestamp for each sample.
    • Marked as answer by SFiorito Monday, December 22, 2014 6:57 PM
    Monday, December 22, 2014 6:57 PM
  • You should set buffering time to TimeSpan.Zero or zero in C++/CX, otherwise it may cause rendering issues with play/pause in release builds.

    Thursday, December 25, 2014 7:48 PM