Raw YUV from camera. RRS feed

  • Question

  • So, I'm working on a Skype-like app and I'd like to add support for WinRT as well. Yes, I'm not joking, in some countries customers ask for WinRT support with full audio+video support :)

    Is it sure that I need to do that insane mambo-jambo (the GrayscaleTransform way) to get YUV data?

    What do I need to do to display YUV data? Obviously, I know how to convert to any rbg format, I'd like to have some hints to know in what direction I need to look for solution.

    I was always fun of Windows development, I've done windows mobile for many years and pretty much knew it inside out... the WinRT thing sits second to Symbian on my hate list. When I had to do Symbian I was sure that will fail eventually. That stirr away from professional developers with something totally new is a step that came way too late when there are other platforms gaining popularity rapidly and taking most of the attention of developers. Seriously, it feels that Android or iOS is closer to Win32 than the WinRT. It's kind of surprising, I don't see any project (on sf.net for example, or codeproject) that does yuv capture or display.

    Thursday, October 3, 2013 2:28 AM


All replies

  • Hello,

    Here is the answer to your question:

    Q. Is it sure that I need to do that insane mambo-jambo (the GrayscaleTransform way) to get YUV data?

    A. Yes, to gain access to the underlying frames you need to create a Media Foundation plug-in. This can either be a Media Foundation Transform (such as the greyscale transform) or a Media Foundation Sink.

    Hopefully this sample will get you started:

    Real-time communication sample (Windows 8.1)


    I hope this helps,


    You might also be interested in my upcoming free chalk talk: Light up your Apps with Sights and Sounds

    Go here for more information: http://aka.ms/Jdk7j0

    Windows SDK Technologies - Microsoft Developer Services - http://blogs.msdn.com/mediasdkstuff/

    Thursday, October 3, 2013 8:27 PM
  • Yes, I ended up using the grayscale transform. I spent more than a day fighting problems associated with using C++/CX and WRL for defining runtime classes in the same dll, eventually I gave up and made the WRL defined classes to live in their separate dll. The issue was because public ref classes defined using C++/CX automatically export DllGetActivationFactory, which conflicts with factory that I need to export for WRL defined classes.

    Now I'm facing second part of my journey. Initially I thought that it shouldn't be a big deal, but I was completely wrong. So, how can I show incoming video?! :) I receive h264 encoded frame over RTP, I have all the code in place to decode and possibly do colorspace conversion if needed, all I want is to show video frames (RGB or YUV) in my metro app. I was hinted about MediaElement but it appears that it's completely useless in my case and won't do the job. I've done like 10 years of windows development, I knew it probably inside out... you probably understand my frustration that I cannot do simplest things with WinRT and that's why I think ms is on its way downhill. Quite possible MS now makes mode money now on royalties from android phones than on WinRT/Win8 :)

    Please help me, what direction should I look for to be able to show video frames.

    PS. One more suggestion: what is that insanity with debug output overflown with exception messages? Debug output because now like a trashcan that shouldn't be looked into for clues... who ever decided to use exceptions internally should be rewarded with a medal of honor...

    Thursday, October 10, 2013 6:35 PM
  • Hopefully this sample will get you started:

    Real-time communication sample (Windows 8.1)


    That sample looks good if I start and run it, but it's absolutely useless in real world. The incoming video MediaElement needs some source. Here's how it's done in the sample:

    _localHostVideo[_latencyMode]->Source = ref new Uri("stsp://localhost");

    I'm not going to write that weird stsp protocol to push incoming video to display surface using IP, this is plainly ridiculous! Even if I had to write that stsp mess, I still have no clue whatsoever about format of the data that I'd need to push over locahost. By the way, is that stsp stands for, is it some flavor of rtsp? or name used by microsoft for rtsp? It it's just random protocol invented for the simple sample?

    I have that impression that all WinRT was made in such way: there was a need for this kind of communication sample app and required components were implemented in the system to make it work... Otherwise, I have no clue how possibly was it made that the most common way to show video in any communication app isn't supported by MediaElement.

    It looks that MediaElement got extra methods in 8.1 that might do the job (correct me if I'm wrong about that), but I'm not sure if I'm ok with that. I have on my desk just about every model of win surface (arm and x86 models) that hit market shelves and I need to make working on these devices. So... what should I do, may be I can phone bill to get some help with his mess? :)

    Thursday, October 10, 2013 8:21 PM
  • If this kind of request about showing video frames appears out of possibilities of WinRT, then can somebody recommend me a way to display bitmap or bmp file? If winrt is so retarded, I don't see anything wrong on this platform to create 30 bmp files per second and display them on the screen or some panel (if this kind of request doesn't appear to be even more impossible than the original one)

    Thursday, October 10, 2013 10:03 PM
  • Hello,

    If you can share with us the overall problem that you are trying to solve maybe we can recommend an architecture that can help you accomplish it in the best way possible.

    If you would like to work directly with someone on my team you can open a support incident and we can work with you one on one.

    To open a support incident go here:




    Windows SDK Technologies - Microsoft Developer Services - http://blogs.msdn.com/mediasdkstuff/

    Thursday, October 10, 2013 10:46 PM
  • Hello James,

    I'm not sure if I need to have anybody from MSFT to help me do that. That's crazy that I cannot find normal way to draw frames on screen in WinRT app. In short, I would like to use as less as possible os specific stuff. Here's what I do  in regular Win32 (pseudo-code):

        CClientDC dc(m_wnd);
        dc.StretchDIBits(renderRect, bmpRect, bmpBuffer, (BITMAPINFO *)&info->bmiHeader, DIB_RGB_COLORS, SRCCOPY);

    Imagine this simple case:

    int w=352, h=288;
    unsigned *bitmap = new unsigned[w*h]
    unsigned color = 0x333333ff;
    for(int i=0; i<h; ++i)
       for(int j=0; j<w; ++j)
           bitmap[i * w + j] = color;

    How do I stuff these bitmap bits into something, so that I could see that gray rectangle on screen of WinRT?! I see how I can draw rectangles using directx, but obviously this isn't what I need.

    I'm trying some directx samples now... it's just amazing how handicapped this winrt thing is, I think that transition from text msdos to win-gui was less painful than this winrt.

    Hey, but it has animations that can be done using pure xml without writing any code, right? Why would I want messing with showing video on screen if I can create a cartoon out of my app ... :)

    Monday, October 14, 2013 2:21 AM
  • Direct2D or Direct3D is a perfectly reasonable way to render a bitmap image to the screen. If you are looking to mix XAML and GUI elements, you probably want to use XAML/Direct2D interop.


    In fact, if you are actually processing video data, you should look at using Direct3D 11 video directly if Media Foundation is not suitable to your needs.


    StretchDIBits is a legacy GDI function dating back over a decade. Windows graphics has in fact gotten pretty powerful since then...

    Monday, October 14, 2013 5:53 AM
  • I'm a bit at loss with Direct2D vs Direct3D 11. Do they refer to the same thing?

    A bit late here, but I want to confirm that I after browsing all the Windows 8 samples I took Direct2D-Xaml sample as a starting point and got it working. Thanks for help!

    Wednesday, October 30, 2013 8:57 PM
  • Direct2D is basically a modern "GDI" with support for rasterizing 2D shapes like polygons, circles, etc. It renders onto a Direct3D device internally or explicitly for hardware acceleration of shape drawing.

    Direct3D is the low-level graphics API which can be used for 2D or 3D rendering, or even non-rendering scenarios like DirectCompute.

    Thursday, October 31, 2013 6:32 AM