none
Why is YUV refresh rate lower? RRS feed

  • Question

  • Hi,

    since the image quality of YUV is actually a lot better (edges are less jagged, compared to 640x480 RGB), and since YUV consumes less bytes per pixel than RGB, why is its refresh rate only 15 Hz?

    Thursday, July 5, 2012 6:56 AM

Answers

  • Hello,

    I'm sorry my response wasn't clear.  From your reply there's a misunderstanding here:

    With roughly 640*480*3 of such values being transmitted per frame, then on the CPU the demosaicing process merges the Bayer-grid intensity values to produce 24-bit RGB pixels.

    We don't send 640*480*3 per frame, only 640*480*1 bytes of data are sent per frame.  The Bayer grid is 640x480 with the alternating red/blue and green pixels.  The 640x480 grid contains data for all the color channels -- there isn't a different grid or frame for each color.  The CPU processing performs the demosaicing of the Bayer values and converts the 1 byte per pixel format to 3 bytes per pixel.

    -- Jon

    Tuesday, July 31, 2012 12:30 AM

All replies

  • I'm still looking for a reasonable explanation from the technical developers :)

    Assuming you're using YUV422 (from reading the docs) this still doesn't explain the reduced data refresh rate...

    Wednesday, July 25, 2012 9:36 AM
  • Hello and thanks for your interest in Kinect for Windows!

    First, here are some helpful links to articles for the terms used below:

    http://en.wikipedia.org/wiki/Bayer_filter - for a description of how the camera captures data.

    http://en.wikipedia.org/wiki/Demosaicing - for a description of the demosaicing problem space.

    The reason YUV is a lower frame rate is because we actually send less data "over the wire" for the RGB mode than we do for YUV mode.  We use the native sensor capture format, which is 8-bit Bayer.  The 8-bit Bayer data is received over USB and then demosaiced into full RGB using bilinear interpolation Bayer demosaicing.

    The YUV format results in twice as much data being sent over the wire because it's essentially a 16-bit per pixel payload.  The reason YUV looks better is because the RGB data is interpolated in the camera hardware then transformed to YUV and then transformed back to RBG for display.  The YUV encoding/decoding process blurs some of the artifacts that occur from the demosaicing algorithm.

    For the image quality question, there are other demosaicing algorithms that may yield better or different artifacts but bilinear interpolation fits well for our per-frame CPU budget.  In the case of a single Kinect sensor we only have ~33 milliseconds to deliver a depth frame, color frame, and skeleton tracking data to our users.  With multiple sensors attached the average time for the frame delivery drops even lower.  The design decision was made to use bilinear interpolation for demosaicing; providing more time for users of Kinect for Windows to do something interesting with the data provided.

    I hope this clears up both the bandwidth and image quality questions for you.

    Thanks again,

    -- Jon

    Thursday, July 26, 2012 7:19 AM
  • Hi. Actually I didn't ask anything about the image quality :), although of course I've noticed the demosaicing artifacts that occur in the RGB images (at 30 Hz).

    I'm still not sure whether I understand your explanations. Apparantly, in 30 Hz RGB-mode, you transmit the umodified Bayer-grid, which has 8-bit (intensity) per cell. With roughly 640*480*3 of such values being transmitted per frame, then on the CPU the demosaicing process merges the Bayer-grid intensity values to produce 24-bit RGB pixels.

    For YUV you say that the Bayer-data is apparently already pre-processed on the Kinect (hardware) first, and converted to YUV (also on the hardware, from your description). And then this still does not explain why the data rate needs to be higher. These notions "8-bit per pixel" or "16 bit per pixel" are not very precise, as it does not explain whether it refers to Bayer-pixels or "final image" pixels. If, for example, the RGB-data production was also done on the Kinect for the 30Hz-RGB mode, it would mean to send 640*480 24-bit-per-pixel values, which is more than 16-bit-per-pixel in case of YUV. The documentation clearly mentions YUV422, i.e. a data consumption of 4 bytes per 2 pixels, or 2 bytes (16 bits) per pixel on average. For me currently the logical explanation is that the Kinect cannot do this preprocessing and YUV-encoding of the data faster than 15 Hz...

    Monday, July 30, 2012 10:47 AM
  • Hello,

    I'm sorry my response wasn't clear.  From your reply there's a misunderstanding here:

    With roughly 640*480*3 of such values being transmitted per frame, then on the CPU the demosaicing process merges the Bayer-grid intensity values to produce 24-bit RGB pixels.

    We don't send 640*480*3 per frame, only 640*480*1 bytes of data are sent per frame.  The Bayer grid is 640x480 with the alternating red/blue and green pixels.  The 640x480 grid contains data for all the color channels -- there isn't a different grid or frame for each color.  The CPU processing performs the demosaicing of the Bayer values and converts the 1 byte per pixel format to 3 bytes per pixel.

    -- Jon

    Tuesday, July 31, 2012 12:30 AM
  • Doesn't RGB have a smoothening algorithm too?  It seems like a very inconsistent pipeline.

    Sunday, February 24, 2013 9:01 AM