Auto Exposure Compensation

    General discussion

  • At this point I have given up on requesting manual control of the color camera. Now that the only Kinects available are the Xbox One versions, and the firmware is locked into whatever updates happen to the Xbox, it seems unlikely that this will ever be fixed.

    So the only option left is a post processing effect based on the color camera settings. I'm posting the code I am currently using in my Kinect project, free for anyone to use. This method uses a Unity Compute Shader, but the same math and workflow should be applicable to other projects.

    I've rewritten the code so that it works with the Kinect View sample project. All you have to do is replace the ColorSourceManager.cs file with the following code, then drag the "AutoExposureCompensation.compute" shader into the Color Source Manager's "Auto Exp Shader" field in the Unity Editor.


    using UnityEngine;
    using System.Collections;
    using Windows.Kinect;
    public class ColorSourceManager : MonoBehaviour 
    	public int ColorWidth { get; private set; }
    	public int ColorHeight { get; private set; }
    	private KinectSensor _Sensor;
    	private ColorFrameReader _Reader;
    	private Texture2D _Texture;
    	private byte[] _RawData;
    	private byte[] _ImgData;
    	private bool isNewFrame;
    	public ComputeShader autoExpShader;
    	private ComputeBuffer rawBuffer;
    	private ComputeBuffer imgBuffer;
    	private int shaderKernel;
    	private float gamma;
    	private float autoExposure;
    	public float targetExposure = 0f;
    	public Texture2D GetColorTexture()
    		return _Texture;
    	void Start()
    		_Sensor = KinectSensor.GetDefault();
    		if (_Sensor != null) 
    			_Reader = _Sensor.ColorFrameSource.OpenReader();
    			var frameDesc = _Sensor.ColorFrameSource.CreateFrameDescription(ColorImageFormat.Rgba);
    			ColorWidth = frameDesc.Width;
    			ColorHeight = frameDesc.Height;
    			_Texture = new Texture2D(frameDesc.Width, frameDesc.Height, TextureFormat.RGBA32, false);
    			_RawData = new byte[frameDesc.BytesPerPixel * frameDesc.LengthInPixels / 2];
    			_ImgData = new byte[frameDesc.BytesPerPixel * frameDesc.LengthInPixels];
    			if (!_Sensor.IsOpen)
    			rawBuffer = new ComputeBuffer((int)((frameDesc.Width / 2) * frameDesc.Height), sizeof(int));
    			imgBuffer = new ComputeBuffer((int)(frameDesc.Width * frameDesc.Height), sizeof(int));
    			shaderKernel = autoExpShader.FindKernel("CSMain");
    			autoExpShader.SetBuffer(shaderKernel, "rawBuffer", rawBuffer);
    			autoExpShader.SetBuffer(shaderKernel, "imgBuffer", imgBuffer);
    		isNewFrame = false;
    	void Update () 
    		// Basic Exposure Controls
    		if (Input.GetKeyDown (KeyCode.Equals))
    			targetExposure += 0.1f;
    		if (Input.GetKeyDown (KeyCode.Minus))
    			targetExposure -= 0.1f;
    		if (_Reader != null) 
    			var frame = _Reader.AcquireLatestFrame();
    			if (frame != null)
    				float shutterGain = ((float)frame.ColorCameraSettings.ExposureTime.TotalMilliseconds / 333333f) / 0.5f;
    				float gain = frame.ColorCameraSettings.Gain / 3;
    				autoExposure = shutterGain * gain;
    				gamma = frame.ColorCameraSettings.Gamma;
    				isNewFrame = true;
    				frame = null;
    			autoExpShader.SetFloats("Exp", 1/(autoExposure - targetExposure));
    			autoExpShader.SetFloats("Gamma", gamma);
    			autoExpShader.Dispatch(shaderKernel, ((_RawData.Length/4)/512),1,1);
    		isNewFrame = false;
    	void OnApplicationQuit()
    		if (_Reader != null) 
    			_Reader = null;
    		if (_Sensor != null) 
    			if (_Sensor.IsOpen)
    			_Sensor = null;


    #pragma kernel CSMain
    // Buffers
    StructuredBuffer<int> rawBuffer;
    RWStructuredBuffer<int> imgBuffer;
    // Color Variables
    float Exp;
    float Gamma;
    float Y1;
    float U;
    float Y2;
    float V;
    float R;
    float G;
    float B;
    // Kernels
    void CSMain (uint3 id : SV_DispatchThreadID)
    	// Decode Pixel Data From Integer
    	Y1 = (float)(((rawBuffer[id.x]) & 0x000000ff)); 
        U = (float)(((rawBuffer[id.x]) & 0x0000ff00) >> 8);
        Y2 = (float)(((rawBuffer[id.x]) & 0x00ff0000) >> 16);
        V = (float)(((rawBuffer[id.x]) & 0xff000000) >> 24);
    	// Remove Gamma
    	Y1 = pow(Y1, Gamma);
    	Y2 = pow(Y2, Gamma);
    	// Adjust Exposure
    	Y1 *= Exp;
    	Y2 *= Exp;
    	// Optional: Reduce Overall Exposure To Prevent Clipping and faux "Increase" Dynamic Range
    	Y1 *= 0.8;
    	Y2 *= 0.8;
    	// Apply Gamma
    	Y1 = pow(Y1, 1/Gamma);
    	Y2 = pow(Y2, 1/Gamma);
    	// Process YUY2
    	Y1 = Y1 - 16;
    	U = U - 128;
    	Y2 = Y2 - 16;
    	V = V - 128;
    	// First Pixel
    	R = ( 298 * Y1 + 409 * V + 128 );
    	G = ( 298 * Y1 - 100 * U - 208 * V + 128 );
    	B = ( 298 * Y1 + 516 * U + 128 );
    	// Optional: Raise Floor to faux "Increase" Dynamic Range
    	R += 16;
    	G += 16;
    	B += 16;
    	// Encode 8bit RGBA Data into 32bit Integer
    	imgBuffer[id.x * 2] = 255;
    	imgBuffer[id.x * 2] <<= 8;
    	imgBuffer[id.x * 2] += clamp((int)B >> 8, 0, 255);
    	imgBuffer[id.x * 2] <<= 8;
    	imgBuffer[id.x * 2] += clamp((int)G >> 8, 0, 255);
    	imgBuffer[id.x * 2] <<= 8;
    	imgBuffer[id.x * 2] += clamp((int)R >> 8, 0, 255);
    	// Second Pixel
    	R = ( 298 * Y2 + 409 * V + 128 );
    	G = ( 298 * Y2 - 100 * U - 208 * V + 128 );
    	B = ( 298 * Y2 + 516 * U + 128 );
    	// Optional: Raise Floor to faux "Increase" Dynamic Range
    	R += 16;
    	G += 16;
    	B += 16;
    	// Encode 8bit RGBA Data into 32bit Integer
    	imgBuffer[id.x * 2 + 1] = 255;
    	imgBuffer[id.x * 2 + 1] <<= 8;
    	imgBuffer[id.x * 2 + 1] += clamp((int)B >> 8, 0, 255);
    	imgBuffer[id.x * 2 + 1] <<= 8;
    	imgBuffer[id.x * 2 + 1] += clamp((int)G >> 8, 0, 255);
    	imgBuffer[id.x * 2 + 1] <<= 8;
    	imgBuffer[id.x * 2 + 1] += clamp((int)R >> 8, 0, 255);

    ***(updated 04/26/15 at 12:15pm) I realized it was better to adjust the luminance values directly rather than the RGB values, much much better results***

    What this code will do:

    This enables users to set a target exposure value for the Kinect V2's color camera to try and match. This WILL NOT change the actual exposure of the image. It adjusts the image's exposure after it has been captured to try and make it consistent and reduce the flickering that constantly occurs with the Kinect. Since it is not adding any detail to the image, in instances where the actual exposure and target exposure vary greatly the resulting image will likely look worse.

    What this code WILL NOT do:

    This code will not add any detail to the image. It will not prevent over/under exposure. It will not completely remove flickering from the Kinect's auto exposure. It will not change the shutter speed and reduce motion blur. It will not reduce noise created from increased gain. It will not prevent the Kinect from dropping down to 15fps in low light.

    The best way to use the Color Camera is still in a bright, evenly lit environment. This will reduce the chances of over exposure and make the auto exposure flickering less noticeable.

    In my opinion the only way to make the Kinect's color camera actually usable is to enable manual exposure control. But hopefully this will give people developing creative apps (3D scanning, computer vision, cinematography, etc) something to work with.

    Please let me know if you have any questions, see any flaws with the code, or have any suggestions for improvement. I would love to make this code as good as possible.

    Thanks and good luck!

    • Edited by sam598 Sunday, April 26, 2015 7:16 PM updated compute shader code
    Sunday, April 26, 2015 6:13 PM

All replies

  • Some more information:

    Exposure Values

    This code assumes that the following settings would be the "ideal" exposure.

    Frame Rate: 30fps

    Shutter Speed: 1/60th of a second (or a 180 degree shutter)

    Gain: 0db

    All of these setting effect exposure, and these are fairly standard settings. The autoExposure value is calculated based on this, but you are free to change the code as you please.


    In the compute shader the gamma value is removed before adjusting the exposure, and then reapplied after. This is done because it is a much more accurate way of adjusting exposure. It is entirely possible to apply the exposure values directly to the RGB values and reduce computation costs, however this would result in greater flicker and a less consistent exposure.

    Why a compute shader?

    Initially I had written this code to run on the CPU. However removing and reapplying the gamma was too computationally expensive. However is entirely possible to apply the exposure values directly to the  RGB values and get real time results on the CPU.

    This does mean that the application requires the ability to run compute shaders. However if the machine is running a Kinect then it should be capable of running this shader.

    Why is there still flickering?

    Despite all of the best intentions, post-process exposing an image is not the same as actually exposing it. Changing the exposure of a camera changes the quality (light, color, noise, etc) of the image and there is no way around that. While the exposure adjustment is handled with floating point precision it is still pre-processed 8bit data, and there is only so much that can be done.

    More than likely there are also filters, sharpening and adjustments happening on the Kinect/API side of things to make the image more "visually pleasing" that make it impossible to match the exposure.

    Also I would never rule out my code being slightly off. So there is that too.

    • Edited by sam598 Sunday, April 26, 2015 6:39 PM typo
    Sunday, April 26, 2015 6:34 PM
  • Thank you for sharing!

    Looks like this is definitely as good as it can be from what we can do as users, and I'm sure it'll be useful to many!


    Sunday, April 26, 2015 7:50 PM
  • sounds great, thanks for sharing. I will try your solution with my application. 
    Tuesday, May 05, 2015 9:13 AM
  • When I tried to use your compute shader I was given this error in Unity:
    Shader error in 'AutoExposureCompensation.compute': Compute shader compilation had a problem, your compute shader was not compiled correctly

    Do you know what might cause this error?

    • Edited by Sean Hart Tuesday, June 02, 2015 6:34 PM
    Tuesday, June 02, 2015 6:33 PM
  • Off the top of my head I do not. One thing I forgot to mention when setting up compute shaders in Unity is that you need to make sure DirectX11 is enabled. You can do this through Edit->Project Settings->Player and check the "Use Direct X11" box.

    Your graphics card also has to support Direct x11, but I think that's one of the requirements for the Kinect anyway.

    Wednesday, June 03, 2015 10:49 PM
  • First of all thank you for sharing your code!

    I tried to adept your code without using the shaders. I have some questions about your algorithm.

    1. I can't figure out the reason why you devide the exposureTime and gain by 3.

    2. I couldn't find a unit for the exposureTime given by colorcamerasettings. I assume the exposureTime is measured in 100 nanoseconds. You mentioned a function (ColorCameraSettings.ExposureTime.TotalMilliseconds) that doesn't exist in the current API. I devided the exposureTime by 10000 to get to milliseconds is tha right?

    3. What is your targetExposure? I chose a value based on the average gain and average exposureTime of 50 frames before putting a dark object in the scene and though triggering the autoExposure correction (avg(gain)*avg(exposureTime)).

    4. You use a correction term "Exp" which i 1/(autoExposure - targetExposure). In nearly constant lighting the difference between autoExposure and targetExposure should be really small (in my case < 0.003) which leads to a really large correction term.

    5. Do you normalize your RGB image before converting it to Yuv? I assume that your Y1 is in range of [0..1] before you remove Gamma?

    I hope you can help me with some of the questions or maybe someone else can.

    If you want to know something about my actual approach, i described it in a different thread

    Thanks a lot!!!
    Wednesday, September 16, 2015 2:42 PM
  • Thanks Sascha!

    1. I only divide gain by 3. Normally gain is represented in decibels (db) and it appears that is how the Kinect SDK works as well. 3 decibels are equivalent to one stop of light, so an image with 3db of gain is twice as bright as an image with 0db of gain. I divide the gain by 3 (and the exposure time by 333333) in order to find a common value of the exposure (since both exposure time and gain change the exposure). The goal is to find an exponential value which the luminance is multiplied by (the gamma is temporarily removed so that the linear brightness values can be multiplied, instead of the gamma corrected ones).

    2. ColorCameraSettings.ExposureTime should be in the API, and returns a TimeSpan value. The function "TotalMilliseconds" is a function of the TimeSpan class which should be part of the System namespace. For some reason it returns the values at an odd scale, which is why it's divided by 333333.

    3. I partially explain how the target value is determined in my second post. As a starting point I choose imaginary exposure values that would be "ideal" for film and television (0db of gain and a 180 degree shutter at 30fps which is equal to 1/60th of a second exposure time). This "ideal" exposure value is arbitrary, so as long as the value is constant it should be fine. In my code I let the user change this value. The current exposure values from the ColorCameraSettings are treated as an "offset" from this value.

    4. Exp is the difference between the autoexposure and the desired target exposure. You are right in that it doesn't work if the current exposure and target exposure are equal like they are in your method. In practice the "ideal" exposure I've set cannot be reached by the Kinect since the shutter is almost always 360 degrees (1/30th of a second). There is a bug as this number reaches 0 or below so there is room for improvement, but in general use (with my settings) it's not an exposure value you would use anyway.

    5. By default the color image from the Kinect is YUV. When you get an RGB image the Kinect is internally converting the image. To minimize the amount of processing this code starts with the YUV data and then converts it to RGB. I've also found that the YUV image has slightly higher dynamic range than the converted RGB image, so it's better to apply the correction directly. As for normalization, the YUV values are brought into the compute shader as 8 bit values and output as such. They are temporarily converted to floating point values in the GPU to help with precision, but I haven't found a need to normalize them. You should be able to if you want though.

    Keep in mind this is not a "true" fix, all this is is a compensation. At best this reduces auto exposure flickering. With differences of more than a stop the image looks pretty terrible.

    There are other depth cameras now that allow full control of the color camera and have comparable specs. It doesn't seem like there will be additional updates to the Kinect, so if exposure control is important and you can do without skeletal tracking, I would look into those.

    I hope that helps, thanks again!

    Sunday, September 20, 2015 7:39 PM
  • Thank you Sam for your detailed reply. Unfortunately i cant use .net in my win32 program. Thats why i can't use the "totalMilliseconds" function of the timeSpan struct. The colorCameraSetting.exposureTime returns a int64 value which i quess is the exposureTime in 100ns. I tryed to devide the value by 10000 (to get ms) and use your algorithm (with targetExposureTime= 16ms which is the value i asume you use) but that leads to negative correction term "Exp".

    I will try some other ideas and will tell you if i got some good results.

    Thank you for your help!

    Wednesday, September 23, 2015 6:41 PM