sizing the bytewidth when creating a CD3D11_BUFFER_DESC RRS feed

  • Question

  • Why are the results of these two lines equivalent?

    UINT cbSize = sizeof(ModelViewProjectionConstantBuffer);// result happens to be 192 in "CubeRenderer.cpp"
    UINT cbSize2 = (sizeof(ModelViewProjectionConstantBuffer) + 15) / 16 * 16;// result also 192;

    I've seen both ways used in the samples to set the ByteWidth of a buffer description. Arithmetically, it seems like it should result in different values but it doesn't so what am I missing?

    Also, why did the author of the shooting game sample put the " + 15)  / 16 * 16 " in that line for whereas the author of the popular "CubeRenderer.cpp" did not?

    Thank you.

    Unlock The *I don't use 3 angles in a Vector3 for rotations anymore* Achievement Here

    Sunday, April 7, 2013 2:32 AM


  • D3D cbuffer resources must be sized as a multiple of 16 bytes (fundamentally a cbuffer is a collection of constants, each of which is composed of 4 four byte elements). The "uint someCBufferSize = (sizeof(SomeCBuffer) + 15) / 16 * 16;" logic takes advantage of integer division truncation to ensure that the resulting value is a multiple of 16 and contains the minimum number of bytes necessary to contain all the data without any excess bytes being wasted.

    If your CPU-side struct representation of the GPU-side cbuffer happens to be a multiple of 16, then you could leave off this logic as it would be superfluous. But if the struct ever changed such that it was no longer a multiple of 16 then you would wind up with bad things happening (how bad would probably depend on the graphics driver but it's possible that the D3D runtime itself would return a failure HRESULT without leaving it up to the graphics driver; I've never tried so I don't know).

    Since cbuffer constants are also not supposed to straddle multiple 16 byte chunks*, I generally prefer to define my CPU-side structs using DirectX::XMFLOAT4 elements and DirectX::XMFLOAT4X4 elements. (*Matrices such as XMFLOAT4X4 are treated as being made up of multiple 4 element constants, none of which individually straddles a 16 byte boundary, thus preserving the proper 16 byte alignment which the GPU registers require.) Defining them this way (and using the register and packoffset keywords in HLSL) results in fewer errors and oddities since the constant values are given explicit locations on the GPU and the data (assuming you comment the struct appropriately) should always be placed properly on the CPU such that a call to ID3D11DeviceContext::UpdateSubresource will result in the correct data ending up in their proper places on the GPU-side.

    For more detail, see the Packing Rules for Constant Variables page on MSDN: http://msdn.microsoft.com/en-us/library/bb509632(v=vs.85).aspx . It does an excellent job explaining GPU-side packing and contains a number of clarifying examples of how specific data are packed.

    As to why they failed to put that logic in to "CubeRenderer.cpp", I suspect it was an oversight rather than a deliberate omission.

    XNA/DirectX MVP | Website | Blog | @mikebmcl

    • Edited by MikeBMcL Sunday, April 7, 2013 4:43 AM
    • Marked as answer by Shazen Sunday, April 7, 2013 3:29 PM
    Sunday, April 7, 2013 4:43 AM