none
DirectX Shader compiler regression : failed to optimize pixel shader

    Question

  • Hi team,
      I have a shader.
      It is compiled by Microsoft (R) D3D10 Shader Compiler 9.24.949.2307 without any issues.
    fxc.39.exe /FcLmPS_lp3am.psh LmPS.fx /Tps_2_0 /DSpec=3 /DSType=1 /DMix=1 /DLma=1 /DBump=1 /DPrlx=1
    compilation succeeded; see LmPS_lp3am.psh
      But when I try to compile it with the latest fxc version (Direct3D Shader Compiler 9.29.952.3111) (June 2010) it can not optimize the shader and as result it does not fit in 64 arithmetic instruction slots: 
    LmPS.fx(155,5): error X5608: Compiled shader code uses too many arithmetic instruction slots (66). Max. allowed by the target (ps_2_0) is 64.
    compilation failed; no code produced

    I was trying to set more aggressive optimization options : "-03" but it did not help.

    Besides that I fould that most of my shaders are compiled in a less optimized way, thay have more arithmetic instructions and use more static float constants.

    Now I have to use old DirectX SDK Version in my project, which is not a good approach.

    Thanks in advance.

      -Kirill

     

    Wednesday, November 09, 2011 6:38 PM

All replies

  • If it is possbile I`ll attach the source of the shader to simplify peroduction of the issue.



    Thursday, December 08, 2011 4:34 PM
  • I tried the latest FXC shipped with VS2012. It has the same bug :

    C:\Work\fxc>fxc.VS2012.exe /O3 /FcLmPS_lp3am.psh LmPS.fx /Tps_2_0 /DSpec=3 /DSType=1 /DMix=1 /DLma=1 /DBump=1 /DPrlx=1
    Microsoft (R) Direct3D Shader Compiler 9.30.9200.16384
    Copyright (C) Microsoft Corporation 2002-2011. All rights reserved.

    C:\Work\fxc\LmPS.fx(155,16): error X5608: Compiled shader code uses too many arithmetic instruction slots (66). Max. allowed by the target (ps_2_0) is 64. Consider increasing optimization level to reduce instruction count.

    compilation failed; no code produced

    Monday, October 22, 2012 7:40 PM
  • Please send the repro

    Wednesday, October 24, 2012 6:07 AM
  • How to send it ?

     Thanks

    Wednesday, October 24, 2012 3:21 PM
  • Here is the source

    http://dl.dropbox.com/u/3538621/fxcBug.zip

    FXC November 2008 builds it without issues : // Generated by Microsoft (R) HLSL Shader Compiler 9.24.950.2656

    fxc.VS2012.exe fails.

    Tuesday, October 30, 2012 3:38 PM
  • Thanks.

    That shader is very near the maximum complexity of pixel shader model 2.0, so it only takes a slightly different behavior to go over the slot limit. No shader since March 2009 can make it fit into ps_2_0, but it works perfectly fine with ps_2_a or ps_4_0_level_9_3.

    You can either simplify it a little, move to ps_2_a, or stick with the several year old compiler.

    Wednesday, October 31, 2012 1:50 AM
  • Hi Chuck,

    > No shader since March 2009 can make it fit into ps_2_0

    This is a bug. It is a regrassion since March 2009. The shader is fittable in 2_0, the problem is that the new compiler has a bug.

    > You can either simplify it a little

    Can you advice how ? I`d like to hear.

    > move to ps_2_a

    I can`t. Unfortunately I have 2.0 HW restrictions.

    > or stick with the several year old compiler

    Which is what I`m doing.

    It would be nice to have the bug fixed.

    It is also important becasue this bug affects shader performance.  

    Is it possible to file and track BUG against the compiler and register an e-mail listener to it ?

    Thanks

      -Kirill



    Wednesday, October 31, 2012 12:04 PM
  • This is the most simplified shader that reproduce the problem :

    cmdline=fxc /Tps_2_0 ps.fx

    float4 color : register(c1);
    float4 ambient : register(c0);
    float4 L : register(c2);
    
    float4 main() : color {
    
        float3 n = float3(0,0,1); 
    
        float len = length(L.xyz);
        float3 l = L.xyz/len;
        float3 attC = color.xyz*saturate(1-len);
        float3 diff = saturate(dot(n,l))*attC;
    
        return float4(saturate(diff+ambient), 1);
    }
    

    The old compiler does that :

    // Generated by Microsoft (R) HLSL Shader Compiler 9.24.950.2656

    //
    // Registers:

    // Name Reg Size // ------------ ----- ---- // ambient c0 1 // color c1 1 // L c2 1 // ps_2_0 def c3, 1, 0, 0, 0 dp3 r0.w, c2, c2 rsq r0.x, r0.w rcp r0.y, r0.x mul_sat r0.x, r0.x, c2.z add_sat r0.y, -r0.y, c3.x mul r0.yzw, r0.y, c1.wzyx mad_sat r0.xyz, r0.x, r0.wzyx, c0 mov r0.w, c3.x mov oC0, r0 // approximately 9 instruction slots used

    The new compiler does that :

    // Generated by Microsoft (R) HLSL Shader Compiler 9.30.9200.16384
    //
    // Registers:
    //
    //   Name         Reg   Size
    //   ------------ ----- ----
    //   ambient      c0       1
    //   color        c1       1
    //   L            c2       1
    //
    
        ps_2_0
        def c3, 1, 0, 0, 0
        dp3 r0.w, c2, c2
        rsq r0.x, r0.w
        mul_sat r0.y, r0.x, c2.z
        rcp r0.x, r0.x
        add r0.x, -r0.x, c3.x
        mul r1.xyz, r0.x, c1
        mul r0.yzw, r0.y, r1.wzyx
        cmp r0.xyz, r0.x, r0.wzyx, c3.y
        add_sat r0.xyz, r0, c0
        mov r0.w, c3.x
        mov oC0, r0
    
    // approximately 11 instruction slots used

    Which is 11 instructions instead of 9.

    Can I file a bug ? How to do this ?

    Thanks


    Saturday, December 22, 2012 10:17 AM