DirectX Shader compiler regression : failed to optimize pixel shader
-
Wednesday, November 09, 2011 6:38 PM
Hi team,I have a shader.It is compiled by Microsoft (R) D3D10 Shader Compiler 9.24.949.2307 without any issues.fxc.39.exe /FcLmPS_lp3am.psh LmPS.fx /Tps_2_0 /DSpec=3 /DSType=1 /DMix=1 /DLma=1 /DBump=1 /DPrlx=1
compilation succeeded; see LmPS_lp3am.psh
But when I try to compile it with the latest fxc version (Direct3D Shader Compiler 9.29.952.3111) (June 2010) it can not optimize the shader and as result it does not fit in 64 arithmetic instruction slots:LmPS.fx(155,5): error X5608: Compiled shader code uses too many arithmetic instruction slots (66). Max. allowed by the target (ps_2_0) is 64.
compilation failed; no code produced
I was trying to set more aggressive optimization options : "-03" but it did not help.
Besides that I fould that most of my shaders are compiled in a less optimized way, thay have more arithmetic instructions and use more static float constants.
Now I have to use old DirectX SDK Version in my project, which is not a good approach.
Thanks in advance.
-Kirill
- Edited by Kirill.Prazdnikov Wednesday, November 09, 2011 7:03 PM
All Replies
-
Thursday, December 08, 2011 4:34 PM
If it is possbile I`ll attach the source of the shader to simplify peroduction of the issue.
- Edited by Kirill.Prazdnikov Monday, October 22, 2012 8:01 PM
- Edited by Kirill.Prazdnikov Tuesday, October 23, 2012 12:47 PM
-
Monday, October 22, 2012 7:40 PM
I tried the latest FXC shipped with VS2012. It has the same bug :
C:\Work\fxc>fxc.VS2012.exe /O3 /FcLmPS_lp3am.psh LmPS.fx /Tps_2_0 /DSpec=3 /DSType=1 /DMix=1 /DLma=1 /DBump=1 /DPrlx=1
Microsoft (R) Direct3D Shader Compiler 9.30.9200.16384
Copyright (C) Microsoft Corporation 2002-2011. All rights reserved.C:\Work\fxc\LmPS.fx(155,16): error X5608: Compiled shader code uses too many arithmetic instruction slots (66). Max. allowed by the target (ps_2_0) is 64. Consider increasing optimization level to reduce instruction count.
compilation failed; no code produced
-
Wednesday, October 24, 2012 6:07 AM
Please send the repro
-
Wednesday, October 24, 2012 3:21 PM
How to send it ?
Thanks
-
Tuesday, October 30, 2012 3:38 PM
Here is the source
http://dl.dropbox.com/u/3538621/fxcBug.zip
FXC November 2008 builds it without issues : // Generated by Microsoft (R) HLSL Shader Compiler 9.24.950.2656
fxc.VS2012.exe fails.
-
Wednesday, October 31, 2012 1:50 AM
Thanks.
That shader is very near the maximum complexity of pixel shader model 2.0, so it only takes a slightly different behavior to go over the slot limit. No shader since March 2009 can make it fit into ps_2_0, but it works perfectly fine with ps_2_a or ps_4_0_level_9_3.
You can either simplify it a little, move to ps_2_a, or stick with the several year old compiler.
-
Wednesday, October 31, 2012 12:04 PM
Hi Chuck,
> No shader since March 2009 can make it fit into ps_2_0
This is a bug. It is a regrassion since March 2009. The shader is fittable in 2_0, the problem is that the new compiler has a bug.
> You can either simplify it a little
Can you advice how ? I`d like to hear.
> move to ps_2_a
I can`t. Unfortunately I have 2.0 HW restrictions.
> or stick with the several year old compiler
Which is what I`m doing.
It would be nice to have the bug fixed.
It is also important becasue this bug affects shader performance.
Is it possible to file and track BUG against the compiler and register an e-mail listener to it ?
Thanks
-Kirill
- Edited by Kirill.Prazdnikov Wednesday, October 31, 2012 12:07 PM
- Edited by Kirill.Prazdnikov Wednesday, October 31, 2012 12:08 PM
-
Saturday, December 22, 2012 10:17 AM
This is the most simplified shader that reproduce the problem :
cmdline=fxc /Tps_2_0 ps.fx
float4 color : register(c1); float4 ambient : register(c0); float4 L : register(c2); float4 main() : color { float3 n = float3(0,0,1); float len = length(L.xyz); float3 l = L.xyz/len; float3 attC = color.xyz*saturate(1-len); float3 diff = saturate(dot(n,l))*attC; return float4(saturate(diff+ambient), 1); }The old compiler does that :
// Generated by Microsoft (R) HLSL Shader Compiler 9.24.950.2656
//
// Registers:// Name Reg Size // ------------ ----- ---- // ambient c0 1 // color c1 1 // L c2 1 // ps_2_0 def c3, 1, 0, 0, 0 dp3 r0.w, c2, c2 rsq r0.x, r0.w rcp r0.y, r0.x mul_sat r0.x, r0.x, c2.z add_sat r0.y, -r0.y, c3.x mul r0.yzw, r0.y, c1.wzyx mad_sat r0.xyz, r0.x, r0.wzyx, c0 mov r0.w, c3.x mov oC0, r0 // approximately 9 instruction slots used
The new compiler does that :
// Generated by Microsoft (R) HLSL Shader Compiler 9.30.9200.16384 // // Registers: // // Name Reg Size // ------------ ----- ---- // ambient c0 1 // color c1 1 // L c2 1 // ps_2_0 def c3, 1, 0, 0, 0 dp3 r0.w, c2, c2 rsq r0.x, r0.w mul_sat r0.y, r0.x, c2.z rcp r0.x, r0.x add r0.x, -r0.x, c3.x mul r1.xyz, r0.x, c1 mul r0.yzw, r0.y, r1.wzyx cmp r0.xyz, r0.x, r0.wzyx, c3.y add_sat r0.xyz, r0, c0 mov r0.w, c3.x mov oC0, r0 // approximately 11 instruction slots usedWhich is 11 instructions instead of 9.
Can I file a bug ? How to do this ?
Thanks
- Edited by Kirill.Prazdnikov Saturday, December 22, 2012 10:19 AM


