Answered by:
XMVectorFloor  Is it broken?

XMVectorFloor isn't doing what I expect it to do in certain cases. I'm not sure if this is a misunderstanding on my part or whether it just isn't working right.
I am using VS2012 on Windows 8, using DirectXMath version 303 (from Windows Kits\8.0)
If I take any odd valued "whole" number and use XMVectorFloor, I get the next lower whole number. For example, the floor of a vector with "105.0f" in it returns 104.0f when I would expect it to return 105.0f.
I tried this in a loop and it does this for every odd number.
I did the same test using floorf and floorf will return 105.0 not 104.0.
My test loop was a simple integer loop, casting the integer to a float, replicating it into the vector, taking the floor, then storing it back in a xmfloat3. Every odd number in the sequence returns the next lower integer. Using floorf in the same loop, doing the same things, casting to a float, taking the floor with floorf, then casting back to an int behaves differently.
I thought at first this might have to do with not being able to represent the whole number precisely enough (I admit I don't know how the float actually stores the values) which is why I compared the behavior against floorf.
 Edited by Myiasis Thursday, January 31, 2013 6:19 PM
Question
Answers

The algorithm used by
XMVectorFloor
is:v = v  (0.5f  EPSILON); v = XMVectorRound(v);
105.f ends up being encoded such at this just puts it below 105 so it rounds to 104. Likely there needs to be another specialcase test added here.
XMVectorCeiling
also seems to have a similar problem with 105. This is implemented asv = v + 0.5f  EPSILON; v = XMVectorRound(v);
Note that DirectXMath is generally coded assuming you want 'fast' values instead of 'robust' one. In some cases that means the answer is going to be an approximation. For 'robust' math, you should stick with using the scalar CRT routines.
Also, if you can make use of SSE 4.1 in your application, there is an
_mm_floor_ps
which gives the 'robust' answer of 105. See DirectXMath: SSE4.1 and SSE4.2
 Edited by Chuck Walbourn  MSFT Thursday, January 31, 2013 9:14 PM
 Marked as answer by Myiasis Thursday, January 31, 2013 9:39 PM
All replies

The algorithm used by
XMVectorFloor
is:v = v  (0.5f  EPSILON); v = XMVectorRound(v);
105.f ends up being encoded such at this just puts it below 105 so it rounds to 104. Likely there needs to be another specialcase test added here.
XMVectorCeiling
also seems to have a similar problem with 105. This is implemented asv = v + 0.5f  EPSILON; v = XMVectorRound(v);
Note that DirectXMath is generally coded assuming you want 'fast' values instead of 'robust' one. In some cases that means the answer is going to be an approximation. For 'robust' math, you should stick with using the scalar CRT routines.
Also, if you can make use of SSE 4.1 in your application, there is an
_mm_floor_ps
which gives the 'robust' answer of 105. See DirectXMath: SSE4.1 and SSE4.2
 Edited by Chuck Walbourn  MSFT Thursday, January 31, 2013 9:14 PM
 Marked as answer by Myiasis Thursday, January 31, 2013 9:39 PM

Thanks for looking at it Chuck.
1 doesn't do it, but 3, 5, 7, 9, 11, think I stopped there. I made a loop because I was curious if it was just 105.
This is an odd case where I do need an exact answer. I was actually only interested in a single component of the vector but it was conveniently already loaded so figured might as well use XMVectorFloor. So I'll do like you suggested and use floorf.
Out of curiosity, if you know the answer, what does floorf do differently? Do all SSE2 instructions work on 4 values at a time only? The docs for floorf say they use SSE2 to accomplish the task too.
