Answered by:
Inconsistan behavior of Table Variable used in subquery with NEWID()

Question
-
The following code should obviously only delete 5 rows. No more, no less.
Simple question: Why does it not?
DECLARE @t TABLE (nDex INT IDENTITY(1,1), val VARCHAR(9)) -- sample data INSERT @t SELECT 'a' UNION ALL SELECT 'b' UNION ALL SELECT 'c' UNION ALL SELECT 'd' UNION ALL SELECT 'e' UNION ALL SELECT 'f' UNION ALL SELECT 'g' UNION ALL SELECT 'h' UNION ALL SELECT 'i' UNION ALL SELECT 'j' DELETE t FROM @t AS t INNER JOIN (SELECT TOP 5 nDex FROM @t ORDER BY NEWID()) AS b ON b.ndex = t.nDex -- this should be 5 rows SELECT * FROM @t ORDER BY ndex
Monday, March 28, 2011 3:22 PM
Answers
-
Chiraq,
The answer is in the connect entry. Microsoft said that this behavior is due to the non-determinism of the function, but I don't buy that, since GETDATE() is not deterministic and the QO uses different approach.
Feel free to vote if you think that changing the behavior could benefit your business.
AMB
- Marked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
- Unmarked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
- Marked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
Monday, March 28, 2011 5:34 PM
All replies
-
This very recent thread should explain the behavior you're seeing and suggest a workaround (temp table instead of CTE/derived table).
For every expert, there is an equal and opposite expert. - Becker's Law
Naomi Nosonovsky, Sr. Programmer-Analyst
My blogMonday, March 28, 2011 3:30 PM -
I am not surprised by your findings..
Remove Order BY NEWID() and you should see expected behaviour (instead use order by ndex ASC/DESC along with your TOP )
Monday, March 28, 2011 3:34 PM -
You can materialize the derived table, as Naomi suggested, or you can use another method to randomize the result, as you can see in this thread.
Example:
-- SS 2008 DECLARE @t TABLE (nDex INT IDENTITY(1,1), val VARCHAR(9)) -- sample data INSERT @t SELECT 'a' UNION ALL SELECT 'b' UNION ALL SELECT 'c' UNION ALL SELECT 'd' UNION ALL SELECT 'e' UNION ALL SELECT 'f' UNION ALL SELECT 'g' UNION ALL SELECT 'h' UNION ALL SELECT 'i' UNION ALL SELECT 'j' DELETE t FROM @t AS t INNER JOIN ( SELECT TOP 5 nDex FROM @t -- Tip from Itzik Ben-Gan ORDER BY RAND(1. * DATEPART(microsecond, SYSDATETIME()) * nDex % 2147483648) ) AS b ON b.ndex = t.nDex -- this should be 5 rows SELECT * FROM @t ORDER BY ndex GO
AMB
- Proposed as answer by Naomi N Monday, March 28, 2011 4:14 PM
Monday, March 28, 2011 4:07 PM -
AMB,
I wanted to ask you a question, may be little sidetrack than original question but
why query optimizer treats NEWID() differently, why not RAND()...?
for e.g. in the query below
select rand(), rand(), NEWID(), NEWID() from sys.objects
Monday, March 28, 2011 5:12 PM -
All, thank you for your replies.
I know how to avoid the issue.
My question was more to WHY it was happening, if it was a known issue, and if there are plans to release a hotfix with this in it?
Monday, March 28, 2011 5:17 PM -
In the thread I referenced there is a link to the Connect Item where you can vote.
For every expert, there is an equal and opposite expert. - Becker's Law
Naomi Nosonovsky, Sr. Programmer-Analyst
My blogMonday, March 28, 2011 5:27 PM -
Chiraq,
The answer is in the connect entry. Microsoft said that this behavior is due to the non-determinism of the function, but I don't buy that, since GETDATE() is not deterministic and the QO uses different approach.
Feel free to vote if you think that changing the behavior could benefit your business.
AMB
- Marked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
- Unmarked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
- Marked as answer by Jason L. Selburg Monday, March 28, 2011 6:03 PM
Monday, March 28, 2011 5:34 PM -
Thank you Naomi and AMB.
Monday, March 28, 2011 5:51 PM