Why is this Update Query so extreme slow?
-
Monday, November 21, 2011 1:49 PM
Hi,
i am new to SQL Server (comming from postgres). I used the following query to update a "tracks" table with the ID of the nearest neighbor Building:
update tracks set facility_id = (SELECT TOP 1 facility_id FROM p_facility_locations ORDER BY location.STDistance(tracks.location))
On Postgres(in a Virtual Machine) this query performs very well (about 10K rows / second). On SqlServer 2008 r2 running on an physical server i get only 15 - 20 rows / second. Can please somebody give me a hint on what i'm doing wrong?
Greetings,
Matze
All Replies
-
Wednesday, November 23, 2011 6:51 AMModerator
Hi Matze,
I would like to involve someone familiar with to have a look at this issue and give an update as soon as possiple. Thanks for your understanding.
Best Regards,
Stephanie LvIf you have feedback for TechNet Subscriber Support, contact tnmff@microsoft.com.
Please remember to mark the replies as answers if they help and unmark them if they provide no help. -
Wednesday, November 23, 2011 10:14 PMAnswerer
Hello Matze,
Thank you for your post. There are a number of things to consider when troubleshooting slow performance. The first is to look at the execution plan and make sure that it is an ideal plan. One way to get that is to do the following:
Set statistics time on set statistics io on set statistics profile on go update tracks set facility_id = (SELECT TOP 1 facility_id FROM p_facility_locations ORDER BY location.STDistance(tracks.location)) go Set statistics time off set statistics io off set statistics profile off
Once you have verified the plan is optimal, the next step is to capture 3 things at the same time. We use a tool called PSSDiag to accomplish this, but essentially, it is collecting a Performance Monitor so we can see how the system as a whole is performing, blocker script to see if the query is waiting on any resources, and profiler script to make sure the exec plan is not changing.Some of the things to look for is the performance of the server as a whole such as disk performance. If the disks are taking > 10 milliseconds per transfer, this can impact the speed of the query or if memory is low, or CPUs are high. Blocker data can contain information about wait types such as if the query is waiting on a lock owned by another resource.
The public version of PSSDiag can be downloaded from http://diagmanager.codeplex.com/.
You can then use a tool called SQLNexus to take the result of the PSSDiag and help with the analysis. SQLNexus is available at http://sqlnexus.codeplex.com/
You can also open a support ticket with our main support for assistance with troubleshooting the performance if it continues to be an issue.
I hope this information helps.
Sincerely,
Rob Beene, MSFT
-
Friday, November 25, 2011 4:41 PMAnswerer
Hi Matze,
You're trying to do what's generally referred to as a "nearest-neighbour" query - if you search this forum you'll find several other posts that discuss this, together with possible solutions. The problem is in this part:
SELECT TOP 1 facility_id FROM p_facility_locations ORDER BY location.STDistance(tracks.location)
What you're doing here is to calculate the distance between the current track location and every facility location, then sort all the facilities by that descending distance, and then you're only selecting the top one record. So, there's a huge amount of wasted effort here as you scan through every row in the facility_locations table, and then you do that again and again for every row in the tracks table, but are discarding all but one record in each case....
There's a couple of alternative approaches. To summarise them:
- you can add an additional predicate to search for only those facilities that lie within a given distance (e.g. SELECT TOP 1 ... WHERE STDistance() < xxx) which will return a reduced set of facilities and make the sort operation much more efficient. The disadvantage is that if you set the search range too small then there's the risk that you won't find any records at all.
- you could use a dynamically-expanding range to find the nearest neighbor to the chosen track and then stop searching for more candidates (as described by Isaac Kunen, here: http://blogs.msdn.com/b/isaac/archive/2008/10/23/nearest-neighbors.aspx)
- you could upgrade to SQL Server 2012, which has a dedicated query plan for nearest neighbour queries as described in http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/SQLServer_Denali_Spatial.docx
In all cases, you should also make sure that you've got a suitable spatial index on the facilities table, and that it's being used (for that, you might need to add an index hint to the query).
twitter: @alastaira blog: http://alastaira.wordpress.com/

