none
Find Closest point to a point

    Question

  • Hi Everyone

    I am super new to the spatial side of things with sql. I am using SQL Server 2008 Express.  I have a table with town points with coordinates and a geog column.  Then i have another table with 400 000 odd points.  I have to find and populate say Closest_Town column with the closest town point for each of these points.  At the moment i am using a cursor for this (yes i know this is prob completely wrong, but thats why i need help!):

    DECLARE @tmpId  int
    DECLARE point CURSOR FOR

    SELECT ID
    FROM tblPoints 

    OPEN point 

    FETCH  point  INTO @tmpId

    WHILE @@Fetch_Status = 0

    BEGIN

    declare @iTown varchar(255)

    select  top 1 @iTown =  b.TWN_NAME

    FROM tblPoints A,
    SP_TOWN_POINTS B
    where a.ID = @tmpId
    order by b.geog.STDistance(A.GEOG)

    update tblPoints
    set town = @iTown
    where ID = @tmpId

    FETCH point INTO @tmpId

     

      END

    CLOSE point 

    DEALLOCATE  point  
    RETURN 

     

    At the moment this is taking about 16 hours for 40 000 records which is just crazy.  Can anyone help me to write an easy quicker query?

    • Moved by Naomi N Monday, February 06, 2012 4:13 PM May be better answer here (From:Transact-SQL)
    Thursday, January 26, 2012 12:36 PM

All replies

  • There is a spatial forum - try posting there.

    Thursday, January 26, 2012 1:10 PM
  • You can simply run a query with CROSS APPLY, e.g.

     

    select T.*, Closest.* from tblPoints T CROSS APPLY (select top (1) *
    
    from SP_Town_Points b order by b.geog.STDDistance(T.Geog)) Closest
    

     


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Thursday, January 26, 2012 3:01 PM
  • Hi Everyone

    I am super new to the spatial side of things with sql. I am using SQL Server 2008 Express.  I have a table with town points with coordinates and a geog column.  Then i have another table with 400 000 odd points.  I have to find and populate say Closest_Town column with the closest town point for each of these points.  At the moment i am using a cursor for this (yes i know this is prob completely wrong, but thats why i need help!):

    DECLARE @tmpId  int
    DECLARE point CURSOR FOR

    SELECT ID
    FROM tblPoints 

    OPEN point 

    FETCH  point  INTO @tmpId

    WHILE @@Fetch_Status = 0

    BEGIN

     declare @iTown varchar(255)

     select  top 1 @iTown =  b.TWN_NAME

    FROM tblPoints A,
    SP_TOWN_POINTS B
    where a.ID = @tmpId
    order by b.geog.STDistance(A.GEOG)

    update  tblPoints
    set town = @iTown
    where ID = @tmpId

    FETCH point INTO @tmpId

     

      END

    CLOSE point 

    DEALLOCATE  point  
    RETURN 

     

    At the moment this is taking about 16 hours for 40 000 records which is just crazy.  Can anyone help me to write an easy quicker query?

    • Merged by Stephanie Lv Thursday, February 09, 2012 11:16 AM
    Thursday, January 26, 2012 4:11 PM
  • Anyone that can help on this please?
    Wednesday, February 01, 2012 6:15 AM
  • i've posted under the spatial forum as well, but no replies yet :(
    Wednesday, February 01, 2012 6:16 AM
  • Can you give a reference and did you try my suggestion?
    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Wednesday, February 01, 2012 1:32 PM
  • Hi Naomi

    I've tried ur suggestion but it takes about 30min for 900 records, which also seems a bit way to slow? as I've got around 300 000 records that i need to populate...

    Should i maybe create my spatial index differently?

    this is the code i use for it:

    CREATE SPATIAL INDEX SP_INDEX_POINT_GEOG

    ON tblPoints(GEOG);

    Wednesday, February 01, 2012 2:26 PM
  • I pasted under the Spatial forum under the Topic: Closest point to a point

    Wednesday, February 01, 2012 2:29 PM
  • I am not sure if an efficient way exists for your problem. We need to find the closest town for 300000 towns.

    I assume it means we need to make almost 300K power 2 calculations. We can cut the number in 2 (since A and B will be the same as B and A). Still it's a lot.

    I think we first need to decide if a better algorithm exists here and then follow this idea.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Wednesday, February 01, 2012 2:59 PM
  •  Are your town points and  your another 400 000 odd points realy geo points,  kind of
     geography::STGeomFromText('POINT(-122.34900 47.65100)', 4326) ?

    In that case you may first try greatly narrow number of towns to search  using point.lat, point.lon and some unpresize but quick distance calculation. See http://en.wikipedia.org/wiki/Geographical_distance for example.

     


    Serg
    Wednesday, February 01, 2012 4:51 PM
  • Yes, that's what I was thinking also - we need to build a bounding box first and filter lots of towns. 

    See the ideas in this blog post

    SQL Server 2008 Proximity Search With The Geography Data Type 

    and the blog referenced at the top of this one.


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Wednesday, February 01, 2012 4:55 PM
  • As Naomi has already commented, whatever way you structure your query (using a CROSS APPLY, a cursor, or a sub-SELECT, for example), you can't avoid the fact that you're going to end up performing the nearest-neighbor part of your query 400,000 times - once for each row of your base table.

    So, forget about the exact syntax of UPDATE query for now, and instead concentrate on just getting that part to execute as efficiently as possible. At the moment, you're doing this:

    declare @iTown varchar(255)
    select  top 1 @iTown =  b.TWN_NAME
    FROM tblPoints A,
    SP_TOWN_POINTS B
    where a.ID = @tmpId
    order by b.geog.STDistance(A.GEOG)

    This is not efficient, because what you're doing is calculating the distance between the current point and every town in the SP_TOWN_POINTS table, then sorting the whole of that list, and finally discarding all but the TOP 1 result. Since you're not applying any selectivity in this query, it can't make use of a spatial (or any other kind of) index - it'll perform the relatively expensive STDistance() method for every row in the Town table. And then, this query gets repeated over and over for every row in the tblPoints table...

    This type of query is generally referred to as a "nearest-neighbour" query (or, if you're in the US, a "nearest neighbor" query :), and it's been discussed several times on this forum already. Here's some starting points that might help you:

    - Try limiting the search for nearest neighbors within a bounded search area, or an expanding area as described by Isaac Kunen, here: http://blogs.msdn.com/b/isaac/archive/2008/10/23/nearest-neighbors.aspx

    - Try out the RC0 of SQL Server 2012, which includes a dedicated query plan for nearest neighbor queries as described here: http://download.microsoft.com/download/D/2/0/D20E1C5F-72EA-4505-9F26-FEF9550EFD44/SQLServer_Denali_Spatial.docx

    - If you're only using tables containing points, then you might not need to use geometry/geography at all. The spatial functionality in SQL Server 2008/R2/2012 is ideally suited for dealing with complex geometry operations, such as testing for intersection between abstract geometrical shapes - polygons/curves/linestrings etc. The spatial methods are accurate, but they're not fast. If all you want to do is find approximate distances between points, you will almost certainly find that storing latitude/longitude coordinates as decimal(18,9) columns, and calculating the distance between them using something like the Haversine approximation will be much fast than any of the other methods described above.

    Once you've got an efficient query to select the closest town to any point, then you can plug that into your UPDATE query.


    twitter: @alastaira blog: http://alastaira.wordpress.com/

    • Proposed as answer by Naomi N Wednesday, February 08, 2012 1:57 PM
    Wednesday, February 08, 2012 11:21 AM
    Answerer
    • Proposed as answer by Naomi N Thursday, February 23, 2012 1:10 AM
    Wednesday, February 22, 2012 8:24 PM
    Answerer