locked
Basic spatial question... Creating a location table RRS feed

  • Question

  • I am creating a Location table, I was using sql 2005, and storing lat / long, since its a new project, i decided to dump 2005 and install 2008 to take advantage of spatial functionality. There will be a need to do lat/long proximity searches in the near future (not based on zip)

    Im geocoding addresses, I believe that the lat/long i get (google geocode, yahoo geocode, etc) is in the format xxx.xxxxxx
    1. i believe the correc datatype is Geography

    tanoshimi: in another post you mentioned something about EPSG:4326.. spatial reference system.. do most of the geocoding services out there using the same?

    when inserting data, do i need to be doing:
     
    set @point = geography::STGeomFromText('POINT(55.85527499715679 -4.239349365234378)', 4326)
    asdasd

    2. when pasing data into a stored procedure, should i pass in 2 floats, lat and long,. and do the above., or shd i create the correct datatype in .NET and pass that in?
    Dim London As SqlGeography 
    London = SqlGeography.Point(51.5, -0.5, 4326) 

    Actually i need to double check on (2)., im using llblgenpro for my data layer, so i need to be sure that will support it :)
    Tuesday, March 24, 2009 10:13 PM

Answers

  • Hiya,

    1.) Yep - if you've got lats and longs then use geography.

    2.) It's pretty safe to say that any publicly-available geocoding service uses SRID 4326 unless they specifically state otherwise.

    3.) What you've written is correct (except for the "asdasd" bit - dunno what that does ;) BUT be very careful with the order in which you specify your coordinates:
     - When you use STGeomFromText() to create a point, the first parameter is a string beginning with the POINT keyword and then stating the longitude then latitude coordinates of the point.
    - When you use the Point() method (as in your second example), the parameters are stated in latitude then longitude order.
    It's not clear from your examples which coordinate is which... POINT(55.85527499715679 -4.239349365234378) describes a location near the Sechelles, whereas  Point(51.5, -0.5, 4326) is just west of the M25 in London.... which one do you want?!

    4.) To pass a Point to a stored procedure you can either create a SqlGeography instance on the client side and pass that, or send the individual latitude and longitude coordinates as floats and create the geography instance in the sproc - it doesn't really much matter. The first method is more extendable if you plan to support other geometry types (LineStrings and Polygons) in the future, but it also requires you to have Microsoft.SqlServer.Types.dll on the client-side, which the latter does not.....

    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by zxed Tuesday, April 14, 2009 3:57 AM
    Wednesday, March 25, 2009 8:32 AM
    Answerer
  • Ooh... STDistance() in an ORDER BY is not generally good news...

    STDistance() is a computationally-expensive method. If you want to order a SELECT by their distance away from a given point, SQL Server must evaluate the result of the STDistance() method on every row in the resultset. If you haven't got any other filters, this results in a table scan, you can't use any indexes etc. etc. It doesn't make any difference whether you use the subselect in your example or not - the query optimiser would probably evaluate these two queries in the same way anyway.

    In practice, you rarely want to order all of the rows in a table by their distance from a given point, but rather just the top 3, or 10, or 100. This type of query is known as a k-nearest neighbour search and there are some examples as to how to optimise a query to find this in Isaac's blog: http://blogs.msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx. I also describe various approaches to performing a nearest-neighbour search in Chapter 13 of "Beginning Spatial with SQL Server 2008".

    As for performance considerations... it's largely a case of 'suck it and see'. When your database is loaded with some more data you can experiment with various approaches to see what works best for you - recommendations regarding indexing options etc. are very heavily-dependent on the exact distribution and nature of your particular dataset so it's hard to give abstract, generalised advice.

    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by zxed Tuesday, April 14, 2009 3:57 AM
    Wednesday, March 25, 2009 7:03 PM
    Answerer

All replies

  • Hiya,

    1.) Yep - if you've got lats and longs then use geography.

    2.) It's pretty safe to say that any publicly-available geocoding service uses SRID 4326 unless they specifically state otherwise.

    3.) What you've written is correct (except for the "asdasd" bit - dunno what that does ;) BUT be very careful with the order in which you specify your coordinates:
     - When you use STGeomFromText() to create a point, the first parameter is a string beginning with the POINT keyword and then stating the longitude then latitude coordinates of the point.
    - When you use the Point() method (as in your second example), the parameters are stated in latitude then longitude order.
    It's not clear from your examples which coordinate is which... POINT(55.85527499715679 -4.239349365234378) describes a location near the Sechelles, whereas  Point(51.5, -0.5, 4326) is just west of the M25 in London.... which one do you want?!

    4.) To pass a Point to a stored procedure you can either create a SqlGeography instance on the client side and pass that, or send the individual latitude and longitude coordinates as floats and create the geography instance in the sproc - it doesn't really much matter. The first method is more extendable if you plan to support other geometry types (LineStrings and Polygons) in the future, but it also requires you to have Microsoft.SqlServer.Types.dll on the client-side, which the latter does not.....

    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by zxed Tuesday, April 14, 2009 3:57 AM
    Wednesday, March 25, 2009 8:32 AM
    Answerer
  • 1. asdasd was something i forgot to remove when trying to get the color coding to stop, had to switch to html and get out of the SPAN...
    2. i got lucky when i was geocoding my address, it didnt like the -110 i was passing in one of the values and it alerted me that the max was 90 or -90, think it was lat.. so it helped me figure out where to put the lat and long.
    3. the .net geography was a copy and paste, no relation to what i needed :)


    I decided to go with passing in my initial lat + long as nvarchar to my sproc and then setting the geography object..

    I was also able to do a proximity search for a given point.... i decided to go with STDistance:
    a. Yes, it may take a hit if the db gets too big, but i did setup my spatial index, and made sure that i am using it
    b. while i was looking at the nearest neighbour solution posted by Isaac., and looked up STBuffer, there was a post (cannot recall where) stating that STBuffer does what it should do, but it does take up resoruces... it would be great for "find 500 houses near a railway line"... but something more of a "find 500 houses at lat/long" can be handled more efficiently using just STDistance... currently my database has 3 points :) when it grows to a big enough size where a proximity search, 1-20 miles becomes a bottle neck i will make use of the STBuffer...

    Question on StDistance.... im using STdistance(..) where clause for example <= 1000... but also use it in my Order By STBuffer(..)... i cannot recall, but i wanted to make sure... this will "not" result in compares? for each row... i think im asking about big O :) .. if i recall corerctly sql automatically optimizes calculations in queries?

    does it make a difference if i do

    select something from sometable where STDistance <= 1000 order by StDistance <= 1000
    or
    select something from
    (select something, stdistance as thedistance from sometable)
    where thedistance <= 1000
    order by thedistance 



    Wednesday, March 25, 2009 6:31 PM
  • Ooh... STDistance() in an ORDER BY is not generally good news...

    STDistance() is a computationally-expensive method. If you want to order a SELECT by their distance away from a given point, SQL Server must evaluate the result of the STDistance() method on every row in the resultset. If you haven't got any other filters, this results in a table scan, you can't use any indexes etc. etc. It doesn't make any difference whether you use the subselect in your example or not - the query optimiser would probably evaluate these two queries in the same way anyway.

    In practice, you rarely want to order all of the rows in a table by their distance from a given point, but rather just the top 3, or 10, or 100. This type of query is known as a k-nearest neighbour search and there are some examples as to how to optimise a query to find this in Isaac's blog: http://blogs.msdn.com/isaac/archive/2008/10/23/nearest-neighbors.aspx. I also describe various approaches to performing a nearest-neighbour search in Chapter 13 of "Beginning Spatial with SQL Server 2008".

    As for performance considerations... it's largely a case of 'suck it and see'. When your database is loaded with some more data you can experiment with various approaches to see what works best for you - recommendations regarding indexing options etc. are very heavily-dependent on the exact distribution and nature of your particular dataset so it's hard to give abstract, generalised advice.

    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by zxed Tuesday, April 14, 2009 3:57 AM
    Wednesday, March 25, 2009 7:03 PM
    Answerer
  • Tanoshimi.

    is there any preference or performance tricks passing around lats and longs?

    should we do it as FLOAT or DECIMAL ? I've tried to get my head around the fixed vs floating debating ... but i would prefer it if some smart person could answer it with a single sentence, with regards to sql2008.

    Lastly, if we do cals in our stored procs (for a single point) that are using the spatial methods (Eg. STIntersects) and basic math (eg. .Lat < 90 .. or what not) .. is it good practice to still keep the point in the DB as a GEOGRAPHY and use the .Lat / .Long methods, isntead of having a GEOGRAPHY field and a lat field and a long field.

    double lastly .. i've had to put my Geography instances in a sep table (1 - 1 relationship) cause my linq2sql spits da dummy. Can i move it back into the same table and in linq2sql, make the field type VARBINARY(MAX) .. manually create that in the designer? gawd i sometimes hate designer-mode stuff :)

    -Pure Krome-
    Thursday, March 26, 2009 6:53 AM
  • *politely bumping this, to see if Tanoshimi picks up my previous post*
    -Pure Krome-
    Tuesday, April 14, 2009 2:32 AM