locked
Select geography within bounds RRS feed

  • Question

  • I am currently porting a huge road map (of the world¬) into SQL Server 2008, using the new geography data types. At the moment, my intention is to have one huge table containing all of the shapes using using queries to extract the geography for any specified bounding box (defined with minLat, maxLat, minLong, maxLong).

    Obviously this table is going to be massive - at rough guess would be 1,000,000,000 rows. Provided I set up spatial indexes, can I expect to achieve good performance for this design, or am I simply asking for too much? Either way, what would be the best way to query the appropriate geometry? Is it a case of doing a STIntersects against the 'view' bounding rectangle, or is this going to cause further performance issues - i.e., is there a better way?

    Thanks.
    Sunday, July 27, 2008 7:11 PM

Answers

  • Hello,

    It's always a little hard to speculate on performance, but this seems a perfectly reasonable design to me.  The STIntersects predicate should go against the index.  You should understand that the east-west lines in a lat-long box won't follow parallels.  This effect will be minimized if the box is kept small, but you should be aware of it.


    You might want to take a look at the Filter method.  If you are just fetching objects for display, and don't care about a few false positives in your query, then Filter should give you better performance.

    Cheers,
    -Isaac
    Sunday, July 27, 2008 8:42 PM
  • Just adding to Isaac's sage wisdom...

     

    Be very careful about the indexing in terms the amount of detail per level. With a million rows, the index could grow to be slow large that it actually slows down the query more than helps if you have HIGH or even MEDIUM levels of detail at all levels.

     

    I start by creating the index with LOW detail on every level then run a series of test queries to get  base line. One level (start at the lowest, LEVEL_4) detail, crank the level of detail up until I stop getting performance gains from it. You could measure that as just the execution time or the total query cost (use SHOWPLAN to get that detail) divide by the execution time.

     

    Good luck, I hope to hear more about how you using this -- sound like a great application!

    Monday, July 28, 2008 12:23 AM
    Answerer

All replies

  • Hello,

    It's always a little hard to speculate on performance, but this seems a perfectly reasonable design to me.  The STIntersects predicate should go against the index.  You should understand that the east-west lines in a lat-long box won't follow parallels.  This effect will be minimized if the box is kept small, but you should be aware of it.


    You might want to take a look at the Filter method.  If you are just fetching objects for display, and don't care about a few false positives in your query, then Filter should give you better performance.

    Cheers,
    -Isaac
    Sunday, July 27, 2008 8:42 PM
  • Just adding to Isaac's sage wisdom...

     

    Be very careful about the indexing in terms the amount of detail per level. With a million rows, the index could grow to be slow large that it actually slows down the query more than helps if you have HIGH or even MEDIUM levels of detail at all levels.

     

    I start by creating the index with LOW detail on every level then run a series of test queries to get  base line. One level (start at the lowest, LEVEL_4) detail, crank the level of detail up until I stop getting performance gains from it. You could measure that as just the execution time or the total query cost (use SHOWPLAN to get that detail) divide by the execution time.

     

    Good luck, I hope to hear more about how you using this -- sound like a great application!

    Monday, July 28, 2008 12:23 AM
    Answerer
  • I'm assuming that Filter returns all geography types within the same spatial index as the bounding geography type? If so, is there any way to return the extents of the spatial index - i.e. the rectangle that was actually checked against? It's not crucial to what I'm doing, but it might well be useful.

    Thanks for your replies.

    EDIT: Also, is there any way to automatically optimize a spatial index (or, all of the indexes for that matter). This is going to be a huge database and manually optimizing it will be long and laborious. Bare in mind that my only concern is the efficiency of SELECT statements - this is going to basically be a static database, so I don't much care about the efficency of INSERTs, UPDATEs or DELETEs.

    Thank again!
    Thursday, August 7, 2008 7:21 PM
  • Yes, do a query like this. The bounding box is defined there.

     

    Now if you are asking what part of the shred table for tessellation got used, I don't believe that's exposed in 2008 bits.

     

    Code Snippet

    select * from SYS.spatial_index_tessellations

     

     

     

    As to your question re: optimization, it depends on what you mean by that. Defragmentation, statistics, etc.
    Thursday, August 7, 2008 11:02 PM
    Answerer
  • By optimization, I mean making SELECT queries as fast as possible.
    Friday, August 8, 2008 8:47 AM
  • If the query optimizer believes using the index will produce a least-cost plan, it will use. If not, it won't. That's when you have to hint it to get what you are looking for. And that is probably fine as along as you only a relative few concurrent users. Just remember that the index can easily be much larger than the data you are querying so it may actually hurt to even have it in some cases.

    Friday, August 8, 2008 3:12 PM
    Answerer