Selecting a good spatial index with large polygons RRS feed

  • Question

  • I'm having some fun trying to pick a decent spatial index setup for a data set I am dealing with.

    The dataset is polygons, representing contours over the whole globe. There are 106,000 rows in the table, the polygons are stored in a geometry field.

    The issue I have is that many of the polygons cover a large portion of the globe. This seems to make it very hard to get a spatial index that will eliminate many rows in the primary filter. For example, look at the following query:


    SELECT "ID","CODE","geom".STAsBinary() as "geom" FROM "dbo"."ContA" WHERE "geom".Filter(geometry::STGeomFromText('POLYGON ((-142.03193662573682 59.53396984952896, -142.03193662573682 59.88928136451884, -141.32743833481925 59.88928136451884, -141.32743833481925 59.53396984952896, -142.03193662573682 59.53396984952896))', 4326)) = 1

    This is querying an area which intersects with only two of the polygons in the table. No matter what combination of spatial index settings I chose, that Filter() always returns around 60,000 rows.

    Replacing Filter() with STIntersects() of course returns just the two polygons I want, but of course takes much longer (Filter() is 6 seconds, STIntersects() is 12 seconds).

    Can anyone give me any hints on whether there is a spatial index setup that is likely to improve on 60,000 rows or is my dataset just not a good match for SQL Server's spatial indexing...


    Thursday, May 27, 2010 10:54 AM


All replies

  • I just did a quick query to see exactly how many truly huge polygons I have. Looking at the area of the envelope of each polygon, only 3 of the 106,000 polygons occupy more than 50% of the globe and only 23 of them are bigger than one eight of the globe.

    Which makes me even more puzzled as to why I can't find and index that returns a sensible number of rows from Filter().

    Sunday, May 30, 2010 10:15 PM
  • It was suggested that I split the polygons up. So I decided to give it a try, using a 4x4 grid across the globe. So I wrote my own query to do it. First I defined 16 bounding boxes, the first looked like this:

    declare  @box1 geometry = geometry::STGeomFromText('POLYGON ((
    -180 90,
    -90 90,
    -90 45,
    -180 45,
    -180 90))'
    , 4326)

    Then I used each bounding box to select and truncate the polygons that intersected that box:

    insert ContASplit
    select CODE, geom.STIntersection(@box1), CODE_DESC from ContA
    where geom.STIntersects(@box1) = 1

    I obviously did this for all 16 bounding boxes in the 4x4 grid. The end result is that I have a new table with ~107,000 rows (which confirms that I didn't actually have many huge polygons).

    I added a spatial index with 1024 cells per object and low,low,low,low for the cells per level.

    However, very oddly this new table with the split polygons still performs the same as the old one. Doing the .Filter listed above still returns ~60,000 rows. I really don't understand this at all, clearly I don't understand how the spatial index actually work.

    Monday, May 31, 2010 10:10 AM
  • Anyone ?

    I've been through a lot more to try and get an index that works, as documented here:


    But I still can't get the primary filter to return less than 60,000 rows.

    Thursday, June 3, 2010 2:36 PM
  • I finally got the solution via an answer to my StackOverflow question:


    In the end it was something very silly. The tool I had used to import my shapefile had transposed the latitude/longitude values used for the bounding box on the index.

    I had slavishly scripted out and repeatedly reused the same wrong bounding box in any subsequent indexes I created.

    Now I have the correct bounding box, the primary filter is working as it should.

    • Marked as answer by Andy Norman Tuesday, June 8, 2010 6:21 PM
    Tuesday, June 8, 2010 6:21 PM