locked
SQL Server Spatial filter optimization RRS feed

  • Question

  • Hi all,

    I have a spatial table containing 3400 parcels and correctly indexed, and when I run the following query on a far over dimensioned server, the performances are barely bad.
    This request takes between 1,5 and 2 seconds

    SELECT F.[GEOMETRY].STAsBinary() as GEOMETRY,F. ID_TRANSAC,F.ID_ENTITY,F.ID_STRUCTURE,F.ID

    FROM CARTO.PG F with (INDEX([SPATIAL_PG]) )

    WHERE

    [GEOMETRY].Filter(geometry::STGeomFromText('POLYGON ((242353.225061205 2093829.3626259, 442712.156613794 2093829.3626259, 442712.156613794 2212891.8626259, 242353.225061205 2212891.8626259, 242353.225061205 2093829.3626259))', 0))=1

    AND

    [GEOMETRY].MakeValid().STEnvelope().STIntersects(geometry::STGeomFromText('POLYGON ((242353.225061205 2093829.3626259, 442712.156613794 2093829.3626259, 442712.156613794 2212891.8626259, 242353.225061205 2212891.8626259, 242353.225061205 2093829.3626259))', 0))=1

    If I remove the second part of my WHERE, [GEOMETRY].MakeValid().STEnvelope().STIntersects, I get the same results, but a bit faster (between 0,4 and 0,6 seconds).
    If I remove the function [GEOMETRY].STAsBinary() in the select, the response comes even a bit faster (between 0,3 and 0,5 seconds)

    Is there a way to optimize the data / indexation / DB parameters, to make the functions [GEOMETRY].MakeValid().STEnvelope().STIntersects and [GEOMETRY].STAsBinary() perfectly tuned ?
    Isn't the second part of my where [GEOMETRY].MakeValid().STEnvelope().STIntersects clause supposed to improve my overall performance ?

    Thanks a lot,

    Geomakhs.
    Wednesday, November 4, 2009 8:01 PM

Answers

  • hi there,
    I have a spatial table containing 3400 parcels and correctly indexed

    What does "correctly" indexed mean? There is no single correct spatial index - it depends hugely on both the nature of the data contained in the column on which the index is created, together with the nature of the predicates that you are supplying to the query. In other words, when you have a query that contains a predicate of the form...
    WHERE a.STIntersects(b) = 1
    ...you need to consider the type of geometry of both a and b when deciding the parameters of the spatial index (since both a and b will be tesselated to the same grid and then their grid cells will be compared).
    In this case, a is the GEOMETRY column of your table (preseumeably containing polygons), and b is a fixed stated polygon. So what values have you chosen for the GRID_LEVEL, CELLS_PER_OBJECT, and BOUNDING_BOX based on this query? Have you tried running sp_help_spatial_geometry_index to see how the objects in your index and query sampel are being tesselated to the grid?

    Isn't the second part of my where [GEOMETRY].MakeValid().STEnvelope().STIntersects clause supposed to improve my overall performance ?
    I'm not quite sure why you would think that... let's have a look at what this query does:
    - MakeValid() is a computationally expensive function, STIntersects() is a computationally expensive function... so that's two operations that have to be done on every row of data.
    - You're testing whether the envelope of the geometry column intersects the same polygon as you're testing in the first predicate using the Filter() method.... so this is not really adding any accuracy to your results, but it's costing quite a lot more.

    As it is, both of your WHERE conditions will return only an approximate set of those rows that intersect the polygon:
    - The first condition is an approximation, because you're using the Filter() method based on the index alone rather than the STIntersects() method.
     -The second condition is also an approximation, because although you're now using the STIntersects() method, you're only testing whether the envelope of the geometry intersects the polygon, not the geometry itself.

    Isn't what you're actually trying to do as follows?:
    WHERE
    [GEOMETRY].STIntersects(geometry::STGeomFromText('POLYGON ((242353.225061205 2093829.3626259, 442712.156613794 2093829.3626259, 442712.156613794 2212891.8626259, 242353.225061205 2212891.8626259, 242353.225061205 2093829.3626259))', 0))=1


    Remember that spatial queries are performed using a primary (fast, approximate) filter and a secondary (slow, accurate) filter. The best way to improve performance of any spatial queries is to make sure that as much of the work as possible is done during the primary filter.
    That means that you want to make the primary filter discard as many rows as possible because you can be certain that they definitely don't belong in the result set (so you don't need to call the secondary filter), or alternatively you can be certain that they definitely do belong in the result and can be pre-selected without needing to call the secondary filter (i.e., internal filter).
    The ability of the primary filter to do this all comes down to the settings you use to create your spatial index.
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    Thursday, November 5, 2009 7:40 PM
    Answerer

All replies

  • hi there,
    I have a spatial table containing 3400 parcels and correctly indexed

    What does "correctly" indexed mean? There is no single correct spatial index - it depends hugely on both the nature of the data contained in the column on which the index is created, together with the nature of the predicates that you are supplying to the query. In other words, when you have a query that contains a predicate of the form...
    WHERE a.STIntersects(b) = 1
    ...you need to consider the type of geometry of both a and b when deciding the parameters of the spatial index (since both a and b will be tesselated to the same grid and then their grid cells will be compared).
    In this case, a is the GEOMETRY column of your table (preseumeably containing polygons), and b is a fixed stated polygon. So what values have you chosen for the GRID_LEVEL, CELLS_PER_OBJECT, and BOUNDING_BOX based on this query? Have you tried running sp_help_spatial_geometry_index to see how the objects in your index and query sampel are being tesselated to the grid?

    Isn't the second part of my where [GEOMETRY].MakeValid().STEnvelope().STIntersects clause supposed to improve my overall performance ?
    I'm not quite sure why you would think that... let's have a look at what this query does:
    - MakeValid() is a computationally expensive function, STIntersects() is a computationally expensive function... so that's two operations that have to be done on every row of data.
    - You're testing whether the envelope of the geometry column intersects the same polygon as you're testing in the first predicate using the Filter() method.... so this is not really adding any accuracy to your results, but it's costing quite a lot more.

    As it is, both of your WHERE conditions will return only an approximate set of those rows that intersect the polygon:
    - The first condition is an approximation, because you're using the Filter() method based on the index alone rather than the STIntersects() method.
     -The second condition is also an approximation, because although you're now using the STIntersects() method, you're only testing whether the envelope of the geometry intersects the polygon, not the geometry itself.

    Isn't what you're actually trying to do as follows?:
    WHERE
    [GEOMETRY].STIntersects(geometry::STGeomFromText('POLYGON ((242353.225061205 2093829.3626259, 442712.156613794 2093829.3626259, 442712.156613794 2212891.8626259, 242353.225061205 2212891.8626259, 242353.225061205 2093829.3626259))', 0))=1


    Remember that spatial queries are performed using a primary (fast, approximate) filter and a secondary (slow, accurate) filter. The best way to improve performance of any spatial queries is to make sure that as much of the work as possible is done during the primary filter.
    That means that you want to make the primary filter discard as many rows as possible because you can be certain that they definitely don't belong in the result set (so you don't need to call the secondary filter), or alternatively you can be certain that they definitely do belong in the result and can be pre-selected without needing to call the secondary filter (i.e., internal filter).
    The ability of the primary filter to do this all comes down to the settings you use to create your spatial index.
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    Thursday, November 5, 2009 7:40 PM
    Answerer
  • Also I recomend not to MakeValid() all the time as it can affect the performance of the spatial index. If your data is fairly static, make it valid and replace the non-valid data. Then run your spatial query against the valid geometries without calling MakeValid() in the query.
    Thursday, November 5, 2009 9:56 PM