locked
Why is my query slower in SQL Server than in Postgres (with PostGIS)? RRS feed

  • Question

  • I'm working on moving some spatial searching capabilities from Postgres with PostGIS to SQL Server and I'm seeing some pretty terrible performance, even with indexes.

    My data is around a million points in a large area (Colorado), and I want to find out which of those points are within given shapes. The shapes are drawn on the map by our users, so range from covering a single neighborhood to covering half the state. My query looks almost exactly like this (with names changed):

    DECLARE @Shape GEOMETRY = ...
    SELECT * FROM PointsTable WHERE Point.STWithin(@Shape) = 1

    If I select a fairly small shape, I get fairly decent times (sub-second), but if I select a big shape, my times sometimes pass 30 seconds.

    I've tried several different grid sizes on my indexes (all high, all medium, all low), different cells per object (16, 64, 256), and no matter what I do the times stay fairly constant. I'd like to try more combinations but I don't even know what direction to go. More cells per object? Less? Some strange combination of grid sizes? The best I was able to get was all medium with 16 cells per object.

    I've looked at my query plans and they're always using the index, it's just not helping at all. I've even tried without the index, and it's not much worse.

    Is there any advice anyone can give on this? I know that indexes "are different for every situation", but aren't there any sort of guidelines?

    • Edited by Brendan Long- Thursday, August 12, 2010 10:40 PM this forum hates my code
    Thursday, August 12, 2010 10:26 PM

All replies

  • Hi Brenda,

     

    This issue might be caused by that the index is configured inappropriately. Then the index will not be used enough then it will result in low performance.

     

    Therefor I recommend that you adjust the option of this spatial index. For more information about the options for constructing spatial index, you could refer to this link: http://msdn.microsoft.com/en-us/library/bb934196.aspx

     

    Meanwhile for further investigation, could you please provide us some sample data and the script you used to create this spatial index?

     

    We could get the script used to create spatial index with the following steps:

    1.       Connect to SQL Server via SQL Server Management Studio

    2.       Expand “<instance> | Databases | <database> | Tables | <table with spatial data> | Indexes”

    3.       Right-click the index and choose “Script Index as | CREATE To | New Query Editor Window”

    4.       Copy the script in the new windows and paste it here

     

    If anything is unclear, please let me know.

    Monday, August 16, 2010 12:39 PM
  • If your data is only going to be in Colorado, have you set the min/max bounds of the index to be a tight fit over that area?

    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    Monday, August 16, 2010 12:53 PM
    Answerer
  • The best working index I could get was this one:


    CREATE SPATIAL INDEX MapTesting_Location_Medium_Medium_Medium_Medium_16_NDX
        ON MapTesting (Location)
     USING GEOMETRY_GRID
      WITH (
        BOUNDING_BOX = ( -- The extent of our data, data is clustered in cities, but this is about as small as the index can be without missing thousands of points
            XMIN = -12135832,
            YMIN = 4433884,
            XMAX = -11296439,
            YMAX = 5443645),
        GRIDS = (
            LEVEL_1 = MEDIUM,
            LEVEL_2 = MEDIUM,
            LEVEL_3 = MEDIUM,
            LEVEL_4 = MEDIUM),
         CELLS_PER_OBJECT = 256 -- This was set to 16 but it was much slower
      )

     

    I had some issues getting the index used, but this is different. For these tests I ran a test search (the one listed in my original post) with a WITH(INDEX(...)) clause for each of my indexes (testing various settings for grid size and cells per object), and one without any hint. I also ran sp_help_spatial_geometry_index using each index and the same search shape. The index listed above ran fastest and also  was listed as most efficient in sp_help_spatial_geometry_index.

    When running the search I get these statistics:

    (1 row(s) affected)
    Table 'MapTesting'. Scan count 0, logical reads 361142, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'Worktable'. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
    Table 'extended_index_592590491_384009'. Scan count 1827, logical reads 8041, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

    (1 row(s) affected)

     SQL Server Execution Times:
       CPU time = 6735 ms,  elapsed time = 13499 ms.

    I'm looking at how to get you sample data.

    EDIT: By the way, in SQL Server, this query takes 14 seconds, but in Postgres (identical data) it takes around 500 ms.

    Monday, August 16, 2010 6:00 PM
  • I tried to create a set of random points to see if we got the same results, but with random points it was actually extremely fast. We're thinking the problem might be that our data is fairly sparse, with small areas of high density. We were wondering if the problem might just be that this is a situation where R-Trees outperform quad-tree-like grids, since any square on the grid that has any points is likely to have a lot of them, so our bottom level grids are either completely unused or packed with points.
    Monday, August 16, 2010 8:27 PM