locked
Simple Spatial Index (Theory) Questions RRS feed

  • Question

  • I'm trying to get my head around the SQL Server implementation of spatial indexes.  Can you help answer the following basic questions?
    1.  What algorithm does SQL Server use? R-Tree, quadtree, something secrect and magic?
    2.  When is the index updated?  Every time a row is added or on request?  So if I add a nadful of rows at random points, does the entire index get rebuilt or does the re-indexing find the correct node in the tree and just rebuilt that?
    Thursday, February 4, 2010 2:03 PM

Answers

  • It's a four-level grid system. Isaac Kunen wrote a couple of blog entries explaining the basic way it works, try this one for starters:
    http://blogs.msdn.com/isaac/archive/2008/03/01/basic-multi-level-grids.aspx

    While they are unique to the geometry and geography datatypes, and they work in slightly different ways, spatial indexes are otherwise managed the same way as any other sort of index. So, the index is simply updated as rows are added to the underlying table, you can force the index to be rebuilt by using ALTER INDEX (although doing so holds a lock on the underlying table), you can choose whether the spatial index should SORT_IN_TEMPDB etc.
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by SimonMunro Thursday, February 4, 2010 4:57 PM
    Thursday, February 4, 2010 2:52 PM
    Answerer
  • Hi Simon,

    I've heard the index called (by one of its designers) a modified adaptable
    quadtree. SQL Server indexes are updated each time a row is added/updated,
    the correct node(s) in the tree are just added/updated.

    Hope this helps,
    Bob Beauchemin
    SQLskills

    • Marked as answer by SimonMunro Thursday, February 4, 2010 4:58 PM
    Thursday, February 4, 2010 2:57 PM

All replies

  • It's a four-level grid system. Isaac Kunen wrote a couple of blog entries explaining the basic way it works, try this one for starters:
    http://blogs.msdn.com/isaac/archive/2008/03/01/basic-multi-level-grids.aspx

    While they are unique to the geometry and geography datatypes, and they work in slightly different ways, spatial indexes are otherwise managed the same way as any other sort of index. So, the index is simply updated as rows are added to the underlying table, you can force the index to be rebuilt by using ALTER INDEX (although doing so holds a lock on the underlying table), you can choose whether the spatial index should SORT_IN_TEMPDB etc.
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Marked as answer by SimonMunro Thursday, February 4, 2010 4:57 PM
    Thursday, February 4, 2010 2:52 PM
    Answerer
  • Hi Simon,

    I've heard the index called (by one of its designers) a modified adaptable
    quadtree. SQL Server indexes are updated each time a row is added/updated,
    the correct node(s) in the tree are just added/updated.

    Hope this helps,
    Bob Beauchemin
    SQLskills

    • Marked as answer by SimonMunro Thursday, February 4, 2010 4:58 PM
    Thursday, February 4, 2010 2:57 PM
  • I have always found the following web page useful and still use it as a reference when I get stuck on tuning spatial indexes:

    Microsoft TechNet - SQL Server TechNet - Spatial Indexing Overview

    http://technet.microsoft.com/en-us/library/bb964712.aspx

    According to the web page, "In SQL Server 2008, spatial indexes are built using B-trees, which means that the indexes must represent the 2-dimensional spatial data in the linear order of B-trees. Therefore, before reading data into a spatial index, SQL Server 2008 implements a hierarchical uniform decomposition of space. The index-creation process decomposes the space into a four-level grid hierarchy ."

    When are spatial indexes updated?  I think spatial indexes are updated when either a user or some piece of code initiates an update.  Spatial indexes, relative to other indexes, can be costly to build, and it seems like any automatic updating would be a real performance killer for the database in general.  In looking at the syntax of the
    CREATE SPATIAL INDEX statement, I don't see anything that indicates an option to have an index automatically update.  Interestingly, online index builds are not supported for spatial indexes with this version of SQL Server.

    Regarding the second part of your second question, I am interested in what others say as well.  From my experiences rebuilding spatial indexes, and I may be doing something wrong, it seems rebuilding an index takes almost as long as dropping and recreating one.  But then again, I usually just use SSMS for rebuilding spatial indexes.
    Thursday, February 4, 2010 3:14 PM
  • After I replied, I saw that tanoshimi and Bob had already responded.  It appears I failed to appreciate the differences between a spatial index getting updated and a spatial index getting rebuilt when I wrote my earlier comments.

    Bob, can you provide a link to BOL or some other source that describes how SQL Server spatial indexes are updated each time a row is added/updated and how only the correct node(s) in the tree are added/updated?  I always thought spatial indexes were handled a bit different than non-spatial indexes, and I am interested in reading more about this specific issue.
    Thursday, February 4, 2010 3:28 PM
  • Thanks for the replies.  It would seem that a 'four-level grid' is essentially a QuadTree - which is a recognised spatial indexing algorithm.  For another day I'll look into this further.  For example, I know that (at least in R-Tree) that the node structure of the index allows easier paging, which makes a difference with storage.  When inserting into a tree, the item can be added to the relevant leaf node by traversing the tree and only updating the traversed nodes that have been altered... whcih offers interesting variations when updating indexes.

    I'm still a spatial noob, so thanks for your help
    Thursday, February 4, 2010 5:06 PM
  • My understanding of quadtrees is that they recursively divide space into four quadrants - i.e. each division subdivides the previous cell into a 2x2 matrix. Therefore, every cell has exactly four child nodes, or they have none (i.e. a leaf cell).

    In SQL Server spatial indexes you can set the resolution of the matrix used at each of the four levels of the grid. A LOW resolution grid subdivides each cell into a 4x4 grid, A MEDIUM grid subdivides each cell into an 8x8 grid, and a HIGH grid subdivides each cell into 16x16. You can mix and match between resolutions at each level. Therefore it doesn't really match the "quad" bit of the quadkey description.

    There is also other differences in behaviour: for example, quadtree cells are only subdivided if it is necessary to reduce the specified item count in each cell (so, if you have a level 1 cell that only contains 1 point, it will not be subdivided). In contrast, the SQL Server tesselation function continues to subdivide all cells right down to Level 4 (so long as the CELLS_PER_OBJECT limit is not met). So rather, than having a Level 1 cell containing a point, you'd have a Level 4 cell containing a point.
    (Not very well explained, but hopefully that makes sense!)
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    Thursday, February 4, 2010 7:07 PM
    Answerer