locked
Effective Octree Spatial Index on UDT RRS feed

  • Question

  • I know this is not exactly the correct forum for this post, but I require input from people who's worked with spatial data and / or 3d data. 


    We currently have a UDT (User Defined Type)  as a column in SQL Server 2008. This type stores 3d geometries, including the Z (elevation). We'd also like to create an spatial index on this type, which will be a separate computed column (persisted).

    After doing some research, an Octree (http://en.wikipedia.org/wiki/Octree) Algorithm sounds the most feasible for 3d (Where SQL uses a quadtree for 2d)..
    Some documentation :(http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.987&rep=rep1&type=pdf)

    The idea would be to have a primary filter based on the index, before applying the more expensive predicates / functions such as contains, intersects, etc. for performance. Below is the pseudo-code for generating an Octree Key which will be stored as the index key for filtering the data:

    public string BuildIndex(xMin, yMin, zMin, xMax, yMax, zMax)
    {
    BoundingBox box = new BoundingBox(xMin, yMin, zMin, xMax, yMax, zMax);
    
    
    StringBuilder rc = new StringBuilder("");
    
    	for (int i = 1; i <= 15; i++)
    	{
    	Vector3D pt = overall.Center();
    
    		int box = 0;
    
    		if (Lower.X < pt.X)
    		{
    		if (Upper.X <= pt.X)
    			{
    				overall.Upper.X = pt.X;
    			}
    			else
    			{
    				break;
    			}
    		}
    		else 
    		{
    			if (Upper.X >= pt.X)
    			{
    				box |= 1;
    				overall.Lower.X = pt.X;
    			}
    			else
    			{
    				break;
    			}
    		}
    
    		if (Lower.Y < pt.Y)
    		{
    			if (Upper.Y <= pt.Y)
    			{
    				overall.Upper.Y = pt.Y;
    			}
    			else
    			{
    				break;
    			}
    		}
    		else
    		{
    			if (Upper.Y >= pt.Y)
    			{
    				box |= 2;
    				overall.Lower.Y = pt.Y;
    			}
    			else
    			{
    				break;
    			}
    		}
    
    		if (Lower.Z < pt.Z)
    		{
    			if (Upper.Z <= pt.Z)
    			{
    				overall.Upper.Z = pt.Z;
    			}
    			else
    			{
    				break;
    			}
    		}
    		else
    		{
    			if (Upper.Z >= pt.Z)
    			{
    				box |= 4;
    				overall.Lower.Z = pt.Z;
    			}
    			else
    			{
    				break;
    			}
    		}
    
    		rc.Append(box.ToString());
    	}
    
    return rc.ToString();
    
    	// Example: would return values lie "1", "13", "4545", etc
    }
    
    

    Then the query (Assuming our octree key (based on bounding box) = 131):

    SELECT 
    		Geometry3D.AsBinary()
    
    	FROM
    		[TABLE]
    
    	WHERE 
    		Geometry3D.Intersects(@bounds) = 1 AND
    		IndexColumn IN (SELECT DISTINCT IndexColumn FROM [TABLE]
    				WHERE IndexColumn IN ('1','13') OR
    						IndexColumn LIKE '131%')
    
    

    The problem with this function is that it still generates a lot of false positives, for example on a data set with 1 million records, the index column would filter it down to 5% (50 000), with the secondary filter (intersection, contains) down to 1% (500). I find the computation not optimal, since an octree should have a lot less false positives. Even if I look at the data, I found results returned by the octree values not remotely close geographically, nor logical to a octree structure (or kd-tree). This means there is something wrong with the actual formula.

    After reading numerous documentation / white papers, I'm still not 100% sure where I can improve on this.

    Any advice and / or feedback would be appreciated.

    Thursday, October 7, 2010 11:34 AM

All replies

  • What about having two 2D index colums?

    One with XY figure and the other with XZ, for example. The base and the facade. Both indexed geometry columns, of course. 

    Thursday, October 7, 2010 4:27 PM
  • I'd rather stick to a single computed column as shown above. If I can just tweak the sub-division of cells to a more optimal level and avoiding the LIKE clause which returns a lot of false positives, it would be a great improvement..

    Is there another approach for recursively sub-dividing / splitting cells that is more efficient based on a octree or kd-tree?

     

    Friday, October 8, 2010 8:15 AM
  • Bump.. Anybody?

    Perhaps a different algorithm for creating a RTree+ index (clustered / grouped by the bounds) of a geometry instead?

    Friday, October 15, 2010 11:43 AM