Asked by:
Effective Octree Spatial Index on UDT
Question

I know this is not exactly the correct forum for this post, but I require input from people who's worked with spatial data and / or 3d data.
We currently have a UDT (User Defined Type) as a column in SQL Server 2008. This type stores 3d geometries, including the Z (elevation). We'd also like to create an spatial index on this type, which will be a separate computed column (persisted).After doing some research, an Octree (http://en.wikipedia.org/wiki/Octree) Algorithm sounds the most feasible for 3d (Where SQL uses a quadtree for 2d)..
Some documentation :(http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.29.987&rep=rep1&type=pdf)The idea would be to have a primary filter based on the index, before applying the more expensive predicates / functions such as contains, intersects, etc. for performance. Below is the pseudocode for generating an Octree Key which will be stored as the index key for filtering the data:
public string BuildIndex(xMin, yMin, zMin, xMax, yMax, zMax) { BoundingBox box = new BoundingBox(xMin, yMin, zMin, xMax, yMax, zMax); StringBuilder rc = new StringBuilder(""); for (int i = 1; i <= 15; i++) { Vector3D pt = overall.Center(); int box = 0; if (Lower.X < pt.X) { if (Upper.X <= pt.X) { overall.Upper.X = pt.X; } else { break; } } else { if (Upper.X >= pt.X) { box = 1; overall.Lower.X = pt.X; } else { break; } } if (Lower.Y < pt.Y) { if (Upper.Y <= pt.Y) { overall.Upper.Y = pt.Y; } else { break; } } else { if (Upper.Y >= pt.Y) { box = 2; overall.Lower.Y = pt.Y; } else { break; } } if (Lower.Z < pt.Z) { if (Upper.Z <= pt.Z) { overall.Upper.Z = pt.Z; } else { break; } } else { if (Upper.Z >= pt.Z) { box = 4; overall.Lower.Z = pt.Z; } else { break; } } rc.Append(box.ToString()); } return rc.ToString(); // Example: would return values lie "1", "13", "4545", etc }
Then the query (Assuming our octree key (based on bounding box) = 131):
SELECT Geometry3D.AsBinary() FROM [TABLE] WHERE Geometry3D.Intersects(@bounds) = 1 AND IndexColumn IN (SELECT DISTINCT IndexColumn FROM [TABLE] WHERE IndexColumn IN ('1','13') OR IndexColumn LIKE '131%')
The problem with this function is that it still generates a lot of false positives, for example on a data set with 1 million records, the index column would filter it down to 5% (50 000), with the secondary filter (intersection, contains) down to 1% (500). I find the computation not optimal, since an octree should have a lot less false positives. Even if I look at the data, I found results returned by the octree values not remotely close geographically, nor logical to a octree structure (or kdtree). This means there is something wrong with the actual formula.
After reading numerous documentation / white papers, I'm still not 100% sure where I can improve on this.
Any advice and / or feedback would be appreciated.
Thursday, October 7, 2010 11:34 AM
All replies

What about having two 2D index colums?
One with XY figure and the other with XZ, for example. The base and the facade. Both indexed geometry columns, of course.
Thursday, October 7, 2010 4:27 PM 
I'd rather stick to a single computed column as shown above. If I can just tweak the subdivision of cells to a more optimal level and avoiding the LIKE clause which returns a lot of false positives, it would be a great improvement..
Is there another approach for recursively subdividing / splitting cells that is more efficient based on a octree or kdtree?
Friday, October 8, 2010 8:15 AM 
Bump.. Anybody?
Perhaps a different algorithm for creating a RTree+ index (clustered / grouped by the bounds) of a geometry instead?
Friday, October 15, 2010 11:43 AM