locked
Update non spatial field with Aggregate Based on Spatial Criteria? RRS feed

  • Question

  • I can't seam to figure out how to do a bulk Update on table with an aggrigate on a spatial condition

    TableA has Geom1 GEOMETRY (Points)  , ValueA Decimal

    I would like  to do something like to

    Update TableA
    Set ValeA = AVG(ValueA)

    (WHERE ValueA > 0  from all other points within a 2 meter radius)

    any suggestions?
     
    Monday, March 30, 2009 8:36 PM

All replies

  • Hi there,

    I don't think you want to update ValueA directly with the average of the ValueA column, but rather store the average in a new column. Otherwise, once you've UPDATEd ValueA in row one, it will be that average that is used in all future calculations rather than the original ValueA. WIth that in mind, here's some source data:

    CREATE
     TABLE
     #TableA(
    Geom1 geometry,
    ValueA decimal
    ,
    ValueB decimal
    )
    
    INSERT
     INTO
     #TableA (Geom1, ValueA) VALUES
    
    (geometry::Point(10,10,0), 1),
    (geometry::Point(7,11,0), 2),
    (geometry::Point(9,10,0), 3),
    (geometry::Point(6,8,0), 4),
    (geometry::Point(12,11,0), 5),
    (geometry::Point(8,6,0), 6),
    (geometry::Point(11,9,0), 7)
    

    and here's how to update ValueB with the average of all ValueA's within a 2 unit radius:

    Update
     #TableA
    Set
     ValueB = 
    ( SELECT
     AVG
    (ValueA) 
    FROM
     #TableA AS
     A
    WHERE
     ValueA > 0
    AND
     #TableA.Geom1.STDistance(A.Geom1) < 2)
    

    is that what you wanted?
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    • Edited by tanoshimiEditor Tuesday, March 31, 2009 8:44 AM Code layout turned horrible
    Tuesday, March 31, 2009 8:43 AM
    Answerer
  • tanoshimi

    Thanks for your help.

    For the life of me I was getting frustrated with this.

    In my initial attempts I was using a HAVING  and GROUP BY clause,  which didn't like having Geography data type, no pun intended, thinking it was necessary for the Aggregrate.

    I was using the query in a sproc and came up with a brute force method that was making me cringe.

    --My original table started as:


        CREATE TABLE #GP(
            [ID] [int] IDENTITY(1,1) NOT NULL,
            [LongLat_Loc] [geography] NOT NULL,
            [UTM_Loc] [geometry] NOT NULL,
            [Elev_m] [decimal](9, 3) NULL,
            [IsSet] [Bit] 
            CONSTRAINT [PK_GP] PRIMARY KEY CLUSTERED
            (
                [ID] ASC
            ))

    --I Filled it with some then add a spacial index then after wards created an intermitent table

       CREATE TABLE #Near ([PointID] [INT] , [DataID] [INT])
       
    -- note for bellow: the first WHERE clause Elev_m < 0 are points to be updated and have elevation preset to -1  , these are evenly spaced points in a grid/ mesh

    -- the following was the butchery I got to work,  it was driving me mad

        INSERT INTO #Near (PointID , DataID)
        SELECT T1.ID , T2.ID
        FROM
        (SELECT ID, LongLat_Loc
            FROM #GP 
            WHERE Elev_m < 0) AS T1,
            (SELECT ID, LongLat_Loc
            FROM #GP) AS T2
        WHERE T1.LongLat_Loc.STDistance(T2.LongLat_Loc)< @Dist + 1 ;


        UPDATE #GP SET Elev_m = d.Elev_m
        FROM
            #GP g INNER JOIN
            (    SELECT n.PointID AS PID, AVG(Elev_m) AS Elev_m
                FROM #Near n
                INNER JOIN #GP g2
                ON n.DataID = g2.ID GROUP BY n.PointID, g2.ID, Elev_m HAVING Elev_m > 0) AS d
            ON g.ID = d.PID  ;


    The intent is to see about using SQL spatial for a quick and easy grid modeling,
    so I will be adding something to run this in a while loop until the changes in values reach a min threshhold.

    I like your simple select query and will use it, but I am wondering about using it for the step to create the intermitent table #Near. 
    I know this replicates some of the functionality of the Spacial index, but may there be a benifit of not needing to recalculate the distance?

    I have the coords in UTM stored as a Geometry data type, Has any one found much efficency with Indexing or the Distance Method between Geometry and Geography types?

    Also, I was  picked up a SQL 2008 book last night and learned of the Addition of the "CROSS APPLY" Join type which may have some application.

    Some interesting stuff to investigate.

    Thanks Again

    Berwyn

    Tuesday, March 31, 2009 5:04 PM
  • I tried the Single Select Method and I got some interesting results

    Initially I had my sproc to run through and do the recalcs 2X as I havn't set up the checking of a min threshhold yet.

    with #Near Method: 1:09  minutes
    with Single select :  4:57


    bumping up to  3X recalc
     #Near method: 1:10
     Single select :  7:50

    and down to a single pass
     #Near method: 1:09
     Single Select:  2:30

    Not what I expected

    Berwyn
    Tuesday, March 31, 2009 6:16 PM
  • OK, I think I see what you're trying to do now (it's a bit hard working with abstract concepts like ValueA and Geom1!). So you have some source data that you've been supplied with and some of the data points do not have a valid elevation, so you're trying to populate those points with a mean average of the elevation of nearby points. Is that right?

    Geometry methods generally perform faster than geography methods because they operate on a simple planar surface rather than a more complex ellipsoidal one. (At least, this is true for methods like STDistance(), STLength(), STIntersects() etc. It doesn't make any difference to STNumPoints() etc.).
    Both types perform significantly faster (think factors of up to 300x) when used with an appropriate spatial index.... the skill is in deciding what an 'appropriate' spatial index is in any particular case.

    My original query was a pretty basic example just to illustrate the logic required - there's better ways of identifying those points that lie within a given radius of a point (consider creating a buffer of a given size about a point using STBuffer() or BufferWithTolerance() and then finding any points that intersect that buffer, for example). Spatial is still a new topic (at least, in the SQL Server world it is), so there's still a lot of learning by experimentation. If you have found a method that works for you, then stick with it, but I'd encourage you to look into the matter of spatial indexes if you want to improve your query performance.
    Beginning Spatial with SQL Server http://www.apress.com/book/view/1430218290
    Tuesday, March 31, 2009 6:42 PM
    Answerer
  • After I populated the table #GP I've been applying an index

    CREATE SPATIAL INDEX [ID_LongLoat] ON #GP
        (
            [LongLat_Loc]
        )USING  GEOGRAPHY_GRID
        WITH (
        GRIDS =(LEVEL_1 = MEDIUM,LEVEL_2 = MEDIUM,LEVEL_3 = MEDIUM,LEVEL_4 = MEDIUM),
        CELLS_PER_OBJECT = 16, PAD_INDEX  = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
       
    I found the sproc ran several seconds faster with the index applied after the points were  inserted

    I havn't played with tweeking it though by adjusting levels and what not.

    Berwyn

    Tuesday, March 31, 2009 6:48 PM