none
geometry.ToString() and STAsText() disastrous for performance?

    Question

  • Short version, adding a geometry.ToString() increases a 50 row query result from 1s to 3s...   (geometry holds simple points) Ex, simplest form:

    Select t1.id, t2.geometry from table1 t1
    inner join table2 t2 on t1.id1 = t2.id1
    where t1.flag = 1

    Above takes 1s, but for the following two queries, it take 3s, for 50 rows...

    Select t1.id, t2.geometry.ToString() from table1 t1
    inner join table2 t2 on t1.id1 = t2.id1
    where t1.flag = 1

    Select t1.id, t2.geometry.ToString() from table1 t1, table2 t2

    where t1.flag = 1 and t1.id1 = t2.id1

    Adding a second geometry.ToString() adds another 2s..   (select a, geometry.ToString(), geometry.ToString())

    If I just do a query to return all 210,000 rows:

    Select id, geometry from table2 -> 1s

    Select id, geometry.ToString() from table2 -> 4s

    Select id, geometry.ToString(), geometry.ToString() from table2 -> 7s

    Yet, when the original query returns 50 rows, it adds 2s for the 50 ToString(), but adds 3s for 210,000 toStrings... How is this possible?  Is there an issue with ToString() function?  But actually, STAsText() behaves exactly the same way...   On another table, a query with 142 results goes from 0s to 4s by adding a geometry.STAsText()...!
    • Edited by akk2x Friday, April 14, 2017 7:44 PM
    Friday, April 14, 2017 7:41 PM

All replies

  • Hi akk2x,

     

    Could you please check the query plan? Based on my test, when add the function ToString() and STAsText(), it will add a separate Compute Scalar step, this step will take some time.

     

    When operate the command 'Select id, geometry from table2':

     


    When operate the command 'Select id, geometry.ToString() from table2' and 'Select id, geometry.ToString(), geometry.ToString() from table2':

     


    When operate the command 'Select id, geometry.ToString(), geometry.STAsText() from table2':

     


    In my test, the time cost for Table Scan is more than Compute Scalar step, however, in your scenario, it takes more time for Compute Scalar.

     

    According to your description, the increasing time for 210,000 rows is a little more than the time for 50 rows, this points out that the actual time used for Compute Scalar is not much. I suggest you checking the query plan.

     

    Best Regards,

    Teige

     

     


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, April 17, 2017 7:41 AM
  • So this really seems like a MS SQL Server bug...  And I can't understand why the Query Plan does not reflect actual costs/execution of the query....

    In studio manager, I execute the following query

    select id1, id2, geo, geo, geo from table1.
    It returns 210,000 rows in 1s.
    Query Plan: Select Cost 0% <- Clustered Index Scan 100%

    select id1, id2, geo.ToString(),geo.ToString(),geo.ToString() from table1.
    It returns 210,000 rows in 13s
    Query Plan: Select Cost 0% <- Compute Scalar cost 0% <- Clustered Index Scan 100%

    So this makes no sense to me.  If I do a series or random concats, add 20 concats(concats(x,y),z) clauses in the select, the execution time quadruples (from 1s to 4s), and this time the compute scalar costs raises by... 1%...!

    Query Plan: Select Cost 0% <- Compute Scalar cost 1% <- Clustered Index Scan 99%

    for reference,
    select * from table1.
    It returns 210,000 rows in 3s (about 25 columns)
    Query Plan: Select Cost 0% <- Clustered Index Scan 100%

    It shouldn't be transfer size as "POINT (300946.2 5042494.519)" doesn't seem that much larger in size than "0xBC7D0000010CCDCCCCCC485E1241C74B37A14F3C5341" to transfer to the client, so that can't seem to explain why executing the query in studio manager would go for 1s vs 13s just because of four .ToString()...

    Wednesday, April 19, 2017 12:03 PM
  • Hi akk2x,

     

    I did some further researching, I found that the cost of Compute Scalar is not actually 1% as it showing. You can right click on the step and click Properties. In the right Dialog, expand Actual Time Statistics -> Actual Elapsed CPU Time, you will see the actual time used for this step.

     

    In my scenario, I used ten ToString(), then it finally used 1542ms, and the step for selecting from table used  23ms. These time are cost for operating ToString().

     


     

    Best Regards,

    Teige

     


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Thursday, April 20, 2017 10:08 AM
  • Thanks for that!  Yeah it seems it's a query optimizer issue and you were right in the first place in pointing to Execution Plan for the cause...  I'm dumbfounded at how this dataset behaves.  The original query is monstrous... But simplifying as above, we get;

    Select t1.id, t2.geometry.ToString() from table1 t1
    inner join table2 t2 on t1.id1 = t2.id1
    where t1.flag = 1

    both tables have over 200,000 rows.  But the where flag = 1 returns 140 (actually returns ~3000, but joins matches 140), it looks like the the ToString() is currently applied to the entire 210k elements and then filtered by the flag = 1 ...  Execution Plan is basically:

    select <- merge join <- index seek
    then a 2nd 'branch' going to the merge:
                   merge join  <-  compute scalar <- index scan 91% cost

    So basically, the top branch filters 3000, and the bottom does the ToString() for the 200k+ rows, then the 2 get merged joined...  Which explains why it takes so long...  I really don't understand why it doesn't apply the filter from the where flag = 1 clause, then merge the two tables, and then executes the function for the result... But I guess that's a question for another thread?  :(     Sadly, I think I might have to generate my 1st custom execution plan to force it to execute the query 'logically'?


    Thursday, April 20, 2017 8:42 PM
  • You may post an issue on Connect. What is your exact SQL Server version?

    If you're correct, you can try using derived table and apply ToString() for that and see if it helps.

    BTW, why do you need to apply that function in the select statement at all?


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Thursday, April 20, 2017 9:41 PM
  • Basically, an application reads the geometries by adding the geometry.ToString() to an SQL query to then use the WKT.  So as it really adds the ToString() to the SQL query during its execution, it prevents doing anything server side to pre-generate the ToString value or anything...

    I've now tried with 2014 and 2016, both give slightly different Execution plans with similar results; one huge branch deals with an 'expensive' spatial filter (intersect) & other conditions, where the cost of the query is about 92%, and then the 'magic' flag = 1 that returns very few results is done and merged with the huge result.  But to not get bogged down with the details;

    Select t1.id, t2.geometry.ToString() from table1 t1
    inner join table2 t2 on t1.id1 = t2.id1
    where t1.flag = 1

    Even the above is super expensive...  Is there way just to tell MSSQL to start off with the t1.flag = 1 'filter', filter t2 to get the 140 results and then compute the ToString & whatever else?  Because that's what's kills it...  Doing all computations on 200k+ rows, and then filtering...

    And wow... SQLServer 2016 is so much better than 2014 for debugging with the live query statistics!  Much easier to see what is going on, I'd have saved so much time starting with that!

    Friday, April 21, 2017 2:17 PM
  • Is it possible to try

    ;with cte as (select * from table1 t1 where Flag=1)

    select ...

    or you're unable to change the query at all? I am just wondering if re-adjusting this way may somehow help (although SQL optimizer usually re-adjusts query itself the way it sees fit, so that change may not help at all).


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Friday, April 21, 2017 3:37 PM
  • We have some flexibility on the query, although I doubt the app would support multi part queries..  But even in studio manager,

    ;with cte as (select id1 from table1 where flag=1)
    Select t2.id, t2.geometry.ToString() from cte
    inner join table2 t2 on t1.id1 = t2.id1

    I'm getting the same behavior, performance and execution plan; full scan & .ToString() applied to all the table rows...  (3s total vs 0s without .ToString())

    Friday, April 21, 2017 7:31 PM
  • In this case I can only suggest to try temp table instead of cte. How many rows the flag=1 filters?

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Friday, April 21, 2017 7:48 PM
  • The flag is true for ~3400 rows, and the join between both tables matches around 150.  But those numbers will vary a lot through time/regions...  I think the temp table could work, but actually, I was doing some test on MSSQL 2016, and with exactly the same data, the query only *NOW* seems to interpret and optimize the query correctly!  It has stopped doing full table scans...  Just yesterday it was 'bugged'...  No change since then really.

    I was going to say/test that the CTE could also probably have worked, my above query probably wasn't correct to bypass the problem, with something like;

    ;with cte as (Select t1.*, t2.* from table1 t1 inner join table2 t2 on t1.id1 = t2.id1 where t1.flag = 1)
    select id, geometry.ToString() from cte

    If i understand correctly the ToString should have be done for the cte result set, so should bypass the issue, but now that it seems to work, I'm unable to confirm that it fixes the issue. And that is what I was about to test, but now with 2016, the Query Plan generated with the same query as before, runs optimally and the performance with the ToString() is back to 'normal'!   



    • Edited by akk2x Tuesday, April 25, 2017 6:26 PM
    Tuesday, April 25, 2017 3:51 PM