locked
Problems with SQL Concatenation + Missing Documentation On This RRS feed

  • Question

  • TL; DR:  If you want to tell someone who IS successfully using  SQL concatenation NOT to use SQL Concatenation - how can you

    1) Show them how their working example could fail  - *OTHER* than the (valid) argument, " a future upgrade to your SQL Server may break your SQL code"

    2) Show them an active link to MS documentation talking about why this doesn't work?

    Long Version

    Hello All,

    I don't know whether this should be a question or a discussion, but I'd appreciate to feedback from either MS or the community.

    Take a look at this issue described on Marc Durdin's blog:

    https://marc.durdin.net/2015/07/concatenating-strings-in-sql-server-or-undefined-behaviour-by-design/

    ... which describes how doing something like this....

    SELECT @MyVariable = @MyVariable + t.TableField + ','

    FROM MyTable t

    ... unreliable and can produce unpredictable results - and we *should* be using XML Path instead.

    Also here on StackOverflow see the second answer and discussion - apparently the unreliable method is much slower than XML Path.

    Problem 

    1) This sort of thing is used quite a bit - I've seen it a good set of tutorials on SQL

    2) Marc shows an example of how this kind of concatenation messes up ... when you use Where Clause + Order By. But he also showed a simple working example, WITHOUT where clause paired with order by.  If you follow along,  you'll likely get it to work, like I did. (But if you didn't please post!)

    But the thing is - even in that case it's supposed to be unreliable - if the execution plan changes.  I'm no genius on execution plans,  but apparently they can change between versions.

    I was trying to find out if there's another way to change the execution plan so that it will cause a simple concatenation with no WHERE Clause + ORDER BY to  *Fail*.   I'm trying to narrow down what would make concatenation work vs not work.

    3) There used to be some Microsoft documentation describing these problems,  and this is referenced in Marc's blog.

    Here is another place in StackOverflow where this documentation is referenced and partially quoted.  However, many of the links are dead - Marc mentioned there's been a change in MS documentation, and it seems many articles are not longer available.

    So my questions are

    1) Does anyone know where this documentation is now?

    2) Can anyone give some more information on what might cause a simple, working example of concatenation -like mine above or Marc's first working example - to fail?  For example,  can you show a way to change the execution plan to break a simple concatenation Select ?

    - Thanks

    OneLChela

    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec
    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec
    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec
    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec
    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec
    
    SELECT @Summary = @Summary + ec.Description + ', '
    FROM BadConcat ec

    • Edited by OneLChela Thursday, November 7, 2019 4:35 PM change wording slightly
    • Changed type OneLChela Thursday, November 7, 2019 8:00 PM Looking for answers.
    Wednesday, November 6, 2019 11:38 PM

All replies

  • Hi OneLChela,

     

    In DDL,  it sets clustered index on "BadConcatID", which means the column is already sorted and ordered by index key.  For clustered key, you can check : https://docs.microsoft.com/en-us/sql/relational-databases/indexes/clustered-and-nonclustered-indexes-described?view=sql-server-ver15  Order by plus index key are duplicate and make no sense here.

     

    Best regards,

    Dedmon Dai


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com

    Friday, November 8, 2019 6:28 AM
  • Dedmon, 

    Thanks for responding!  In this query from Marc's blogs

    SELECT @Summary = @Summary + ec.Description + ', ' FROM BadConcat ec WHERE ec.BadConcatID in (1,2,3) ORDER BY ec.SortIndex

    ..it is not ordered by the BadConcatID column but the SortIndex column. (Theorectically the SortIndex could have data, which, after sorting, would result in different order than the primary key.  It is true that in this case, , given the sample values in the SortIndex and BadConcatID columns, the Order By is redundant. 

    But that doesn't seem to be what's "breaking" this SQL.  If you take out the WHERE clause, then run "PRINT @Summary", it will still give you the correct results. 

    - OneLChela


    • Edited by OneLChela Monday, December 2, 2019 9:50 PM
    Monday, December 2, 2019 9:49 PM
  • There used to be a KB article - way back in the days of SQL 2000 - which said that the correct behaviour of this construct is undefined. Alas, that article is not around anymore.

    I don't have any examples in store which shows when this fails, but every once in a while we encounter people in the forums who run into it.

    Anyway, if you are on SQL 2017, use string_agg. On SQL 2005 to SQL 2016 use FOR XML PATH(''). Yes, it looks ugly but it is guaranteed to work.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    • Proposed as answer by Naomi N Monday, April 13, 2020 12:20 AM
    Monday, December 2, 2019 10:53 PM
  • Thanks for the reply Erland!

    There does seem to be an issue with not being able to use STRING_AGG with an "order by"

    ( I am on SQL server 2017)

    Wednesday, December 18, 2019 7:34 PM
  • There does seem to be an issue with not being able to use STRING_AGG with an "order by"

    You can use string_agg to get an ordered list. You need to use the use the WITHIN GROUP clause:

    SELECT object_name(object_id), string_agg(name, ' - ') WITHIN GROUP (ORDER BY name DESC)
    FROM   sys.columns
    GROUP  BY object_name(object_id)


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    • Proposed as answer by Naomi N Monday, April 13, 2020 12:20 AM
    Friday, December 20, 2019 10:39 PM
  • Thanks Erland,  (I'm sorry for the late reply - just seeing this response today)

    Just to explain further -  in this example query:

    SELECT @Summary = @Summary + ec.Description + ', ' FROM BadConcat ec WHERE ec.BadConcatID in (1,2,3) ORDER BY ec.SortIndex

    We aren't trying to group by anything. 

    But I decided to go ahead and try something a little messy


    SELECT STRING_AGG(ec.Description,',')  WITHIN GROUP (ORDER BY SortIndex) desc_list
    FROM
    (select 'fake_out' as fake_col,  t.*
    from BadConcat t ) ec
    WHERE ec.BadConcatID in (1,2,3)
    group by fake_col

    This works. Thanks for taking the time to give your input!

    - OneLChela

    Thursday, January 2, 2020 8:43 PM
  • You don't have to use GROUP BY with string_agg, just you don't have to use GROUP BY with any other aggregate function (SUM, COUNT etc). I only used GROUP BY in my example to show the full powers.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Thursday, January 2, 2020 10:38 PM
  • Erland,

    I haven't yet figured out how to get this query: 

    SELECT @Summary = @Summary + ec.Description + ', ' FROM BadConcat ec WHERE ec.BadConcatID in (1,2,3) ORDER BY ec.SortIndex

    Translated into using String_Agg without using the group by fake-out I used above.

    - Michele

    Friday, January 3, 2020 2:37 PM
  • SELECT string_agg(ec.Description) WITHIN GROUP (ORDER BY ec.SortIndex)
    FROM   BadConcat ec WHERE  ec.BadConcatID in (1,2,3)


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Friday, January 3, 2020 10:37 PM
  • Thanks Erland!

    (Now I'm wondering what exactly what was the last thing I tried that didn't work.  I thought I did something with WITHIN GROUP (sans GROUP BY) that didn't work...  maybe my sql was off,  or maybe I was just in the wrong database or...?)

      Oh well,  I knew I should have checked again after you said Group By wasn't necessary! My Bad.

    Thanks again.


    • Edited by OneLChela Sunday, January 5, 2020 7:40 PM
    Sunday, January 5, 2020 7:27 PM