none
Correcting someone else's code RRS feed

  • Question

  • I have been a light user of SQL in the past and now I have inherited some code which is quite long and little untidy. I am using SQL Management Server Studio 2008R2 - which is completely new to me. I've had a test server set-up and now have the debugger working, which I tried out on a small simple query. It did not work as expected and I'm struggling to find easy reference material for "beginners".

    So first question is; where to get a primer for MS-SQL debugging (and writing for that matter)?

    The specific issues with the debugger that I experienced were:

    1. setting break-points (OK); starting the process with step-into (OK); then the all break-points disappearing into one on line 1 under the debuggers yellow arrow. It seems to not allow any other breaks. Why?

    2. Trying to set some watches at random. Allowed me to select almost anything but could never evaluate anything. Why?

    The query itself is only 35 lines and it executes properly, I was simply using it as a self-training tool.

    The reason I need the tool is that I have 6500 line query that executes Ok with the one exception of doubling the output ie every row is duplicated. I suspect it is a join to far but pinning it down is currently beyond me.

    Sorry for such simple questions but all help/advice gratefully received.

    Monday, June 24, 2013 2:14 PM

Answers

  • The debugger is no help for troubleshooting a query, which is a single statement.  It's only useful for multi-statement procedural code.

     

    I have 6500 line query that executes Ok with the one exception of doubling the output ie every row is duplicated. I suspect it is a join to far but pinning it down is currently beyond me.

    To troubleshoot this you will need to fix the query.  Here's the basic process I would use:

    1) Start with a simplified query that comments out most of the joins and returns the correct number of rows.

    2) Add back the joins until you find the one that doubles the output rows.

    3) Fix the join.

    David


    David http://blogs.msdn.com/b/dbrowne/

    • Proposed as answer by Naomi NModerator Monday, June 24, 2013 2:30 PM
    • Marked as answer by SforB Wednesday, June 26, 2013 11:06 AM
    Monday, June 24, 2013 2:24 PM

All replies

  • The debugger is no help for troubleshooting a query, which is a single statement.  It's only useful for multi-statement procedural code.

     

    I have 6500 line query that executes Ok with the one exception of doubling the output ie every row is duplicated. I suspect it is a join to far but pinning it down is currently beyond me.

    To troubleshoot this you will need to fix the query.  Here's the basic process I would use:

    1) Start with a simplified query that comments out most of the joins and returns the correct number of rows.

    2) Add back the joins until you find the one that doubles the output rows.

    3) Fix the join.

    David


    David http://blogs.msdn.com/b/dbrowne/

    • Proposed as answer by Naomi NModerator Monday, June 24, 2013 2:30 PM
    • Marked as answer by SforB Wednesday, June 26, 2013 11:06 AM
    Monday, June 24, 2013 2:24 PM
  • Another suggestion is to try "divide and conquer" approach  (term learned from Peter Larsson). Get main part of the query into a temp table and join with the rest of the tables without breaking the idea of the query. You may get better performance with this approach (although not necessarily).

    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog


    My TechNet articles

    Monday, June 24, 2013 2:32 PM
    Moderator
  • Thanks for the reply David.

    Disappointed that the debugger is of no use as these SQL scripts can appear pretty illogical read from top to bottom.

    Am I to assume from your answer that there are no other tools that can assist and it is purely a slog using trial and error?

    Is there for example anything that can re-arrange the code into a "sensible" sequence (maybe the execution plan part of the app)?

    Also, is it possible that the code I'm seeing was in fact machine generated, rather than crafted by hand? And if so would tracking down the tool used be of any help?

    Sleeves rolled up in anticipation of your answers!

    Regards,

    Rob

    Tuesday, June 25, 2013 9:13 AM
  • Do you really have a 6500 line query?  This isn't procedural code, it's really a massive statement with a SELECT at the beginning and an ORDER BY at the end?  I am trying to visualize such a query.  Does it have UNION statements connecting dozens of SELECTs, many of which are near copies of each other?  Sometimes people can get themselves into trouble like that if they don't know how to build IN clauses.  How many tables in each join?


    • Edited by John Smith 3 Tuesday, June 25, 2013 12:50 PM blah
    Tuesday, June 25, 2013 12:17 PM
  • Maybe I exaggerated a little 6275 rows with a some blank separator lines so say circa 6000 code lines. I have not analysed it yet but it certainly begins SELECT and ends with ORDER BY.

    I started with its baby brother (only 2800 lines). as far as I can tell it has: embedded SELECT statements 4 deep; 5 aliased tables; 5 non-aliased tables;  5 sub-tables (created by the code); 13 primaary joins; and 14 secondary joins.

    Why was this written in SQL? Who knows!

    To be fair a lot of the larger query is repetitive code covering 12 periods so you could say it only has around 500-600 lines of unique code.

    Tuesday, June 25, 2013 12:59 PM
  • Much of the usual advice is useless here.  SSMS offers "Display Estimated Execution Plan" which helps with performance problems.  SQL Profiler intercepts the actual statements going to the database, but you're in possession of the statements.  The biggest issue here is political, in that you have been handed flatly unmaintainable code.  I have seen people lose jobs because their bosses failed to understand that it was not actually possible to fix / maintain / enhance the existing code base, and that abandonment or reimplementation from scratch had to be considered.  I would do my best to be up front with them about that.

    Technically, there's no reason this shouldn't be written in SQL (which generally runs much faster than when you do this sort of thing on the application side), but there's also no reason why these statements should be over a few hundred lines.  I'm pretty confident about that without even seeing it.  Repetitive code covering the 12 periods is a tip-off, as is the sheer number of lines.

    If I had to tackle this, I would be breaking it down into the smallest pieces first, which means building up the joins one at a time, just as David's reply suggests.  Do it for one period at first.  Build up the embedded SELECTS (subqueries) from scratch.  I keep copies of each version of the statement in a plain text file while I'm working on it, with comments, so I can always go back to what worked.  Once you get to the top, you may see a good way to simplify the whole statement (or not).  Reimplementing it as a stored procedure may be more maintainable, possibly at the cost of some performance.  There are performance arguments against the use of cursors and/or temp tables, but if the result is fast enough and also human-readable, that's an argument, too.  I would give the bosses estimates and options at each step of the way, so they can start thinking about the maintenance cost issue before all the time and money is down the drain, rather than afterwards.




    • Edited by John Smith 3 Tuesday, June 25, 2013 1:45 PM blah
    Tuesday, June 25, 2013 1:36 PM
  • Many thanks one and all.
    Wednesday, June 26, 2013 11:06 AM