locked
Replacing nested views with stored procs RRS feed

  • Question

  • Hi,

    I am working on a project where nested views seem to make sense for data agregation to prevent code duplication. However I understand that nested views perform poorly. How can I replace these with stored procs?

    Here is what I want to achive simplified down. I know for the below example it looks like table 2 and 3 should be combined but in reality they contain a number of differnt columns.

    Table1
    VerId
    DeptId
    CreatorId

    Table2
    type1Id
    fkVerId
    units
    cost

    Table3
    type2Id
    fkVerId
    units
    cost

    view type1Price

    SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table2



    view type2Price

    SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table3



    view versionPrice

    
    SELECT VerId
          ,DeptId
          ,CreatorId
          ,SUM(v1.Price) + SUM(v2.Price) TotalPrice
    FROM Table1
    	LEFT JOIN type1Price v1
    		ON v1.fkVerId = VerId
    	LEFT JOIN type2Price v2
    		ON v2.fkVerId = VerId
    GROUP BY VerId
            ,DeptId
            ,CreatorId
    
    



    To me the above gives a nice simple way of getting the versionCost for either and individual version, a departments versions or a creators versions.

    It also allows me to get the type 1 and type 2 prices for a version from views type1Price and type2Price without duplicating the aggregation code in versionPrice.

    How and should this be replaced with stored procs?


    • Edited by DaveC1982 Friday, February 3, 2012 12:27 PM
    Friday, February 3, 2012 12:27 PM

Answers

  • The following stored procedure uses derived tables. You can also use CTE-s.

    CREATE PROC uspVersionPrice
    AS
    SELECT VerId
          ,DeptId
          ,CreatorId
          ,SUM(v1.Price) + SUM(v2.Price) TotalPrice
    FROM Table1
    	LEFT JOIN (SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table2) v1
    		ON v1.fkVerId = VerId
    	LEFT JOIN (SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table3) v2
    		ON v2.fkVerId = VerId
    GROUP BY VerId
            ,DeptId
            ,CreatorId
    GO
    


    Related articles:

    http://www.sqlusa.com/bestpractices2005/derivedtable/

    http://www.sqlusa.com/bestpractices2005/cte

    http://www.sqlusa.com/bestpractices2008/stored-procedure-parameters/ 


    Kalman Toth, SQL Server & Business Intelligence Training; SQL SERVER BEST PRACTICES
    • Marked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Unmarked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Unmarked as answer by DaveC1982 Friday, February 3, 2012 3:15 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 5:01 PM
    • Edited by Kalman Toth Friday, February 3, 2012 9:48 PM
    Friday, February 3, 2012 12:54 PM
  • Thanks,

    What if I then want to get

    SELECT type1Id
          ,fkVerId
          ,(units * cost) PriceFROM Table2

    I end up duplicating code or is this where CTE should be used?


    I am glad you asked that question because I was going to ask the same exact question.  What you need to know is that what you want to avoid for this computation is using the stored procedure that you just created!

    Why? Because you have "spectator data" that is only going to add overhead to your computations.  To some extent you have answered the other questions I was going to ask:

    How do you plan on using this stored procedure?
    How did you plan on using the view you had?

    To me it looks like you were trying to make a generic widget for solving several different problems using the same widget.  In this case this is not such a good idea.

    First your followup question implies that the abstraction that you really need is either a view or an inline function and not a procedure.  The reason that I say that is (1) you are using outer joins that can be optimized with join eliminations in case you don't want the information coming from the outer join, (2) you are at least potentially interested in part of the data from time to time and (3) you want to build a SELECT query around the data you fetch.

    First, there might be fault with my reason #1 because in this case because we are dealing with aggregates, but we also might be able to do something about that; however, reason #3 is probably the primary reason I would aim at abstracting this as a view: You want to use the data to construct some kind of select query; therefore, a view is probably the best abstraction.

    But what about this join elimination business?

    first look at this query:

    declare @testA table( aId integer, aStuff varchar(20) );
    insert into @testa
    select 1, 'Stuff #1' union all select 2, 'Stuff #2' 
    
    declare @testB table( aId integer, bValue integer );
    insert into @testb
    select 1, 14 union all select 1, 7 union all select 2, 1;
    
    declare @testC table( aId integer, cValue integer );
    insert into @testC
    select 1, 6
    
    ;with theView as
    ( select
        a.aId,
        b.bValue,
        c.cValue
      from @testA a
      left join @testb b
        on b.aId = a.aId
      left join @testC c
        on c.aId = a.aId
    )
    select * from theView
    /* -------- Output: --------
    aId         bValue      cValue
    ----------- ----------- -----------
    1           14          6
    1           7           6
    2           1           NULL
    
    (3 row(s) affected)
    */
    
    


    Hickup!  I have adventured into a place I didn't see coming; such happens from time to time.

    Kalman, would you please review your query and verify that those sums are going to be correct?




    • Edited by Kent Waldrop Friday, February 3, 2012 2:09 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 5:01 PM
    Friday, February 3, 2012 1:50 PM
  • First, let's talk about "join elimination".  In this query I am using the two CTEs named viewB and viewC as if they were views; for the purposes of this demonstration, they will work similar to views.  Both views compute aggregates:

    declare @testA table( aId integer, aStuff varchar(20) );
    insert into @testa
    select 1, 'Stuff #1' union all select 2, 'Stuff #2' 
    
    declare @testB table( aId integer, bValue integer );
    insert into @testb
    select 1, 14 union all select 1, 7 union all select 2, 1;
    
    declare @testC table( aId integer, cValue integer );
    insert into @testC
    select 1, 6
    
    ;with viewB as
    ( select
        aId,
        sum(bValue) as bValue_Sum
      from @testB
      group by aId
    ), viewC as
    ( select
        aId,
        sum(cValue) as cValue_Sum
      from @testC
      group by aId
    )
    select
      a.aId,
    --  bValue_Sum,
      cValue_Sum
    from @testA a
    left join viewb b
      on b.aId = a.aId
    left join viewC c
      on c.aId = a.aId
    /* -------- Output: --------
    aId         cValue_Sum
    ----------- -----------
    1           6
    2           NULL
    
    (2 row(s) affected)
    */
    
    
    

    Run the query such that it includes the bValue_Sum column and then comment out the bValue_Sum column and note the difference between the execution plan.  When the bValue_Sum column is selected with this query, the execution plan includes processing of the viewB CTE; however, when the bValue_Sum column is not selected with the query, the execution plan eliminates the processinf of the viewB CTE; this is what I mean by a "join elimination".  When it comes to views with outer joins, join eliminations are your friends.  In this case I mocked this up using CTEs instead of views; however, views would function similarly.

    What to remember is this:

    The fastest way to fetch data that you do not need is to not fetch the data at all.

    That is why you want to try to provide for join eliminations if at all possible.  Now, if you abstract this as a stored procedure that possibility completely goes away.  Frequently you might have heard it said that stored procedures are the fastest way to execute something.  That is not true when you start talking about a strong possibility of an abstraction with a view with the possibility of  join eliminations.

     

    • Marked as answer by DaveC1982 Friday, February 3, 2012 4:59 PM
    Friday, February 3, 2012 3:37 PM

All replies

  • The following stored procedure uses derived tables. You can also use CTE-s.

    CREATE PROC uspVersionPrice
    AS
    SELECT VerId
          ,DeptId
          ,CreatorId
          ,SUM(v1.Price) + SUM(v2.Price) TotalPrice
    FROM Table1
    	LEFT JOIN (SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table2) v1
    		ON v1.fkVerId = VerId
    	LEFT JOIN (SELECT type2Id
          ,fkVerId
          ,(units * cost) Price
    FROM Table3) v2
    		ON v2.fkVerId = VerId
    GROUP BY VerId
            ,DeptId
            ,CreatorId
    GO
    


    Related articles:

    http://www.sqlusa.com/bestpractices2005/derivedtable/

    http://www.sqlusa.com/bestpractices2005/cte

    http://www.sqlusa.com/bestpractices2008/stored-procedure-parameters/ 


    Kalman Toth, SQL Server & Business Intelligence Training; SQL SERVER BEST PRACTICES
    • Marked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Unmarked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 1:10 PM
    • Unmarked as answer by DaveC1982 Friday, February 3, 2012 3:15 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 5:01 PM
    • Edited by Kalman Toth Friday, February 3, 2012 9:48 PM
    Friday, February 3, 2012 12:54 PM
  • Thanks,

    What if I then want to get

    SELECT type1Id
          ,fkVerId
          ,(units * cost) PriceFROM Table2

    I end up duplicating code or is this where CTE should be used?

    Friday, February 3, 2012 1:12 PM
  • Thanks,

    What if I then want to get

    SELECT type1Id
          ,fkVerId
          ,(units * cost) PriceFROM Table2

    I end up duplicating code or is this where CTE should be used?


    I am glad you asked that question because I was going to ask the same exact question.  What you need to know is that what you want to avoid for this computation is using the stored procedure that you just created!

    Why? Because you have "spectator data" that is only going to add overhead to your computations.  To some extent you have answered the other questions I was going to ask:

    How do you plan on using this stored procedure?
    How did you plan on using the view you had?

    To me it looks like you were trying to make a generic widget for solving several different problems using the same widget.  In this case this is not such a good idea.

    First your followup question implies that the abstraction that you really need is either a view or an inline function and not a procedure.  The reason that I say that is (1) you are using outer joins that can be optimized with join eliminations in case you don't want the information coming from the outer join, (2) you are at least potentially interested in part of the data from time to time and (3) you want to build a SELECT query around the data you fetch.

    First, there might be fault with my reason #1 because in this case because we are dealing with aggregates, but we also might be able to do something about that; however, reason #3 is probably the primary reason I would aim at abstracting this as a view: You want to use the data to construct some kind of select query; therefore, a view is probably the best abstraction.

    But what about this join elimination business?

    first look at this query:

    declare @testA table( aId integer, aStuff varchar(20) );
    insert into @testa
    select 1, 'Stuff #1' union all select 2, 'Stuff #2' 
    
    declare @testB table( aId integer, bValue integer );
    insert into @testb
    select 1, 14 union all select 1, 7 union all select 2, 1;
    
    declare @testC table( aId integer, cValue integer );
    insert into @testC
    select 1, 6
    
    ;with theView as
    ( select
        a.aId,
        b.bValue,
        c.cValue
      from @testA a
      left join @testb b
        on b.aId = a.aId
      left join @testC c
        on c.aId = a.aId
    )
    select * from theView
    /* -------- Output: --------
    aId         bValue      cValue
    ----------- ----------- -----------
    1           14          6
    1           7           6
    2           1           NULL
    
    (3 row(s) affected)
    */
    
    


    Hickup!  I have adventured into a place I didn't see coming; such happens from time to time.

    Kalman, would you please review your query and verify that those sums are going to be correct?




    • Edited by Kent Waldrop Friday, February 3, 2012 2:09 PM
    • Marked as answer by DaveC1982 Friday, February 3, 2012 5:01 PM
    Friday, February 3, 2012 1:50 PM
  • The reality is that my actual system is much more complex than the above example

    I have a quote made up of versions made up of elements made up of cost types

    I want to be able to get totals for cost types, elements, versions and quotes so I went about this using nested views but I am finding that perfromance isn't very good.

    1. I create views of the cost types calculating the price based on units and cost per unit (or other sums dependent on the cost type).
    2. Use these views to create a view unioning the cost types and calculates the totals for the element.
    3. Then the I use the element view to create a view calculatlating the totals for a version suming the elements. 
    4. Finally the version view is used to create a quote view.

    In my head it made sense to use views to prevent code duplication but it just isn't working in practice. The query optimizer seems to be getting all rows from the lower views thus doing the sums on rows I don't need so for my quote view It calculates all rows for cost type then element then version then works out which rows it needs in the quote view.

    Hope this make some sense! Might need to rewrite this ramble!

     




    • Edited by DaveC1982 Friday, February 3, 2012 2:03 PM
    Friday, February 3, 2012 2:00 PM
  • Actually, this is exactly what I wanted to hear.  Views, including nested views are a tool.  Like any other tool this tool can be used or misused.  A frequent issue that comes up with views is that there is frequently a three-way tension between convenience, efficiency and size especially when you start to factor in the computation of aggregates.
    Friday, February 3, 2012 2:49 PM
  • There is a blog post by Jes Schultz Borland emphasizing this point

     

    T-SQL Wednesday #21 – T-SQL That Should Have Been Flushed Down the Toilet

     


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog

    • Edited by Naomi N Friday, February 3, 2012 3:14 PM
    Friday, February 3, 2012 3:13 PM
  • Hi,

    Thanks very much for the answers all of them very helpful in there own way! My new question is then it seems I am doing something right and something wrong how do i fix it!

    Cheers

    Friday, February 3, 2012 3:20 PM
  • First, let's talk about "join elimination".  In this query I am using the two CTEs named viewB and viewC as if they were views; for the purposes of this demonstration, they will work similar to views.  Both views compute aggregates:

    declare @testA table( aId integer, aStuff varchar(20) );
    insert into @testa
    select 1, 'Stuff #1' union all select 2, 'Stuff #2' 
    
    declare @testB table( aId integer, bValue integer );
    insert into @testb
    select 1, 14 union all select 1, 7 union all select 2, 1;
    
    declare @testC table( aId integer, cValue integer );
    insert into @testC
    select 1, 6
    
    ;with viewB as
    ( select
        aId,
        sum(bValue) as bValue_Sum
      from @testB
      group by aId
    ), viewC as
    ( select
        aId,
        sum(cValue) as cValue_Sum
      from @testC
      group by aId
    )
    select
      a.aId,
    --  bValue_Sum,
      cValue_Sum
    from @testA a
    left join viewb b
      on b.aId = a.aId
    left join viewC c
      on c.aId = a.aId
    /* -------- Output: --------
    aId         cValue_Sum
    ----------- -----------
    1           6
    2           NULL
    
    (2 row(s) affected)
    */
    
    
    

    Run the query such that it includes the bValue_Sum column and then comment out the bValue_Sum column and note the difference between the execution plan.  When the bValue_Sum column is selected with this query, the execution plan includes processing of the viewB CTE; however, when the bValue_Sum column is not selected with the query, the execution plan eliminates the processinf of the viewB CTE; this is what I mean by a "join elimination".  When it comes to views with outer joins, join eliminations are your friends.  In this case I mocked this up using CTEs instead of views; however, views would function similarly.

    What to remember is this:

    The fastest way to fetch data that you do not need is to not fetch the data at all.

    That is why you want to try to provide for join eliminations if at all possible.  Now, if you abstract this as a stored procedure that possibility completely goes away.  Frequently you might have heard it said that stored procedures are the fastest way to execute something.  That is not true when you start talking about a strong possibility of an abstraction with a view with the possibility of  join eliminations.

     

    • Marked as answer by DaveC1982 Friday, February 3, 2012 4:59 PM
    Friday, February 3, 2012 3:37 PM
  • Hi Thanks CTE was the answer changed a couple of key views to use CTE and the application is 10 times plus quicker and thats before I fully optimised them all.

    I still think my nested views were the right design as it makes the code really simple but the CTE was the missing factor.

    Will read up on CTEs and get my head around them properly but I can see that now the server is only getting the rows I need when I join on the CTE! :-) love it!

    Thank you so much.

    Cheers

    Dave


    • Edited by DaveC1982 Friday, February 3, 2012 5:28 PM
    Friday, February 3, 2012 5:00 PM
  • Ok one last question for now!

     

    Does CTE cause the whole view to be calculated or does it just make SQL use a better execution plan.

    e.g

     

    CREATE VIEW2
    AS
    WITH CTE
    AS(
         SELECT ID
                    ,SUM(VALUEA) TOTAL
         FROM VIEW1
         GROUP BY ID
    )
    
    SELECT CTE.ID
                ,CTE.TOTAL
                ,ANOTHERCOL
    FROM TABLE TB
          JOIN CTE
               ON CTE.ID = TB.fkId
    
    GO
    
    

     

    If I then select * from VIEW2 WHERE fkId = 10 does the CTE just consit of rows with ID 10 or the full result set.

    My thinking is that this has improved performance when I have 10k rows in my VIEW1 but what about when I have 1 million.

    Is there any good reading on the performance coost of CTE as this solution seems a bit too good to be true! I have good performance and nice code!

    Friday, February 3, 2012 7:46 PM
  • The first thing is:

    • Please show the predicate of View1; that is, please show all of the tables in the FROM clause of the view
    • Please show SELECT section definition of the "Id" column in View1
    • Please show the SELECT section definition for the "valueA" column in View1

    Also, understand what Kalman said in his post.  He talked about the fact that he was using a derived table rather than a CTE.  Derived tables and CTEs that are not recursive function similarly; however, CTEs also provide the ability to be re-used locally within the particular query similar to how view definitions can be re-used more globally.

    Friday, February 3, 2012 7:54 PM
  • Hi the above doesn't actually exist it is just an example to help me understand. I am just trying to work out how the CTE helps the execution plan as that is what it has done in my real application.

    I can't post the real app code as i don't own it plus I just want to understand using a simple example.

    Thanks again for all your help.

    Friday, February 3, 2012 8:08 PM
  • In that case I would say that (1) views, (2) derived tables and (3) CTEs that are not recursive, in general, optimize simiarly.  I will see if I can put together an example similar to what you are trying to do, but probably won't be able to get that tonight.
    Friday, February 3, 2012 9:30 PM
  •  

    No rush enjoy yor weekend and if you want to give more that's great but leave it till next week and have some fun. You have been lots of help.

    I have been thinking and in my case the use of CTE is clearly helping SQL server come up with a muh better execution plan and probably do a better job of join elimination.

    I have another plan which I am going to test. I think the problem may be thatt at each view level I join the tables I need. If I join all the tables I need in the first view then just do the diferent agregations in each subsequent view then SQL may be able to do better join elimination with the joins I don't use at each level.

    e.g.

    before adding the CTE the moment I had something like (again simplified down not the real world example)

     

    viewCostType
    
    SELECT fkElementId
    
                ,units * cost price
    
    FROM tableCostType
    
     
    
    viewElement
    
     
    
    SELECT elementID
    
               ,fkVersionId
    
               ,t1.anotherColumn
    
              ,sum(v1.price) price
    
    FROM tableElement
    
             JOIN viewCostType v1
    
                  ON v1.fkElementId = elementId
    
             JOIN someTable t2
    
                  ON t1.SomeKey = SomeFk
    
    GROUP BY elementID
    
                  ,fkVersionId
    
                  ,t2.anotherColumn
    
     
    
    viewVersion
    
     
    
    SELECT versionID
    
                ,fkQuoteId
    
               ,t2.anotherColumn2
    
              ,sum(v1.price) price
    
    FROM tableVersion
    
       JOIN ViewElement v1
    
            ON v1.fkVersionId= versionId
    
        JOIN someTable2 t2
    
            ON t2.SomeKey = SomeFk
    
    GROUP BY versionID
    
                    ,fkQuoteId
    
                   ,t2.anotherColumn2
    
     
    
     
    
    And so on.
    
     
    
    I am wondering if this was replaced with
    
     
    
    viewCostType
    
     
    
    SELECT fkElementId
    
              ,units * cost price
    
              ,te.elementID
    
               ,te.fkVersionId
    
               ,t1.anotherColumn
    
              ,tv.versionID
    
               ,tv.fkQuoteId
    
               ,t2.anotherColumn2
    
    FROM tableCostType
    
        JOIN  tableElement te
    
           ON te.ElementId = fkElement
    
         JOIN someTable t2
    
            ON t1.SomeKey = te.SomeFk
    
        JOIN tableVersion tv
    
          ON tv.versionId = v1.fkVersionId
    
        JOIN someTable2 t2
    
           ON t2.SomeKey = tv.SomeFk
    
     
    
    viewElement
    
    SELECT elementID
    
                ,fkVersionId
    
                ,v1.anotherColumn
    
               ,sum(v1.price) price
    
    FROM viewCostType v1
    
    GROUP BY elementID
    
                     ,fkVersionId
    
                     ,v1.anotherColumn
    
     
    
    viewVersion
    
     
    
    SELECT versionID
    
               ,fkQuoteId
    
               ,v1.anotherColumn2
    
               ,sum(v1.price) price
    
    FROM viewCostType v1
    
    GROUP BY versionID
    
                    ,fkQuoteId
    
                   ,v1.anotherColumn2
    
    

    This way any joins will be eliminated from the base view and agregations are done at each level based on grouping rather than the outcome of the layers of views. Now I need to look at this in practice as it may be i still need to join on the other layers of views as some of the calculations are much more complex than SUM() hoever perhaps these could be moved into functions to stop code duplication at each layer. Once it has been chnaged to the new structure I could then reassess if I need any CTE-s

    What do people think?

    The CTE-s give me time to rebuild my views. Unfortunatly there wasn't much time for design in this project so much was written on the fly under pressure not always thinking through the execution plans.

    P.S. lots of my JOINS are in reality LEFT JOINS I am ust writing JOIN for speed really just pseudo code.


    • Edited by DaveC1982 Saturday, February 4, 2012 10:00 AM
    Saturday, February 4, 2012 9:53 AM
  • Hello me again,

    Was getting to much spam to e-mail address I was registered with so I ditched it!

    So this has been a huge help and my application is performing well and I am happy with the design and maintainability of the code.

    I have one last question! Why do people say views and nested views are so bad? Is this one of those stock answers that doesn't apply or am I doing something really wrong?


    Is the real answer that badly designed nested views are bad!
    Thursday, February 9, 2012 8:59 AM
  • As I previously stated, views are a tool.  The reason for the stock answer is that views have been frequently over-used for situations where they really shouldn't be applied.  Nested, monster views have a tendency to have lots of "spectator data" that gets access and that otherwise serves no useful purpose to the actual query that gets executed.  This can become a performance nightmare.  Worse still is that when this does become a problem the frequent scenario is that the performance problem isn't realized until AFTER you have gone into production.

    Also, realize that join elimination happens only with outer joins.  If a view simply doesn't contain outer joins the usefulness of a "generic view" becomes small because now there are no join eliminations and the possibility of processing "spectator data" becomes high; this in turn makes the "generic view" a bad choice for problem abstraction.

    Monday, February 20, 2012 3:56 PM