none
Ranking Functions Not Windowing As Expected

    Question

  • Hi all,
    I'm new to ranking functions and windowing.  Please excuse the simplicity of this post - but I'm failing to grasp why I'm receiving the output I am from SQL 2008 SP2 Dev x64, SQL 2008 SP1 Std x64, and SQL 2008R2 Std x64.  (This tells me I'm interpreting the function wrong!)
    I'm running this set of statements:
    CREATE TABLE RankTest
    (
     OrderBy1 INT NOT NULL,
     Value1 CHAR(1) NOT NULL
    )
    
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (1, 'A')
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (2, 'A')
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (3, 'B')
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (4, 'B')
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (5, 'A')
    INSERT INTO RankTest (OrderBy1, Value1) VALUES (6, 'A')
    
    SELECT *, 
     ROW_NUMBER() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS rn,
     RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS r,
     DENSE_RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS dr 
     FROM RankTest
     ORDER BY OrderBy1
     
    DROP TABLE RankTest
    
    
    From the SELECT, I'm expecting to see (from all of the ranking functions) is a nice series of "1, 2, 1, 2, 1, 2", like this:
    OrderBy1 Value1 rn r dr
    1 A 1 1 1
    2 A 2 2 2
    3 B 1 1 1
    4 B 2 2 2
    5 A 1 1 1
    6 A 2 2 2
    But instead, I'm getting this:
    OrderBy1 Value1 rn r dr
    1 A 1 1 1
    2 A 2 2 2
    3 B 1 1 1
    4 B 2 2 2
    5 A 3 3 3
    6 A 4 4 4

    Why is the partition not working like my brain says it should? Doesn't the ORDER BY in the OVER clause mean that the ranking functions should "start over" in their counting when they see the "A" value in row 5?  Or does the PARTITION BY operate "first" (to group the rows) and then the ORDER BY merely orders the contents of the partitions?

    How can I modify my query to return the results I'm looking for?

    Thanks!


    Todd McDermid's Blog Talk to me now on

    Tuesday, April 26, 2011 6:03 PM

Answers

  • The windowing functions are working correctly, one way to get what you want is:

    With cte As
    (SELECT *,
     ROW_NUMBER() OVER (ORDER BY OrderBy1) - ROW_NUMBER() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS Island
     --RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS r,
     --DENSE_RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS dr
     FROM RankTest)
    Select *,
     ROW_NUMBER() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) As rn,
     RANK() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) AS r,
     DENSE_RANK() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) AS dr
     From cte
     ORDER BY OrderBy1;
    

    Tom

    Tuesday, April 26, 2011 6:39 PM

All replies

  • The ranking functions are working as expected.  I am afraid that I am not awake on this one.  To me it looks like the best solution might be to use a cursor.  Additional help please?

    Here is a method that is more-or-less a triangular join method; however, I think this will be slower than a cursor-based version:

    declare @RankTest table
    ( OrderBy1 INT NOT NULL,
     Value1 CHAR(1) NOT NULL)
    
    INSERT INTO @RankTest 
    select 1, 'A' union all select 2, 'A' union all 
    select 3, 'B' union all select 4, 'B' union all
    select 5, 'A' union all select 6, 'A'
    
    select
     a.orderBy1,
     a.value1,
     count(*)
    from @rankTest a
    join @rankTest b
     on b.value1 = a.value1
     and b.orderBy1 <= a.orderBy1
     and not exists
      ( select 0 from @rankTest c
       where c.value1 <> a.value1
        and c.orderBy1 between b.orderBy1 and a.orderBy1
      )
    group by a.orderBy1, a.value1
    
    /* -------- Output: --------
    orderBy1  value1 
    ----------- ------ -----------
    1      A   1
    2      A   2
    5      A   1
    6      A   2
    3      B   1
    4      B   2
    
    (6 row(s) affected)
    */

     ( This is a case where the ORACLE "Lag" or "Last" analytic functions might be helpful. )



    Tuesday, April 26, 2011 6:16 PM
    Moderator
  • To get 1,2,1,2,1,2 like that the values for rows 5 & 6 would need to be 'C'. 

    This is going to get you pretty close to what you are looking for but if you really need it to reset back to 1 on row 5 you'll have to have the Value1 column be different than either of the previous sets.  ROW_NUMBER() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS rn

    I went through a similare problem last week and went as far as putting one ROW_NUMBER in a sub-select because you can't for example do DENSE_RANK(ROW_NUMBER() OVER (PARTITION BY Value1 ORDER BY OrderBy1)) AS rn.  It's not allowed.

     


    -- Aaron Nelson.
    Tuesday, April 26, 2011 6:36 PM
  • The windowing functions are working correctly, one way to get what you want is:

    With cte As
    (SELECT *,
     ROW_NUMBER() OVER (ORDER BY OrderBy1) - ROW_NUMBER() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS Island
     --RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS r,
     --DENSE_RANK() OVER (PARTITION BY Value1 ORDER BY OrderBy1) AS dr
     FROM RankTest)
    Select *,
     ROW_NUMBER() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) As rn,
     RANK() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) AS r,
     DENSE_RANK() OVER (PARTITION BY Value1, Island ORDER BY OrderBy1) AS dr
     From cte
     ORDER BY OrderBy1;
    

    Tom

    Tuesday, April 26, 2011 6:39 PM
  • Tom - that technique looks like it's going to work for me.  Thanks everyone for confirming what I thought I was seeing!
    Todd McDermid's Blog Talk to me now on
    Tuesday, April 26, 2011 7:00 PM
  • Think of it this way... The ranking is taking place before the SQL statement's  ORDER BY clause. So the last line, has no bearing on any of these functions. It's only affecting the final display order.

    The ranking criteria is 100% in the OVER clause of the given function, with the "PARTITION BY" columns being sorted first, followed by the "ORDER BY" columns.

    So based on your example, in all 3 cases, the ranking order is...

    OrderBy1	Value1	r
    1		A	1
    2		A	2
    5		A	3
    6		A	4
    3		B	1
    4		B	2
    

    With [Value1] being sorted 1st followed by [OrderBy1].

     


    Jason Long
    Tuesday, April 26, 2011 7:06 PM
  • Todd,

    > Or does the PARTITION BY operate "first" (to group the rows) and then the ORDER BY merely orders the contents of the partitions?

    Bingo!!!

    > How can I modify my query to return the results I'm looking for?

    Tom already answered your question, so I am going to add little. You are looking to rank the rows not only by the [Value1], if not also by the island (consecutive rows) based on the values of [OrderBy1].

    Island OrderBy1 Value1
    1 1 A
    1 2 A
    1 3 B
    1 4 B
    2 5 A
    2 6 A

    If there is no gap in the [OrderBy1] sequence, then you can use:

    [OrderBy1] - ROW_NUMBER() OVER(PARTITION BY Value1 ORDER BY [OrderBy1]) AS Island

    otherwise you use:

    ROW_NUMBER() OVER(ORDER BY [OrderBy1]) - ROW_NUMBER() OVER(PARTITION BY [Value1] ORDER BY [OrderBy1]) AS Island

    Then you rank the result set by:

    ranking_function() OVER(PARTITION BY [Value1], Island ORDER BY [OrderBy1])

     

    I learned this trick from Itzik Ben-Gan, and you can read more about ranking functions in his last book about T-SQL Querying.

    Inside Microsoft® SQL Server® 2008: T-SQL Querying

     


    AMB

    Some guidelines for posting questions...

    Tuesday, April 26, 2011 7:08 PM
    Moderator
  • Alternatively, Plamen Ratchev has a good blog on this topic as well which I always re-check then faced islands problem

    Refactoring Ranges


    For every expert, there is an equal and opposite expert. - Becker's Law


    My blog
    Thursday, April 28, 2011 1:45 AM
    Moderator