locked
Group by Maximum Length RRS feed

  • Question

  • I have used a recursive CTE to create a view which shows me a 'path' for each of my entities.

    My data looks like:

    Company,EntityName,SponsorPath

    Company1, John Smith, Jack

    Company1,John Smith, Jack / Tom

    Company1, John Smith, Jack/Tom/Harry

     

    For some purposes, I love the fact that I have John Smith in 3 rows, and can pull a report based on SponsorPath = 'Jack' or SponsorPath = 'Jack/Tom' or SponsorPath = 'Jack/Tom/Harry'

    Other times, I only want to see the rows where the SponsorPath is the most extreme (ie, the most detailed), which would also mean the length in the longest. 

    I feel like I should be able to group the records by Max(Len(SponsorPath)) but I'm having trouble writing the code....

    Suggestions?  thank you!

     

    Sunday, August 15, 2010 6:06 PM

Answers

  • May be you need to order not by length of the path, but by the number of / in the path?

     

    with cte as (select *, 
    NumSlashes = len(SponsorPath) - len(replace(SponsorPath,'/','')), 
    row_number() over (partition by Company, EntityName
    
    order by len(SponsorPath) - len(replace(SponsorPath,'/','')) DESC) as PathRank 
    from Entities)
    
    select * from cte where PathRank = 1 -- Longest path

     


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by mtpaper Sunday, August 15, 2010 7:38 PM
    • Edited by Naomi N Sunday, August 15, 2010 7:48 PM
    Sunday, August 15, 2010 6:23 PM

All replies

  • May be you need to order not by length of the path, but by the number of / in the path?

     

    with cte as (select *, 
    NumSlashes = len(SponsorPath) - len(replace(SponsorPath,'/','')), 
    row_number() over (partition by Company, EntityName
    
    order by len(SponsorPath) - len(replace(SponsorPath,'/','')) DESC) as PathRank 
    from Entities)
    
    select * from cte where PathRank = 1 -- Longest path

     


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    • Marked as answer by mtpaper Sunday, August 15, 2010 7:38 PM
    • Edited by Naomi N Sunday, August 15, 2010 7:48 PM
    Sunday, August 15, 2010 6:23 PM
  • Excellent. I had to make some adjustments to get it to work:

    With CTE as (

    Select *, NumSlashes = LEN(SponsorPath) - LEN(Replace(Sponsorpath,'/','')), ROW_Number() over (Partition by Company,EntityName ORDER BY Len(SponsorPath) - LEN(Replace(Sponsorpath,'/','')) DESC) AS PathRank

    FROM Entities

    )

    Select * from CTE where PathRank = 1

    It does exactly what I need, but it is not very speedy.  But that's ok for the time being. And I was very proud of myself for being able to adjust your clever solution to get it to work -

    Thank you Naomi-

    Marion

    Sunday, August 15, 2010 7:37 PM
  • You could add another column to your recursive CTE, to store the level of the [SponsorPath] or number of pass from the root node. Then you could use this column and a ranking function to identify rows with greatest level.

    The final statement will be similar to what Naomi already posted, but no need for a trick to calculate the max length.

    AMB

    • Proposed as answer by Naomi N Sunday, August 15, 2010 7:44 PM
    Sunday, August 15, 2010 7:40 PM
  • Naomi,

    I am sure you ment "LEN(REPLACE(SponsorPath, '/', ''))", instead "len(SponsorPath, '/', '')".

     

    AMB

    Sunday, August 15, 2010 7:42 PM
  • @AMB Hunchback

    Can you please be more specific about what I would add to my recursive CTE? To store the level of the Sponsorpath? or Number of pass from the root node? And ranking function.

    Or point me towards an example of something similar, and I can probably mimic it.

    I've only been at this for a few months, so I appreciate the extra assistance -

    Marion

    PS - I think I may be figuring it out

     

     

    • Edited by mtpaper Sunday, August 15, 2010 9:02 PM spelling error
    Sunday, August 15, 2010 8:21 PM
  • Marion,

    What is the query you use to produce the starting point query? Alejandro is proposing to add Level column to that recursive CTE generation. If you have Level column, you can use it instead of the calculated column I used in the ranking function.


    Premature optimization is the root of all evil in programming. (c) by Donald Knuth

    Naomi Nosonovsky, Sr. Programmer-Analyst

    My blog
    Sunday, August 15, 2010 9:43 PM
  • Yes, I understood his suggestion, and I got it to work -

    thanks

    Monday, August 16, 2010 10:47 PM