none
Loop Optimization

    Question

  • I have a loop that I have been working on, it provides a way to assign an Advisor to each Student.  The @tblAdvisorStudents table has the Students, but when it first gets created, the txtAdvisorID value starts of as NNN, then as the loop goes, it adds a txtAdvisorID value from @tblAdvisors Table.  I was wondering if there is a better way of doing this, or if I can optimize it, because the more students you add, the longer the loop takes.  Here is the code:

     

    Code Snippet

    declare @tblAdvisingEvents table

    (

    txtAdvisingEventID nvarchar(50),

    txtTermCode nvarchar(20),

    dtmStartDate dateTime,

    dtmEndDate datetime

    )

     

    declare @tblAdvisors Table

    (

    txtAdvisorID nvarchar(50),

    txtLastName nvarchar(50),

    txtFirstName nvarchar(50),

    dtmAvailableStart datetime,

    dtmAvailableEnd datetime

    )

     

    declare @tblAdvisorStudents table

    (

    txtAdvisingEventID nvarchar(50),

    txtAdvisorID nvarchar(50),

    txtStudentID nvarchar(50)

    )

     

    Insert INTO @tblAdvisingEvents Values('af98bef6-564c-4084-bd49-36caffa1df60', '200920',

    '3/13/2008 12:00:00 AM', '6/1/2008 12:00:00 AM')

     

    Insert INTO @tblAdvisors Values('N00002838', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

    Insert INTO @tblAdvisors Values('N00005176', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

    Insert INTO @tblAdvisors Values('N00008289', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

    Insert INTO @tblAdvisors Values('N00018609', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

    Insert INTO @tblAdvisors Values('N00029813', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

    Insert INTO @tblAdvisors Values('N00033516', NULL, NULL, '1/1/1900 12:00:00 AM', '1/1/2010 12:00:00 AM')

     

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00018678')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00057943')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00075686')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00139051')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00194550')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00205979')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00212019')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00234586')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00241887')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00302663')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00310538')

    Insert INTO @tblAdvisorStudents Values('AF98BEF6-564C-4084-BD49-36CAFFA1DF60', 'NNN', 'N00312465')

     

    while ( select count(txtStudentID) from @tblAdvisorStudents where txtAdvisorID = 'NNN' ) > 0

    begin

    update @tblAdvisorStudents

    set txtAdvisorID = (

    SELECT top 1 qryCurrent.txtAdvisorID

    FROM

    (

    select *

    from @tblAdvisors

    where GETDATE() between dtmAvailableStart and dtmAvailableEnd

    ) qryCurrent

    left join

    ( select count(txtStudentID) numStudents, txtAdvisorID

    from @tblAdvisorStudents

    where txtAdvisingEventID =

    ( select txtAdvisingEventID from @tblAdvisingEvents where GETDATE() between dtmStartDate and dtmEndDate )

    group by txtAdvisorID

    ) qryCounts on qryCurrent.txtAdvisorID = qryCounts.txtAdvisorID

    order by numStudents

    )

    where txtAdvisingEventID = ( select txtAdvisingEventID FROM @tblAdvisingEvents WHERE GETDATE() between dtmStartDate and dtmEndDate )

    and txtStudentID = (select TOP 1 txtStudentID from @tblAdvisorStudents where txtAdvisorID = 'NNN')

    end

    Select * from @tblAdvisorStudents

     

     

     

     

    Wednesday, April 02, 2008 3:44 PM

Answers

  • OK,

     

    I created the tables locally and created this temp table solution. Let me know if it works for you

     

    It works for me on a very small scale and should perform better with larger tables.

     

    Also, it evenly distributes students to advisors.

     

    Code Snippet

    CREATE TABLE #tblAdvisorsTemp

    (

    txtAdvisorID nvarchar(50),

    txtLastName nvarchar(50),

    txtFirstName nvarchar(50),

    dtmAvailableStart datetime,

    dtmAvailableEnd datetime

    )

     

    CREATE TABLE #tblStudentsTemp

    (

    txtAdvisingEventID nvarchar(50),

    txtAdvisorID nvarchar(50),

    txtStudentID nvarchar(50)

    )

     

    INSERT INTO #tblStudentsTemp

    SELECT s.* FROM tblAdvisorStudents s

    JOIN tblAdvisingEvents a

    ON s.txtAdvisingEventID = 'af98bef6-564c-4084-bd49-36caffa1df60'

     

    /*The following loop simply duplicates the advisors until there are slightly more advisors than students (or equal but not likely)*/

    WHILE (SELECT COUNT(txtAdvisorID) FROM #tblAdvisorsTemp) <= (SELECT COUNT(txtStudentID) FROM #tblStudentsTemp)

    BEGIN

    INSERT INTO #tblAdvisorsTemp

    SELECT txtAdvisorID, txtLastName, txtFirstName, dtmAvailableStart, dtmAvailableEnd FROM tblAdvisors

    WHERE GETDATE() BETWEEN dtmAvailableStart and dtmAvailableEnd

    END

     

    /*We then add an identity column to randomly assign advisors to students*/

    ALTER TABLE #tblAdvisorsTemp

    ADD ID int IDENTITY(1, 1)

     

    ALTER TABLE #tblStudentsTemp

    ADD ID int IDENTITY(1, 1)

     

    UPDATE s

    SET s.txtAdvisorID = a.txtAdvisorID

    FROM #tblStudentsTemp s

    JOIN #tblAdvisorsTemp a

    ON s.ID = a.ID

     

    SELECT * FROM #tblStudentsTemp

     

    DROP TABLE #tblAdvisorsTemp

    DROP TABLE #tblStudentsTemp

     

     

    Edit: Revise since original post. Please see revision

     

    Adam

    Thursday, April 03, 2008 4:08 PM

All replies

  • To prove existence, it is better to use EXISTS operator and not counting the number of rows that match the criteria. The first improvement will be changing:

     

    > while ( select count(txtStudentID) from @tblAdvisorStudents where txtAdvisorID = 'NNN' ) > 0

     

    while exists (select * from @tblAdvisorStudents where txtAdvisorID = 'NNN' )

    ...

     

     

    AMB

    Wednesday, April 02, 2008 4:11 PM
  • I made that change and ran it against my 984 students and it's still taking 36 seconds to run, same as before. Sad

    Wednesday, April 02, 2008 4:44 PM
  • Can you tell us a little bit more about the process of selecting the correct advisor for each student?

     

    It is hard to interpret the select / update statement you are using.

     

     

    AMB

    Wednesday, April 02, 2008 4:49 PM
  • Sure no problem, let me break it down for you.

     

    First Section ( The way I had it)

     

    Code Snippet

    while ( select count(txtStudentID) from @tblAdvisorStudents where txtAdvisorID = 'NNN' ) > 0

    begin

    update @tblAdvisorStudents

    set txtAdvisorID = (

    SELECT top 1 qryCurrent.txtAdvisorID

    FROM

     

     

     

    From this you can see that the loop will run as long as txtAdvisorID = 'NNN'.  So it will update the @tblAdvisorStudents Table where students reside and assign an AdvisorID where it has an 'NNN' on it.  It's selecting the top 1 from a derived table.

     

    Second Section

     

    Code Snippet

    (

    select *

    from @tblAdvisors

    where GETDATE() between dtmAvailableStart and dtmAvailableEnd

    ) qryCurrent

    left join

    ( select count(txtStudentID) numStudents, txtAdvisorID

    from @tblAdvisorStudents

    where txtAdvisingEventID =

    ( select txtAdvisingEventID from @tblAdvisingEvents where GETDATE() between dtmStartDate and dtmEndDate )

    group by txtAdvisorID

    ) qryCounts on qryCurrent.txtAdvisorID = qryCounts.txtAdvisorID

    order by numStudents

    )

     

     

     

    This next section shows the derived table where we get the txtAdvisorID to Update the table from the firs section above.  The first parts just gets all the values from @tblAdvisors Table. That first part is then joinned with a LEFT JOINT to the next part of the Derived Table which gets the count of the Students Assigned to a txtAdvisorID and it's order by numStudents.  So when we first start off, there aren't any students assigned an AdvisorID yet.  So it takes the TOP 1 from the Derived Table and assigns it on the Update Statement.  Now lets remember that the Derived Table is ORDER BY numStudents, so the first record that got assigned an AdvisorID goes to the bottom of the list.  So now the loop will select the next TOP 1 and brings it back to the bottom once again. 

     

     

     

    The Last Section, just a Where clause to get the desired result.

     

    Code Snippet

    where txtAdvisingEventID = ( select txtAdvisingEventID FROM @tblAdvisingEvents WHERE GETDATE() between dtmStartDate and dtmEndDate )

    and txtStudentID = (select TOP 1 txtStudentID from @tblAdvisorStudents where txtAdvisorID = 'NNN')

    end

     

     

     

     

    Wednesday, April 02, 2008 5:41 PM
  • If there is an alternative way of assigning Advisors to Students I would be more than happy to do it.  This is the only way I came up with, problem is, it takes too long as it gets bigger.  So it assigns the advisor with the least amount of students.

     

    Wednesday, April 02, 2008 6:04 PM
  • Try:

     

    Code Snippet

    declare @txtStudentID int

    declare @txtAdvisingEventID int

     

    declare c cursor local fast_forward

    for

    select

    s.txtStudentID, s.txtAdvisingEventID

    from

    @tblAdvisorStudents as s

    inner join

    (

    select txtAdvisingEventID

    FROM @tblAdvisingEvents

    WHERE GETDATE() between dtmStartDate and dtmEndDate

    ) as e

    on s.txtAdvisingEventID = e.txtAdvisingEventID

    where

    s.txtAdvisorID = 'NNN'

     

    open c

     

    while 1 = 1

    begin

    fetch next from c into @txtStudentID, @txtAdvisingEventID

     

    if @@error <> 0 or @@fetch_status <> 0 break

     

    update @tblAdvisorStudents

    set txtAdvisorID = (

    SELECT top 1 qryCurrent.txtAdvisorID

    FROM

    (

    select *

    from @tblAdvisors

    where GETDATE() between dtmAvailableStart and dtmAvailableEnd

    ) qryCurrent

    left join

    ( select count(txtStudentID) numStudents, txtAdvisorID

    from @tblAdvisorStudents

    where txtAdvisingEventID = @txtAdvisingEventID

    group by txtAdvisorID

    ) qryCounts on qryCurrent.txtAdvisorID = qryCounts.txtAdvisorID

    order by numStudents

    )

    where txtAdvisingEventID = @txtAdvisingEventID

    and txtStudentID = @txtStudentID

    end

     

    close c

    deallocate c

     

     

    You have to add the declaration of the table variables.

     

    AMB

    Wednesday, April 02, 2008 6:48 PM
  • It worked, but It ran just as long as mine does. Sad

     

    Wednesday, April 02, 2008 7:54 PM
  • Try using just the set of advisors that match the criteria "where getdate() between ...".

     

    Code Snippet

    declare @d datetime

    set @d = GETDATE()

     

    declare @t table (

    txtAdvisorID nvarchar(50) not null unique

    )

     

    insert into @t(txtAdvisorID)

    select txtAdvisorID

    FROM @tblAdvisors

    WHERE @d between dtmAvailableStart and dtmAvailableEnd

     

    declare @txtStudentID nvarchar(50)

    declare @txtAdvisingEventID nvarchar(50)

     

    declare c cursor local fast_forward

    for

    select

    s.txtStudentID, s.txtAdvisingEventID

    from

    @tblAdvisorStudents as s

    inner join

    (

    select txtAdvisingEventID

    FROM @tblAdvisingEvents

    WHERE @d between dtmStartDate and dtmEndDate

    ) as e

    on s.txtAdvisingEventID = e.txtAdvisingEventID

    where

    s.txtAdvisorID = 'NNN'

     

    open c

     

    while 1 = 1

    begin

    fetch next from c into @txtStudentID, @txtAdvisingEventID

     

    if @@error <> 0 or @@fetch_status <> 0 break

     

    update @tblAdvisorStudents

    set txtAdvisorID = (

    SELECT top 1 qryCurrent.txtAdvisorID

    FROM

    @t as qryCurrent

    left join

    (

    select count(txtStudentID) numStudents, txtAdvisorID

    from @tblAdvisorStudents

    where txtAdvisingEventID = @txtAdvisingEventID

    group by txtAdvisorID

    ) qryCounts

    on qryCurrent.txtAdvisorID = qryCounts.txtAdvisorID

    order by numStudents

    )

    where txtAdvisingEventID = @txtAdvisingEventID

    and txtStudentID = @txtStudentID

    end

     

    close c

    deallocate c

     

     

    AMB

    Wednesday, April 02, 2008 11:26 PM
  • This can be done without a loop or cursor...a single update would work fine. I'll work on it and post it later.

     

    Adam

    Wednesday, April 02, 2008 11:51 PM
  • I'm kinda curious how that would work.  The only way I imagined it was with a loop.

     

    Thursday, April 03, 2008 12:51 PM
  • OK,

     

    I created the tables locally and created this temp table solution. Let me know if it works for you

     

    It works for me on a very small scale and should perform better with larger tables.

     

    Also, it evenly distributes students to advisors.

     

    Code Snippet

    CREATE TABLE #tblAdvisorsTemp

    (

    txtAdvisorID nvarchar(50),

    txtLastName nvarchar(50),

    txtFirstName nvarchar(50),

    dtmAvailableStart datetime,

    dtmAvailableEnd datetime

    )

     

    CREATE TABLE #tblStudentsTemp

    (

    txtAdvisingEventID nvarchar(50),

    txtAdvisorID nvarchar(50),

    txtStudentID nvarchar(50)

    )

     

    INSERT INTO #tblStudentsTemp

    SELECT s.* FROM tblAdvisorStudents s

    JOIN tblAdvisingEvents a

    ON s.txtAdvisingEventID = 'af98bef6-564c-4084-bd49-36caffa1df60'

     

    /*The following loop simply duplicates the advisors until there are slightly more advisors than students (or equal but not likely)*/

    WHILE (SELECT COUNT(txtAdvisorID) FROM #tblAdvisorsTemp) <= (SELECT COUNT(txtStudentID) FROM #tblStudentsTemp)

    BEGIN

    INSERT INTO #tblAdvisorsTemp

    SELECT txtAdvisorID, txtLastName, txtFirstName, dtmAvailableStart, dtmAvailableEnd FROM tblAdvisors

    WHERE GETDATE() BETWEEN dtmAvailableStart and dtmAvailableEnd

    END

     

    /*We then add an identity column to randomly assign advisors to students*/

    ALTER TABLE #tblAdvisorsTemp

    ADD ID int IDENTITY(1, 1)

     

    ALTER TABLE #tblStudentsTemp

    ADD ID int IDENTITY(1, 1)

     

    UPDATE s

    SET s.txtAdvisorID = a.txtAdvisorID

    FROM #tblStudentsTemp s

    JOIN #tblAdvisorsTemp a

    ON s.ID = a.ID

     

    SELECT * FROM #tblStudentsTemp

     

    DROP TABLE #tblAdvisorsTemp

    DROP TABLE #tblStudentsTemp

     

     

    Edit: Revise since original post. Please see revision

     

    Adam

    Thursday, April 03, 2008 4:08 PM
  • Please try my lastest post. I revised it a dozen times...but I'll leave it alone until I hear back.

     

    Adam

    Thursday, April 03, 2008 7:17 PM
  • Thanks Adam.  It has definetly lead me in the right direction.  I need to tweak it some since new students are constantly being added so I'm trying to get it to add the new student with the advisor with the least amount of of students without touching the data already posted.

    Thursday, April 03, 2008 8:15 PM
  • Hi Adam,

     

    > INSERT INTO #tblStudentsTemp

    > SELECT s.* FROM tblAdvisorStudents s

    > JOIN tblAdvisingEvents a

    > ON s.txtAdvisingEventID = 'af98bef6-564c-4084-bd49-36caffa1df60'

     

    For just one txtAdvisingEventID, sure it works, but how do you do it for the rest of events?

     

    AMB

     

    Thursday, April 03, 2008 11:30 PM
  • Hi hunchback,

     

    It was for example only. I believe it would be:

     

    > INSERT INTO #tblStudentsTemp

    > SELECT s.* FROM tblAdvisorStudents s

    > JOIN tblAdvisingEvents a

    > ON s.txtAdvisingEventID = a.txtAdvisingEventID

     

    ...or some passed in variable.

     

    Adam

    Friday, April 04, 2008 12:29 AM
  • Adam,

     

    I would like to understand this approach better, if you allow me.

     

    1 - If you use:

     

    ON s.txtAdvisingEventID = a.txtAdvisingEventID

     

    how are you going to implement:

    WHILE (SELECT COUNT(txtAdvisorID) FROM #tblAdvisorsTemp) <= (SELECT COUNT(txtStudentID) FROM #tblStudentsTemp)

    BEGIN

    INSERT INTO #tblAdvisorsTemp

    SELECT txtAdvisorID, txtLastName, txtFirstName, dtmAvailableStart, dtmAvailableEnd FROM tblAdvisors

    WHERE GETDATE() BETWEEN dtmAvailableStart and dtmAvailableEnd

    END

     

    because you have to match the number of advisors with the number of students for every EventID.

     

    2 - If you will pass a variable, how are going to get the value for the variable if not traversing the set of the distinct values. To me, this sound like looping, doesn't it?

     

     

    AMB

    Friday, April 04, 2008 1:21 AM
  • Ok Alexandro,

     

    I get your point and it's a good catch...but it can still be resolved without a loop simply using the dates.

     

    The student temp table would need to accept a variable.

     

    The advisors query then would have to include a filter that somehow communicates with date associated with the event variable to determine the dates.

     

    So the student temp table would be something like:

    INSERT INTO #tblStudentsTemp

    SELECT s.* FROM tblAdvisorStudents s

    JOIN tblAdvisingEvents a

    ON s.txtAdvisingEventID = @EventID

     

    and the advisors would be something like:

     

    WHILE (SELECT COUNT(txtAdvisorID) FROM #tblAdvisorsTemp) <= (SELECT COUNT(txtStudentID) FROM #tblStudentsTemp)

    BEGIN

    INSERT INTO #tblAdvisorsTemp

    SELECT txtAdvisorID, txtLastName, txtFirstName, dtmAvailableStart, dtmAvailableEnd FROM tblAdvisors a

    WHERE GETDATE() BETWEEN dtmAvailableStart and dtmAvailableEnd

    AND ((SELECT dtmStartDate FROM tblAdvisingEvents WHERE txtAdvisingEventID = @EventID)

    BETWEEN a.dtmAvailableStart and a.dtmAvailableEnd)

    AND ((SELECT dtmEndDate FROM tblAdvisingEvents WHERE txtAdvisingEventID = @EventID)

    BETWEEN a.dtmAvailableStart and a.dtmAvailableEnd)

    END

     

    Again...that should do it but it's untested

     

    Edit: I tested it with the records supplied and it executed fine. Whether or not it will work with more records, we'll see. And I'm sure the dates comparisons are flawed, but do I really to do 'all' the work?

     

    Adam

    Friday, April 04, 2008 1:59 AM
  • Adam,

     

    ok, let us assume that you that code inside a stored procedure that accept an EventID parameter; still you need to traverse all the EventIDs and execute the stored procedure for each one of them. how are you going to traverse the set of eventids and execute the sp for each one of them if you do not want to use a loop?

     

     

    AMB

    Friday, April 04, 2008 12:47 PM
  • Alejandro,

     

    This query will be run once per trimester per 1 event. I think the OP neglected to mention that (look at date range of the event.). If it were not the case, and several events needed to be processed, you are right...you would need some sort of looping structure.

     

    Edit: But the original structure was poorly written. My suggestion offered an optimized approach even if an additional outer loop was required.

     

    Adam

    Friday, April 04, 2008 3:55 PM