SQL Server Developer Center > SQL Server Forums > Transact-SQL > Cursor Looking At Next Row Without Moving Cursor Forward
Ask a questionAsk a question
 

AnswerCursor Looking At Next Row Without Moving Cursor Forward

  • Tuesday, August 04, 2009 8:27 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    I need a way to loop through a query result in a stored procedure. So I know that I can use cursors. But the key thing also is that I need a way to PEEK at the next row while not moving the cursor forward. So basically a way to do current.nex() which returns the next row without moving the cursor forward. I know in Oracle this is called lag and lead. Wondering if there is an equivalent in SQL? We are running SQL 2005. Thanks.

Answers

  • Saturday, August 08, 2009 2:07 AMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    You cannot reference a column from a table or CTE that is not included in the FROM clause. In your case you reference the columns from the ISLANDS CTE definition which is not included in the last query. You can simply include the columns in the RANKED CTE and use them, something like this (note I also changed the mixed joins in the first CTE where you had a mix of SQL-89 and SQL-92 join syntax, it is good practice to use the SQL-92 join syntax only):

    WITH ISLANDS AS (
            SELECT V.LABEL AS LABEL,
                   L.LANDMARK_NAME AS LANDMARK_NAME,
                   H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP AS LOCATION,
                   H.HISTORY_ID,
                   H.GPS_TIME,
                   H.MESSAGE_STATUS,
                   ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - 
                   ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY
              FROM [Production].[dbo].VEHICLE_HISTORY AS H 
              JOIN [Production].[dbo].VEHICLES AS V
                ON V.VEHICLE_ID = H.VEHICLE_ID
              LEFT OUTER JOIN [Production].[dbo].LANDMARKS AS L 
                ON H.LANDMARK_ID = L.LANDMARK_ID,
             WHERE H.VEHICLE_ID = @VEHICLE_ID
               AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE)
               AND MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF')), 
    RANKED_ISLANDS AS (
                SELECT HISTORY_ID,
                       LABEL,
                       LANDMARK_NAME,
                       LOCATION,
                       GPS_TIME,
                       MESSAGE_STATUS,
                       ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY 
                                          ORDER BY 
                            CASE
                                WHEN MESSAGE_STATUS = 'Ignition OFF'
                                THEN HISTORY_ID
                            END,
                            CASE WHEN MESSAGE_STATUS = 'Ignition ON'
                                 THEN HISTORY_ID
                            END DESC) AS ROW_NUMBER
            FROM ISLANDS
        ), RANKED AS (
            SELECT GPS_TIME,
                   LABEL,
                   LANDMARK_NAME,
                   LOCATION,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
              FROM RANKED_ISLANDS
             WHERE ROW_NUMBER = 1
        ), GROUPS AS (
            SELECT MIN(ROW_KEY) AS ROW_KEY,
                   MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
                   MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME
              FROM RANKED
          GROUP BY GRPBY)
    
            SELECT RANKED.LABEL,
                   RANKED.LANDMARK_NAME,
                   RANKED.LOCATION,
                   GROUPS.ARRIVAL_TIME,
                   GROUPS.DEPARTURE_TIME,
                  (GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME,
                  (RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME
              FROM GROUPS
         LEFT JOIN RANKED
                ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END;

    Plamen Ratchev

All Replies

  • Tuesday, August 04, 2009 8:32 PMPesoMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    There is currently no such support.
    Please tell us why you need this awkward approach.
  • Tuesday, August 04, 2009 8:37 PMKent WaldropMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Presently there are no SQL Server equivalent to the Oracle LAG and LEAD analytic functions along with a number of other Oracle analytic scenarios.
  • Tuesday, August 04, 2009 8:39 PMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    You can use the INSERT...EXEC syntax to dump the stored procedure result set into a table. Then you can use the ranking functions to rank the rows by some expression and using a self join lookup the next row. A sketch here:

    INSERT INTO Table(<columns>)
    EXEC sp <params>;
    
    WITH CTE AS (
    SELECT col1, col2, ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS rk
    FROM Table)
    SELECT Current.col1, Current.col2, Next.col1, Next.col2
    FROM CTE AS Current
    LEFT JOIN CTE AS Next
      ON Current.rk = Next.rk - 1;


    Plamen Ratchev
  • Tuesday, August 04, 2009 8:41 PMPesoMVPUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
  • Tuesday, August 04, 2009 10:16 PMV4MC Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    You could use 2 set of variables and keep reassigning the old set with the new set before moving the cursor forward.
    What are you trying to accomplish?There might be better ways of doing with out using the cursor.

  • Tuesday, August 04, 2009 10:37 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hello, maybe there is a different approach we can take then. Here is the algorithm we are trying to accomplish.

    Assume we have a result set with 8 rows with only one column for ease of explanation.

    The output table returned is comprised of rows built up for the rows of the original 8.

                               FINAL OUTPUT TABLE SHOULD LOOK LIKE

    1st Row:   NULL, A,      NULL,    B - A
    2nd Row:  B,      C,      C-B,     D-C
    3rd Row:   D,      E,      E-D,     F-E
    4th Row:   F,       G,      G-F,     H-G
    5th Row:   H,       NULL  NULL,   NULL




    So A,B,C,D,E,F,G,H are the rows.
  • Wednesday, August 05, 2009 1:29 AMV4MC Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Would something like this work for you?

    Create table #tmp (id int, t varchar(1))

    insert into #tmp (id, t) values (1, 'A')
    insert into #tmp (id, t) values (2, 'B')
    insert into #tmp (id, t) values (3, 'C')
    insert into #tmp (id, t) values (4, 'D')
    insert into #tmp (id, t) values (5, 'E')
    insert into #tmp (id, t) values (6, 'F')
    insert into #tmp (id, t) values (7, 'G')
    insert into #tmp (id, t) values (8, 'H')

    SELECT
    v.t
    ,v1.t
    ,v1.t + '-' + v.t
    ,v2.t + '-' + v1.t
    FROM
    (
    SELECT
    row_number() OVER (ORDER BY t) a
    ,t
    FROM
    #tmp
    WHERE
    id % 2 = 0
    ) v
    FULL JOIN (
    SELECT
    row_number() OVER (ORDER BY t) a
    ,t
    FROM
    #tmp
    WHERE
    id % 2 = 1
    ) v1
    ON v.a = v1.a - 1
    LEFT JOIN (
    SELECT
    row_number() OVER (ORDER BY t) a
    ,t
    FROM
    #tmp
    WHERE
    id % 2 = 0
    ) v2
    ON v2.a = v1.a
    ORDER BY
    v.a

    • Edited byV4MC Wednesday, August 05, 2009 1:31 AM
    •  
  • Wednesday, August 05, 2009 4:00 AMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Proposed AnswerHas Code
    Here is one solution:

    CREATE TABLE Foo (
     row_key CHAR(1) PRIMARY KEY,
     row_value INT);
     
    INSERT INTO Foo VALUES('A', 10);
    INSERT INTO Foo VALUES('B', 15);
    INSERT INTO Foo VALUES('C', 20);
    INSERT INTO Foo VALUES('D', 26);
    INSERT INTO Foo VALUES('E', 30);
    INSERT INTO Foo VALUES('F', 37);
    INSERT INTO Foo VALUES('G', 40);
    INSERT INTO Foo VALUES('H', 48);
    
    WITH Ranked AS (
    SELECT row_value, 
           ROW_NUMBER() OVER(ORDER BY row_key) AS rk,
           ROW_NUMBER() OVER(ORDER BY row_key) / 2 AS grp,
           ROW_NUMBER() OVER(ORDER BY row_key) % 2 AS col
    FROM Foo),
    Groups AS (
    SELECT MIN(rk) AS rk,
           MAX(CASE WHEN col = 1 THEN row_value END) AS cur_value,
           MAX(CASE WHEN col = 0 THEN row_value END) AS prev_value
    FROM Ranked
    GROUP BY grp)
    SELECT G.prev_value, G.cur_value, R.row_value AS next_value,
           G.cur_value - G.prev_value AS cur_prev_value,
           R.row_value - G.cur_value AS next_cur_value
    FROM Groups AS G
    LEFT JOIN Ranked AS R
      ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END;
    
    /*
    
    prev_value  cur_value   next_value  cur_prev_value next_cur_value
    ----------- ----------- ----------- -------------- --------------
    NULL        10          15          NULL           5
    15          20          26          5              6
    26          30          37          4              7
    37          40          48          3              8
    48          NULL        NULL        NULL           NULL
    
    */

    Plamen Ratchev
    • Proposed As Answer byNaom Wednesday, August 05, 2009 4:38 AM
    •  
  • Wednesday, August 05, 2009 8:56 AMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     

    Plamen,

    Thanks for the reply. Let me run this by the other developers and see if this indeed works in every case. By the way, your CRAZY smart, we tried to figure this out for hours on a white board how to write this in PURE T-SQL, but gave up. Good stuff. I will keep you posted.
  • Wednesday, August 05, 2009 12:03 PMV4MC Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Justin,
    Have you tried my solution, you would get similar results.
  • Wednesday, August 05, 2009 12:17 PMShamas Saeed Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    there is no such support in sql server but you can do this by involving variables and looping the clause
    Shamas Saeed MCITP-DBD 2005 http://sqlservercoollinks.blogspot.com
  • Thursday, August 06, 2009 7:13 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Plamen,

    Thank you very much. I got your T-SQL implemented but there are a bit more details that I am struggling with. Basically what my algorithm is doing is selecting dates and the difference between them. So in your example 10, 15, 20(row_value) are acutally dates. But that doesn't really matter, your query still works. The problem comes from another column that I need to check for each iteration of each row.

    Let me define the table structure that I am selecting the data from:

    [code]
    CREATE TABLE Foo (
       history_id INT PRIMARY KEY,
       gps_time DATETIME,
       status VARCHAR(3)
    );
    [/code]

    So history_id is your row_key, gps_time is your row_value. The last column, status is what needs to be checked for every iteration. Status stores either, ignition 'on', or ignition 'off'.

    I am going to try to explain the problem, but terribly sorry if I am confusing. So what needs to happen is the following checks:

    Start at first row, check status, if it is 'on' and the next row is 'off' continue as normal and do the algorithm you wrote above.

      If the first row status = 'off' continue iterating through the rows until you find the first 'on'.
      If the first row status was 'on' but the next row was 'on' as well, skip the first row and start the algorithm at the 2nd row.

    So we always need to make sure the current row is NOT equal to the next rows status.

    Finally, if row 1 was 'on' and row 2 was 'off' everything good. But let's assume now we are at row 2 and we doing the check to make sure the next row is not equal to the current row. Row 2 is 'off' but oops, row 3 is also 'off'. In this case we must use row 2 as the off, and skip row 3, and then continue iterating look for our next 'on'.

    RULES
      If(A == ON AND B == ON)
         USE B

      IF(B == OFF AND C == OFF)
         USE B

    EXAMPLE
      A    B    C    D    E    F    G    H    I    J    K   L
      on  off   off  on   off  on  on  on   off  on on  off

    So the output table should use: A, B, D, E, H, I, K, L

    Hopefully that makes sense, let me know if you have any questions.




  • Thursday, August 06, 2009 8:49 PMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    If I understand correctly, you simply need to find islands for status codes and then pick the first row in each island and use that for the further processing. Here is modified example:

    CREATE TABLE Foo (
       history_id INT PRIMARY KEY,
       gps_time DATETIME,
       status VARCHAR(3));
     
    INSERT INTO Foo VALUES(1,  '20090101', 'on');
    INSERT INTO Foo VALUES(5,  '20090102', 'on');
    INSERT INTO Foo VALUES(7,  '20090103', 'off');
    INSERT INTO Foo VALUES(8,  '20090104', 'on');
    INSERT INTO Foo VALUES(9,  '20090105', 'off');
    INSERT INTO Foo VALUES(11, '20090106', 'off');
    INSERT INTO Foo VALUES(13, '20090107', 'on');
    INSERT INTO Foo VALUES(15, '20090108', 'off');
    
    WITH Islands AS (
    SELECT history_id, gps_time, status,
           ROW_NUMBER() OVER(ORDER BY history_id) -
           ROW_NUMBER() OVER(PARTITION BY status ORDER BY history_id) AS grp
    FROM Foo),
    RankedIslands AS (
    SELECT history_id, gps_time, status,
           ROW_NUMBER() OVER(PARTITION BY status, grp ORDER BY history_id) AS rn
    FROM Islands),
    Ranked AS (
    SELECT gps_time, 
           ROW_NUMBER() OVER(ORDER BY history_id) AS rk,
           ROW_NUMBER() OVER(ORDER BY history_id) / 2 AS grp,
           ROW_NUMBER() OVER(ORDER BY history_id) % 2 AS col
    FROM RankedIslands
    WHERE rn = 1),
    Groups AS (
    SELECT MIN(rk) AS rk,
           MAX(CASE WHEN col = 1 THEN gps_time END) AS cur_value,
           MAX(CASE WHEN col = 0 THEN gps_time END) AS prev_value
    FROM Ranked
    GROUP BY grp)
    SELECT G.prev_value, G.cur_value, R.gps_time AS next_value,
           DATEDIFF(DAY, G.cur_value, G.prev_value) AS cur_prev_value,
           DATEDIFF(DAY, R.gps_time, G.cur_value) AS next_cur_value
    FROM Groups AS G
    LEFT JOIN Ranked AS R
      ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END;
    
    /*
    
    prev_value cur_value  next_value cur_prev_value next_cur_value
    ---------- ---------- ---------- -------------- --------------
    NULL       2009-01-01 2009-01-03 NULL           -2
    2009-01-03 2009-01-04 2009-01-05 -1             -1
    2009-01-05 2009-01-07 2009-01-08 -2             -1
    2009-01-08 NULL       NULL       NULL           NULL
    
    */

    Plamen Ratchev
  • Thursday, August 06, 2009 9:19 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Plamen,

    Thanks again for your help. I am trying to figure out your code but getting twisted and confused. :) Could you perhaps implement 'islands' into my current code. It is basically your original algorithm, but with different alias and also a different original select for the results. By the way, I surely owe you a beer or something for all your effort in helping me. :)

    So the original select basically says, give me the mesage_status, gps_time for a particular vehicle (vehicle_id) between start_date and end_date where the status is either 'ignition on' or 'ignition off'. That is the starting result set. The rest of the query is basically identical to your original algorithm.

    My Current Query
    WITH STOP_DETAILS AS (
            SELECT MESSAGE_STATUS,
                   GPS_TIME,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
              FROM [Production].[dbo].VEHICLE_HISTORY
             WHERE VEHICLE_ID = @VEHICLE_ID
               AND GPS_TIME BETWEEN @START_DATE AND @END_DATE
               AND (MESSAGE_STATUS = 'Ignition ON'
                    OR MESSAGE_STATUS = 'Ignition OFF')),
           RESULTS AS (
            SELECT MIN(ROW_KEY) AS ROW_KEY,
                   MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
                   MAX(CASE WHEN COL = 1 THEN MESSAGE_STATUS END) AS IGN_ON,
                   MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME,
                   MAX(CASE WHEN COL = 0 THEN MESSAGE_STATUS END) AS IGN_OFF
              FROM STOP_DETAILS
             GROUP BY GRPBY)
            SELECT RESULTS.ARRIVAL_TIME,
                   RESULTS.IGN_OFF,
                   RESULTS.DEPARTURE_TIME,
                   RESULTS.IGN_ON,
                  (RESULTS.DEPARTURE_TIME - RESULTS.ARRIVAL_TIME) AS STOP_TIME,
                  (SD.GPS_TIME - RESULTS.DEPARTURE_TIME) AS TRIP_TIME
              FROM RESULTS
         LEFT JOIN STOP_DETAILS AS SD
                ON RESULTS.ROW_KEY = SD.ROW_KEY - CASE WHEN RESULTS.ROW_KEY = 1 THEN 1 ELSE 2
        END;
  • Thursday, August 06, 2009 9:35 PMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Island is a sequence of consecutive values. In your case that would be status. Here is how they are defined with the sample data I generated:

    INSERT INTO Foo VALUES(1,  '20090101', 'on');  -- island 1  <- pick this
    INSERT INTO Foo VALUES(5,  '20090102', 'on');  -- island 1  <- ignore
    INSERT INTO Foo VALUES(7,  '20090103', 'off'); -- island 2  <- pick this
    INSERT INTO Foo VALUES(8,  '20090104', 'on');  -- island 3  <- pick this
    INSERT INTO Foo VALUES(9,  '20090105', 'off'); -- island 4  <- pick this
    INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4  <- ignore
    INSERT INTO Foo VALUES(13, '20090107', 'on');  -- island 5  <- pick this
    INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6  <- pick this

    You simply need to add the two additional CTEs as in my example and move the predicates to filter on those parameters there.
    Plamen Ratchev
  • Thursday, August 06, 2009 9:40 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    So in your example:



    INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- pick this
    INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- ignore
    INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this
    INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this
    INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this
    INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore
    INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this
    INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this


    That inst exactly right though should be:


    INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- ignore
    INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- pick this
    INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this
    INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this
    INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this
    INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore
    INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this
    INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this


    So its: two on's in a row, pick the second. two off's in a row, pick the first.
    • Edited byJustinK101 Thursday, August 06, 2009 9:42 PM
    •  
  • Thursday, August 06, 2009 9:49 PMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Easy to adjust the logic:

    CREATE TABLE Foo (
       history_id INT PRIMARY KEY,
       gps_time DATETIME,
       status VARCHAR(3));
     
    INSERT INTO Foo VALUES(1,  '20090101', 'on');  -- island 1  <- ignore
    INSERT INTO Foo VALUES(5,  '20090102', 'on');  -- island 1  <- pick this
    INSERT INTO Foo VALUES(7,  '20090103', 'off'); -- island 2  <- pick this
    INSERT INTO Foo VALUES(8,  '20090104', 'on');  -- island 3  <- pick this
    INSERT INTO Foo VALUES(9,  '20090105', 'off'); -- island 4  <- pick this
    INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4  <- ignore
    INSERT INTO Foo VALUES(13, '20090107', 'on');  -- island 5  <- pick this
    INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6  <- pick this
    
    WITH Islands AS (
    SELECT history_id, gps_time, status,
           ROW_NUMBER() OVER(ORDER BY history_id) -
           ROW_NUMBER() OVER(PARTITION BY status ORDER BY history_id) AS grp
    FROM Foo),
    RankedIslands AS (
    SELECT history_id, gps_time, status,
           ROW_NUMBER() OVER(PARTITION BY status, grp ORDER BY CASE WHEN status = 'off'
                                                                    THEN history_id
                                                               END, 
                                                               CASE WHEN status = 'on'
                                                                    THEN history_id
                                                               END DESC) AS rn
    FROM Islands),
    Ranked AS (
    SELECT gps_time, 
           ROW_NUMBER() OVER(ORDER BY history_id) AS rk,
           ROW_NUMBER() OVER(ORDER BY history_id) / 2 AS grp,
           ROW_NUMBER() OVER(ORDER BY history_id) % 2 AS col
    FROM RankedIslands
    WHERE rn = 1),
    Groups AS (
    SELECT MIN(rk) AS rk,
           MAX(CASE WHEN col = 1 THEN gps_time END) AS cur_value,
           MAX(CASE WHEN col = 0 THEN gps_time END) AS prev_value
    FROM Ranked
    GROUP BY grp)
    SELECT G.prev_value, G.cur_value, R.gps_time AS next_value,
           DATEDIFF(DAY, G.cur_value, G.prev_value) AS cur_prev_value,
           DATEDIFF(DAY, R.gps_time, G.cur_value) AS next_cur_value
    FROM Groups AS G
    LEFT JOIN Ranked AS R
      ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END;
    
    /*
    
    prev_value cur_value  next_value cur_prev_value next_cur_value
    ---------- ---------- ---------- -------------- --------------
    NULL       2009-01-02 2009-01-03 NULL           -1
    2009-01-03 2009-01-04 2009-01-05 -1             -1
    2009-01-05 2009-01-07 2009-01-08 -2             -1
    2009-01-08 NULL       NULL       NULL           NULL
    
    */

    Plamen Ratchev
  • Friday, August 07, 2009 4:23 AMSteve Kass Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Hi Justin,

    You don't say what to choose if there are more than two in a row, but from an earlier example, it looks like you want the last 'off' from each string of consecutive 'off's, and the first 'on' from each string of consecutive 'on's. This should take care of it by returning each pair of rows where there first is 'on' and the second is 'off'. The FULL join sort of puts in a 0th row that's 'on' and an N+1st row that's 'off', and the IS NOT NULL clauses prevent those rows from being returned.

    with Ranked as (
      select
        row_number() over (order by history_id) as rk,
        history_id, gps_time, status
      from Foo
    ), Jogged as (
       select
         A.history_id as idA, A.gps_time as gpsA, coalesce(A.status,'on') as statusA,
         B.history_id as idB, B.gps_time as gpsB, coalesce(B.status,'off') as statusB
       from Ranked as A full outer join Ranked as B
       on A.rk + 1 = B.rk
    )
      select idA, gpsA, 'on'
      from Jogged where statusA = 'on' and statusB = 'off' and idA is not null
      union all
      select idB, gpsB, 'off'
      from Jogged where statusA = 'on' and statusB = 'off' and idB is not null
      order by 1;
    go
    
    Steve Kass
    Drew University
    http://www.stevekass.com
  • Friday, August 07, 2009 6:03 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Hey Steve,

    Wow your way is crazy, I will test both and see which one I can get working. :) Though I am sure they both work. In regards to multiple, the algorithm I need is:

    A   B    C    D    E
    on  on  on   on  off   =>  D,E THROW AWAY A, B, C

    A    B    C    D    E    =>  A,E  THROW AWAY B,C,D
    off  off  off   off  on
  • Friday, August 07, 2009 11:07 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Plamen, You are a genius. Your query worked perfectly. I am simply trying to select a few more static columns though from the original select, but I am getting ISLANDS.LABEL could not be bound. What am I doing wrong? Thanks a million.

    WITH ISLANDS AS (
            SELECT V.LABEL AS LABEL,
                   L.LANDMARK_NAME AS LANDMARK_NAME,
                   'LOCATION' = H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP,
                   H.HISTORY_ID,
                   H.GPS_TIME,
                   H.MESSAGE_STATUS,
                   ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY
              FROM [Production].[dbo].VEHICLE_HISTORY H LEFT OUTER JOIN [Production].[dbo].LANDMARKS L ON H.LANDMARK_ID = L.LANDMARK_ID,
                   [Production].[dbo].VEHICLES V
             WHERE H.VEHICLE_ID = @VEHICLE_ID
               AND V.VEHICLE_ID = H.VEHICLE_ID
               AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE)
               AND (MESSAGE_STATUS = 'Ignition ON'
                    OR MESSAGE_STATUS = 'Ignition OFF')
        ), RANKED_ISLANDS AS (
                SELECT HISTORY_ID,
                       GPS_TIME,
                       MESSAGE_STATUS,
                       ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY ORDER BY CASE
                            WHEN MESSAGE_STATUS = 'Ignition OFF'
                                THEN HISTORY_ID
                            END,
                            CASE WHEN MESSAGE_STATUS = 'Ignition ON'
                                THEN HISTORY_ID
                            END
                        DESC) AS ROW_NUMBER
            FROM ISLANDS
        ), RANKED AS (
            SELECT GPS_TIME,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
              FROM RANKED_ISLANDS
             WHERE ROW_NUMBER = 1
        ), GROUPS AS (
            SELECT MIN(ROW_KEY) AS ROW_KEY,
                   MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
                   MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME
              FROM RANKED
          GROUP BY GRPBY)

            SELECT ISLANDS.LABEL,
                   ISLANDS.LANDMARK_NAME,
                   ISLANDS.LOCATION,
                   GROUPS.ARRIVAL_TIME,
                   GROUPS.DEPARTURE_TIME,
                  (GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME,
                  (RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME
              FROM GROUPS
         LEFT JOIN RANKED
                ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2
        END;
  • Saturday, August 08, 2009 2:07 AMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     AnswerHas Code
    You cannot reference a column from a table or CTE that is not included in the FROM clause. In your case you reference the columns from the ISLANDS CTE definition which is not included in the last query. You can simply include the columns in the RANKED CTE and use them, something like this (note I also changed the mixed joins in the first CTE where you had a mix of SQL-89 and SQL-92 join syntax, it is good practice to use the SQL-92 join syntax only):

    WITH ISLANDS AS (
            SELECT V.LABEL AS LABEL,
                   L.LANDMARK_NAME AS LANDMARK_NAME,
                   H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP AS LOCATION,
                   H.HISTORY_ID,
                   H.GPS_TIME,
                   H.MESSAGE_STATUS,
                   ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - 
                   ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY
              FROM [Production].[dbo].VEHICLE_HISTORY AS H 
              JOIN [Production].[dbo].VEHICLES AS V
                ON V.VEHICLE_ID = H.VEHICLE_ID
              LEFT OUTER JOIN [Production].[dbo].LANDMARKS AS L 
                ON H.LANDMARK_ID = L.LANDMARK_ID,
             WHERE H.VEHICLE_ID = @VEHICLE_ID
               AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE)
               AND MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF')), 
    RANKED_ISLANDS AS (
                SELECT HISTORY_ID,
                       LABEL,
                       LANDMARK_NAME,
                       LOCATION,
                       GPS_TIME,
                       MESSAGE_STATUS,
                       ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY 
                                          ORDER BY 
                            CASE
                                WHEN MESSAGE_STATUS = 'Ignition OFF'
                                THEN HISTORY_ID
                            END,
                            CASE WHEN MESSAGE_STATUS = 'Ignition ON'
                                 THEN HISTORY_ID
                            END DESC) AS ROW_NUMBER
            FROM ISLANDS
        ), RANKED AS (
            SELECT GPS_TIME,
                   LABEL,
                   LANDMARK_NAME,
                   LOCATION,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
                   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
              FROM RANKED_ISLANDS
             WHERE ROW_NUMBER = 1
        ), GROUPS AS (
            SELECT MIN(ROW_KEY) AS ROW_KEY,
                   MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
                   MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME
              FROM RANKED
          GROUP BY GRPBY)
    
            SELECT RANKED.LABEL,
                   RANKED.LANDMARK_NAME,
                   RANKED.LOCATION,
                   GROUPS.ARRIVAL_TIME,
                   GROUPS.DEPARTURE_TIME,
                  (GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME,
                  (RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME
              FROM GROUPS
         LEFT JOIN RANKED
                ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END;

    Plamen Ratchev
  • Friday, November 06, 2009 8:22 PMJustinK101 Users MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     Has Code
    Plamen,

    We have one additional change to make to this query, and after coming to the conclusion that a cursor might be required, we wanted to make sure we weren't missing something.

    The change we need to include is the ability to run this report against a collection of vehicles, not just a single.  Below is the current iteration of our SP:

    -- Create our temp resultsets
    WITH ISLANDS AS (
        -- HISTORY is our overall data
        SELECT V.LABEL AS LABEL,
    	   L.LANDMARK_NAME AS LANDMARK_NAME,
    	   'LOCATION' = A.ADDRESS + ', ' + A.CITY + ', ' + A.STATE + ' ' + A.ZIP,
    	   A.CORRECTED_ODOMETER,
    	   A.HISTORY_ID,
    	   A.GPS_TIME,
    	   A.MESSAGE_STATUS,
               A.LATITUDE,
               A.LONGITUDE,
    	   -- Row number ordered by history id - row number partitioned over message status
    	   -- At first glance, this clever algorithm makes duplicate key on / key off messages
    	   -- have the same ROW_NUMBER value.  The partitioned row number B resets on every change of status,
    	   -- and when subtracted from the incrementing ordered row number A, halts the incrementing number
    	   -- i.e. 1,2,3,3,3,4,5,6,6,7
    	   ROW_NUMBER() OVER (ORDER BY A.GPS_TIME, A.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY A.MESSAGE_STATUS ORDER BY A.GPS_TIME, A.HISTORY_ID) AS GROUP_NUMBER
          FROM VEHICLE_HISTORY A LEFT OUTER JOIN LANDMARKS L ON A.LANDMARK_ID = L.LANDMARK_ID,
    	   VEHICLES V
         WHERE A.VEHICLE_ID = @VEHICLE_ID
           AND V.VEHICLE_ID = A.VEHICLE_ID
           AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE)
           AND A.MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF')
    ), RANKED_ISLANDS AS (
        -- RANKED_ISLANDS is our grouped information
        SELECT LABEL,
    	   LANDMARK_NAME,
    	   LOCATION,
    	   CORRECTED_ODOMETER,
    	   HISTORY_ID,
    	   GPS_TIME,
    	   MESSAGE_STATUS,
    	   LATITUDE,
    	   LONGITUDE,
    	   -- I think this collapses the duplicates into a single group, and produces
    	   -- a new sequential row number
    	   ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GROUP_NUMBER ORDER BY 
                   CASE WHEN MESSAGE_STATUS = 'Ignition OFF' THEN HISTORY_ID END,
    	       CASE WHEN MESSAGE_STATUS = 'Ignition ON'	THEN HISTORY_ID	END
    	   DESC) AS ROW_NUMBER
          FROM ISLANDS
    ), RANKED AS (
        -- RANKED is a basic numeric list of what type of row it is
        SELECT GPS_TIME,
    	   CORRECTED_ODOMETER,
    	   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY, -- The basic key
    	   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY, -- I think this identifies pairs
    	   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL -- Reflect our alternating messages
          FROM RANKED_ISLANDS
         WHERE ROW_NUMBER = 1
    ), RANKED_LOCATION AS (
        -- RANKED_LOCATION is an offset for the location information
        SELECT LABEL,
    	   LANDMARK_NAME,
    	   LOCATION,
    	   HISTORY_ID,
    	   GPS_TIME,
    	   LATITUDE,
    	   LONGITUDE,
    	   ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY -- The address key
          FROM RANKED_ISLANDS
         WHERE ROW_NUMBER = 1
    ), GROUPS AS (
        -- GROUPS is a filtered list of departure and arrival times relevant to each other
        -- The alternating messages are set into GRPBY pairs
        SELECT MIN(ROW_KEY) AS ROW_KEY,
    	   MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
    	   MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME,
    	   MAX(CORRECTED_ODOMETER) AS CORRECTED_ODOMETER
          FROM RANKED
         GROUP BY GRPBY
    )
    
    -- Get our information, matching our groups row key against a previous RANKED row
    -- The case handles our first row situation, where there is no previous record (I think)
    SELECT RANKED_LOCATION.HISTORY_ID,
           RANKED_LOCATION.LABEL,
           RANKED_LOCATION.LANDMARK_NAME,
           RANKED_LOCATION.LOCATION,
           RANKED_LOCATION.LATITUDE,
           RANKED_LOCATION.LONGITUDE,
           CONVERT(VARCHAR, RANKED_LOCATION.GPS_TIME, 101) AS DATE,
           GROUPS.ARRIVAL_TIME,
           GROUPS.DEPARTURE_TIME,
           [dbo].RELATIVE_DATE(GROUPS.DEPARTURE_TIME, GROUPS.ARRIVAL_TIME) AS STOP_TIME,
           ([dbo].UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME) - [dbo].UNIX_TIMESTAMP(GROUPS.ARRIVAL_TIME)) AS STOP_TIME_SECONDS,
           [dbo].RELATIVE_DATE(RANKED.GPS_TIME, GROUPS.DEPARTURE_TIME) AS TRIP_TIME,
           ([dbo].UNIX_TIMESTAMP(RANKED.GPS_TIME) - [dbo].UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME)) AS TRIP_TIME_SECONDS,
           CAST((RANKED.CORRECTED_ODOMETER - GROUPS.CORRECTED_ODOMETER) * 0.000621371192 AS DECIMAL(12, 1)) AS DISTANCE,
           'Ignition' AS EVENT_TYPE
      FROM GROUPS LEFT JOIN RANKED
                  ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END
                  LEFT JOIN RANKED_LOCATION
                  ON GROUPS.ROW_KEY = RANKED_LOCATION.ROW_KEY
     WHERE ((dbo.UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME) - dbo.UNIX_TIMESTAMP(GROUPS.ARRIVAL_TIME)) / 60 > @STOP_LENGTH
        OR GROUPS.DEPARTURE_TIME IS NULL
        OR GROUPS.ARRIVAL_TIME IS NULL);
    

    Ultimately, we need to create islands on a per-vehicle basis, but I am at a loss as to how to get that to work.  At the end of the day, our solution was to create a cursor for each vehicle we wanted to run this report on, and loop through the cursor appending the rows it generates per vehicle to a temporary table and returning that.

    I would love to accomplish this sort of aggregation without the cursor, but I don't see how that is possible.  I figured I would head to the mountaintop and talk to the wizard again before I gave up.
  • Saturday, November 07, 2009 12:47 AMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    Seems to me you can simply add the VEHICLE_ID column to all SELECT statements (as well as GROUP BY where needed) and it should handle this for all vehicles. There is nothing specific in the logic to limit this for a single vehicle. If you need more help you can provide the CREATE TABLE statements with INSERT statements for sample data and what the desired result is.
    Plamen Ratchev
  • Saturday, November 07, 2009 12:49 AMPlamen RatchevMVP, ModeratorUsers MedalsUsers MedalsUsers MedalsUsers MedalsUsers Medals
     
    You may need to add the VEHICLE_ID column to the PARTITION BY clause of the ranking functions, so all ranks are calculated per vehicle.
    Plamen Ratchev