Cursor Looking At Next Row Without Moving Cursor Forward
- I need a way to loop through a query result in a stored procedure. So I know that I can use cursors. But the key thing also is that I need a way to PEEK at the next row while not moving the cursor forward. So basically a way to do current.nex() which returns the next row without moving the cursor forward. I know in Oracle this is called lag and lead. Wondering if there is an equivalent in SQL? We are running SQL 2005. Thanks.
Answers
- You cannot reference a column from a table or CTE that is not included in the FROM clause. In your case you reference the columns from the ISLANDS CTE definition which is not included in the last query. You can simply include the columns in the RANKED CTE and use them, something like this (note I also changed the mixed joins in the first CTE where you had a mix of SQL-89 and SQL-92 join syntax, it is good practice to use the SQL-92 join syntax only):
WITH ISLANDS AS ( SELECT V.LABEL AS LABEL, L.LANDMARK_NAME AS LANDMARK_NAME, H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP AS LOCATION, H.HISTORY_ID, H.GPS_TIME, H.MESSAGE_STATUS, ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY FROM [Production].[dbo].VEHICLE_HISTORY AS H JOIN [Production].[dbo].VEHICLES AS V ON V.VEHICLE_ID = H.VEHICLE_ID LEFT OUTER JOIN [Production].[dbo].LANDMARKS AS L ON H.LANDMARK_ID = L.LANDMARK_ID, WHERE H.VEHICLE_ID = @VEHICLE_ID AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE) AND MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF')), RANKED_ISLANDS AS ( SELECT HISTORY_ID, LABEL, LANDMARK_NAME, LOCATION, GPS_TIME, MESSAGE_STATUS, ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY ORDER BY CASE WHEN MESSAGE_STATUS = 'Ignition OFF' THEN HISTORY_ID END, CASE WHEN MESSAGE_STATUS = 'Ignition ON' THEN HISTORY_ID END DESC) AS ROW_NUMBER FROM ISLANDS ), RANKED AS ( SELECT GPS_TIME, LABEL, LANDMARK_NAME, LOCATION, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL FROM RANKED_ISLANDS WHERE ROW_NUMBER = 1 ), GROUPS AS ( SELECT MIN(ROW_KEY) AS ROW_KEY, MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME, MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME FROM RANKED GROUP BY GRPBY) SELECT RANKED.LABEL, RANKED.LANDMARK_NAME, RANKED.LOCATION, GROUPS.ARRIVAL_TIME, GROUPS.DEPARTURE_TIME, (GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME, (RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME FROM GROUPS LEFT JOIN RANKED ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END;
Plamen Ratchev- Marked As Answer byZongqing LiMSFT, ModeratorWednesday, August 12, 2009 6:11 AM
All Replies
- There is currently no such support.
Please tell us why you need this awkward approach. - Presently there are no SQL Server equivalent to the Oracle LAG and LEAD analytic functions along with a number of other Oracle analytic scenarios.
- You can use the INSERT...EXEC syntax to dump the stored procedure result set into a table. Then you can use the ranking functions to rank the rows by some expression and using a self join lookup the next row. A sketch here:
INSERT INTO Table(<columns>) EXEC sp <params>; WITH CTE AS ( SELECT col1, col2, ROW_NUMBER() OVER(ORDER BY (SELECT 1)) AS rk FROM Table) SELECT Current.col1, Current.col2, Next.col1, Next.col2 FROM CTE AS Current LEFT JOIN CTE AS Next ON Current.rk = Next.rk - 1;
Plamen Ratchev - Are you performing a "moving average"?
See http://weblogs.sqlteam.com/peterl/archive/2007/12/10/Cursor-is-really-faster-than-set-based-solution.aspx You could use 2 set of variables and keep reassigning the old set with the new set before moving the cursor forward.
What are you trying to accomplish?There might be better ways of doing with out using the cursor.- Hello, maybe there is a different approach we can take then. Here is the algorithm we are trying to accomplish.
Assume we have a result set with 8 rows with only one column for ease of explanation.
The output table returned is comprised of rows built up for the rows of the original 8.
FINAL OUTPUT TABLE SHOULD LOOK LIKE
1st Row: NULL, A, NULL, B - A
2nd Row: B, C, C-B, D-C
3rd Row: D, E, E-D, F-E
4th Row: F, G, G-F, H-G
5th Row: H, NULL NULL, NULL
So A,B,C,D,E,F,G,H are the rows.- Edited byJustinK101 Tuesday, August 04, 2009 11:30 PM
- Edited byJustinK101 Tuesday, August 04, 2009 11:30 PM
Would something like this work for you?
Create table #tmp (id int, t varchar(1))insert into #tmp (id, t) values (1, 'A')
insert into #tmp (id, t) values (2, 'B')
insert into #tmp (id, t) values (3, 'C')
insert into #tmp (id, t) values (4, 'D')
insert into #tmp (id, t) values (5, 'E')
insert into #tmp (id, t) values (6, 'F')
insert into #tmp (id, t) values (7, 'G')
insert into #tmp (id, t) values (8, 'H')SELECT
v.t
,v1.t
,v1.t + '-' + v.t
,v2.t + '-' + v1.t
FROM
(
SELECT
row_number() OVER (ORDER BY t) a
,t
FROM
#tmp
WHERE
id % 2 = 0
) v
FULL JOIN (
SELECT
row_number() OVER (ORDER BY t) a
,t
FROM
#tmp
WHERE
id % 2 = 1
) v1
ON v.a = v1.a - 1
LEFT JOIN (
SELECT
row_number() OVER (ORDER BY t) a
,t
FROM
#tmp
WHERE
id % 2 = 0
) v2
ON v2.a = v1.a
ORDER BY
v.a- Edited byV4MC Wednesday, August 05, 2009 1:31 AM
- Here is one solution:
CREATE TABLE Foo ( row_key CHAR(1) PRIMARY KEY, row_value INT); INSERT INTO Foo VALUES('A', 10); INSERT INTO Foo VALUES('B', 15); INSERT INTO Foo VALUES('C', 20); INSERT INTO Foo VALUES('D', 26); INSERT INTO Foo VALUES('E', 30); INSERT INTO Foo VALUES('F', 37); INSERT INTO Foo VALUES('G', 40); INSERT INTO Foo VALUES('H', 48); WITH Ranked AS ( SELECT row_value, ROW_NUMBER() OVER(ORDER BY row_key) AS rk, ROW_NUMBER() OVER(ORDER BY row_key) / 2 AS grp, ROW_NUMBER() OVER(ORDER BY row_key) % 2 AS col FROM Foo), Groups AS ( SELECT MIN(rk) AS rk, MAX(CASE WHEN col = 1 THEN row_value END) AS cur_value, MAX(CASE WHEN col = 0 THEN row_value END) AS prev_value FROM Ranked GROUP BY grp) SELECT G.prev_value, G.cur_value, R.row_value AS next_value, G.cur_value - G.prev_value AS cur_prev_value, R.row_value - G.cur_value AS next_cur_value FROM Groups AS G LEFT JOIN Ranked AS R ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END; /* prev_value cur_value next_value cur_prev_value next_cur_value ----------- ----------- ----------- -------------- -------------- NULL 10 15 NULL 5 15 20 26 5 6 26 30 37 4 7 37 40 48 3 8 48 NULL NULL NULL NULL */
Plamen Ratchev- Proposed As Answer byNaom Wednesday, August 05, 2009 4:38 AM
- Justin,
Have you tried my solution, you would get similar results. - there is no such support in sql server but you can do this by involving variables and looping the clause
Shamas Saeed MCITP-DBD 2005 http://sqlservercoollinks.blogspot.com - Plamen,
Thank you very much. I got your T-SQL implemented but there are a bit more details that I am struggling with. Basically what my algorithm is doing is selecting dates and the difference between them. So in your example 10, 15, 20(row_value) are acutally dates. But that doesn't really matter, your query still works. The problem comes from another column that I need to check for each iteration of each row.
Let me define the table structure that I am selecting the data from:
[code]
CREATE TABLE Foo (
history_id INT PRIMARY KEY,
gps_time DATETIME,
status VARCHAR(3)
);
[/code]
So history_id is your row_key, gps_time is your row_value. The last column, status is what needs to be checked for every iteration. Status stores either, ignition 'on', or ignition 'off'.
I am going to try to explain the problem, but terribly sorry if I am confusing. So what needs to happen is the following checks:
Start at first row, check status, if it is 'on' and the next row is 'off' continue as normal and do the algorithm you wrote above.
If the first row status = 'off' continue iterating through the rows until you find the first 'on'.
If the first row status was 'on' but the next row was 'on' as well, skip the first row and start the algorithm at the 2nd row.
So we always need to make sure the current row is NOT equal to the next rows status.
Finally, if row 1 was 'on' and row 2 was 'off' everything good. But let's assume now we are at row 2 and we doing the check to make sure the next row is not equal to the current row. Row 2 is 'off' but oops, row 3 is also 'off'. In this case we must use row 2 as the off, and skip row 3, and then continue iterating look for our next 'on'.
RULES
If(A == ON AND B == ON)
USE B
IF(B == OFF AND C == OFF)
USE B
EXAMPLE
A B C D E F G H I J K L
on off off on off on on on off on on off
So the output table should use: A, B, D, E, H, I, K, L
Hopefully that makes sense, let me know if you have any questions.
- Edited byJustinK101 Thursday, August 06, 2009 7:20 PM
- Edited byJustinK101 Thursday, August 06, 2009 7:21 PM
- If I understand correctly, you simply need to find islands for status codes and then pick the first row in each island and use that for the further processing. Here is modified example:
CREATE TABLE Foo ( history_id INT PRIMARY KEY, gps_time DATETIME, status VARCHAR(3)); INSERT INTO Foo VALUES(1, '20090101', 'on'); INSERT INTO Foo VALUES(5, '20090102', 'on'); INSERT INTO Foo VALUES(7, '20090103', 'off'); INSERT INTO Foo VALUES(8, '20090104', 'on'); INSERT INTO Foo VALUES(9, '20090105', 'off'); INSERT INTO Foo VALUES(11, '20090106', 'off'); INSERT INTO Foo VALUES(13, '20090107', 'on'); INSERT INTO Foo VALUES(15, '20090108', 'off'); WITH Islands AS ( SELECT history_id, gps_time, status, ROW_NUMBER() OVER(ORDER BY history_id) - ROW_NUMBER() OVER(PARTITION BY status ORDER BY history_id) AS grp FROM Foo), RankedIslands AS ( SELECT history_id, gps_time, status, ROW_NUMBER() OVER(PARTITION BY status, grp ORDER BY history_id) AS rn FROM Islands), Ranked AS ( SELECT gps_time, ROW_NUMBER() OVER(ORDER BY history_id) AS rk, ROW_NUMBER() OVER(ORDER BY history_id) / 2 AS grp, ROW_NUMBER() OVER(ORDER BY history_id) % 2 AS col FROM RankedIslands WHERE rn = 1), Groups AS ( SELECT MIN(rk) AS rk, MAX(CASE WHEN col = 1 THEN gps_time END) AS cur_value, MAX(CASE WHEN col = 0 THEN gps_time END) AS prev_value FROM Ranked GROUP BY grp) SELECT G.prev_value, G.cur_value, R.gps_time AS next_value, DATEDIFF(DAY, G.cur_value, G.prev_value) AS cur_prev_value, DATEDIFF(DAY, R.gps_time, G.cur_value) AS next_cur_value FROM Groups AS G LEFT JOIN Ranked AS R ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END; /* prev_value cur_value next_value cur_prev_value next_cur_value ---------- ---------- ---------- -------------- -------------- NULL 2009-01-01 2009-01-03 NULL -2 2009-01-03 2009-01-04 2009-01-05 -1 -1 2009-01-05 2009-01-07 2009-01-08 -2 -1 2009-01-08 NULL NULL NULL NULL */
Plamen Ratchev - Plamen,
Thanks again for your help. I am trying to figure out your code but getting twisted and confused. :) Could you perhaps implement 'islands' into my current code. It is basically your original algorithm, but with different alias and also a different original select for the results. By the way, I surely owe you a beer or something for all your effort in helping me. :)
So the original select basically says, give me the mesage_status, gps_time for a particular vehicle (vehicle_id) between start_date and end_date where the status is either 'ignition on' or 'ignition off'. That is the starting result set. The rest of the query is basically identical to your original algorithm.
My Current Query
WITH STOP_DETAILS AS (
SELECT MESSAGE_STATUS,
GPS_TIME,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
FROM [Production].[dbo].VEHICLE_HISTORY
WHERE VEHICLE_ID = @VEHICLE_ID
AND GPS_TIME BETWEEN @START_DATE AND @END_DATE
AND (MESSAGE_STATUS = 'Ignition ON'
OR MESSAGE_STATUS = 'Ignition OFF')),
RESULTS AS (
SELECT MIN(ROW_KEY) AS ROW_KEY,
MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
MAX(CASE WHEN COL = 1 THEN MESSAGE_STATUS END) AS IGN_ON,
MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME,
MAX(CASE WHEN COL = 0 THEN MESSAGE_STATUS END) AS IGN_OFF
FROM STOP_DETAILS
GROUP BY GRPBY)
SELECT RESULTS.ARRIVAL_TIME,
RESULTS.IGN_OFF,
RESULTS.DEPARTURE_TIME,
RESULTS.IGN_ON,
(RESULTS.DEPARTURE_TIME - RESULTS.ARRIVAL_TIME) AS STOP_TIME,
(SD.GPS_TIME - RESULTS.DEPARTURE_TIME) AS TRIP_TIME
FROM RESULTS
LEFT JOIN STOP_DETAILS AS SD
ON RESULTS.ROW_KEY = SD.ROW_KEY - CASE WHEN RESULTS.ROW_KEY = 1 THEN 1 ELSE 2
END; - Island is a sequence of consecutive values. In your case that would be status. Here is how they are defined with the sample data I generated:
INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- pick this INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- ignore INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this
You simply need to add the two additional CTEs as in my example and move the predicates to filter on those parameters there.
Plamen Ratchev - So in your example:
INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- pick this
INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- ignore
INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this
INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this
INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this
INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore
INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this
INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this
That inst exactly right though should be:
INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- ignore
INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- pick this
INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this
INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this
INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this
INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore
INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this
INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this
So its: two on's in a row, pick the second. two off's in a row, pick the first.- Edited byJustinK101 Thursday, August 06, 2009 9:42 PM
- Easy to adjust the logic:
CREATE TABLE Foo ( history_id INT PRIMARY KEY, gps_time DATETIME, status VARCHAR(3)); INSERT INTO Foo VALUES(1, '20090101', 'on'); -- island 1 <- ignore INSERT INTO Foo VALUES(5, '20090102', 'on'); -- island 1 <- pick this INSERT INTO Foo VALUES(7, '20090103', 'off'); -- island 2 <- pick this INSERT INTO Foo VALUES(8, '20090104', 'on'); -- island 3 <- pick this INSERT INTO Foo VALUES(9, '20090105', 'off'); -- island 4 <- pick this INSERT INTO Foo VALUES(11, '20090106', 'off'); -- island 4 <- ignore INSERT INTO Foo VALUES(13, '20090107', 'on'); -- island 5 <- pick this INSERT INTO Foo VALUES(15, '20090108', 'off'); -- island 6 <- pick this WITH Islands AS ( SELECT history_id, gps_time, status, ROW_NUMBER() OVER(ORDER BY history_id) - ROW_NUMBER() OVER(PARTITION BY status ORDER BY history_id) AS grp FROM Foo), RankedIslands AS ( SELECT history_id, gps_time, status, ROW_NUMBER() OVER(PARTITION BY status, grp ORDER BY CASE WHEN status = 'off' THEN history_id END, CASE WHEN status = 'on' THEN history_id END DESC) AS rn FROM Islands), Ranked AS ( SELECT gps_time, ROW_NUMBER() OVER(ORDER BY history_id) AS rk, ROW_NUMBER() OVER(ORDER BY history_id) / 2 AS grp, ROW_NUMBER() OVER(ORDER BY history_id) % 2 AS col FROM RankedIslands WHERE rn = 1), Groups AS ( SELECT MIN(rk) AS rk, MAX(CASE WHEN col = 1 THEN gps_time END) AS cur_value, MAX(CASE WHEN col = 0 THEN gps_time END) AS prev_value FROM Ranked GROUP BY grp) SELECT G.prev_value, G.cur_value, R.gps_time AS next_value, DATEDIFF(DAY, G.cur_value, G.prev_value) AS cur_prev_value, DATEDIFF(DAY, R.gps_time, G.cur_value) AS next_cur_value FROM Groups AS G LEFT JOIN Ranked AS R ON G.rk = R.rk - CASE WHEN G.rk = 1 THEN 1 ELSE 2 END; /* prev_value cur_value next_value cur_prev_value next_cur_value ---------- ---------- ---------- -------------- -------------- NULL 2009-01-02 2009-01-03 NULL -1 2009-01-03 2009-01-04 2009-01-05 -1 -1 2009-01-05 2009-01-07 2009-01-08 -2 -1 2009-01-08 NULL NULL NULL NULL */
Plamen Ratchev - Hi Justin,You don't say what to choose if there are more than two in a row, but from an earlier example, it looks like you want the last 'off' from each string of consecutive 'off's, and the first 'on' from each string of consecutive 'on's. This should take care of it by returning each pair of rows where there first is 'on' and the second is 'off'. The FULL join sort of puts in a 0th row that's 'on' and an N+1st row that's 'off', and the IS NOT NULL clauses prevent those rows from being returned.Steve Kass
with Ranked as ( select row_number() over (order by history_id) as rk, history_id, gps_time, status from Foo ), Jogged as ( select A.history_id as idA, A.gps_time as gpsA, coalesce(A.status,'on') as statusA, B.history_id as idB, B.gps_time as gpsB, coalesce(B.status,'off') as statusB from Ranked as A full outer join Ranked as B on A.rk + 1 = B.rk ) select idA, gpsA, 'on' from Jogged where statusA = 'on' and statusB = 'off' and idA is not null union all select idB, gpsB, 'off' from Jogged where statusA = 'on' and statusB = 'off' and idB is not null order by 1; go
Drew Universityhttp://www.stevekass.com - Hey Steve,
Wow your way is crazy, I will test both and see which one I can get working. :) Though I am sure they both work. In regards to multiple, the algorithm I need is:
A B C D E
on on on on off => D,E THROW AWAY A, B, C
A B C D E => A,E THROW AWAY B,C,D
off off off off on - Plamen, You are a genius. Your query worked perfectly. I am simply trying to select a few more static columns though from the original select, but I am getting ISLANDS.LABEL could not be bound. What am I doing wrong? Thanks a million.
WITH ISLANDS AS (
SELECT V.LABEL AS LABEL,
L.LANDMARK_NAME AS LANDMARK_NAME,
'LOCATION' = H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP,
H.HISTORY_ID,
H.GPS_TIME,
H.MESSAGE_STATUS,
ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY
FROM [Production].[dbo].VEHICLE_HISTORY H LEFT OUTER JOIN [Production].[dbo].LANDMARKS L ON H.LANDMARK_ID = L.LANDMARK_ID,
[Production].[dbo].VEHICLES V
WHERE H.VEHICLE_ID = @VEHICLE_ID
AND V.VEHICLE_ID = H.VEHICLE_ID
AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE)
AND (MESSAGE_STATUS = 'Ignition ON'
OR MESSAGE_STATUS = 'Ignition OFF')
), RANKED_ISLANDS AS (
SELECT HISTORY_ID,
GPS_TIME,
MESSAGE_STATUS,
ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY ORDER BY CASE
WHEN MESSAGE_STATUS = 'Ignition OFF'
THEN HISTORY_ID
END,
CASE WHEN MESSAGE_STATUS = 'Ignition ON'
THEN HISTORY_ID
END
DESC) AS ROW_NUMBER
FROM ISLANDS
), RANKED AS (
SELECT GPS_TIME,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY,
ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL
FROM RANKED_ISLANDS
WHERE ROW_NUMBER = 1
), GROUPS AS (
SELECT MIN(ROW_KEY) AS ROW_KEY,
MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME,
MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME
FROM RANKED
GROUP BY GRPBY)
SELECT ISLANDS.LABEL,
ISLANDS.LANDMARK_NAME,
ISLANDS.LOCATION,
GROUPS.ARRIVAL_TIME,
GROUPS.DEPARTURE_TIME,
(GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME,
(RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME
FROM GROUPS
LEFT JOIN RANKED
ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2
END; - You cannot reference a column from a table or CTE that is not included in the FROM clause. In your case you reference the columns from the ISLANDS CTE definition which is not included in the last query. You can simply include the columns in the RANKED CTE and use them, something like this (note I also changed the mixed joins in the first CTE where you had a mix of SQL-89 and SQL-92 join syntax, it is good practice to use the SQL-92 join syntax only):
WITH ISLANDS AS ( SELECT V.LABEL AS LABEL, L.LANDMARK_NAME AS LANDMARK_NAME, H.ADDRESS + ', ' + H.CITY + ', ' + H.STATE + ' ' + H.ZIP AS LOCATION, H.HISTORY_ID, H.GPS_TIME, H.MESSAGE_STATUS, ROW_NUMBER() OVER (ORDER BY H.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY H.MESSAGE_STATUS ORDER BY H.HISTORY_ID) AS GRPBY FROM [Production].[dbo].VEHICLE_HISTORY AS H JOIN [Production].[dbo].VEHICLES AS V ON V.VEHICLE_ID = H.VEHICLE_ID LEFT OUTER JOIN [Production].[dbo].LANDMARKS AS L ON H.LANDMARK_ID = L.LANDMARK_ID, WHERE H.VEHICLE_ID = @VEHICLE_ID AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE) AND MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF')), RANKED_ISLANDS AS ( SELECT HISTORY_ID, LABEL, LANDMARK_NAME, LOCATION, GPS_TIME, MESSAGE_STATUS, ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GRPBY ORDER BY CASE WHEN MESSAGE_STATUS = 'Ignition OFF' THEN HISTORY_ID END, CASE WHEN MESSAGE_STATUS = 'Ignition ON' THEN HISTORY_ID END DESC) AS ROW_NUMBER FROM ISLANDS ), RANKED AS ( SELECT GPS_TIME, LABEL, LANDMARK_NAME, LOCATION, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL FROM RANKED_ISLANDS WHERE ROW_NUMBER = 1 ), GROUPS AS ( SELECT MIN(ROW_KEY) AS ROW_KEY, MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME, MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME FROM RANKED GROUP BY GRPBY) SELECT RANKED.LABEL, RANKED.LANDMARK_NAME, RANKED.LOCATION, GROUPS.ARRIVAL_TIME, GROUPS.DEPARTURE_TIME, (GROUPS.DEPARTURE_TIME - GROUPS.ARRIVAL_TIME) AS STOP_TIME, (RANKED.GPS_TIME - GROUPS.DEPARTURE_TIME) AS TRIP_TIME FROM GROUPS LEFT JOIN RANKED ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END;
Plamen Ratchev- Marked As Answer byZongqing LiMSFT, ModeratorWednesday, August 12, 2009 6:11 AM
- Plamen,
We have one additional change to make to this query, and after coming to the conclusion that a cursor might be required, we wanted to make sure we weren't missing something.
The change we need to include is the ability to run this report against a collection of vehicles, not just a single. Below is the current iteration of our SP:
-- Create our temp resultsets WITH ISLANDS AS ( -- HISTORY is our overall data SELECT V.LABEL AS LABEL, L.LANDMARK_NAME AS LANDMARK_NAME, 'LOCATION' = A.ADDRESS + ', ' + A.CITY + ', ' + A.STATE + ' ' + A.ZIP, A.CORRECTED_ODOMETER, A.HISTORY_ID, A.GPS_TIME, A.MESSAGE_STATUS, A.LATITUDE, A.LONGITUDE, -- Row number ordered by history id - row number partitioned over message status -- At first glance, this clever algorithm makes duplicate key on / key off messages -- have the same ROW_NUMBER value. The partitioned row number B resets on every change of status, -- and when subtracted from the incrementing ordered row number A, halts the incrementing number -- i.e. 1,2,3,3,3,4,5,6,6,7 ROW_NUMBER() OVER (ORDER BY A.GPS_TIME, A.HISTORY_ID) - ROW_NUMBER() OVER (PARTITION BY A.MESSAGE_STATUS ORDER BY A.GPS_TIME, A.HISTORY_ID) AS GROUP_NUMBER FROM VEHICLE_HISTORY A LEFT OUTER JOIN LANDMARKS L ON A.LANDMARK_ID = L.LANDMARK_ID, VEHICLES V WHERE A.VEHICLE_ID = @VEHICLE_ID AND V.VEHICLE_ID = A.VEHICLE_ID AND GPS_TIME BETWEEN @START_DATE AND DATEADD(DAY, 1, @END_DATE) AND A.MESSAGE_STATUS IN ('Ignition ON', 'Ignition OFF') ), RANKED_ISLANDS AS ( -- RANKED_ISLANDS is our grouped information SELECT LABEL, LANDMARK_NAME, LOCATION, CORRECTED_ODOMETER, HISTORY_ID, GPS_TIME, MESSAGE_STATUS, LATITUDE, LONGITUDE, -- I think this collapses the duplicates into a single group, and produces -- a new sequential row number ROW_NUMBER() OVER (PARTITION BY MESSAGE_STATUS, GROUP_NUMBER ORDER BY CASE WHEN MESSAGE_STATUS = 'Ignition OFF' THEN HISTORY_ID END, CASE WHEN MESSAGE_STATUS = 'Ignition ON' THEN HISTORY_ID END DESC) AS ROW_NUMBER FROM ISLANDS ), RANKED AS ( -- RANKED is a basic numeric list of what type of row it is SELECT GPS_TIME, CORRECTED_ODOMETER, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY, -- The basic key ROW_NUMBER() OVER (ORDER BY HISTORY_ID) / 2 AS GRPBY, -- I think this identifies pairs ROW_NUMBER() OVER (ORDER BY HISTORY_ID) % 2 AS COL -- Reflect our alternating messages FROM RANKED_ISLANDS WHERE ROW_NUMBER = 1 ), RANKED_LOCATION AS ( -- RANKED_LOCATION is an offset for the location information SELECT LABEL, LANDMARK_NAME, LOCATION, HISTORY_ID, GPS_TIME, LATITUDE, LONGITUDE, ROW_NUMBER() OVER (ORDER BY HISTORY_ID) AS ROW_KEY -- The address key FROM RANKED_ISLANDS WHERE ROW_NUMBER = 1 ), GROUPS AS ( -- GROUPS is a filtered list of departure and arrival times relevant to each other -- The alternating messages are set into GRPBY pairs SELECT MIN(ROW_KEY) AS ROW_KEY, MAX(CASE WHEN COL = 1 THEN GPS_TIME END) AS DEPARTURE_TIME, MAX(CASE WHEN COL = 0 THEN GPS_TIME END) AS ARRIVAL_TIME, MAX(CORRECTED_ODOMETER) AS CORRECTED_ODOMETER FROM RANKED GROUP BY GRPBY ) -- Get our information, matching our groups row key against a previous RANKED row -- The case handles our first row situation, where there is no previous record (I think) SELECT RANKED_LOCATION.HISTORY_ID, RANKED_LOCATION.LABEL, RANKED_LOCATION.LANDMARK_NAME, RANKED_LOCATION.LOCATION, RANKED_LOCATION.LATITUDE, RANKED_LOCATION.LONGITUDE, CONVERT(VARCHAR, RANKED_LOCATION.GPS_TIME, 101) AS DATE, GROUPS.ARRIVAL_TIME, GROUPS.DEPARTURE_TIME, [dbo].RELATIVE_DATE(GROUPS.DEPARTURE_TIME, GROUPS.ARRIVAL_TIME) AS STOP_TIME, ([dbo].UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME) - [dbo].UNIX_TIMESTAMP(GROUPS.ARRIVAL_TIME)) AS STOP_TIME_SECONDS, [dbo].RELATIVE_DATE(RANKED.GPS_TIME, GROUPS.DEPARTURE_TIME) AS TRIP_TIME, ([dbo].UNIX_TIMESTAMP(RANKED.GPS_TIME) - [dbo].UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME)) AS TRIP_TIME_SECONDS, CAST((RANKED.CORRECTED_ODOMETER - GROUPS.CORRECTED_ODOMETER) * 0.000621371192 AS DECIMAL(12, 1)) AS DISTANCE, 'Ignition' AS EVENT_TYPE FROM GROUPS LEFT JOIN RANKED ON GROUPS.ROW_KEY = RANKED.ROW_KEY - CASE WHEN GROUPS.ROW_KEY = 1 THEN 1 ELSE 2 END LEFT JOIN RANKED_LOCATION ON GROUPS.ROW_KEY = RANKED_LOCATION.ROW_KEY WHERE ((dbo.UNIX_TIMESTAMP(GROUPS.DEPARTURE_TIME) - dbo.UNIX_TIMESTAMP(GROUPS.ARRIVAL_TIME)) / 60 > @STOP_LENGTH OR GROUPS.DEPARTURE_TIME IS NULL OR GROUPS.ARRIVAL_TIME IS NULL);
Ultimately, we need to create islands on a per-vehicle basis, but I am at a loss as to how to get that to work. At the end of the day, our solution was to create a cursor for each vehicle we wanted to run this report on, and loop through the cursor appending the rows it generates per vehicle to a temporary table and returning that.
I would love to accomplish this sort of aggregation without the cursor, but I don't see how that is possible. I figured I would head to the mountaintop and talk to the wizard again before I gave up. - Seems to me you can simply add the VEHICLE_ID column to all SELECT statements (as well as GROUP BY where needed) and it should handle this for all vehicles. There is nothing specific in the logic to limit this for a single vehicle. If you need more help you can provide the CREATE TABLE statements with INSERT statements for sample data and what the desired result is.
Plamen Ratchev - You may need to add the VEHICLE_ID column to the PARTITION BY clause of the ranking functions, so all ranks are calculated per vehicle.
Plamen Ratchev


