Grouping by date ranges with another attribute, ranges may be non-contiguous
-
Wednesday, January 30, 2013 8:01 PM
After poring over a similar problem and finding it never provided a complete solution I finally have gotten to the heart of the problem I can't solve. I'm looking for the consecutive amount of days that a person can be prescribed a certain amount of drugs. Because the prescriptions begin and end, there can be multiple, non-contiguous intervals that a person is on X number of drugs. The following SQL script produces the result set of the query I'll post momentarily:
create table test
(pat_id int, cal_date date, grp_nbr int, drug_qty int,[ranking] int)
go
insert into test(pat_id,cal_date, grp_nbr,drug_qty,[ranking])
values
(1, '1/8/2007',7,2, 1),
(1, '1/9/2007',7,2, 1),
(1, '1/10/2007',7, 2,1),
(1, '1/11/2007',7, 2,1),
(1, '1/12/2007',7, 2,1),
(1, '1/13/2007',7, 2,1),
(1, '1/14/2007',7, 2,1),
(1, '1/15/2007',7, 2,1),
(1, '6/1/2007',7,2, 1),
(1, '6/2/2007',7,2, 1),
(1, '6/3/2007',7,2, 1)Notice here that there are two non-contiguous intervals where this person was on two drugs at once. In the days that are omitted,drug_qty was more than two. The last column in this example was my attempt at adding another field that I could group by to help solve the problem (didn't work).
Query to create tables:
CREATE TABLE [dbo].[rx](
[pat_id] [int] NOT NULL,
[fill_Date] [date] NOT NULL,
[script_End_Date] AS (dateadd(day,[dayssup],[filldate])),
[drug_Name] [varchar](50) NULL,
[days_Sup] [int] NOT NULL,
[quantity] [float] NOT NULL,
[drug_Class] [char](3) NOT NULL,
CHECK(fill_Date <=script_End_Date
PRIMARY KEY CLUSTERED
(
[clmid] ASC
)
CREATE TABLE [dbo].[Calendar](
[cal_date] [date] PRIMARY KEY,
[Year] AS YEAR(cal_date) PERSISTED,
[Month] AS MONTH(cal_date) PERSISTED,
[Day] AS DAY(cal_date) PERSISTED,
[julian_seq] AS 1+DATEDIFF(DD, CONVERT(DATE, CONVERT(varchar,YEAR(cal_date))+'0101'),cal_date),
id int identity);
the query I'm using to produce my result sets:;WITH x
AS (SELECT rx.pat_id,
c.cal_date,
Count(DISTINCT rx.drug_name) AS distinctDrugs
FROM rx,
calendar AS c
WHERE c.cal_date BETWEEN rx.fill_date AND rx.script_end_date
AND rx.ofinterest = 1
GROUP BY rx.pat_id,
c.cal_date
--the query example I used having count(1) =2, but to illustrate the non-contiguous intervals, in practice I need the below having statement
HAVING Count(*) > 1),
y
AS (SELECT x.pat_id,
x.cal_date
--c2.id is the row number in the calendar table.
,
c2.id - Row_number()
OVER(
partition BY x.pat_id
ORDER BY x.cal_date) AS grp_nbr,
distinctdrugs
FROM x,
calendar AS c2
WHERE c2.cal_date = x.cal_date)
SELECT *,
Rank()
OVER(
partition BY pat_id, grp_nbr
ORDER BY distinctdrugs) AS [ranking]
FROM y
WHERE y.pat_id = 1604012867
AND distinctdrugs = 2Besides the fact that I shouldn't have a column in the calendar table named 'id', is there anything egregiously wrong with this approach? I can get the query to show me the distinct intervals of distinctDrugs=x, but it will only work for that integer and not anything >1. By this I mean that I can find the separate intervals where a patient is on two drugs, but only when I use =2 in the having clause, not >1. I can't do something like
SELECT pat_id,
Min(cal_date),
Max(cal_date),
distinctdrugs
FROM y
GROUP BY pat_id,
grp_nbrbecause this will pick up that second group of non-contiguous dates. Does anyone know of an elegant solution to this problem?
All Replies
-
Thursday, January 31, 2013 5:49 AM
I'm not sure exactly what you are looking for, but perhaps
A simplified rx table and data
CREATE TABLE #rx( pat_id int NOT NULL, drug_Name varchar(50) NULL, fill_Date date NOT NULL, script_End_Date date); Insert #rx(pat_id, drug_Name, fill_Date, script_End_Date) Values (1, 'A', '20130105', '20130120'), (1, 'B', '20130109', '20130113'), (1, 'C', '20130110', '20130125'), (1, 'A', '20130205', '20130228'), (1, 'D', '20130215', '20130217'), (2, 'B', '20130110', '20130225'), (2, 'G', '20130207', '20130210');Then you can get every interval where each patient is taking a particular number of drugs by (assuming you have a calendar table with a column named dt that contains each date)
;With cte As (Select r.pat_id, c.dt, DateDiff(dd, '20000101', c.dt) - Row_Number() Over (Partition By r.pat_id, Count(*) Order By c.dt) As IslandNbr, Count(*) As NbrOfDrugs From #rx r Inner Join dbo.Calendar c On c.dt Between r.fill_Date And r.script_End_Date Group By r.pat_id, c.dt) Select pat_id, Min(dt) As StartDate, Max(dt) As EndDate, NbrOfDrugs From cte Group By pat_id, IslandNbr, NbrOfDrugs Order By pat_id, Min(dt);
And if you are looking for a particular patient and/or number of drugs, just add a where clause. So, for example to see which intervals patient 1 was taking 2 drugs, you would do
;With cte As (Select r.pat_id, c.dt, DateDiff(dd, '20000101', c.dt) - Row_Number() Over (Partition By r.pat_id, Count(*) Order By c.dt) As IslandNbr, Count(*) As NbrOfDrugs From #rx r Inner Join dbo.Calendar c On c.dt Between r.fill_Date And r.script_End_Date Group By r.pat_id, c.dt) Select pat_id, Min(dt) As StartDate, Max(dt) As EndDate, NbrOfDrugs From cte Where pat_id = 1 And NbrOfDrugs = 2 Group By pat_id, IslandNbr, NbrOfDrugs Order By pat_id, Min(dt);
Tom- Proposed As Answer by Uri DimantMVP, Editor Thursday, January 31, 2013 6:01 AM
- Marked As Answer by sqlservernoob Thursday, January 31, 2013 1:18 PM

