Grouping by date ranges with another attribute, ranges may be non-contiguous

Answered Grouping by date ranges with another attribute, ranges may be non-contiguous

  • Wednesday, January 30, 2013 8:01 PM
     
     

    After poring over a similar problem and finding it never provided a complete solution I finally have gotten to the heart of the problem I can't solve.  I'm looking for the consecutive amount of days that a person can be prescribed a certain amount of drugs.  Because the prescriptions begin and end, there can be multiple, non-contiguous intervals that a person is on X number of drugs.  The following SQL script produces the result set of the query I'll post momentarily:

    create table test
    (pat_id int, cal_date date, grp_nbr int, drug_qty int,[ranking] int)
    go
    insert into test(pat_id,cal_date, grp_nbr,drug_qty,[ranking])
    values
    (1, '1/8/2007',7,2, 1),
    (1, '1/9/2007',7,2, 1),
    (1, '1/10/2007',7, 2,1),
    (1, '1/11/2007',7, 2,1),
    (1, '1/12/2007',7, 2,1),
    (1, '1/13/2007',7, 2,1),
    (1, '1/14/2007',7, 2,1),
    (1, '1/15/2007',7, 2,1),
    (1, '6/1/2007',7,2, 1),
    (1, '6/2/2007',7,2, 1),
    (1, '6/3/2007',7,2, 1)

    Notice here that there are two non-contiguous intervals where this person was on two drugs at once.  In the days that are omitted,drug_qty was more than two.  The last column in this example was my attempt at adding another field that I could group by to help solve the problem (didn't work).

    Query to create tables:

     CREATE TABLE [dbo].[rx](
                [pat_id] [int] NOT NULL,
                [fill_Date] [date] NOT NULL,
                [script_End_Date]  AS (dateadd(day,[dayssup],[filldate])),
                [drug_Name] [varchar](50) NULL,
                [days_Sup] [int] NOT NULL,
                [quantity] [float] NOT NULL,
                [drug_Class] [char](3) NOT  NULL,
                CHECK(fill_Date <=script_End_Date
    PRIMARY KEY CLUSTERED 
    (
                [clmid] ASC
    )


    CREATE TABLE [dbo].[Calendar](
                 [cal_date] [date] PRIMARY KEY,
    [Year] AS YEAR(cal_date) PERSISTED,
    [Month] AS MONTH(cal_date) PERSISTED,
    [Day] AS DAY(cal_date) PERSISTED,
                 [julian_seq] AS 1+DATEDIFF(DD, CONVERT(DATE, CONVERT(varchar,YEAR(cal_date))+'0101'),cal_date),
         id int identity);



    the query I'm using to produce my result sets:

    ;WITH x 
         AS (SELECT rx.pat_id, 
                    c.cal_date, 
                    Count(DISTINCT rx.drug_name) AS distinctDrugs 
             FROM   rx, 
                    calendar AS c 
             WHERE  c.cal_date BETWEEN rx.fill_date AND rx.script_end_date 
                    AND rx.ofinterest = 1 
             GROUP  BY rx.pat_id, 
                       c.cal_date 
             --the query example I used having count(1) =2, but to illustrate the non-contiguous intervals, in practice I need the below having statement
             HAVING Count(*) > 1), 
         y 
         AS (SELECT x.pat_id, 
                    x.cal_date 
                    --c2.id is the row number in the calendar table. 
                    , 
                    c2.id - Row_number() 
                              OVER( 
                                partition BY x.pat_id 
                                ORDER BY x.cal_date) AS grp_nbr, 
                    distinctdrugs 
             FROM   x, 
                    calendar AS c2 
             WHERE  c2.cal_date = x.cal_date) 
    SELECT *, 
           Rank() 
             OVER( 
               partition BY pat_id, grp_nbr 
               ORDER BY distinctdrugs) AS [ranking] 
    FROM   y 
    WHERE  y.pat_id = 1604012867 
           AND distinctdrugs = 2 

    Besides the fact that I shouldn't have a column in the calendar table named 'id', is there anything egregiously wrong with this approach?  I can get the query to show me the distinct intervals of distinctDrugs=x, but it will only work for that integer and not anything >1.  By this I mean that I can find the separate intervals where a patient is on two drugs, but only when I use =2 in the having clause, not >1.  I can't do something like

    SELECT pat_id, 
           Min(cal_date), 
           Max(cal_date), 
           distinctdrugs 
    FROM   y 
    GROUP  BY pat_id, 
              grp_nbr 

    because this will pick up that second group of non-contiguous dates.  Does anyone know of an elegant solution to this problem?
     

     

All Replies

  • Thursday, January 31, 2013 5:49 AM
     
     Answered Has Code

    I'm not sure exactly what you are looking for, but perhaps

    A simplified rx table and data

    CREATE TABLE #rx(
                pat_id int NOT NULL,
                drug_Name varchar(50) NULL,
                fill_Date date NOT NULL,
                script_End_Date date);
    Insert #rx(pat_id, drug_Name, fill_Date, script_End_Date) Values
    (1, 'A', '20130105', '20130120'),
    (1, 'B', '20130109', '20130113'),
    (1, 'C', '20130110', '20130125'),
    (1, 'A', '20130205', '20130228'),
    (1, 'D', '20130215', '20130217'),
    (2, 'B', '20130110', '20130225'),
    (2, 'G', '20130207', '20130210');

    Then you can get every interval where each patient is taking a particular number of drugs by (assuming you have a calendar table with a column named dt that contains each date)

    ;With cte As
    (Select r.pat_id, c.dt, DateDiff(dd, '20000101', c.dt) - Row_Number() Over (Partition By r.pat_id, Count(*) Order By c.dt) As IslandNbr, Count(*) As NbrOfDrugs
    From #rx r
    Inner Join dbo.Calendar c On c.dt Between r.fill_Date And r.script_End_Date
    Group By r.pat_id, c.dt)
    Select pat_id, Min(dt) As StartDate, Max(dt) As EndDate, NbrOfDrugs
    From cte
    Group By pat_id, IslandNbr, NbrOfDrugs
    Order By pat_id, Min(dt);

    And if you are looking for a particular patient and/or number of drugs, just add a where clause.  So, for example to see which intervals patient 1 was taking 2 drugs, you would do

    ;With cte As
    (Select r.pat_id, c.dt, DateDiff(dd, '20000101', c.dt) - Row_Number() Over (Partition By r.pat_id, Count(*) Order By c.dt) As IslandNbr, Count(*) As NbrOfDrugs
    From #rx r
    Inner Join dbo.Calendar c On c.dt Between r.fill_Date And r.script_End_Date
    Group By r.pat_id, c.dt)
    Select pat_id, Min(dt) As StartDate, Max(dt) As EndDate, NbrOfDrugs
    From cte
    Where pat_id = 1 And NbrOfDrugs = 2
    Group By pat_id, IslandNbr, NbrOfDrugs
    Order By pat_id, Min(dt);

    Tom