none
Relational division with events occurring within a timeframe

    Question

  • I have two tables (table definitions and result set can be found http://sqlfiddle.com/#!3/754ed

    The first CTE in the following query returns information about the list of drugs that people are on.  The second CTE will eventually have parameters passed to it and so for the time being it's a row constructor.  What I would like to do is find the people who are taking both drugs at the same time.  The column scriptEndDate indicates when stopping date of a particular prescription.  The result set from the below query is at the link.  The result set I'm given has been tested and is correct, but I can't seem to get the correct answer when I add the restriction of taking both drugs at the same time.

    --PatientDrugList is a CTE because eventually parameters might be passed to it
    --to alter the selection population
    ;with PatientDrugList(patid, filldate, scriptEndDate,drugName,strength)
    as
    (
    select rx.patid,rx.fillDate,rx.scriptEndDate,rx.drugName,rx.strength
    from rx
    ),
    --the row constructor here will eventually be parameters for a stored procedure
    DrugList (drugName)
    as
    (
    select x.drugName
    from (values ('concerta'),('fentanyl'))
    as x(drugName)
    where x.drugName is not null
    )


    --the row number here is so that I can find the largest date range
    --(the largest datediff means the person was on a given drug for a larger
    --amount of time.  obviously not a optimal solution
         --celko inspired relational division!
         select distinct row_number() over(partition by pd.patid, drugname order by datediff(day,pd.fillDate,pd.scriptEndDate)desc)  as rn
    ,pd.patid
    ,pd.drugname
    ,pd.fillDate
    ,pd.scriptEndDate
    from PatientDrugList as pd
    where not exists
    (select * from DrugList 
    where not exists
    (select * from PatientDrugList as pd2
    where(pd.patid=pd2.patid)
    and (pd2.drugName = DrugList.drugName)))
    and exists 
    (select * 
    from DrugList
    where DrugList.drugName=pd.drugName
    )
    group by pd.patid, pd.drugName,pd.filldate,pd.scriptEndDate



    Thursday, December 13, 2012 9:58 PM

Answers

  • >> I have two tables (table definitions AND result set can be found http:--sqlfiddle.com-#!3-754ed <<

    Do you often download files from strangers? But I did anyway. It did not help. There were no tables. Tables have keys and constraints. Here is what you might have meant to post: 

    CREATE TABLE Patient_Drug_List
    (pat_id INTEGER NOT NULL, 
     ndc CHAR(10) NOT NULL 
      REFERENCES Drugs(ndc),
     PRIMARY KEY (pat_id, ndc)
     script_fill_date DATE NOT NULL, 
     script_end_date DATE NOT NULL, 
     CHECK (script_fill_date < script_end_date));

    What you post are not tables at all! No keys, not constraints, no respect for ISO-11179 naming rules. I also wonder why you use drug names and not the standard unique, three-segment number, called the National Drug Code (NDC), which serves as a universal product identifier for human drugs. It will have the strength in it as I recall. 

    >> What I would like to do is find the people who are taking both drugs at the same time. The column script_end_date indicates the final date of a particular prescription. <<

    Wrong approach. This is a temporal problem, not a division. Also, the nested NOT EXISTS() predicates are not my relation division; that is Chris Date's version. You need a calendar table; 50 or 100 years is usually big enough. 

    DECLARE @in_ndc_1 = 0093-5370-65 -- Fentany 
    DECLARE @in_ndc_2 = 54868-4489-0 -- Concerta 

    SELECT PD.pat_id, C.cal_date AS overlap_date
     FROM Patient_Drug_List AS PD, Calendar AS C
     WHERE ndc IN (VALUES (@in_ndc_1, @in_ndc_2)
      AND cal_date BETWEEN script_fill_date AND script_end_date
     GROUP BY PD.pat_id, C.cal_date
    HAVING COUNT(DISTINCT ndc) = 2; 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, December 14, 2012 2:40 AM

All replies

  • I figured out what I wanted to do.  I inserted the results into a temp table and did

    ;with tmp as (
       query
    )
    select *
      from tmp a
      join tmp b on a.patid = b.patid and a.drugname <> b.drugname
     where a.filldate < b.scriptenddate
       and b.filldate < a.scriptenddate;

    Thursday, December 13, 2012 10:31 PM
  • >> I have two tables (table definitions AND result set can be found http:--sqlfiddle.com-#!3-754ed <<

    Do you often download files from strangers? But I did anyway. It did not help. There were no tables. Tables have keys and constraints. Here is what you might have meant to post: 

    CREATE TABLE Patient_Drug_List
    (pat_id INTEGER NOT NULL, 
     ndc CHAR(10) NOT NULL 
      REFERENCES Drugs(ndc),
     PRIMARY KEY (pat_id, ndc)
     script_fill_date DATE NOT NULL, 
     script_end_date DATE NOT NULL, 
     CHECK (script_fill_date < script_end_date));

    What you post are not tables at all! No keys, not constraints, no respect for ISO-11179 naming rules. I also wonder why you use drug names and not the standard unique, three-segment number, called the National Drug Code (NDC), which serves as a universal product identifier for human drugs. It will have the strength in it as I recall. 

    >> What I would like to do is find the people who are taking both drugs at the same time. The column script_end_date indicates the final date of a particular prescription. <<

    Wrong approach. This is a temporal problem, not a division. Also, the nested NOT EXISTS() predicates are not my relation division; that is Chris Date's version. You need a calendar table; 50 or 100 years is usually big enough. 

    DECLARE @in_ndc_1 = 0093-5370-65 -- Fentany 
    DECLARE @in_ndc_2 = 54868-4489-0 -- Concerta 

    SELECT PD.pat_id, C.cal_date AS overlap_date
     FROM Patient_Drug_List AS PD, Calendar AS C
     WHERE ndc IN (VALUES (@in_ndc_1, @in_ndc_2)
      AND cal_date BETWEEN script_fill_date AND script_end_date
     GROUP BY PD.pat_id, C.cal_date
    HAVING COUNT(DISTINCT ndc) = 2; 


    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Friday, December 14, 2012 2:40 AM
  • @Celko, thanks for taking the time to respond.  I didn't use the NDC because a given drug can have up to 20 different NDCs depending no whether or not it is given intravenously, orally, (rectally..shudders) or what have you.  By using the drug name I don't have to worry about finding a specific NDC from a large list of them.  I'm going to have to look up what a calendar table is, I've not used/heard of that.  What is the significance of the cal_date in this example?  How does one know what value to give it?
    Friday, December 14, 2012 12:55 PM
  • cal_date would be a column in the "Calendar" table.

    This portion will tell you if a day is within the scripts' Fill and End date range.

    >> AND cal_date BETWEEN script_fill_date AND script_end_date <<


    Please mark this reply as the answer or vote as helpful, as appropriate, to make it useful for other readers.
    Thanks!
    Aalam | Blog (http://aalamrangi.wordpress.com)

    Saturday, December 22, 2012 6:01 PM
  • @Celko, thanks for taking the time to respond.  I didn't use the NDC because a given drug can have up to 20 different NDCs depending no whether or not it is given intravenously, orally, (rectally..shudders) or what have you.  By using the drug name I don't have to worry about finding a specific NDC from a large list of them.  I'm going to have to look up what a calendar table is, I've not used/heard of that.  What is the significance of the cal_date in this example?  How does one know what value to give it?

    We used a look-up table with the name and the various NDCs, but preferred  the NDC. 

    Since SQL is a data language, we like tables over computations. Build a calendar table with one column for the calendar data and other columns to show whatever your business needs in the way of temporal information. Do not try to calculate holidays in SQL -- Easter alone requires too much math.

    CREATE TABLE Calendar
    (cal_date DATE NOT NULL PRIMARY KEY, 
     fiscal_year SMALLINT NOT NULL, 
     fiscal_month SMALLINT NOT NULL,
     week_in_year SMALLINT NOT NULL, -- SQL Server is not ISO standard 
     holiday_type SMALLINT NOT NULL 
     CHECK(holiday_type IN ( ..), -- 
     day_in_year 
     SMALLINT NOT NULL,  
     julian_business_day INTEGER NOT NULL,
     ...);

    --CELKO-- Books in Celko Series for Morgan-Kaufmann Publishing: Analytics and OLAP in SQL / Data and Databases: Concepts in Practice Data / Measurements and Standards in SQL SQL for Smarties / SQL Programming Style / SQL Puzzles and Answers / Thinking in Sets / Trees and Hierarchies in SQL

    Saturday, December 22, 2012 11:14 PM