none
finding gaps in date ranges RRS feed

  • Question

  • Hi we run 2017 std. I have a project table with a hard start and end time as shown below.  And I have a personnel team schedule table as shown beneath that.  Has anyone seen an elegant (perhaps without loops) solution to generating the 0-n missing date range coverages (the gaps) from the team schedule over the project schedule? 

    For example if my project runs from sun thru sat with start and end as 12/8/19 and 12/15/19 (that's 12 am of sunday the 15th) respectively, and my team schedule has nobody scheduled on wed and fri, I'd like to generate a phantom team schedule (team name is null) record for wed and a phantom team schedule record for fri.

    The catch is that starts and stops (neither project nor team) never align exactly with days.  So they can have for example a start of 6am and an end of 9am.  and like projects, team schedules can cross days.  To keep it simple i'm going to say that team schedules don't overlap, ie Team A and Team B don't both work on Monday.  I may eat those words later.

    Maybe a recursion would work?

    CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	
    PRIMARY KEY CLUSTERED 
    (
    	[ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]--,
    	
    ) ON [PRIMARY]
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100),
    	
    PRIMARY KEY CLUSTERED 
    (
    	[ID] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]--,
    	
    ) ON [PRIMARY]
    
    GO

        


    • Edited by db042190 Tuesday, December 10, 2019 7:26 PM would recursion do the trick?
    Tuesday, December 10, 2019 7:16 PM

Answers

  • CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL)
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100))
    
    GO
    
     
    
    insert project_starts
    select 1,'1/1/2019','1/1/2020'
    insert staff_team_schedule
    select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A'
    union all 
    --I jury rigged this select to propose an answer based on Itzik's 
    -- approach.  Its explained below.
    select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'
    
    insert staff_team_schedule
    select 3,'2019-4-1 07:00:48.393','2019-4-2 07:00:48.393','Team A'
    ;with mycte as 
    (
    select 0 as Schedule_Id, Start_Time,End_Time,null as Team_Name from project_starts 
    union all
    select Schedule_Id , Start_Time,End_Time ,Team_Name  from  staff_team_schedule
    )
    
    ,mycte1 as (
     Select distinct dt,Schedule_Id  
     from mycte
     Cross apply (values( Start_Time ),( End_Time )) d(dt)
     )
    
     
     
    ,mycte2 AS
    ( 
    SELECT *
    ,row_number() OVER ( ORDER BY dt) as rn
     FROM mycte1
    ) 
     
     
     
    
     SELECT  
     null as Schedule_Id
      
     ,t1.dt as [from], t2.dt 	as [to]
     FROM mycte2 t1 left join mycte2 t2 on   t1.rn=t2.rn-1
    
     where   t1.Schedule_Id<>t2.Schedule_Id
     and t1.dt<>t2.dt
    order by [from]
     
     
     
     
     
    drop table project_starts,staff_team_schedule
    
    /*
    Schedule_Id	from	to
    NULL	2019-01-01 00:00:00.0000000	2019-02-01 07:00:48.3930000
    NULL	2019-04-02 07:00:48.3930000	2019-05-01 07:00:48.3930000
    NULL	2019-12-31 07:00:48.3930000	2020-01-01 00:00:00.0000000
    */

    • Marked as answer by db042190 Thursday, December 12, 2019 5:26 PM
    Thursday, December 12, 2019 5:06 PM
    Moderator

All replies

  • the contributors at https://www.sqlservercentral.com/forums/topic/finding-gaps-within-date-ranges seem to be giving Itzik Ben-Gan credit for this.  They talk about stacking (eliminating overlaps first) as a first step which I may need to circle back to but I'm going to try my hand at modifying the solution you see below to overcome my challenge of 1) starts and stops within 12 am boundaries , 2) gaps on either end of my project start.  From what I can tell, their object id is just a separate category for a set of dates and each category's set of dates is handled separately from every other category's set of dates.

    -- Finding gaps
    SELECT
    A.obj_id,
    DATEADD([day], 1, A.dateend) AS dtstart,
    DATEADD([day], -1, B.datestart) AS dtend
    FROM
    #T AS A
    CROSS APPLY
    (
    SELECT TOP (1)
    T.datestart
    FROM
    #T AS T
    WHERE
    T.obj_id = A.obj_id AND T.datestart > A.dateend
    ORDER BY
    T.datestart
    ) AS B
    WHERE
    DATEDIFF([day], A.dateend, B.datestart) > 1
    ORDER BY
    A.obj_id,
    dtstart;

     

    Tuesday, December 10, 2019 8:17 PM
  • Hi db042190, 

    Sorry for my poor understanding.

    Could you please share us  your table structure (CREATE TABLE …) and some sample data(INSERT INTO …) along with your expected result? So that we’ll get a right direction and make some test.

     

    When you post your issue, please refer POSTING TIPS - Code, Images, Hyperlinks, Details.

     

    Best Regards,

    Rachel



    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Wednesday, December 11, 2019 3:13 AM
  • thanks Rachel. The creates are in the beginning of the post.  The code block shows some inserts. And expected results are...

    since Project 1 has no team coverage between the new year and approx. 7am 2/1, and no coverage between 4/1 approx. 7am and 5/1 approx. 7am and finally no coverage between  12/31 approx. 7am and new years day, i'd want the null team schedule gap filler records shown in the 2nd block generated...

    insert project_starts select 1,'1/1/2019','1/1/2020' insert staff_team_schedule select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A' union all

    --I jury rigged this select to propose an answer based on Itzik's

    -- approach. Its explained below. select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'


    select null,'1/1/2019','2019-2-1 07:00:48.393',null
    
    union all
    
    select null,'2019-4-1 07:00:48.393','2019-5-1 07:00:48.393',null
    
    union all 
    
    select null,'2019-12-31 07:00:48.393','1/1/2020',null




    • Edited by db042190 Thursday, December 12, 2019 2:17 PM document the jury rig i did to propose an answer based on Itzik's approach
    Wednesday, December 11, 2019 12:15 PM
  • CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL)
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100))
    
    GO
    
    insert project_starts
    select 1,'1/1/2019','12/31/19'
    insert staff_team_schedule
    select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A'
    union all 
    select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'
    ;with mycte as 
    (
    select Start_Time,End_Time,'Team A' as Team_Name from project_starts 
    union all
    select Start_Time,End_Time ,Team_Name  from  staff_team_schedule
    )
    
    ,mycte1 as (
     Select distinct dt,Team_Name 
     from mycte
     Cross apply (values(Cast(Start_Time as date)),(Cast(End_Time as date))) d(dt)
     )
     
    ,mycte2 as  (
    SELECT * ,ROW_NUMBER() OVER( ORDER BY dt)%2 AS rn
    	FROM mycte1
    )
    ,mycte3 as  (
    SELECT * ,sum(rn) Over(Order by dt) grp2
    	FROM mycte2
    )
     
    select Team_Name, min(dt) as [From]
    ,Case when max(dt) =min(dt) 
    then dateadd(day,1, max(dt) ) else max(dt) 
    end as [To]  
    from mycte3
    group by Team_Name, grp2
    Order by Team_Name, grp2
    
    
    --select null,'1/1/2019','2019-2-1 07:00:48.393',null
    
    --union all
    
    --select null,'2019-4-1 07:00:48.393','2019-5-1 07:00:48.393',null
    
    --union all 
    
    --select null,'2019-12-31 07:00:48.393','1/1/2020',null
    
    
    drop table project_starts,staff_team_schedule

    Wednesday, December 11, 2019 7:21 PM
    Moderator
  • thx jingyang. What is the significance of the hardcoded select of 'Team A' from project starts in mycte? 
    Wednesday, December 11, 2019 7:33 PM
  • I just use the hard code value to make a whole teamA set. In your real case, you may need to filter something to get each team startdate, enddate and schedule dates in a dataset.

    Wednesday, December 11, 2019 7:45 PM
    Moderator
  • Here is another try:

    CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL)
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100))
    
    GO
    
    insert project_starts
    select 1,'1/1/2019','12/31/19'
    insert staff_team_schedule
    select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A'
    union all 
    select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'
    ;with mycte as 
    (
    select Start_Time,End_Time,'Team A' as Team_Name from project_starts 
    union all
    select Start_Time,End_Time ,Team_Name  from  staff_team_schedule
    )
    
    ,mycte1 as (
     Select distinct dt,Team_Name 
     from mycte
     Cross apply (values(Cast(Start_Time as date)),(Cast(End_Time as date))) d(dt)
     )
    
     
     
    ,mycte2 AS
    ( 
    SELECT *
    ,row_number() OVER (PARTITION BY Team_Name ORDER BY dt) as rn
     FROM mycte1
    ) 
     
     SELECT t1.Team_Name
    	,t1.dt as [from]
    	,ISNULL(t2.dt,dateadd(day,1,t1.dt)) as [to]	 
        FROM mycte2 t1 left join mycte2 t2 
    	on t1.Team_Name=t2.Team_Name and t1.rn=t2.rn-1
    	Where t1.rn%2=1
     
    	 --
     
     
     /*
     Team_Name	from	to
    Team A	2019-01-01	2019-02-01
    Team A	2019-04-01	2019-05-01
    Team A	2019-12-31	2020-01-01
    
     */
    drop table project_starts,staff_team_schedule

    Wednesday, December 11, 2019 7:45 PM
    Moderator
  • --or this

    CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL)
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100))
    
    GO
    
    insert project_starts
    select 1,'1/1/2019','12/31/19'
    insert staff_team_schedule
    select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A'
    union all 
    select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'
     
    
    DECLARE @StartDate  DATE = (Select Cast([Start_Time] as date) from [Project_Starts] where [Project_Id]=1)
    DECLARE @EndDate DATE = (select Cast([End_Time] as date)   from [Project_Starts] where [Project_Id]=1)
     
    
    --****  create a Number table
    ;WITH Num1 (n) AS (
    SELECT 1 as n
    UNION ALL SELECT n+1 as n
    FROM Num1 Where n <101),
    Num2 (n) AS (SELECT 1 FROM Num1 AS X, Num1 AS Y), 
    Nums (n) AS (SELECT ROW_NUMBER() OVER(ORDER BY n) FROM Num2)
     
    ,DateRange as
    (
    select  dateadd(day,n-1,@startdate) dt from Nums  
    where dateadd(day,n-1,@startdate)<=@EndDate   
    )
     
    
    ,mycte1 as (
     Select distinct dt ,d.Team_Name
     from DateRange t
      cross join (select distinct Team_Name from Staff_Team_Schedule) d
     WHERE not exists ( select 1 from staff_team_schedule sts where t.dt >sts.Start_Time  and t.dt<Dateadd(day,-1,sts.End_Time))
    )
    
     ,mycte2 as  (
    SELECT
        *,  DATEADD(DAY, - ROW_NUMBER() OVER(PARTITION BY Team_Name ORDER BY dt), dt) AS grp 
    	FROM mycte1
    )
    select  Team_Name 
     , min(dt) as [From]
     ,Case when max(dt) =min(dt) 
    then dateadd(day,1, max(dt) ) else max(dt) 
    end as [To]  
     
    from mycte2
    group by Team_Name, grp
    Order by Team_Name, grp
     
     
     
    drop table project_starts,staff_team_schedule

    Wednesday, December 11, 2019 9:01 PM
    Moderator
  • thx Rachel and Jingyang. I'll look at your answers more closely but I suspect neither a hard coded Team nor hard coded project is appropriate.  And I don't know what the modulus 2 is about.  And one of the selects in one of the solutions shows dates coming back without times which cant be right.  I'll try out your solutions removing the hard coding, adding more data points and post back here.         
    Thursday, December 12, 2019 12:26 PM
  • I took a stab at modifying Itzik's solution to fit this problem.  Also notice I had the wrong end time originally on project starts. 

    I took a HUGE shortcut to save time by jury rigging the data and the code to assume the schedule id matches the project id. So the 2nd schedule id ended up being 1 for this solution to work.  I'll enter the appropriate comment  up where the inserts are.

    I'll take a closer look at Jingyang's and Rachel's solutions next.

    Here is the solution.  I'm not 100% sure it works and I know the code isn't as clean as it can be.

    It produces these results

    1 2019-01-01 00:00:00.0000000 2019-02-01 07:00:48.3930000
    1 2019-04-01 07:00:48.3930000 2019-05-01 07:00:48.3930000
    1 2019-12-31 07:00:48.3930000 2020-01-01 00:00:00.0000000

    ;with theschedule as
    (
    --bookend left-----
    select 0 id,ps.project_id schedule_id, dateadd(ss,-1,ps.start_time) start_time ,ps.start_time end_time, cast(null as varchar(100)) Team_Name--, 'left' whichside 
    from project_starts ps
    join (select schedule_id,min(start_time) start_time from staff_team_schedule group by schedule_id) ts
    on ps.project_id=ts.schedule_id
    where ts.start_time is null or (ts.start_time <> ps.start_time)
    -------------------
    union all
    --bookend right-----
    select 0 id,ps.project_id, ps.end_time start_time, dateadd(ss,1,ps.end_time) end_time, cast(null as varchar(100)) Team_Name--,'right' whichside 
    from project_starts ps
    join (select schedule_id,max(end_time) end_time from staff_team_schedule group by schedule_id) ts
    on ps.project_id=ts.schedule_id
    where ts.end_time is null or (ts.end_time <> ps.end_time)
    ---------------------
    union all 
    select * from staff_team_schedule
    --order by 3
    )
    SELECT
    --A.obj_id,
    --a.project_id,--<----deal with this later
    a.schedule_id project_id,
    --DATEADD([day], 1, A.dateend) AS dtstart,
    a.end_time start_time,
    --DATEADD([day], -1, B.datestart) AS dtend
    b.start_time end_time
    FROM
    --#T AS A
    theschedule as a
    CROSS APPLY
    (
    SELECT TOP (1)
    --T.datestart
    t.start_time
    FROM
    --#T AS T
    theschedule as t
    WHERE
    --T.obj_id = A.obj_id AND --<--------deal with this later
    t.schedule_id=a.schedule_id and
    --T.datestart > A.dateend
    t.start_time>a.end_time
    ORDER BY
    --T.datestart
    t.start_time
    ) AS B
    WHERE
    --DATEDIFF([day], A.dateend, B.datestart) > 1
    a.end_time < b.start_time
    ORDER BY
    --A.obj_id,--<------deal with this later
    a.schedule_id,
    --dtstart;
    start_time

     
    Thursday, December 12, 2019 2:11 PM
  • Hi Jingyang, in response to your first solution, there are no times showing so already its not a good solution for me.  I'm guessing if I made the problem more complicated with additional gaps on the same day it would break down.

    Also, there is no notion of a team if i'm filling a gap.  So Team A doesn't make sense.

    also, I added a contiguous team schedule that looks like what's in the code block below and the results seem to break down.

    Finally 1/1/20 - 1/2/20 doesn't solve the problem.

    Team A 2019-01-01 2019-02-01
    Team A 2019-04-01 2019-04-02
    Team A 2019-05-01 2019-12-31
    Team A 2020-01-01 2020-01-02

    insert staff_team_schedule
    select 3,'2019-4-1 07:00:48.393','2019-4-2 07:00:48.393','Team A'



    • Edited by db042190 Thursday, December 12, 2019 3:25 PM which code block
    Thursday, December 12, 2019 2:58 PM
  • I didn't consider time portion for all my solutions.

    If you need to take time into consideration, please get a detailed set of sample table and data in scripts along with expected result, we go from there.

    Thanks. 

    Thursday, December 12, 2019 3:02 PM
    Moderator
  • Hi Jingyang. In your second solution, I got the answers below with that additional contiguous record added. So while I don't know the exact differences in code between this query and the previous, it doesn't really solve the problem for the same reasons I cited on the first query.

    Team_Name from to
    Team A 2019-01-01 2019-02-01
    Team A 2019-04-01 2019-04-02
    Team A 2019-05-01 2019-12-31
    Team A 2020-01-01 2020-01-02

     
    Thursday, December 12, 2019 3:13 PM
  • Hi Jingyang. In response to your 3rd answer, I got the results below. The 1-1-20 to 1-2-20 issue seems to have been resolved but as you can see most of the issues cited in response to your 1st answer are still present.

    Team_Name From To
    Team A 2019-01-01 2019-02-01
    Team A 2019-04-01 2019-05-01
    Team A 2019-12-31 2020-01-01

    Thursday, December 12, 2019 3:19 PM
  • CREATE TABLE [dbo].[Project_Starts](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	[Project_Id] [int] NOT NULL,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL)
    
    GO
    
    
    CREATE TABLE [dbo].[Staff_Team_Schedule](
        [ID] [int] IDENTITY(1,1) NOT NULL PRIMARY KEY,
    	Schedule_Id int,
    	[Start_Time] datetime2(7) NULL,
    	[End_Time] datetime2(7) NULL,
    	[Team_Name] varchar(100))
    
    GO
    
     
    
    insert project_starts
    select 1,'1/1/2019','1/1/2020'
    insert staff_team_schedule
    select 1,'2019-2-1 07:00:48.393','2019-4-1 07:00:48.393','Team A'
    union all 
    --I jury rigged this select to propose an answer based on Itzik's 
    -- approach.  Its explained below.
    select 2,'2019-5-1 07:00:48.393','2019-12-31 07:00:48.393','Team A'
    
    insert staff_team_schedule
    select 3,'2019-4-1 07:00:48.393','2019-4-2 07:00:48.393','Team A'
    ;with mycte as 
    (
    select 0 as Schedule_Id, Start_Time,End_Time,null as Team_Name from project_starts 
    union all
    select Schedule_Id , Start_Time,End_Time ,Team_Name  from  staff_team_schedule
    )
    
    ,mycte1 as (
     Select distinct dt,Schedule_Id  
     from mycte
     Cross apply (values( Start_Time ),( End_Time )) d(dt)
     )
    
     
     
    ,mycte2 AS
    ( 
    SELECT *
    ,row_number() OVER ( ORDER BY dt) as rn
     FROM mycte1
    ) 
     
     
     
    
     SELECT  
     null as Schedule_Id
      
     ,t1.dt as [from], t2.dt 	as [to]
     FROM mycte2 t1 left join mycte2 t2 on   t1.rn=t2.rn-1
    
     where   t1.Schedule_Id<>t2.Schedule_Id
     and t1.dt<>t2.dt
    order by [from]
     
     
     
     
     
    drop table project_starts,staff_team_schedule
    
    /*
    Schedule_Id	from	to
    NULL	2019-01-01 00:00:00.0000000	2019-02-01 07:00:48.3930000
    NULL	2019-04-02 07:00:48.3930000	2019-05-01 07:00:48.3930000
    NULL	2019-12-31 07:00:48.3930000	2020-01-01 00:00:00.0000000
    */

    • Marked as answer by db042190 Thursday, December 12, 2019 5:26 PM
    Thursday, December 12, 2019 5:06 PM
    Moderator