Date SQL Puzzle - Consolidate date ranges
-
Tuesday, September 22, 2009 2:51 PMHi,
I've been thinking about this issue quite some time now and I can't seem to get my head around it. I have multiple sources that provide me date ranges, and these ranges will or will not overlap, some will be consecutive to another, others will not. To make my datawarehouse loading process the most efficient, I would like to consolidate all these ranges into the most simple representation of date ranges possible from the source data. Let me explain using an example:
Say I have the following query results from several sources of date ranges (Date ranges are considered inclusive):
ID FromDate ToDate
1 20090801 20090803
2 20090802 20090809
3 20090805 20090806
4 20090812 20090813
5 20090811 20090812
6 20090802 20090802
What I would want is a query that will result in the following ranges:
ID FromDate ToDate
1 20090801 20090809
2 20090811 20090813
I don't really care how these ranges are calculated, although I do have some restictions:
- I cannot afford to use a temporary table, nor a cursor or stored procedure
- I would prefer not to use recursion as usually you would like to limit the recursion depth, and this would limit the complexity of date-range combinations the query could handle.
- I would prefer not to use heavy calculations as the date-range sets that would be fed to the query are quite large. (Up to approx. 50.000 ranges at a time)
I've thought up a solution, although it seems cumbersome:
- Select all items as a subselect, splitting them in 1 day chunks
- Select a distinct startdate, sort on startdate, while issueing a rownumber as a subselect
- Join the same result twice, but with T1.rownumber = T1.rownumber + 1 for a comparison between the row an it's following row
- Calculate the date-distance in days, all items with distance greater than 1 day are starting points for a result row
- Select the starting points from the previous results, and select all rows between this and the next starting point, calculate a min and max date on this set
As I said...cumbersome.
Is there any way to consolidate these date ranges on a fairly easy way? Thanks for your effort in advance!
All Replies
-
Tuesday, September 22, 2009 3:29 PM
SET NOCOUNT ON
DECLARE @T TABLE(ID INT,FromDate DATETIME, ToDate DATETIME)
INSERT INTO @T(ID,FromDate,ToDate)
SELECT 1,'20090801','20090803' UNION ALL
SELECT 2,'20090802','20090809' UNION ALL
SELECT 3,'20090805','20090806' UNION ALL
SELECT 4,'20090812','20090813' UNION ALL
SELECT 5,'20090811','20090812' UNION ALL
SELECT 6,'20090802','20090802'
SELECT ROW_NUMBER() OVER(ORDER BY s1.FromDate) AS ID,
s1.FromDate,
MIN(t1.ToDate) AS ToDate
FROM @T s1
INNER JOIN @T t1 ON s1.FromDate <= t1.ToDate
AND NOT EXISTS(SELECT * FROM @T t2
WHERE t1.ToDate >= t2.FromDate
AND t1.ToDate < t2.ToDate)
WHERE NOT EXISTS(SELECT * FROM @T s2
WHERE s1.FromDate > s2.FromDate
AND s1.FromDate <= s2.ToDate)
GROUP BY s1.FromDate
ORDER BY s1.FromDate- Marked As Answer by tss68nl Tuesday, September 22, 2009 8:46 PM

