Serious Performance Problems with INTERSECT using SQL 2005
-
Friday, August 14, 2009 6:48 AM
I have a set of queries that I need to do an intersect on (could be up to 6 queries). Any time that are more than 2 intersects the performance degrade significantly. Here is the example
SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Conversions) = 0INTERSECT
SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '6/14/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Cost) > 300INTERSECT
SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '8/12/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Impressions) >= 3
Each query on its own runs in about 2 seconds. Any two of the three run together with an INTERSECT between then runs in about 2 seconds. Add the 3rd query with the INTERSECT and it runs for 6 minutes.
If I modify the query to be the following, where I put the results from the first INTERSECT into a temp table then intersect that result with the 3rd query the whole thing runs in about 2 seconds. The problem is that since I need to build this dynamically and also need to support anywhere from 2 to 6 INTERSECT's the TEMP table solution is much more difficult to code.
Any suggestions would be appreciated.
Jeff Cleath
Inetium, LLC- Edited by jcleath Friday, August 14, 2009 7:09 AM Formatting
- Changed Type Mangal PardeshiModerator Friday, August 14, 2009 8:12 AM
Answers
-
Friday, August 14, 2009 7:46 AM
Are you getting parallelism?
I think the query engine works on all queries simultaneously, and INTERSECTs on the fly, not when all queries are done and stored in a worktable.
Try my suggestion and pleas post back the results.
insert #temp1 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Conversions) = 0 insert #temp2 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '6/14/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Cost) > 300 insert #temp3 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '8/12/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Impressions) >= 3 select * from #temp1 intersect select * from #temp2 intersect select * from #temp3
- Marked As Answer by jcleath Friday, August 14, 2009 12:35 PM
-
Friday, August 14, 2009 7:52 AM
Also, have you tried this rewrite? The table is only read once...
SELECT DataSourceName, Campaign, AdGroup, Keyword, MatchType, KeywordID, AdGroupID, DataSourceID, MaximumCPC, KeywordType FROM MonitorTable WITH (NOLOCK) WHERE ClientID = 51 AND Keyword NOT LIKE '%unmatched%' AND [Date] >= '5/15/2009' -- Some further optimization GROUP BY DataSourceName, Campaign, AdGroup, Keyword, MatchType, KeywordID, AdGroupID, DataSourceID, MaximumCPC, KeywordType HAVING SUM(CASE WHEN [Date] >= '5/15/2009' THEN Conversions ELSE 0 END) = 0 AND SUM(CASE WHEN [Date] >= '6/14/2009' THEN Cost ELSE 0 END) > 300 AND SUM(CASE WHEN [Date] >= '8/12/2009' THEN Impressions ELSE 0 END) >= 3
- Edited by SwePesoMVP Friday, August 14, 2009 7:54 AM
- Marked As Answer by jcleath Friday, August 14, 2009 1:07 PM
All Replies
-
Friday, August 14, 2009 7:01 AMI would guess it's not INTERSECT that is the problem.
It's grouping by 10 columns, three times.
Try inserting all the queries in a separate temp table, and then at the end, INTERSECT them. -
Friday, August 14, 2009 7:05 AMI appreciate the response. But as I mentioned in the post, if I do either 2 of the queries together with the INTERSECT or two queries together with and INTERSECT to a temp table and then INTERSECT that temp table with the third query then it completed the work in 2 seconds. Also, each query in itself returns in about 1 second.
This tells me that the INTERSECT is the issue if it takes over 6 minutes.
Jeff Cleath Inetium, LLC jcleath@inetium.com -
Friday, August 14, 2009 7:46 AM
Are you getting parallelism?
I think the query engine works on all queries simultaneously, and INTERSECTs on the fly, not when all queries are done and stored in a worktable.
Try my suggestion and pleas post back the results.
insert #temp1 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Conversions) = 0 insert #temp2 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '6/14/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Cost) > 300 insert #temp3 SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '8/12/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Impressions) >= 3 select * from #temp1 intersect select * from #temp2 intersect select * from #temp3
- Marked As Answer by jcleath Friday, August 14, 2009 12:35 PM
-
Friday, August 14, 2009 7:52 AM
Also, have you tried this rewrite? The table is only read once...
SELECT DataSourceName, Campaign, AdGroup, Keyword, MatchType, KeywordID, AdGroupID, DataSourceID, MaximumCPC, KeywordType FROM MonitorTable WITH (NOLOCK) WHERE ClientID = 51 AND Keyword NOT LIKE '%unmatched%' AND [Date] >= '5/15/2009' -- Some further optimization GROUP BY DataSourceName, Campaign, AdGroup, Keyword, MatchType, KeywordID, AdGroupID, DataSourceID, MaximumCPC, KeywordType HAVING SUM(CASE WHEN [Date] >= '5/15/2009' THEN Conversions ELSE 0 END) = 0 AND SUM(CASE WHEN [Date] >= '6/14/2009' THEN Cost ELSE 0 END) > 300 AND SUM(CASE WHEN [Date] >= '8/12/2009' THEN Impressions ELSE 0 END) >= 3
- Edited by SwePesoMVP Friday, August 14, 2009 7:54 AM
- Marked As Answer by jcleath Friday, August 14, 2009 1:07 PM
-
Friday, August 14, 2009 12:35 PM
Thanks Peso, that was a good suggestion. I tried your original exact format and received errors around
Msg 208, Level 16, State 0, Line 1
Invalid object name '#temp3'.
But I changed it to
SELECT * into #temp1 FROM
(SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Conversions) = 0 ) ASELECT * into #temp2 FROM
(SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '6/14/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Cost) > 300 ) BSELECT * into #temp3 FROM
(SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
FROM MonitorTable M1 WITH(NOLOCK)
WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '8/12/2009'
GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType
HAVING SUM(M1.Impressions) >= 3 ) Cselect * from #temp1
intersect
select * from #temp2
intersect
select * from #temp3
And it worked in 2 seconds.
Thanks
Jeff Cleath
Jeff Cleath Inetium, LLC jcleath@inetium.com -
Friday, August 14, 2009 12:39 PMPeso, this also appears to be a great suggestion, thank you!
This last idea is a pretty drastic change to how the queries were originally built (which I did not personally do so I need to more thoroughly validate the logic and results sets, but this also returned the same results in my initial testing in about 2 seconds. So I am hopefull this is a valid approach because it would be simpler to extend.
I will reply again one I have fully validated this approach.
Thanks again.
Jeff Cleath.
Jeff Cleath Inetium, LLC jcleath@inetium.com -
Friday, August 14, 2009 1:05 PM
You can simplify the insert from
SELECT * into #temp1 FROM (SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Conversions) = 0 ) AS A
to this
SELECT M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType INTO #Temp1 FROM MonitorTable M1 WITH(NOLOCK) WHERE (M1.ClientID = 51 AND M1.Keyword NOT LIKE '%unmatched%') AND M1.Date >= '5/15/2009' GROUP BY M1.DataSourceName, M1.Campaign, M1.AdGroup, M1.Keyword, M1.MatchType, M1.KeywordID, M1.AdGroupID, M1.DataSourceID, M1.MaximumCPC, M1.KeywordType HAVING SUM(M1.Conversions) = 0
-
Friday, August 14, 2009 1:10 PM
Thank you for your feedback.
My second suggestion will scan the table only once, so there is much less change you get blocking issues using this query.
The optimization trick for teh WHERE ... AND [Date]> '...'
is that the date in the WHERE clause is the oldest date of all the original date filters. -
Thursday, August 20, 2009 10:13 AMHi!
Have you validated the results yet?

