none
Query turning slow when joining small table RRS feed

  • Question

  • Hi all,

    I'm working with SQL Server for many years, but now I'm experiencing some issues that I can't solve... I have a query like this:

    declare @currentdate as date = (
    		select cast(getdate() as date)
    		);
    declare @referencedate as date = (
    		select dateadd(yy, datediff(yy, 0, @currentdate) - 2, 0)
    		);
    declare @Registration as TRegistration
    
    with InsEndDate (
    	Field1
    	,Field2
    	,Rownr
    	)
    as (
    	select T.Field1
    		,cast(T.Field2 as date)
    		,ROW_NUMBER() over (
    			partition by T.Field1 order by T.Field3 desc
    			) as 'Rownr'
    	from (
    		select ins.Field1
    			,ins.Field2
    		from edw.Dim.Dim1 ins with (nolock)
    		where ins.Date is not null		
    				)
    		) as T
    	)
    insert into @Registration (
    	Field1, Field2, Field3, Field4, Field5
    	)
    select ins.Field1
    	,pat.Field2
    	,pat.Field3
    	,cast(pat.FieldX as date) as Field4
    	,case 
    		when ins.Date is not null
    			then cast(T1.DateFirst as date)
    		else null
    		end
    	,insend.Field5
    from edw.dim.Dim1 ins with (nolock)
    join edw.dim.Dim2 pat with (nolock) on pat.PatSKey = ins.PatSKey
    left outer join (
    	select fgm.FieldX
    		,fgm.FieldY
    		,ROW_NUMBER() over (
    			partition by fgm.Field4 order by fgm.Appointment asc
    			) as rownr
    	from edw.ztj.FctTable fgm with (nolock)
    	where fgm.NumberOfX > 0
    	) as T1 on T1.rownr = 1 
    	and T1.FieldZ = ins.FieldZ
    left outer join InsEndDate insend on insend.Field1 = ins.Field1
    	and insend.Rownr = 1
    where (
    		ins.Adddate >= @referencedate
    		or (
    			(
    				ins.Enddate is null
    				or cast(ins.Enddate as date) > @referencedate
    				)
    			)
    		);
    
    select T.Field1
    	,'LeeftijdSkey' = [dbo].[fun_GET_Age](ipt.DateOfBirth, convert(date, convert(varchar(8), T.AppointmentDate), 112))
    	,'AreaSKey' = -1
    	, AreaCode
    into #step3
    from @Registration ipt
    join (
    	select field1,
    	       field2,
    		MAX(Field3) AS max	
    	from FactWith12MillionRecords
    	where fgm.AantalGeregistreerdeMinutenDirect > 0
    	group by fgm.Field1
    		,fgm.Field2
    	) as T on T.SurrogateKey = ipt.SurrogateKey
    left join dimdepartment dpt with (nolock) on dpt.Department = T.Department
    	and convert(date, convert(varchar(8), T.AppointmentDate), 112) between T.Start
    		and isnull(T.End, '20991231')
    
    UPDATE s3
    SET s3.AreaSkey = a.AreaSKey
    FROM #step3 AS s3
    INNER JOIN dimarea a AON s3.Areacode = A.Areacode

    As you can see in this query; I'm updating at the end, but it would be more efficient to join dimarea in the previous query. But these whole query takes around 1 minute and when I add this join, instead of the separated update it takes about 40 minutes.

    Is there anyone who can tell me where to start look for a solution?

    I've updated the statistics and they seem to be fine.

    dimarea has only 527 records in it. The 'main' query gives around 900.000 results.

    PS; it is a SQL 2016 installation

    • Edited by MiKeHendriks Saturday, January 18, 2020 9:15 AM Some more info
    Saturday, January 18, 2020 9:10 AM

Answers

  • Unfortunately, there was not much to work from in the dropbox. The execution plan was only for these statements:

    declare @currentdate as date = (
      select cast(getdate() as date)
      )
    drop table #bhpnrlist
    drop table #bhpnrlist
    drop table #bhpnrlist
    drop table #bhpnrlist

    In the file Query.text, the one at the bottom resembles what you had in your original post, but this query refers a table not included in the the download.

    However, I see that this query involves a table variable into which you insert earlier. This can very well be the problem. SQL Server will estimate that the table variable has a single row (since it is empty when the batch starts), when there in fact are many, and then it goes downhill from there.

    You can try adding this to the end of the query:

    OPTION (RECOMPILE)

    You can also test with replacing the table variable with a temp tables, which are not subject to the same problem.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Thursday, January 23, 2020 11:09 PM

All replies

  • It is always difficult to come with tuning suggestions without seeing the query plan and without knowledge of table and index definitions. In this case you have also partly anonymised the query, which makes it inconsistent. This means that the suggestions we can make may not be the right ones, but they are more shots in the dark.

    If you add dimarea to the query, but take out the call to fun_GET_Age - does that affect performance?

    What data type is AppointmentDate? If its datetime, change

       convert(date, convert(varchar(8), T.AppointmentDate), 112)
    to

       convert(date, T.AppointmentDate)


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Saturday, January 18, 2020 10:41 AM
  • I will check this. I will also be able to see if I can rebuild this tables and queries in an Azure DB, with fake data.

    But does this help? Or will it won't give some clear information because of the different data?

    Your suggestions will I check asap.

    Saturday, January 18, 2020 7:49 PM
  • Trying to reproduce a problem like this with fake data, can be quite challenging. Statistics are likely to be different and the optimizer may make completely different choices.

    The usual recommendation for a problem like this is to provide CREATE TABLE and CREATE INDEX statements for your tables, and also post execution plans in XML format. Since the latter are two verbose, the best is to upload tihs to Dropbox or a similar place (where you also can remove it once the issue has been resolved.)

    If you want to anonmyise table and column names, you can use SentryOne's Plan Explorer (which is free) for the purpose. Just make sure that your CREATE TABLE and CREATE INDEX statements are anonymised in the same way.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Saturday, January 18, 2020 10:15 PM
  • Hi MiKeHendriks,

    Could you please share us execution plan? Per your execution plan, please check which one will have the most cost . And then maybe you can create some indexes to improve the performance . 

    There is a similar article , please refer to How to create and optimize SQL Server indexes for better performance.

    Best Regrads,

    Rachel 


    MSDN Community Support
    Please remember to click "Mark as Answer" the responses that resolved your issue, and to click "Unmark as Answer" if not. This can be beneficial to other community members reading this thread. If you have any compliments or complaints to MSDN Support, feel free to contact MSDNFSF@microsoft.com.

    Monday, January 20, 2020 2:52 AM
  • Hi Erland,

    It took me some time to get everything prepared, but I think everything you asked for is here:

    https://www.dropbox.com/sh/nevxw5qtezhzz3l/AACsNFoOiZIC4TD8pS2kB8m1a?dl=0

    I am very interested in solving this issue, but more interested in how to solve this issue. This because of that instead of joining this table an update solves my performance issue, but I want to fix it the polite way and learn from this...

    Thursday, January 23, 2020 9:24 PM
  • Unfortunately, there was not much to work from in the dropbox. The execution plan was only for these statements:

    declare @currentdate as date = (
      select cast(getdate() as date)
      )
    drop table #bhpnrlist
    drop table #bhpnrlist
    drop table #bhpnrlist
    drop table #bhpnrlist

    In the file Query.text, the one at the bottom resembles what you had in your original post, but this query refers a table not included in the the download.

    However, I see that this query involves a table variable into which you insert earlier. This can very well be the problem. SQL Server will estimate that the table variable has a single row (since it is empty when the batch starts), when there in fact are many, and then it goes downhill from there.

    You can try adding this to the end of the query:

    OPTION (RECOMPILE)

    You can also test with replacing the table variable with a temp tables, which are not subject to the same problem.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Thursday, January 23, 2020 11:09 PM
  • Hi Erland,

    A late reply, but making a temptable solved the issue. Thank you!

    Tuesday, February 11, 2020 12:56 PM