locked
Cleared cached plans on SQL Server RRS feed

  • Question

  • I am using SQL Server 2016 on Windows Azure VMs (Windows Server 2016) for running ERP databases. 

    Analyzing high CPU problems in one of my Servers I am noticing that there are very few rows in sys.dm_exec_query_stats view- (54 rows), and in sys.dm_exec_cached_plans (1041 rows). The Server has 28 GB RAM and manages a total of 56 GB of databases. (for comparison another server with the same capacity have 35000 rows in each view).

    Checking the type of the included objects in cached plans I see (View-817, UsrTab-163, Rule-38, Proc-25, Adhoc-1, Prepared-6)  and a total of 160 MB used. For comparison in another better behaving Server I am getting (View-10, UsrTab-16, Rule-4, Proc-7, Adhoc-19135, Prepared-13160) and a total of 2500 MB. When I am viewing SQL Texts for the first server I see mostly system views (like sys.databases, dm_io_virtual_file_stats, etc) included in the cache plan. 

    Is something going wrong with this server? Maybe memory pressure for some reason?

    Both of the servers mentioned are using 28 GB RAM which I think is enough for the load. The problem I see is higher CPU from time to time. Maybe the Server needs a restart?

    Thanks,

    Dimitris

    Edit after 2 weeks:

    The situation getting worse:

    Unfortunately the problem still insists after two weeks. At first after restarting the servers (28GB), I had 3G and about 30000 cached plans in sys.dm_exec_query_stats. But after two weeks of running, half of my servers (8 SQL Servers) have actually no cache plans (<10) and high CPU, and the others have the 3G and ~30000 cache plans as I said before.

    The servers are identical, identically configured (same initialization scripts), and brand new Azure VMs from the same image (SQL Server 2016 with SP1, Windows Server 2016). I tried reseting optimize for ad hoc workloads in one server, (was activated already) as Susantha mentioned, but I see no difference.

    I don't know what to try next. I am almost convinced that under specific circumstances (~some days), SQL Server recalculates memory for cache plans and assigns almost zero for this operation. Maybe a bug or something?

    I worked for months with the same load in SQL Server 2014 without noticing these problem.




    • Edited by Dimitris V Thursday, December 22, 2016 5:07 AM Follow up
    Thursday, December 8, 2016 12:46 PM

Answers

  • By the way, an MVP colleague had a production situation akin to yours and in their case they had Query Stored turned on. When they turned off Query Store, the behaviour disappeared.

    This is not to say that Query Store is the culprit. The theory is because EF produced many single-use plans, Query Store got problems in flushing the cache and therefore it looked like memory pressure to SQL Server. But it seems to me that since this has been appeared in several versions of SQL Server, there is something underlying that is triggered.

    • Marked as answer by Dimitris V Saturday, January 7, 2017 9:16 AM
    Thursday, December 22, 2016 9:05 PM
  • Hi Dimitris,

    Glad to know that you didn't see it again, after disabling the query store. The bug that caused plan cache eviction when query store was turned on, is fixed now and available on Azure SQL DB. If you are using SQL Server, the next CU release for SQL Server 2016 will have a fix for this. After the fix is applied, you will no longer see plan evictions even when query store is turned on.

    Thanks,

    shreya

    • Marked as answer by Dimitris V Saturday, January 7, 2017 9:16 AM
    Wednesday, January 4, 2017 8:31 PM

All replies

  • If you restart the server, there will be even fewer plans in the cache. :-)

    Maybe something is wrong, or maybe the server is not in use and there is nothing to cache.

    I would suggest that you monitor the count for a while to see if goes up and down. Also monitor to see if there actually is any workload on the server.

    Thursday, December 8, 2016 1:43 PM
  • 28 G of RAM for total of 54 G database should be enough, I dont think memory pressure should be reason for less entries in plan cache. As already advised you have to monitor this

    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    Thursday, December 8, 2016 2:32 PM
  • I am monitoring the server more than 24 hours.

    The server is active and there is workload for sure.

    I cleaned the plan cache 20 hours ago using DBBC FREEPROCCACHE for both of the servers mentioned. At first there were few rows on both but after some hours of use the results are as described. My problem is not to clear the cache once, I can live with this, but I want the server start using it.

    One thing I didn't mentioned is that I changed the Max Server Memory to 24 GB. Before it was unlimited. Is it possible that this change created this behaviour?

    Thanks,

    Dimitris

    Thursday, December 8, 2016 3:22 PM

  • One thing I didn't mentioned is that I changed the Max Server Memory to 24 GB. Before it was unlimited. Is it possible that this change created this behaviour?

    Thanks,

    Dimitris

    Yep quite possible. When you change max server memory it flushes the plan cache and it has to start filling itself again. Plus it is quite possible that restricting memory has forced SQL Server to do internal calculations and limit the content of plan cache.

    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    • Marked as answer by Dimitris V Tuesday, December 13, 2016 9:41 AM
    • Unmarked as answer by Dimitris V Wednesday, December 21, 2016 9:24 PM
    Thursday, December 8, 2016 4:04 PM
  • OK, I should be a little clearer. What I'm actually interested in is whether it could be the case that the entire plan cache is flushed periodically. You could run you queries against the cache from an Agent job every five minutes or so to capture the development.

    Also check the SQL Server error log for message about the cache being flushed or other memory-related messages.

    Thursday, December 8, 2016 10:24 PM
  • optimize for ad hoc workloads setting affects the size of your plan cache. You may check that and compare with the other server that you mentioned. 

    Thursday, December 8, 2016 10:54 PM
  • Below code can be used to check the "optimize for ad hoc workloads" setting. 

    EXEC sp_configure 'show advanced option',1
    RECONFIGURE
    EXEC sp_configure 'optimize for ad hoc workloads'

    Thursday, December 8, 2016 10:57 PM
  • (Post included as a follow up edit in the initial post)

    Unfortunately the problem still persists after two weeks. At first after restarting the servers (28GB), I had 3G and about 30000 cached plans in sys.dm_exec_query_stats. But after two weeks of running, half of my servers (8 SQL Servers) have actually no cache plans (<10) and high CPU, and the others have the 3G and ~30000 cache plans as I said before.

    The servers are identical, identically configured (same initialization scripts), and brand new Azure VMs from the same image (SQL Server 2016 with SP1, Windows Server 2016). I tried reseting optimize for ad hoc workloads in one server, (was activated already) as Susantha mentioned, but I see no difference.

    I don't know what to try next. I am almost convinced that under specific situations (~some days), SQL Server recalculates memory for cache plans and assigns almost zero for this operation. Maybe a bug or something?

    I worked for months with the same load in SQL Server 2014 without noticing these problem.

    Thanks,

    Dimitris



    • Edited by Dimitris V Thursday, December 22, 2016 5:11 AM
    Wednesday, December 21, 2016 9:23 PM
  • So it seems that you are victim of a mysterious behaviour that many of my MVP colleagues have experienced. Many on our laptops, but some also in production. All of a sudden, the plan cache goes poof. And when it occurs, it seems to keep on recurring.

    The fact that you get high CPU is maybe not surprising - since the cache is flushed, the server needs to spend a lot on compilation.

    The one common theme I have been able to sense is that there is some shortage of memory, but nowhere near enough to explain this behaviour.

    Have you checked the SQL Server errorlog if there are any interesting error messages?

    It would be great if you could open a case with Microsoft, because this is a situation that many of us wants an answer to.

    Wednesday, December 21, 2016 10:23 PM
  • My problem Erland is that my SQL Servers are production servers in a SAAS environment with more than 1000 databases.

    We're constantly receiving customer complaints for poor performance. Do you think that it is a good idea to downgrade back to SQL 2014? Of course this a big project requiring huge human effort, carefully designed software, and enough downtime for our clients.

    But I don't know what else to do.

    My other option is restarting servers constantly (every night ?). But I suppose this will lead to poor performance also, but of course in less downtime for my clients.

    Thanks,

    Dimitris



    • Edited by Dimitris V Thursday, December 22, 2016 3:00 PM
    Thursday, December 22, 2016 5:00 AM
  • I understand that this is a great pain situation for you.

    But then there is all reason to open a case with Microsoft. If they admit that this is a bug, the case will not be of any cost for you. If they do not, it could cost you an arm and a leg, that's true.

    For a short-term solution, I would recommend that you restart the troublesome servers every night to see if this helps. (Some observations indicates that this does indeed help for a while.)

    And believe me: if you are able to open a case and Microsoft to identify that there is an issue, you will be a hero with more than one MVP.

    As for downgrading to SQL 2014, I would not recommend that. This behaviour has been observed down to at least SQL 2008.

    What could help though is to get more RAM for the VMs, although I don't know easily that is done.

    Thursday, December 22, 2016 8:27 AM
  • By the way, an MVP colleague had a production situation akin to yours and in their case they had Query Stored turned on. When they turned off Query Store, the behaviour disappeared.

    This is not to say that Query Store is the culprit. The theory is because EF produced many single-use plans, Query Store got problems in flushing the cache and therefore it looked like memory pressure to SQL Server. But it seems to me that since this has been appeared in several versions of SQL Server, there is something underlying that is triggered.

    • Marked as answer by Dimitris V Saturday, January 7, 2017 9:16 AM
    Thursday, December 22, 2016 9:05 PM
  • Sorry for my late response Erland, but I was spending my time opening a case with Microsoft.

    We focused on a server with this behavior, they admit there is zero memory assigned to cache plan, and trying to investigate what has happened. Product group will be informed of the case. We restarted the servers and monitoring their behavior.

    But, I think that your last notice about the Query Store could be the root of the problem. We have turned on Query Store for some period for specific databases, then turned it off, but the memory never assigned back to the cached plan. I checked that in Server's log file (query store on and off). I have to check other servers' logs with erratic behavior to find if there was a correlation of query store and zero cache plan memory problem.

    All of our servers now are clean restarted and we try to catch the problem again. But my instinct telling me that you are right, Query Store is the problem. One day now with no problems, but no heavy load these Christmas days for our servers. Next week will be in full load and I will confirm if the problem arising from Query Store use. 

    Thanks,

    Dimitris

    PS: For the time I am giving one vote to your suggestion, and I am ready to mark your response as answer next week, :).

    Saturday, December 24, 2016 9:50 AM
  • Sorry for my late response Erland, but I was spending my time opening a case with Microsoft.

    Seems like a correct order of priority. :-)

    We focused on a server with this behavior, they admit there is zero memory assigned to cache plan, and trying to investigate what has happened. Product group will be informed of the case. We restarted the servers and monitoring their behavior.

    Seems like you got a good reponse, at least initially.

    But, I think that your last notice about the Query Store could be the root of the problem. We have turned on Query Store for some period for specific databases, then turned it off, but the memory never assigned back to the cached plan. I checked that in Server's log file (query store on and off). I have to check other servers' logs with erratic behavior to find if there was a correlation of query store and zero cache plan memory problem.

    As I noted the behaviour has been observed on earlier versions of SQL Server as well, so I'm thinking that Query Store is just the component that causes SQL Server to go into this wobbling state.
    Of course, for you the most important thing is to get your production situation cleared out, so if Query Store is what takes you over the brink, you may be content with that.

    The problem for some of my MVP colleagues is that they intend to demos where they run queries against the plan cache, and they sure feel stupid when the queries come back empty when they are not supposed to!

    Saturday, December 24, 2016 12:00 PM
  • Hi Dimitris,

    Sorry to know about the unpleasant experience you are having with plan cache eviction. The next time you see this behavior, could you please try selecting from sys.dm_os_memory_cache_clock_hands and sys.dm_os_ring_buffers ? These should be able to tell you if the plan cache entries were removed due to some external memory pressure.

    Thanks,

    shreya


    Tuesday, December 27, 2016 10:54 PM
  • Hi Shreya,

    I will try this. But since I never activated Query Store again, I didn't had this behavior any more. I am tempting to try to activate it again to check it.

    Thanks,

    Dimitris.

    Tuesday, January 3, 2017 5:25 PM
  • Hi Dimitris,

    Glad to know that you didn't see it again, after disabling the query store. The bug that caused plan cache eviction when query store was turned on, is fixed now and available on Azure SQL DB. If you are using SQL Server, the next CU release for SQL Server 2016 will have a fix for this. After the fix is applied, you will no longer see plan evictions even when query store is turned on.

    Thanks,

    shreya

    • Marked as answer by Dimitris V Saturday, January 7, 2017 9:16 AM
    Wednesday, January 4, 2017 8:31 PM
  • Hi,

    I have exactly the same problem. I've tested the query store with the default MAX_STORAGE_SIZE_MB, which is 100MB, and then disabled it. 
    But since, like I can see from sys.dm_exec_cached_plans where objtype='Proc', the cache plan of my server is now cleared each time it gets to 100MB. Because it is a busy server, it means every minutes.
    I don't have a memory pressure. I have 28GB and before enabled query store, I didn't had any cache plan problem.
    I've already seen this problem on a second server. I could fixed it with a restart. 
    Is there another solution than a restart to fix the problem ? It is a production server and I don't want to have a downtime.

    Regards,



    Jack

    Thursday, October 12, 2017 1:06 PM
  • So you are saying even after disabling query store the plans are getting flushed every 5 mins or so ?

    Another workaround is to give SQL Server service account locked pages in memory privilege, does SQL Server have that ?


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    Thursday, October 12, 2017 2:16 PM
  • First, I recommend that you start a new thread, rather than piggybacking on an old thread, even if the symptoms are seemingly the same.

    When you do this, please include the output from "SELECT @@version". Also can you run this query a few times with some minute in between:

    SELECT * FROM sys.dm_os_memory_cache_clock_hands

    Check the values in the column rounds_count. Do they stay at 0? Or do they increment all the time? In that case, is the internal or external hands that are moving?

    Lock Pages in Memory that Shanky suggests is definitely worth trying.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    • Proposed as answer by JKLMFR Monday, October 16, 2017 6:47 AM
    Thursday, October 12, 2017 2:28 PM
  • First, I recommend that you start a new thread, rather than piggybacking on an old thread, even if the symptoms are seemingly the same.

    When you do this, please include the output from "SELECT @@version". Also can you run this query a few times with some minute in between:

    SELECT * FROM sys.dm_os_memory_cache_clock_hands

    Check the values in the column rounds_count. Do they stay at 0? Or do they increment all the time? In that case, is the internal or external hands that are moving?

    Lock Pages in Memory that Shanky suggests is definitely worth trying.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Erland the only reason I did not split there thread because Shreya Verma from MS is on this thread and you know its hard to find some core MS guys here. The problem is deep and genuine as we know so just wanted to make sure top MS guys are aware about this

    Do you still recommend to split it ?


    Cheers,

    Shashank

    Please mark this reply as answer if it solved your issue or vote as helpful if it helped so that other forum members can benefit from it

    My TechNet Wiki Articles

    MVP

    Friday, October 13, 2017 4:23 AM
  • Hello,

    @Shashank : Yes I am saying even after disabling query store the plans are getting flushed every 5 minutes. The problem start exactly at the time I've started and stopped the query store. I can be sure because I have redgate monitor and I can see the effect on cpu before and after the action.  

    @Erland : Sorry, I should probably start a new thread. My memory model is CONVENTIONAL (no LPIM).

    I have planned to patch the last CU and to activated the LPIM. If my problem is not resolved, I will post a new question.

    Regards.

    Jack

    Monday, October 16, 2017 6:47 AM