none
How to Query the Query Store for highest memory consuming queries on certain day and time in the past. RRS feed

  • Question

  • I'm working on gathering information about the queries that caused the CPU spike last week in one of our Prod environment. The counters are pointing towards memory bottleneck that has led to the the CPU spike. This memory pressure in turn led to plan cache being flushed. There was lots of report generating queries running on prod at that particular moment. However, those queries do run regularly but we need to find out which query at that particular time was consuming the most memory.

    I know that query store keep the data on disk which allows for investigating past issues. However, I'm not sure how to get the historical data as I don't see any option to select data and time Query Store. 

    Any help would be appreciated. 

    Environment: Microsoft SQL Server 2017 (RTM-CU12) (KB4464082) - 14.0.3045.24 (X64)   Oct 18 2018 23:11:05   Copyright (C) 2017 Microsoft Corporation  Standard Edition (64-bit) on Windows Server 2016 Datacenter 10.0 <X64> (Build 14393: ) (Hypervisor) 
    • Edited by Shaddy_1 Monday, September 16, 2019 2:56 PM
    Monday, September 16, 2019 2:55 PM

Answers

All replies

  • I suggest you start here:

    https://blogs.msdn.microsoft.com/docast/2017/07/30/sql-high-cpu-troubleshooting-checklist/

    However, the first thing to do is make sure your "max server memory" is set correctly for the server.  If you do not set this value, you may end up massively paging, which will cause high CPU and other issues.

    • Marked as answer by Shaddy_1 Monday, September 16, 2019 5:00 PM
    Monday, September 16, 2019 3:55 PM
    Moderator
  • Hi Tom,

    Thanks for sharing the link, it is very informational however it will not be useful in my situation as the CPU went to 100 within seconds and cache was flushed. We had to immediately Fail-over. As soon as we fail over the cpu went down immediately and the secondary CPU was stable. We use third party Database Monitoring Tool and it shows Memory consumption above 96% with disk read latency increased significantly, and the PLE went down to below 60. However, this tool does not shows each query by memory or CPU consumption. 

    This is the reason I'm looking at Query Store because it provides the information I'm looking for and moreover it survives the SQL Server restart as well due to information being kept in the disk. 

     

    Monday, September 16, 2019 4:55 PM
  • What you describe is not likely a single query, but a system problem with not setting the max server memory to a reasonable value.  This causes the server to become very busy and not respond.

    How much physical RAM does your server have?  What is the setting of "max server memory"?

    Monday, September 16, 2019 4:58 PM
    Moderator
  • I found out how to look for the information I was looking for, see the screenshot below: 

    

    • Marked as answer by Shaddy_1 Monday, September 16, 2019 5:00 PM
    Monday, September 16, 2019 5:00 PM
  • Tom,

    I agree it was system wide problem but the MaxMemory was correctly set.  

    Memory: 122GB
    SQL Server Max Memory: 102 GB

    Please see the screenshot for query store from that particular moment when the issue occurred showing highest memory consuming queries. I think the first three were the problematic queries as they were consuming the most memory. 

    What is your suggestions? 

    

    Also see this screenshot for the type of waits we had: Resource_Semaphore_query was the highest wait.

     

    I'm looking for the offending Query that was the root cause of this issue. 

    • Edited by Shaddy_1 Monday, September 16, 2019 5:52 PM
    Monday, September 16, 2019 5:37 PM
  • That may not be a problem at all.  A query using RAM is perfectly normal and expected behavior.

    Are there errors in the SQL Server log like "A significant part of SQL Server process memory has been paged out" or "FlushCache"?

    Please post the errors.

    Are you running with "Lock Pages In Memory"?

    The first thing I would suggest is you install the latest CU and see if the problem still occurs.  MS is horrible at publicly documenting changes and just because there is nothing in the log of changes, does not mean it is not fixed already.

    https://support.microsoft.com/en-us/help/4047329/sql-server-2017-build-versions

    Monday, September 16, 2019 5:44 PM
    Moderator
  • Hi Phillip,

    To answer your quesitons:

    1. I did not find anything in the SQL Server log like "A significant part of SQL Server process memory has been paged out" or "FlushCache"

    2. Lock pages in Memory is enabled for the SQL Server Account

    3. Installing the latest patch is no an option at the moment as it requires management approval and downtime.

    4. On the next day we were able to fail back the database to the Primary and we did not see that issue as reoccurring as the Memory/CPU is stable since we failed back to primary. 

    I still think that it was a particular query. Any other suggestion? 



    • Edited by Shaddy_1 Monday, September 16, 2019 8:20 PM
    Monday, September 16, 2019 7:11 PM
  • Thanks for sharing the link, it is very informational however it will not be useful in my situation as the CPU went to 100 within seconds and cache was flushed. We had to immediately Fail-over. As soon as we fail over the cpu went down immediately and the secondary CPU was stable.

    As I understand it, your theory is that because a query needed lots of memory, plans were flushed, and this cause a CPU due to the need to compile everything.

    Permit it to point that failing over to the other node has the same effect, since you start with empty caches, both the plan cache and the buffer cache.


    Erland Sommarskog, SQL Server MVP, esquel@sommarskog.se

    Monday, September 16, 2019 9:18 PM