What can cause PLE to drop suddenly?
-
Thursday, August 09, 2012 9:08 PM
I see I'm not the first to ask this, but I don't see an answer that helps.
I have a small and lightly used server, 4gb RAM, single core, virtual environment, Windows Server 2008, SQL 2005.
PLE gets up around 300, falls to zero within a second, then slowly comes back up, only to be hit again.
I'm not only not aware of anything scanning big tables to cause it to fall, but it wouldn't fall that quickly in any case, it's almost as if someone were periodically calling dbcc dropcleanbuffers. Could also be called by someone dropping large tables already copied into buffers, I guess, but while that does happen nightly I don't see that happening intraday, over and over.
Actually does not seem to have a major problem with performance, just the misbehavior of the counter making me nervous.
I do see some queries with high physical reads but it's weird, it simply reads a single row by PK, so I assume the high physical reads are caused by the buffers getting flushed, and not the other way around.
So I find it - mysterious!
Suggestions welcome!
Josh
All Replies
-
Friday, August 10, 2012 3:22 AM
This query will help you determine which indexes and tables are being cached in the buffer pool. By comparing that to your queries that have a "large" number of phyiscal reads you may be able to determine what's happening. Keep in mind that a query that retrieves just a few records can still be the problem if it is executed many times and returns different pages many times.
Another thing to consider is that your small VM can't handle the existing memory requirements of your application. Obviously 0 is bad but it is the sharp downward trend that you should be watching for rather than a specific value.
USE [mydb];
SELECT count(*)AS cached_pages_count
,name ,index_id
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT object_name(object_id) AS name
,index_id, allocation_unit_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.partition_id
AND au.type = 2
) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
WHERE database_id = db_id()
GROUP BY name, index_id
ORDER BY cached_pages_count DESC; -
Friday, August 10, 2012 6:33 AMAnswererobject level inside the database. That query looks like this:
SELECT TOP 25
obj.[name],
i.[name],
i.[type_desc],
count(*)AS Buffered_Page_Count ,
count(*) * 8192 / (1024 * 1024) as Buffer_MB
-- ,obj.name ,obj.index_id, i.[name]
FROM sys.dm_os_buffer_descriptors AS bd
INNER JOIN
(
SELECT object_name(object_id) AS name
,index_id ,allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND (au.type = 1 OR au.type = 3)
UNION ALL
SELECT object_name(object_id) AS name
,index_id, allocation_unit_id, object_id
FROM sys.allocation_units AS au
INNER JOIN sys.partitions AS p
ON au.container_id = p.hobt_id
AND au.type = 2
) AS obj
ON bd.allocation_unit_id = obj.allocation_unit_id
LEFT JOIN sys.indexes i on i.object_id = obj.object_id AND i.index_id = obj.index_id
WHERE database_id = db_id()
GROUP BY obj.name, obj.index_id , i.[name],i.[type_desc]
ORDER BY Buffered_Page_Count DESC
----------------------------------------------------------------------
Identify the queries consuming (or that will consume) the most memory
select requested_memory_kb, grant_time, cost, plan_handle, sql_handle
from sys.dm_exec_query_memory_grantsBest Regards,Uri Dimant SQL Server MVP,http://sqlblog.com/blogs/uri_dimant/
-
Friday, August 10, 2012 7:06 AMModerator
-
Friday, August 10, 2012 7:21 AMAnswerer
Tibor
Do you mean PLE is not the right counter (of many others) to determine the lack of memory....If not , what counters you are looking at to see if the instance suffers of luck of memory?
Best Regards,Uri Dimant SQL Server MVP,http://sqlblog.com/blogs/uri_dimant/
-
Friday, August 10, 2012 7:40 AM
4GB is not alot of memory assuming that the os has it's usual slice what remains is probably going to be around 1.xGB for the buffer pool. You can check exact size of your pool with http://gallery.technet.microsoft.com/scriptcenter/Page-Life-Expectancy-a6a51d5b
Anyway my point being that reading 1.xGB in to the buffer pool is not going to take that long at all. (There are several scripts on this thread that will show you what is in the cache which you can compare.)
The other options are pressures outside sql server such as windows asking for the memory back, or your virtualisation host.
Does the page life drop happen at a set time? is a job running? I would use perfmon counter over a period of time to determine exactly when the page life drops and see if you can correlate it back. I would also look at the memory usage of this virtualised environment from the host.
Jon
-
Friday, August 10, 2012 8:00 AMModerator
Uri,
I'm not saying that the value is always incorrect. It can be incorrect if you happen to grab the value at a bad time, which corresponds to what Josh is seeing. Such readings should be ignored, but the values that are "normal" as still as valid as the PLE is (which has been debated, but that is just a Bingle search away).
-
Friday, August 10, 2012 10:32 AM
Is PLE Drop coinciding with Backup, Mainly Update Statistics, DBCC CheckDB or Re-index jobs overnight.
I have seen this happen during nights when Database Maintenance tasks are running.
Vamshi SQL DBA(MCITP) My Blogs: http://sqlserver-dba.co.uk http://sql-developers.co.uk/
-
Friday, August 10, 2012 3:21 PM
Stephen, thanks, that's a great query (to have in any case!), it should let me tell *what* is being cleared, which should give me a hint as to who or why!
Otherwise, it's happening about every ten minutes, I'm running profiler and can't catch the command, it doesn't seem related to any job or anything about the VM environment. But of course it seems like it *must* be!
No other counter I'm watching spikes at that time, the read queue grows at about that time, but again it seems more of an effect and not the cause, ... though I still can't figure what or why, it's not an app that ever intentionally reads a lot of data.
... but I won't have access again to the system until sometime next week, if ever, so there will be some delay in trying this out!
Thanks all,
Josh

