CLR “Out of Memory” problem occurs eventually/inevitably
-
Friday, July 23, 2010 3:14 PM
Windows 2003 server Standard edition, 64 bit
SQL Server 2005 Developer and Enterprise Edition, 32 bit
Servers are configured with 8GB or 32GB of memory
(Eight CPUs / cores / whatever, plenty of hard drive space, network latency irrelevant)
Summary:
After days of use, CLR memory usage appears to degrade or fragment to a point where “large” runs that had run succesfully now fail with “out of memory” issues. Resetting the server fixes this, but only for a time. Why, and how do we stop this from happening?
Details:
An overview of what we’re doing:
- Web site app calls SQL Server stored procedure “A” in one of several possible databases
- Procedure A calls CRL procedure “B” (Assembly created with CLR permission_set = “safe”). B is stored in a “library” database on the same SQL instance, i.e. there’s only the one copy
- CLR procedure B calls stored procedure “C” (back in the first database, as specified via parameter)
- Procedure C runs and returns three data sets to B. The second data set ranges from large to very large (over 60MB in our extreme cases)
- CLR procedure B performs some severe mathematical processing, returns a single data set to procedure A
- Procedure A slices, dices, and stores the data
- There could be multiple simultaneous calls to this routine, though in development and in to-date limited production release this is rare. (That is, contention may be an issue, but probably isn't yet.)
We have observed the following undesirable behavior:
- With “small” data sets returned to B from C, no problem
- At some point as the sets get larger, instead of results we get errors like so: “.NET Framework execution was aborted by escalation policy because of out of memory.”
- When we stop and restart SQL Server, the problem goes away and we are able to process our larger sets
- But only for a while. After time (days, not hours) the errors start to crop up again.
Research on the web dug up references to the use of sys.dm_os_memory_clerks. Use and review produced the following query:
-- The amount of memory used by the CLR?
SELECT
single_pages_kb + multi_pages_kb + virtual_memory_committed_kb TotalCLR_kb
,virtual_memory_reserved_kb Reserved_kb
from sys.dm_os_memory_clerks
where type = 'MEMORYCLERK_SQLCLR'
Observations over time show that, on our SQL 32 bit instances (remember, OS is 64 bit and machines have well over 4G of memory):
- First off, though specific numbers vary, they are generally consistant. The following values are reasonably accurate
- Two rows are always returned, with one always set to 0. We ignore that row.
- After restart, calls with “small” data sets (maybe as much as 1M of data generated by procedure C) produce results of Reserved_kb in the 100M range, and Total_CLR_kb around 60M. Everything works.
- After restart, calls with “large” data sets (60M and over) get those numbers to jump, with Reserved_kb to well over 200M and TotalCLR_kb to 60, 80, or even 100M. Everything works.
- However, with the passage of days, the “out of memory” errors start to appear again for the large sets, and the results of the memory clerks quey fall to about: Reserved_kb 120M-160M, and TotalCLR_kb to 60M-70M (not enough to hold the data for the large sets, let alone to handle subsequent processing)
I have used the following query to (?) determine the largest free block of memory available:
-- Memory blocks available, and largest single block
;WITH VAS_Summary AS
(
SELECT
Size = VAS_Dump.Size
,Reserved = SUM( CASE(CONVERT(INT, VAS_Dump.Base)^0) WHEN 0 THEN 0 ELSE 1 END )
,Free = SUM( CASE(CONVERT(INT, VAS_Dump.Base)^0) WHEN 0 THEN 1 ELSE 0 END )
FROM (
SELECT
CONVERT(VARBINARY, SUM(region_size_in_bytes)) [Size]
,region_allocation_base_address [Base]
FROM sys.dm_os_virtual_address_dump
WHERE region_allocation_base_address <> 0x0
GROUP BY region_allocation_base_address
UNION SELECT
CONVERT(VARBINARY, region_size_in_bytes) [Size]
,region_allocation_base_address [Base]
FROM sys.dm_os_virtual_address_dump
WHERE region_allocation_base_address = 0x0
) AS VAS_Dump
GROUP BY Size
)
SELECT
SUM(CONVERT(BIGINT, Size) * Free) / 1024 AS [Total avail mem, KB]
,CAST(MAX(Size) AS BIGINT) / 1024 AS [Max free size, KB]
FROM VAS_Summary
WHERE Free <> 0
On non-working systems this appears to be very low, at 8M or so; on working systems, it is large, at 40M and up.
So my questions are: why are we having these problems? There’s enough memory, and everything works initially, but over time things degrade and cease to function properly. Because we do start out with “enough memory", we’re not fiddling with the -g startup flag. Is “this part" of the memory being managed by SQL getting fragmented over time, blocking SQL from allocating large chunks and foiling our attempts at “large operations” with CLR code? Is there anything we can do to our current setup to prevent this from happening? (Periodically restarting SQL server = “turning it off and on again” = a bozo solution, please don’t suggest it.) Preliminary tests indicate that this will not occur in 64bit editions (the memory cleck queries say there’s 6G of memory for the CLR), so we may have to migrate… but this still leaves the question, what are the limitations we're hitting in 32bit SQL regarding availability and use of memory for the CLR?
Philip
All Replies
-
Wednesday, July 28, 2010 5:50 AM
Hello Philip,
The main problem will be you CLR implementation, maybe memory leakage, holding to much objects at one time. You should check & optimize your CLR code. And consider how far a CLR implementation is necessary or if it also could be implemented in T-SQL.
See e.g.:
Various memory errors are logged to SQL Server error log when using SQL CLR objects
Memory Usage in SQL CLR
Olaf Helper ----------- * cogito ergo sum * errare humanum est * quote erat demonstrandum * Wenn ich denke, ist das ein Fehler und das beweise ich täglich http://olafhelper.over-blog.de -
Saturday, July 31, 2010 12:39 AM
Philip,
This appears to be a memory leak issue. However, are you using context connection in your CLR stored procedure? If yes, can you use regular connection object and see if this makes any difference.
Jay [MSFT]
-
Thursday, January 19, 2012 11:41 PM
FYI, it ultimately proved to be a combination of overly-large data sets and memory fragmentation. We solved the problem by upgrading (sidegrading?) to 64-bit, in which CLR memory allocation works a bit differently. The problem has not recurred since then.
Philip- Marked As Answer by Bob BeaucheminMVP, Moderator Friday, January 20, 2012 5:02 PM

