Friday, July 23, 2010 3:14 PM
Windows 2003 server Standard edition, 64 bit
SQL Server 2005 Developer and Enterprise Edition, 32 bit
Servers are configured with 8GB or 32GB of memory
(Eight CPUs / cores / whatever, plenty of hard drive space, network latency irrelevant)
After days of use, CLR memory usage appears to degrade or fragment to a point where “large” runs that had run succesfully now fail with “out of memory” issues. Resetting the server fixes this, but only for a time. Why, and how do we stop this from happening?
An overview of what we’re doing:
- Web site app calls SQL Server stored procedure “A” in one of several possible databases
- Procedure A calls CRL procedure “B” (Assembly created with CLR permission_set = “safe”). B is stored in a “library” database on the same SQL instance, i.e. there’s only the one copy
- CLR procedure B calls stored procedure “C” (back in the first database, as specified via parameter)
- Procedure C runs and returns three data sets to B. The second data set ranges from large to very large (over 60MB in our extreme cases)
- CLR procedure B performs some severe mathematical processing, returns a single data set to procedure A
- Procedure A slices, dices, and stores the data
- There could be multiple simultaneous calls to this routine, though in development and in to-date limited production release this is rare. (That is, contention may be an issue, but probably isn't yet.)
We have observed the following undesirable behavior:
- With “small” data sets returned to B from C, no problem
- At some point as the sets get larger, instead of results we get errors like so: “.NET Framework execution was aborted by escalation policy because of out of memory.”
- When we stop and restart SQL Server, the problem goes away and we are able to process our larger sets
- But only for a while. After time (days, not hours) the errors start to crop up again.
Research on the web dug up references to the use of sys.dm_os_memory_clerks. Use and review produced the following query:
-- The amount of memory used by the CLR?
single_pages_kb + multi_pages_kb + virtual_memory_committed_kb TotalCLR_kb
where type = 'MEMORYCLERK_SQLCLR'
Observations over time show that, on our SQL 32 bit instances (remember, OS is 64 bit and machines have well over 4G of memory):
- First off, though specific numbers vary, they are generally consistant. The following values are reasonably accurate
- Two rows are always returned, with one always set to 0. We ignore that row.
- After restart, calls with “small” data sets (maybe as much as 1M of data generated by procedure C) produce results of Reserved_kb in the 100M range, and Total_CLR_kb around 60M. Everything works.
- After restart, calls with “large” data sets (60M and over) get those numbers to jump, with Reserved_kb to well over 200M and TotalCLR_kb to 60, 80, or even 100M. Everything works.
- However, with the passage of days, the “out of memory” errors start to appear again for the large sets, and the results of the memory clerks quey fall to about: Reserved_kb 120M-160M, and TotalCLR_kb to 60M-70M (not enough to hold the data for the large sets, let alone to handle subsequent processing)
I have used the following query to (?) determine the largest free block of memory available:
-- Memory blocks available, and largest single block
;WITH VAS_Summary AS
Size = VAS_Dump.Size
,Reserved = SUM( CASE(CONVERT(INT, VAS_Dump.Base)^0) WHEN 0 THEN 0 ELSE 1 END )
,Free = SUM( CASE(CONVERT(INT, VAS_Dump.Base)^0) WHEN 0 THEN 1 ELSE 0 END )
CONVERT(VARBINARY, SUM(region_size_in_bytes)) [Size]
WHERE region_allocation_base_address <> 0x0
GROUP BY region_allocation_base_address
CONVERT(VARBINARY, region_size_in_bytes) [Size]
WHERE region_allocation_base_address = 0x0
) AS VAS_Dump
GROUP BY Size
SUM(CONVERT(BIGINT, Size) * Free) / 1024 AS [Total avail mem, KB]
,CAST(MAX(Size) AS BIGINT) / 1024 AS [Max free size, KB]
WHERE Free <> 0
On non-working systems this appears to be very low, at 8M or so; on working systems, it is large, at 40M and up.
So my questions are: why are we having these problems? There’s enough memory, and everything works initially, but over time things degrade and cease to function properly. Because we do start out with “enough memory", we’re not fiddling with the -g startup flag. Is “this part" of the memory being managed by SQL getting fragmented over time, blocking SQL from allocating large chunks and foiling our attempts at “large operations” with CLR code? Is there anything we can do to our current setup to prevent this from happening? (Periodically restarting SQL server = “turning it off and on again” = a bozo solution, please don’t suggest it.) Preliminary tests indicate that this will not occur in 64bit editions (the memory cleck queries say there’s 6G of memory for the CLR), so we may have to migrate… but this still leaves the question, what are the limitations we're hitting in 32bit SQL regarding availability and use of memory for the CLR?
Wednesday, July 28, 2010 5:50 AM
The main problem will be you CLR implementation, maybe memory leakage, holding to much objects at one time. You should check & optimize your CLR code. And consider how far a CLR implementation is necessary or if it also could be implemented in T-SQL.
Olaf Helper ----------- * cogito ergo sum * errare humanum est * quote erat demonstrandum * Wenn ich denke, ist das ein Fehler und das beweise ich täglich http://olafhelper.over-blog.de
Saturday, July 31, 2010 12:39 AM
This appears to be a memory leak issue. However, are you using context connection in your CLR stored procedure? If yes, can you use regular connection object and see if this makes any difference.
Thursday, January 19, 2012 11:41 PMFYI, it ultimately proved to be a combination of overly-large data sets and memory fragmentation. We solved the problem by upgrading (sidegrading?) to 64-bit, in which CLR memory allocation works a bit differently. The problem has not recurred since then.
- Marked As Answer by Bob BeaucheminMVP, Moderator Friday, January 20, 2012 5:02 PM