Asked by:
more worker process (w3wp) than application pools

Question
-
User-269706872 posted
Hello
We have configured IIS 6 with 5 application pools and 5 web sites.
Only one website is enabled and serves traffic.
The web site is load balanced in a web farm.
Today four servers crashed (stopped responsing). I could not even login through RDP. Although I could use pslist to see what processes that were running. I noticed that there were 30-40 w3wp-processes running on the servers that crashed. And the ones that didnt crash had 1 w3wp (as guessed since we have one application pool and one web site). The application pool is configured to create max 1 worker process. Each worker process used about 100mb of memory. The log files says that system ran our of resources (read memory).
After reboot, all seems normal.
What could have happend? Any ideas are welcome! Why did it create alot of w3wp-processes without removing the old ones?
We are running a .net 2.0-application on windows server 2003 web edition server.
Monday, August 11, 2008 1:52 PM
All replies
-
User1489974807 posted
No web-garden was setup for the pool (I am sensing that from your post ... but to be sure) either. Right?
Monday, August 11, 2008 2:24 PM -
User-269706872 posted
Correct. We dont use web garden.
Monday, August 11, 2008 2:26 PM -
User1489974807 posted
What type of application are you running? Is it .NET or ASP? If .NET, what framework? Is the app spawning threads of some form? Right now, if you RDP into one of those boxes and run cscript iisapp.vbs does it give you the correct number of pools?
Monday, August 11, 2008 3:21 PM -
User-2064283741 posted
yeah I was going say use iisapp to see what application pools are attached to the worker processes. It may be that one apppool is generating all these work processes and tehn you can direct your efforts in that area.
I would also turn on the eventlog options for app pool recycling to see what is stopping and restarting. It may be that somehow an app pool is not closing and another is open on a recycle maybe say when a memory limit,etc is reached you recycle an pool.
Turn on eventlogs
"Additionally you can log when the app pools recycle. There are many reasons for a recycle and by default these are not recorded in the event viewer (very little is recorded by default for us admins). It is useful to have this for all events that occur.
Complete list:
AppPoolRecycleTime
AppPoolRecycleRequests
AppPoolRecycleSchedule
AppPoolRecycleMemory
AppPoolRecycleIsapiUnhealthy
AppPoolRecycleOnDemand
AppPoolRecycleConfigChange
AppPoolRecyclePrivateMemoryTo turn them all on (for DefaultAppPool in this example):
cscript adsutil.vbs Set w3svc/AppPools/DefaultAppPool/LogEventOnRecycle 255
"
Monday, August 11, 2008 4:26 PM -
User-269706872 posted
I am using asp.net 2.0.
We are not using any Thread-classes in the web application, so no we are not creating any more threads other than the one that is created by asp.net.
The output from that script gives me two worker processes. the default application pool, and then also the application pool that is dedicated to my web site.
This is what PSList (http://technet.microsoft.com/en-us/sysinternals/bb896682.aspx) returned on one server during the crash.
Name Pid Pri Thd Hnd Priv CPU Time Elapsed Time w3wp 3852 8 92 2523 393808 3:22:59.515 38:55:59.740 w3wp 336 8 262 1528 150456 0:00:18.031 19:33:18.033 w3wp 4564 8 264 1550 148940 0:00:17.859 19:25:00.786 w3wp 3432 8 263 1684 131564 0:00:18.375 19:16:48.539 w3wp 6752 8 263 1551 126336 0:00:15.750 19:08:27.058 w3wp 6368 8 266 1584 129640 0:00:18.843 18:58:24.077 w3wp 9308 8 269 1555 139292 0:00:18.359 18:48:21.706 w3wp 9724 8 263 1541 119228 0:00:16.421 18:38:34.851 w3wp 11788 8 261 1557 138836 0:00:18.687 18:28:18.589 w3wp 12660 8 264 1606 132584 0:00:15.875 18:16:35.636 w3wp 11720 8 259 1513 122212 0:00:13.500 18:04:23.545 w3wp 15288 8 264 1541 117816 0:00:14.031 17:48:07.332 w3wp 15552 8 268 1588 86480 0:00:15.453 17:24:03.685 w3wp 18212 8 264 1591 143272 0:00:14.812 16:54:33.149 w3wp 17164 8 261 1449 106416 0:00:11.312 16:12:00.572 w3wp 20352 8 81 1345 105340 0:00:05.421 15:16:58.249 w3wp 19856 8 259 1471 122132 0:00:15.812 14:57:38.913 w3wp 19100 8 259 1412 98540 0:00:12.046 13:46:06.722 w3wp 20704 8 258 1424 78972 0:00:13.500 12:10:35.899 w3wp 22668 8 259 1518 117452 0:00:14.578 11:04:33.956 w3wp 24048 8 260 1505 117876 0:00:13.578 10:10:56.336 w3wp 26540 8 259 1479 159228 0:00:13.250 9:32:55.756 w3wp 27060 8 259 1468 117140 0:00:16.437 8:56:52.326 w3wp 27396 8 258 1427 119840 0:00:13.828 8:24:50.573 w3wp 29892 8 261 1505 140564 0:00:16.406 7:54:15.897 w3wp 28804 8 261 1577 139116 0:00:17.656 7:27:42.876 w3wp 31808 8 260 1528 118300 0:00:13.640 7:07:38.306 w3wp 29500 8 260 1535 118828 0:00:13.484 6:41:36.863 w3wp 34536 8 265 1537 110140 0:00:17.031 6:14:34.623 w3wp 34964 8 258 1510 113788 0:00:20.140 5:50:30.445 w3wp 35116 8 261 1519 141496 0:00:13.921 5:21:56.471 w3wp 37564 8 259 1557 136548 0:00:14.093 4:52:55.404 w3wp 37960 8 260 1512 118824 0:00:13.656 4:26:52.008 w3wp 39872 8 260 1494 135688 0:00:14.203 4:01:41.627 w3wp 36708 8 263 1549 126352 0:00:14.093 3:36:31.668 w3wp 42552 8 259 1504 154900 0:00:16.796 3:13:23.552 w3wp 43048 8 260 1525 104928 0:00:14.890 2:47:51.265 w3wp 44936 8 263 1633 108700 0:00:15.625 2:28:46.585 w3wp 45764 8 262 1621 116700 0:00:15.000 1:55:01.082 w3wp 46924 8 263 1620 143488 0:00:17.859 1:32:21.247 w3wp 48744 8 7 1449 46404 0:00:02.968 1:14:41.160 w3wp 41564 8 4 242 6716 0:00:00.140 1:12:40.755
Monday, August 11, 2008 4:29 PM -
User-269706872 posted
I will try to turn on the eventlog options. Thanks!
Do you have any ideas on futher research if it shows that it is one worker process that isnt closing correctly? What could I do about it?
Monday, August 11, 2008 4:44 PM -
User-2064283741 posted
You will need the iisapp to get the real app pool name for the worker process but looking at the process list it seems that they are all one application pool. There are many wp's with 250-270 threads so I expect they are all the same application.
The others are low and 4 others maybe they are the other 4 app pools making 5 in total or the
Maybe you can look at your site now (I presume it is not a problem atm) and see what wp's uses 250-270 thread app pools normally and look this up in iisapp to get teh real name.
Also they seem to be spawn every 30 minutes looking at the wps with 250-270 threads time elapsed. So that to me implies more of a regular spawning. That value must be defined somewhere.
Maybe default ideal timeout (default 20 I think but maybe related) for each app pool. maybe...
Again the app pool recycling event counters I mentioned above may well help identify this.
Monday, August 11, 2008 4:51 PM -
User-2064283741 posted
Another thought you are not using isolation mode and orphan worker processes are you? They would give the behaviour of not closing down an worker process and it will create a new one. That is what debugging can/will do.
http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/286efc79-e1d8-4078-9ba4-97d569b031ff.mspx?mfr=true
Monday, August 11, 2008 8:15 PM -
User989702501 posted
Look like orphaning worker processes to me... but I would love the see the output of iisapp.vbs :)
Tuesday, August 12, 2008 12:39 AM -
User-269706872 posted
no we are not using isolation mode.
I dont know how to enable orphan, so no we dont use that either.
The result from the command:
cscript adsutil.vbs get W3SVC/AppPools/OrphanWorkerProcess
is "OrphanWorkerProcess : (BOOLEAN) False", so I guess that proves it.
In reply to qbernard. The output from
cscript c:\windows\system32\iisapp.vbs
is
W3WP.exe PID: 2692 AppPoolId: custom_app_pool
W3WP.exe PID: 4004 AppPoolId: DefaultAppPoolOne problem is how I can run this when I cannot login to the server with RDP? Is it possible to execute this command remote?
The custom_app_pool has default application pool settings except we are using a set time when the worker process recycle instead of the default of every 1740 minutes. The settings are configured as follow:
- Recycle worker process in minutes
- Recycle
X recycle worker processes at the following times: 03:15
- maximum virtual memory
- maximum used memory
X shutdown worker processes after being idle for: 20 minutes
X limit the kernel request queue: 1000
- enable CPU monitoring
Maximum number of worker processes: 1
X ping worker process every: 30 seconds
X enable rapid-fail protection
failures: 5
time period: 5
Worker process must startup within: 90 seconds
Worker process must shutdown within: 90 seconds
Identity: contigurable: domain\custom_app_pool_user
the custom_app_pool_user is a member of the local iis_wpg group
Tuesday, August 12, 2008 2:07 AM -
User989702501 posted
I too wish to run iis*.vbs remotely... there's some tweak i saw long ago.. have to dig out :)
Next, per your latest output only 2 apps pool running, so can't tell where those ghost w3wp.exe belong. do you have lot of event error related to w3wp? i'm sure it will list the process id + app pool name and more detail with it. able to find it?
Tuesday, August 12, 2008 2:20 AM -
User-269706872 posted
Yes there are alot of errors in the logs.
It all starts with alot of
A process serving application pool 'custom_app_pool' failed to respond to a ping. The process id was '32600'.
Then I have some
A process serving application pool 'custom_app_pool' exceeded time limits during shut down. The process id was '37448'.
Application pool 'custom_app_pool' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
Application pool 'DefaultAppPool' is being automatically disabled due to a series of failures in the process(es) serving that application pool.
I also had some deadlocks
ISAPI 'C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll'
reported itself as unhealthy for the following reason: 'Deadlock detected'.But we have set the processmodel in machine.config to autoconfig and therefore we follow the guidelines in: http://support.microsoft.com/?id=821268
I found this http://www.devnewsgroups.net/group/microsoft.public.dotnet.framework/topic62051.aspx which seems to be the same problem that we have.
Tuesday, August 12, 2008 3:06 AM -
User989702501 posted
Ok. so the ghost w3wp.exe is definitely from your two app pools. It could be IIS keep firewall off new w3wp.exe while still killing the bad one, even with no orphan worker process enabled, and it will be very hard to reproduce.
the next thing i would suggest is to get debugdiag and try debug the app if crash happen again. this should be app related.
Tuesday, August 12, 2008 3:44 AM -
User-2064283741 posted
Patrikc,
Those app pool settings look like defaults (apart from the recycle by time)
I still think you are looking at 1 problem app pool over another. By default what how many threads are these (correctly working now) worker processes running. Cross reference this with iisapp.
Ummh iisapp remotely not possible. Although it could be so with a few tweaks to the scripts (probably adding something like - Set providerObj = GetObject("winmgmts://MachineName/root/MicrosoftIISv2") Set appPool = providerObj.Get("IIsApplicationPool='w3svc/AppPools/DefaultAppPool'") ) . I wonder if I could that...it would be handy...if only I had more time.Tuesday, August 12, 2008 6:39 AM