locked
IIS stops responding, service restart won't help, reboot needed. RRS feed

  • Question

  • User1495177700 posted

     Hi everyone,

    I am currently trying to solve an issue with one of my servers. The server is running Windows 2003 Server R2 with SP2 and my issue concerns IIS. I am using this server as a file server and for our intranet. I run 4 sites on IIS: Sharepoint, Sharepoint administration, Testtrack and the server web administration.

    Since 3 weeks, IIS occasionnaly stop responding once a week. When it happens, nothing shows up in the logs _nothing_. I tried to restart the websites, restart IIS, restart my SQL DBs, nothing works, I work around the problem by rebooting the server.

    What changed:

    I installed Testtrack, a web based bugtracking software.
    I upgraded the sharepoint database from  the SQL desktop engine to the SQL std version 2000 (8.00.2039 SP4) and I turned on the full text indexing (which doesn't work).

    Needless to say, rebooting the server in the middle of the day while everyone use the shared files is a major PITA.

    If you have any idea on the cause of the problem or if you have any suggestion on what I could try, I'll be very happy to read your reply. If you need more information, just ask.

    Thanks in advance,

    Etienne

    Monday, August 13, 2007 3:37 PM

All replies

  • User2076344368 posted

    Are you getting anything in the http.sys error logs? I assume your using application pools?, More info on the Http.sys logs below

    http://www.microsoft.com/technet/prodtechnol/WindowsServer2003/Library/IIS/23b752a4-9daa-4695-b735-98941f6e65ec.mspx?mfr=true 

    You may get some more information about the hang/crash in the logs, start there first.

     
    Andy
     

    Tuesday, August 14, 2007 5:07 AM
  • User1073881637 posted

    What errors do you get.  Anything in the event log?  As the other poster said, anything in the http.sys error logs.  if so, there could be a memory leak or just not enough memory to run your applications.  The next time this happens, try creating a simple HTML page and see if it works.  Post any errors you are seeing.  That will help us out provide more options.

    Tuesday, August 14, 2007 7:13 AM
  • User1495177700 posted

    Thank you for your answers.

    I didn't know about http.sys logs, I'm going to look that up and try to see if anything shows there. Also, I don't know what application pools are, sorry for beeing a newbie, I guess I must start somewhere. (I am better at running apache websites...)

    As for the event logs, really, nothing shows in them. I'm kindof an event log freak so when something shows up, I lookup the problem and solve it. So right now, everything in my log is informal, clean as a fresh install. I just been to \systemroot\System32\LogFiles, is there something I should look for there?

    My best guest would be to wait until it stops responding again and try to run some tests. 

    I'm not an experienced IIS user so I thank you in advance for your patience,

     
    Etienne
     

    Tuesday, August 14, 2007 10:31 AM
  • User1073881637 posted

    Sounds good.  Here is a useful KB article on registry entries for http.sys.  http://support.microsoft.com/default.aspx/kb/820129  The standard disclaimer of backing up your registry and such.  Try the new settings on a test box to ensure you understand before implementing in production.  Good luck.

    Tuesday, August 14, 2007 12:17 PM
  • User1495177700 posted

     I might have found what the problem is. Well maybe there is two problems:

    This morning my websites were down again... I needed to find out when they stopped responding.

    My httperr.log shows that something went wrong around 00:23 last night:

    2007-08-14 19:14:48 192.168.1.184 3440 192.168.1.15 80 - - - - - Timer_ConnectionIdle -
    2007-08-14 20:25:53 192.168.1.109 4422 192.168.1.15 80 - - - - - Timer_ConnectionIdle -
    2007-08-14 20:45:48 192.168.1.164 1592 192.168.1.15 80 - - - - - Timer_ConnectionIdle -
    2007-08-15 00:23:43 - - - - - - - - - 2_Connections_Refused -
    2007-08-15 00:34:33 - - - - - - - - - 1_Connections_Refused -

    So I went to my event log to see if anything else happened at the same time...

    From 00:05 to 00:25, there was 5 event id 20, Volsnap: The shadow copies of volume I: were aborted because of a failed free space computation.

    Hum... my website isn't on I: but I fixed the problem using  kb833167.

    What else... Event id 4000 from smtpsvc at 00:29, close to the time the server started to refuse connections:  Message delivery to the remote domain 'srv.domain' failed for the following reason: Unable to deliver the message because the destination address was misconfigured as a mail loop.

    I fixed that by making it point to my exchange server.  

    Last event is event id 5 from active server pages: Error: The Template Persistent Cache initialization failed for Application Pool 'DefaultAppPool' because of the following error: Could not create a Disk Cache Sub-directory for the Application Pool. The data may have additional error codes..

    I followed kb842493 to fix that error. 

    After all of that I restarted the world wide web publishing service but the server is still refusing connections. I will reboot the server and see if it happens again after my changes.

    If all that gave you any clue about what might be the problem, please let me know.

    Thanks!

    Etienne
     

     

    Wednesday, August 15, 2007 11:05 AM
  • User1495177700 posted

    Soo... new monday, new reboot. I think I might have figured out the problem.

    Can I be that my SQL server has a memory leak? I can see the private bytes going up and up... At some point I have event 2019: "The server was unable to allocate from the system nonpaged pool because the pool was empty" showing up in my logs...

    This is on the sharepoint db only. My other instance of SQL is working just fine.

    Anyone knows how to fix this?

     
    Etienne

     

    Monday, August 20, 2007 2:46 PM
  • User273591096 posted

    I think you are on the right track with the non-paged pooled memory.  See this post from David Wang to help see what is using up your kernel memory. http://blogs.msdn.com/david.wang/archive/2005/09/21/HOWTO-Diagnose-IIS6-failing-to-accept-connections-due-to-Connections-Refused.aspx

    Tuesday, September 4, 2007 10:14 AM
  • User1495177700 posted

     Hey Deshazer,

     Thanks for the link it was a good read. I think that might be right.

     I moved the Testtrack server to another machine with a 180 day trial of Server 2003. It seems that was the software who made IIS go down. I rebooted the server for the last time on august 24th. It's been running fine for 10 days now.

    I'll let you know if I still have problems and I'll try poolmon.exe.

    Thank you all for your help,

    Etienne

     

    Tuesday, September 4, 2007 10:32 AM
  • User1073881637 posted

    I highly doubt SQL Server has a memory leak.  It sounds like your on the right 'TestTrack'.  Sorry for the pun.   :)

    Tuesday, September 4, 2007 8:38 PM