• Our team is looking to connect with folks who use email services provided by Plesk, or a premium service. If you'd like to be part of the discovery process and share your experiences, we invite you to complete this short screening survey. If your responses match the persona we are looking for, you'll receive a link to schedule a call at your convenience. We look forward to hearing from you!
  • The BIND DNS server has already been deprecated and removed from Plesk for Windows.
    If a Plesk for Windows server is still using BIND, the upgrade to Plesk Obsidian 18.0.70 will be unavailable until the administrator switches the DNS server to Microsoft DNS. We strongly recommend transitioning to Microsoft DNS within the next 6 weeks, before the Plesk 18.0.70 release.
  • The Horde component is removed from Plesk Installer. We recommend switching to another webmail software supported in Plesk.

Server hanging randomly

J

JessB

Guest
I have a server running Plesk 9.5.2 on Debian and every so often it will become completely unresponsive on all protocols except ping. It can happen anywhere from hours to weeks after the server boots and I haven't been able to track down the cause.

I devised a way to capture the output of top just before the server stopped completely, available to view here http://pastebin.com/1Fd0v7nH.

The load averages are through the roof, but CPU usage is low. There are 125 instances of apache2 and 104 instances of relaylock, compared with 14 and 0 currently.

Any help or suggestions would be appreciated.
 
TOP shows that there is almost no memory left and also your swap is full. This leads to out-of-memory errors and the swapping leads to very high load because of much disk IO.
You should have a look into your logs why there are so many processes of apache and relaylock.
You could also reduce your MaxClient setting in your apache configuration so that your server does not start swaping when many clients connect to your apache. To allow more connections with less apache processes you can also reduce the KeepAliveTimeout.
 
Thank you for the suggestions Bevan,

I have reduced MaxClients to 75 (originally 150), and reduced KeepAliveTimeout to 3 (originally 15)

The apache error logs don't have much to say, the only interesting line around the time the server hung was this:

[Sat Nov 06 09:21:47 2010] [error] server reached MaxClients setting, consider raising the MaxClients setting

Although it suggests to increase MaxClients, I think you're on the right track by suggesting to reduce it because the new KeepAliveTimeout should hopefully reduce the number of processes running unnecessarily.

The mail logs show a large amount of Spam coming in (every few seconds), which seems to be the cause of all of the relaylock processes, such as this:

Nov 6 09:57:26 server01 /var/qmail/bin/relaylock[14328]: /var/qmail/bin/relaylock: mail from 109.60.193.7:4254 (ip7.net193.n37.ru)

Under normal conditions the processes aren't alive long enough to appear in top or ps so I'm going to guess that after apache steals all of the RAM, any other processes the server tries to run are delayed.
 
Back
Top