Knowledgebase

Server hammered dailly

Posted by Ionutz, 06-24-2010, 02:58 AM
Hello i have a dedicated shared hosting server, and every day i got hammered, having load at 100 or higher and free swap zero message. I don't know what tools i should install to monitor accounts activity but i bet that one of the hosting accounts is doing this on purpose. This server was made putting together accounts from two different sources. one source was a dual core pentium 4 dedicated server, and the other one was a VPS. The same behavior was on the VPS befor starting the migration of the accounts to actual dedicated server. So o do believe that one account hosted on VPS is the cause of the issue, because on pentium 4 dedicated server i never had this kind of issue. On the VPS i use to have no more then 100 accounts. My problem is that i cannot trace the bastard who's doing this, so i need your help. If needed i can post logs, or other stuff, just mention it... Thanks Darius

Posted by M Bacon, 06-24-2010, 03:06 AM
Are you using a control panel? Can you reboot the server to get the load down and run top?

Posted by Ionutz, 06-24-2010, 03:10 AM
i'm running top on SH window, but i don't see any tarce of abuse. i mean only at least 100 load, nagios is yelling because of down services...i'm using cPanel but when is happening is so fast, so if i got lucky i'm able to see what'swrong on sh window... on WHM i'm not having access because of high load... Thanks

Posted by M Bacon, 06-24-2010, 03:16 AM
What happens if you stop the services? Run service httpd stop service exim stop service mysql stop Can you login root WHM after that?

Posted by Ionutz, 06-24-2010, 03:22 AM
if i'm able to run a command... but most of the times, i cannot do anything...

Posted by M Bacon, 06-24-2010, 03:23 AM
Do you have a VPS Panel? Like Hyper VM, SolusVM, or somewhere?

Posted by Ionutz, 06-24-2010, 03:25 AM
is a dedicated server. accounts from a VPS where moved to this server, and the same behavior i suse to have on VPS. So i'm presuming that one account that was hosted on VPS is doing this...

Posted by M Bacon, 06-24-2010, 03:27 AM
Oh I see. Does your dedicated server provider provide KVM access?

Posted by Ionutz, 06-24-2010, 03:28 AM
yes, but when is happening, i get this message: free swap 0

Posted by M Bacon, 06-24-2010, 03:31 AM
Anyway to increase it? http://www.linux.com/news/software/a...nux-swap-space Maybe you need to contact your server provider unfortunately. Maybe they didn't partition it right.

Posted by Ionutz, 06-24-2010, 03:33 AM
i have 2 GB of swap, 4 GB dual channel of RAM. RAM is not used all, but when i'm hammered, load is over 100, so i do believe that not even 100 gb of swap will not support this behavior. i need to know how i can limit resources usage per account, or something like this...

Posted by M Bacon, 06-24-2010, 03:37 AM
Litespeed is a good Apache Replacement. http://litespeedtech.com Do you have a firewall? APF or CSF? http://configserver.com or http://www.rfxn.com/projects/advanced-policy-firewall/ protects you against DDOS attacks. Media Layer offers DDOS Deflate. You could scan your server with RK Hunter just in case you have a virus or Clam Antivirus.

Posted by ibee, 06-24-2010, 03:38 AM
in WHM in the left pane just check the current CPU usage, you might get more information on what is responsible for the server load

Posted by Ionutz, 06-24-2010, 03:39 AM
do you think is DDOS? when is happening, i have ping in server (normal latency) but no access...

Posted by M Bacon, 06-24-2010, 03:40 AM
If he can access WHM, thats the best method in my opinion.

Posted by Ionutz, 06-24-2010, 03:43 AM
after reset, the server is getting better... so i don't think is a DDOS or flood...

Posted by M Bacon, 06-24-2010, 03:43 AM
It could be unless your firewall is going haywire. Can anybody access your sites or just you? http://downorisitjustme.com/

Posted by M Bacon, 06-24-2010, 03:45 AM
Does the CPU usage area say anything in WHM?

Posted by Ionutz, 06-24-2010, 03:47 AM
none. is completelly unaccesible.. i don't understand about firewall. do you think that is DDOS? Even if the server is OK after reboot? I asking this, because on ddos, is continously, even after reboot, so is an internal issue.an account is loading the server, or apache or mysql are not optimised... cheers

Posted by M Bacon, 06-24-2010, 03:49 AM
Do you have a firewall at all? Having one helps even if DDOS not the issue.

Posted by Ionutz, 06-24-2010, 03:53 AM
i have CSF and LFD. configured to bann ip'sthat are having more then 100 sim connections... for 3 minutes.

Posted by M Bacon, 06-24-2010, 03:59 AM
Ok. It is really hard to diagnose the issue on this forum unfortunately.

Posted by Ionutz, 06-24-2010, 04:02 AM
maybe some logs will help... first i do tough that is because i have many CMS installed on server, and these are opening multiple sim connections for an IP, and for a wile worked with a limit of 100 connections per IP, but after 1 week again is happening. these accounts worked fine on a dula pentium 4 server, untill now, so actual server sould be better because is having much ram and a better CPU. still.. i get hammered....

Posted by M Bacon, 06-24-2010, 04:09 AM
You could try that or just get a management plan at http://platinumservermanagement.com or http://webbycart.com They do support cPanel. I am guessing that your server is unmanaged.

Posted by tim2718281, 06-24-2010, 04:14 AM
One guess; there could be a cron job with a bug in it, so it gets stuck if there is more than one instance of the cron job running. If cron is set to fire off the job every minute or every five minutes or whatever, cron will keep starting new copies. Eventually the system will run out of RAM and swap space, affecting everything. Anyway, if I had this problem, I'd set up a script to every minute log the time, system load, and what processes were running. After the event occurred, I'd examine the log, to see if I could detect any pattern in the processes.

Posted by Ionutz, 06-24-2010, 04:19 AM
it could be a cron issue, but it was happening on other machine before, so i gues that error cannot be copied on migration of account. Also... is happening in different time of the day...

Posted by alons, 06-24-2010, 09:38 AM
Crons of users will also be copied if you migrated. So I think you should stop the cron daemon and check. And yes, a cron job which is executing buggy code can certainly be the reason.

Posted by Ionutz, 06-24-2010, 09:51 AM
hello thanks for reply. this situation is not occuring every day, or on the same hour interval. somethimes is taking almose a week without any problems, somethimes i gaet hammered everyday during 4-5 days, and i repeat, not in the same hour interval. somethimes occure in the morning somethimes on the afternoon, somethimes at night... i guess someone is duing this by purpose. some hosting account (is a shared server) is overloading the server with scripts or else... or it can be a bug, because yesterday i saw in top report that a new account had many open sessions with command php. That account is new, and is hosting a CMS (joomla). problems arised before that account start to have hosting on this server, so i cannot suspect it entirely. i guess, is a hosting account that is doing this, or a bug in the apache... cheers

Posted by Ionutz, 06-24-2010, 10:14 AM
hello i got htis on scanning the server. these are real trojans or fake warning? Main >> Security Center >> Scan for Trojan Horses Scan for Trojan Horses Appears Clean /dev/core /dev/stderr Scanning for Trojan Horses Possible Trojan - /etc/cron daily/logrotate Possible Trojan - /usr/bin/cpan Possible Trojan - /usr/bin/instmodsh Possible Trojan - /usr/bin/prove Possible Trojan - /usr/lib64/python2 4/site-packages/libxml2mod la Possible Trojan - /usr/lib64/python2 4/site-packages/libxml2mod so Possible Trojan - /usr/bin/xml2-config Possible Trojan - /usr/bin/xmlcatalog Possible Trojan - /usr/bin/xmllint Possible Trojan - /usr/bin/xml2-config Possible Trojan - /usr/sbin/pureauth 11 POSSIBLE Trojans Detected

Posted by TheChemist, 06-24-2010, 10:25 AM
Sounds like ddos, how long has it been happening? Do you think it's someone that will lighten up a bit. It may not be a client of yours it could also be someone that your client pissed off. But I've never seen someone have such a hard time finding the root of all evil. I know it would be tedious, but you could send out an e-mail to every client as I am sure they are as pissed as you are, and tell them that you are going to be suspending one account at a time, and then unsuspending it until you find out which account is bringing this negativity to the server.

Posted by Ionutz, 06-24-2010, 10:27 AM
do you think is DDOS? Then why after reboot all is OK?

Posted by tim2718281, 06-24-2010, 02:52 PM
If the user had cron jobs set up, they would presumably set them up again after migration, if the migration process did not do that. But I was trying to show the problem may not be malicious. But regardless of the cause, a minute-by-minute log of system load and running processes could help diagnose it. (If you're lucky, you'll see the same command being executed with more and more processes.)

Posted by Drifter13, 06-24-2010, 05:30 PM
Try: netstat -tan | grep ':80 ' | awk '{print $6}' | sort | uniq -c this will give you a summary list of the connection types to your server and give you an idea if it's a ddos attack or not.

Posted by Ionutz, 06-26-2010, 06:09 AM
hello can someone help me in decreasing TIME_WAIT? also to force visitors to cache websites pages for 2-3 hours? thanks PS: i know that are some tricks to find out what user is loading the server. now i have suPHP installed but i don't know how to use it...

Posted by Ionutz, 06-29-2010, 01:52 PM
hello does look this like ddos? # netstat -tan | grep ':80 ' | awk '{print $6}' | sort | uniq -c 1 CLOSE_WAIT 113 ESTABLISHED 149 FIN_WAIT1 23 FIN_WAIT2 38 LAST_ACK 1 LISTEN 17 SYN_RECV 118 TIME_WAIT Right in that time, the server started to load very slow, and load rised from 1 to 10 in 3 seconds... i have restarted apache and load decreased... Thanks

Posted by OLM | DavidG, 06-29-2010, 05:26 PM
Hi Ionutz, I would advise you to install the sysstat package, if it is not installed already. Useful data about resource usage on the server will be recorded on regular intervals, and can be viewed using the "sar" command, along with its various arguments (see "man sar" for details). Viewing the data in this format may expose trends related to the problem which you are experiencing. I hope this helps! David

Posted by Ionutz, 07-15-2010, 05:14 AM
thanks David. I will try it. but look what i found today: # netstat -ntu | awk '{print $4}' | cut -d: -f1 | sort | uniq -c | sort -n 1 1 XXX.115.108.XXX 1 Local 1 (w/o 3 XXX.115.108.XXX 3 XXX.115.108.XXX 12 XXX.19.14.XXX 13 XXX.115.108.XXX 18 XXX.115.108.XXX 262 XXX.115.108.XXX 2350 127.0.0.1 Load rised very fast. i had to restart apache to drop... why do i have 2350 requests on localhost? Cheers

Posted by OLM | DavidG, 07-15-2010, 06:56 AM
You're welcome. Can you please post the output of: netstat -nap |grep 127.0.0.1

Posted by Ionutz, 07-15-2010, 06:57 AM
tcp 0 0 127.0.0.1:2095 127.0.0.1:38130 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37362 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37106 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37874 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37618 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36850 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38131 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38387 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37107 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37363 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37619 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37875 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36851 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37876 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37620 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37364 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37108 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38388 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38132 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36852 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36596 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37621 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37877 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37109 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38133 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38389 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36853 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37366 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37110 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37878 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37622 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38390 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38134 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36854 TIME_WAIT - tcp 0 0 127.0.0.1:80 127.0.0.1:40703 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37111 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37367 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37623 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:37879 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38135 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:38391 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36599 TIME_WAIT - tcp 0 0 127.0.0.1:2095 127.0.0.1:36855 TIME_WAIT - udp 0 0 127.0.0.1:53 0.0.0.0:* 9457/named udp 0 0 127.0.0.1:123 0.0.0.0:* 2860/ntpd

Posted by OLM | DavidG, 07-15-2010, 07:11 AM
Is this all of the output? Your previous stats indicated over 2000 connections to localhost. I realize that is too much to post here, but the provided output does not show which process may be initiating the connections. Based on the output, it looks like there were a bunch of connections to the cPanel webmail port (2095). I assume you are running cPanel. Perhaps this is related to cPanel webmail proxying, which is an option under "Tweak Settings" inside WHM. If so, some remote host may be scanning the webmail subdomain and generating all of these local proxy connections to the webmail port. I would recommend that you review the http access logs on the server for the time periods when you experienced the load spikes.

Posted by Ionutz, 07-15-2010, 04:57 PM
i think i found the issue. some mrx shopping cart was hammered dailly by google bot. At least 5-6 GB of traffic dailly with google bot. I suspended the site and now i'm monitoring the server. but the load dropped exactlly in the same moment with the suspend account procedure...

Posted by Ionutz, 08-04-2010, 09:43 AM
take a look on the prints screen please. What can be? Attached Thumbnails  

Posted by larwilliams, 08-04-2010, 10:16 AM
It appears to be either a bunch of crons running PHP scripts, or PHP as a CGI. Also, you may look to get more RAM. Your screenshot shows that only 3MB was free. This will often cause a server to go into swap, which will drive the load up considerably as well.

Posted by Ionutz, 08-04-2010, 10:19 AM
hello larwilliams this is a 512 RAM VPS, The client is running on it 6-7 onlinse stores averyone with at least 1000 visitors by day. This could be the problem? Low resurces? Cheers

Posted by larwilliams, 08-04-2010, 10:20 AM
That is a possibility. I would try getting 1GB of RAM allocated, and do some work optimizing Apache and MySQL.

Posted by Ionutz, 08-04-2010, 10:34 AM
hello larwilliams Thanks for your reply. Regarding optimizing, i know that this is no easy stuff, but can you point me to some tutorials or manuals? Cheers

Posted by Ionutz, 08-04-2010, 10:43 AM
hello regarding optimization. i know that isn't a easy stuff. can you point me to some tutorials or manuals?

Posted by larwilliams, 08-04-2010, 10:55 AM
For MySQL, you can try out the MySQL tuning primer shell script http://www.day32.com/MySQL/tuning-primer.sh Use wget to download it to a folder on your VPS, chmod +x it, then run using ./tuning-primer.sh That will tell you some of the things you can change in /etc/my.cnf As for Apache, if you are using the prefork mpm (most likely), you can edit /usr/local/apache/conf/extra/httpd-mpm.conf and change the prefork section to the following. You will need to experiment a bit with the settings to get it just right. StartServers 2 MinSpareServers 2 MaxSpareServers 5 ServerLimit 20 MaxClients 20 MaxRequestsPerChild 10000 I am giving this advice with the assumption you are using cPanel with CentOS.

Posted by Drifter13, 08-04-2010, 04:13 PM
You need more ram and also more processor power for that many visitors. Consider optimizing php too by installing an opcode cache such as memcache xcache or eaccelerator.

Posted by larwilliams, 08-04-2010, 04:56 PM
I'd use eAccelerator. It's older, but more stable and can compress the cached info (saving you RAM). xCache is good too, never used memcache.

Posted by Steven, 08-04-2010, 06:25 PM
I read through this thread. Here are my thoughts. If you are using a default cpanel installation then you have suphp installed by default which means for each php request its going to launch a php process which is expensive in terms of server resources, but can be more secure. Also if you are using suphp, a opcode cache is not going to be of much use to you. My suggestion is to compile Apache 2.2 with the worker MPM and use php in fastcgi (you can run it in suexec mode). Also I suggest utilizing xCache 1.3.0 as your opcode cache. Since you are running a dedicated server, a kernel with the bloat stripped out of it and a few specific settings set in it will help out. Keep in mind, since you have small amount of ram in the server, you cannot run huge buffers. Many buffers are PER CONNECTION. Last edited by Steven; 08-04-2010 at 06:31 PM.

Posted by larwilliams, 08-04-2010, 07:31 PM
I would advise against FastCGI. It requires a very good sysadmin to configure properly with SuEXEC and xcache. I do recommend using the worker mpm though. Just make sure that everything you use it thread-safe or it can cause problems.

Posted by Steven, 08-04-2010, 10:08 PM
I disagree with your advice. Setting xcache up with fastcgi is not any different then with mod_php. He is using cpanel. Cpanel takes the trouble out of making it suexec safe with the rebuild_phpconf script.

Posted by larwilliams, 08-04-2010, 10:22 PM
cPanel doesn't fix half of the gotchas with using FastCGI and PHP together. I tried it myself and found several problems with it, including the famous problem where a process exits after 500 requests and just returns 500 Internal Server Errors to the client. I know that this is a PHP behavior that can be disabled by setting PHP_FCGI_MAX_REQUESTS to 0, but that can be an issue if the PHP application leaks resources. Alternatively, PHP_FCGI_MAX_REQUESTS can be set to a much higher value than the default to reduce the frequency of this problem. FcgidMaxRequestsPerProcess can be set to a value less than or equal to PHP_FCGI_MAX_REQUESTS to resolve the problem. I'm not saying that FastCGI is bad, it's just not the best solution for the OPs use.

Posted by Steven, 08-04-2010, 10:26 PM
Everything you said has nothing to do with fastcgi and suexec with xcache as your original post made. The default configuration works just fine for a lot of people.



Was this answer helpful?

Add to Favourites Add to Favourites

Print this Article Print this Article

Also Read
Shoutcast Reseller? (Views: 592)


Language:

Client Login

Email

Password

Remember Me

Search