Knowledgebase

Mod Security Rules Incorrectly Blocking Googlebot

Posted by Another Blogger, 01-06-2013, 10:25 AM
Hello everyone I'm using CSF Firewall in my blog and the problem is the Mod_Security rules incorrectly blocks Googlebot regularly. It is happening almost every week. In past I have removed a Mod Security rule which was blocking Googlebot and now a new Mod Security rule 950005 is blocking Googlebot. Whenever it blocks Googlebot, I manually remove Googlebot IP address from deny list using WHM. Is there any way to permanently stop this? How can I set Mod Security to not block Googlebot? My hosting staff suggested me to add following rule at the end of Mod Security configuration: But I'm not sure whether I should add it or not. Is there any security risk in adding this rule or is it ok? Is there any other recommended way to allow Googlebot? Any kind of suggestions will be highly appreciated. Last edited by Another Blogger; 01-06-2013 at 10:28 AM.

Posted by Infinitnet, 01-06-2013, 10:32 AM
You could follow your host's suggestion, or just put all the crawler IPs in /etc/csf/csf.allow to whitelist them. Another option would be to check your audit log and find out which rules blocks Google and then adjust it.

Posted by Another Blogger, 01-06-2013, 10:37 AM
Thanks for your reply. I cant put IPs in allow list because Googlebot IP changes regularly. As I told its 950005 rule which is blocking Googlebot regularly. Can you please tell me how to adjust the rule as you suggested? Also if I follow host's suggestion, will it cause any security risk?

Posted by Infinitnet, 01-06-2013, 10:44 AM
The rule ID won't help much if we don't know which rule set you're using. I'm not talking about single IPs, but the IP ranges Googlebot is using. The main ranges are: 66.249.80.0/20 66.249.64.0/19 Following your host's suggestion won't cause any security risk, as it will only whitelist IPs with googlebot.com in their reverse DNS.

Posted by Another Blogger, 01-06-2013, 10:49 AM
Thanks. The error is email alert is as following: As you can see Googlebot IP was 66.249.73.150 which doesnt fall in the range which you mentioned.

Posted by Infinitnet, 01-06-2013, 10:55 AM
Of course it does. 66.249.64.0/19 is 66.249.64.0 - 66.249.95.255.

Posted by Another Blogger, 01-06-2013, 11:14 AM
Ok. Thanks. Please tell me how to adjust the Mod Security rule and also kindly tell me how to add the IP address range in csf.allow list I mean the exact line which I should insert into the file. Thanks.

Posted by Infinitnet, 01-06-2013, 11:29 AM

Posted by Another Blogger, 01-06-2013, 11:34 AM
Thanks a lot. One last question! Which one will you recommend to implement? My host's suggestion or this IP address whitelist method? Which one is safer and better solution?

Posted by Infinitnet, 01-06-2013, 11:38 AM
Both are safe, it's your choice which of them you use. Your host's suggestion will whitelist Googlebot in mod_security, so it won't block a single request from Googlebot, while mine only prevents the Googlebot from getting banned by CSF, but a "malicious" request from it will still not get executed.

Posted by Another Blogger, 01-06-2013, 11:41 AM
Thanks. I really appreciate your help. Just now someone suggested me to use following code rather than the code suggested by my host: Is it better and secure than the host's suggestion?

Posted by Infinitnet, 01-06-2013, 11:43 AM
That's an insecure solution, as it only checks the user agent and that's easy to fake. An attacker could just guess that you have whitelisted Google by user agent, change his user agent to "Googlebot" and then be able to get around every mod_security rule.

Posted by Another Blogger, 01-06-2013, 11:47 AM
Thanks. You really helped me a lot. I highly appreciate it. Thanks again.

Posted by Igal Zeifman, 01-07-2013, 09:38 AM
FYI This is the complete range of Googlebot IPs. (Source: Botopedia.org) For obvious SEO related reasons you should allow all, and not just some.

Posted by Infinitnet, 01-07-2013, 10:14 AM
Are there any IP lists available? On Botopedia I can't find an option to view the IPs, eg. http://www.botopedia.org/user-agent-...bots/googlebot

Posted by Igal Zeifman, 01-07-2013, 10:20 AM
You can use Botopedia to verify the origin of IPs (i.e. to identify fake Googlebot visits) but no, generally speaking, you won't be able to extract the full list of all IPs like I did here. You can use Botopedia to verify every single IP in the list I provided... I just copy/pasted in bulk because I wanted to save you the trouble of doing that.

Posted by Another Blogger, 01-10-2013, 05:46 PM
Today the httpd service went down and couldnt restart. When I checked the startup log, I found following error: Line 159 contained following: So I had to remove the code and httpd service recovered successfully. Can you please tell me why did it happen and how to correct the code? Thanks.

Posted by Another Blogger, 01-12-2013, 07:52 PM
I fixed the problem by adding .googlebot.com to csf's ignore list.



Was this answer helpful?

Add to Favourites Add to Favourites

Print this Article Print this Article

Also Read
Quickpacket down? (Views: 630)
OpenVZ API? (Views: 615)
Very Strange problem!! (Views: 551)


Language:

Client Login

Email

Password

Remember Me

Search