• Our team is looking to connect with folks who use email services provided by Plesk, or a premium service. If you'd like to be part of the discovery process and share your experiences, we invite you to complete this short screening survey. If your responses match the persona we are looking for, you'll receive a link to schedule a call at your convenience. We look forward to hearing from you!
  • We are looking for U.S.-based freelancer or agency working with SEO or WordPress for a quick 30-min interviews to gather feedback on XOVI, a successful German SEO tool we’re looking to launch in the U.S.
    If you qualify and participate, you’ll receive a $30 Amazon gift card as a thank-you. Please apply here. Thanks for helping shape a better SEO product for agencies!
  • The BIND DNS server has already been deprecated and removed from Plesk for Windows.
    If a Plesk for Windows server is still using BIND, the upgrade to Plesk Obsidian 18.0.70 will be unavailable until the administrator switches the DNS server to Microsoft DNS. We strongly recommend transitioning to Microsoft DNS within the next 6 weeks, before the Plesk 18.0.70 release.
  • The Horde component is removed from Plesk Installer. We recommend switching to another webmail software supported in Plesk.

Issue Googlebots bypassing Nginx

tkalfaoglu

Silver Pleskian
Server operating system version
AlmaLinux
Plesk version and microupdate number
Obsidian
Hi there. I noticed a high CPU usage in a server of ours and went to check it..

We are getting hit hard by the google bots.. the access_ssl_log contains thousands/millions of entries like:

66.249.66.204 - - [14/Nov/2022:13:18:57 +0300] "GET /index.php?rp=%2Fknowledgebase%2Ftag%2Fbar%C4%B1nd%C4%B1rmalanguage%3Dturkishlanguage%3Destonianlanguage%3Dgermanlanguage%3Destonianlangu
age%3Ddutchlanguage%3Dromanianlanguage%3Destonianlanguage%3Dczechlanguage%3Dukranianlanguage%3Dhungarianlanguage%3Darabiclanguage%3Dczechlanguage%3Dukranianlanguage%3Dswedishlanguage%3Dport
uguese-ptlanguage%3Dromanianlanguage%3Dczechlanguage%3Dportuguese-brlanguage%3Dczechlanguage%3Drussianlanguage=ukranian HTTP/1.0" 200 7070 "-" "Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X B
uild/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.5304.110 Mobile Safari/537.36 (compatible; Googlebot/2.1; +What Is Googlebot | Google Search Central | Documentation | Google Developers)"

I checked the IP and it is google indeed. I checked the settings why nginx is not being used for these innocent queries (the log file proxy_access_log file is not being used much), and although the nginx/apache settings are very permissive for this domain, they are not used:

[ ] HTTP no-cache headers are received in request
[X] HTTP authorization headers are received in request
[ ] GET nocache parameter is received in request

Any ideas what to do and how to channel these "attacks" to nginx instead?
Thanks! -t
 
But you are right -- I had to stop this, it was looping (the language parameter kept repeating), so I added the Googlebot to the string, and now it's getting 403 instead.. I already had the nginx command for a dozen more bots anyway..
Regards, -turgut
 
I checked the IP and it is google indeed. I checked the settings why nginx is not being used for these innocent queries (the log file proxy_access_log file is not being used much), and although the nginx/apache settings are very permissive for this domain, they are not used:
Plesk configures nginx so that accesses that are passed to apache are not logged. ("access_log off;" in the location section)
 
Back
Top