Blocking spiders should not have any impact to performance...
Yes for small sites / new sites, this shouldn't really be something you have to think about. However, sometimes you can get a chock full of thousands of spiders every second of every day as a big site and that can really reduce and effect performance for your actual visitors. That's about the only real time you'd want to block some spiders from visiting your site is when they are hogging resources and not providing anything in return. Also, some search engine bots are malware and there are known malware bots which are blocked by most good WP plugins like Wordfence if you're a WP user. Otherwise you can just use the robots.txt, htaccess or Meta Robots tag.
in addition you'll want web crawlers for search engines to visit in order to get content indexed...
Yes true that. But as much as we may wish for all the world’s search engines to take notice of our Websites - when they’ve actually managed to crash your system a few times you may be pardoned for having second thoughts. Equally, when they’re hitting your servers with such a task load, your visitors may easily get the impression that viewing your pages is akin to plodding through treacle — impeding sales and your company’s reputation, not to mention the fact that it’s anything but a great user experience.
So what to do about it? Well, how about simply blocking them? After all, as always in business, it’s a question of what kind of a trade-off you’re actually stuck with here.
Not all spiders are created equal, and only your specific online business model should govern your decision to either bear with them accessing your pages regularly or telling them to get lost. After all, bandwidth doesnt come cheap and losing sales due to poor server performance isnt particularly funny either.
Are you targeting the Russian market at all? If not, all that traffic created by Yandex search engine crawlers is something you may very well do without.
How about China? Japan? Korea? Chinese search engines such as Baidu, SoGou and Youdao will merrily spider your sites to oblivion if you let them. In Japan it’s Goo, and in South Korea it’s Naver that can mutate into performance torpedoes once they’ve started to fancy your website.
Nor is that all, because the search engines arent the only culprits in this field.
The only situation when you want to keep certain spiders out such as Moz, Majestic, Ahrefs etc - is when you are operating PBNs (private networks) as then you hide your backlinks from competitors and with that your ranking strategy...
Not the only situation as explained but definitely one of the reasons. That's definitely a smart move.
You have to ask yourself. Are you happy with your competition sussing out your entire linking strategy (both incoming and outgoing)? A number of services around will help them do exactly that. Fortunately, at least one major contender, namely Majestic-SEO is perfectly open about things and lets you block their crawlers gracefully. (No such luck with most other setups…)
The other thing to consider is, if those other search engine spiders are visiting your site, and you are getting sales/customers/clicks from that, then do you really want to block that spider from crawling/indexing your site? If it is simply just crawling your site and eating up your logs and not providing anything in return not even clicks then that is when you should probably block them. That is what you need to know.
Some Tips on Blocking Spiders To Think About
- If you’re considering blocking search engine spiders, make sure you’re doing it for the right reasons and not just because you've heard you can.
- Dont try to use any methods of tricking the spiders such as using agent detection and redirection. Be up front by using the robots.txt file or Meta Robots tag.
- Dont forget that just because you’re using the recommended methods to block content you’re safe. Understand how blocking content will make your site appear to the bots.
Reasons to block bots:
- Less bots on your site and more bandwidth/performance/speed for your real visitors.
- Helps to keep you safe against malware bots which search for vulnerabilities.
- log size
Reasons to not to block bots:
- Search engine bots can increase your traffic by indexing your website on more search engines.