To remove an entire site or specific sections and pages from search results in Google, Yandex, and other search engines, they must be blocked from indexing. After this action, the content will no longer appear in search results. Let’s review the commands you can use in the robots.txt file to prohibit indexing.
robots.txt is a special file that allows you to configure how your site is indexed by search engine crawlers.
Here are some settings you can apply with robots.txt:
You can set the crawl rate in Yandex.Webmaster under Indexing — Crawl rate. More details are available in Yandex help.
For Google, the search engine bot automatically adjusts the crawl speed depending on the server’s response. If the server slows down or returns an error, crawling may pause.
Please note:
Examples:
# blocking indexing of vip.html for Googlebot only:
User-agent: Googlebot
Disallow: /vip.html
# blocking indexing of the /private folder for all crawlers:
User-agent: *
Disallow: /private/
# allowing YandexBot to access only pages starting with /shared:
User-agent: Yandex
Disallow: /
Allow: /shared
The User-agent directive specifies which crawler the rules apply to. You can name specific bots or set rules for all crawlers.
To block your entire website from all search engines, add this to robots.txt:
To block only one search engine (e.g., Yandex):
To block all except one search engine (e.g., Google):