Configuration: Block Search Engine Crawling and Indexing
Detailed Setup Guide
(Optional) Configure specific pages to restrict crawling using robots.txt
(Optional) Configure restricted crawling across all pages using robots.txt
(Optional) Restrict search engine indexing across the entire shop
1. (Optional) Configure specific pages to restrict crawling using robots.txt
This is recommended to prevent search engines from crawling hidden categories, voucher product urls, and any page that should not be indexed by search engines.
To restrict only certain pages from being crawled by search engine bots, configure your robots.txt entry for your shop by going to Configuration > Settings > General Settings.
Locate the robots.txt section and expand it. It will be pre-loaded with some default restrictions setup by nopcommerce.
To disallow crawling of specific pages find the seach engine friendly page name for that product or category.
For product pages, open the product editor page in a separate tab and expand the SEO section.
Copy the value used in the 'Search engine friendly page name'.
For category pages, open the category editor page in a separate tab and expand the SEO section.
Copy the value used in the 'Search engine friendly page name'.
Back on the robots.txt editor, paste the ‘search engine friendly page name’ in the Localizable disallow paths at the bottom of the list with a / in front of the search engine friendly page name. Repeat for all PDPs that you wish to disallow crawling on.
For best results, complete this step before publishing the product if possible.
Save General Settings and clear store cache.
IMPORTANT: A specific ‘multi-store’ setting can override the ‘all stores’ setting. Also, if a robots.custom.txt file is present on the server that will override these setting values.
NOTE: Keep in mind that if a robots.txt entry is made for a product after it has been published, it could already have been crawled and indexed, and therefore will still appear in search engine results. To ensure previously crawled pages never appear in search engine results, consider setting up no indexing across the entire shop, detailed in section 3.
2. (Optional) Configure restricted crawling across all pages using robots.txt
To restrict only certain pages from being crawled by search engine bots, configure your robots.txt entry for your shop by going to Configuration > Settings > General Settings.
Locate the robots.txt section and expand it. It will be pre-loaded with some default restrictions setup by nopCommerce.
To disable crawling of the entire site enter a / at the end of the localizable disallow path.
NOTE: Keep in mind that if a robots.txt entry is made for a product after it has been published, it could already have been crawled and indexed, and therefore will still appear in search engine results. To ensure previously crawled pages never appear in search engine results, consider setting up no indexing across the entire shop, detailed in section 3 below.
3. (Optional) Restrict search engine indexing across the entire shop
To restrict the entire shop from being indexed by search engines, go to Configuration > Settings > All Settings.
Search for the Setting: ‘seosettings.customheadtags’ and edit it.
Add the following value and save.
<meta name="robots" content="noindex">
If something is already added to this setting, as it is also used to define Adobe Analytics Credentials, keep what is added to the setting and include the value above on a new line.