Robots.txt is the most important thing to rank a website on search engines. Websites powered by Blogger always have a default robots.txt file. And Blogger has the feature to customize the robots.txt file and that is Custom Robots.txt. In this tutorial, I am going to discuss the importance of custom robots.txt and also how you can add custom robots.txt in Blogger BlogSpot.
What is a Robots.txt file?
Robots.txt is a simple text file with a few lines of code that is the most important thing to a web crawler. This file always instructs the web crawlers how a website should be indexed or crawled. Web search crawlers always scan the robots.txt file before crawling any web pages of a website. That means you can restrict any web pages of your blog that are not important to get indexed or you don't want to index these pages.
Okay, now you know what is robots.txt file. Before you learn how to add custom robots.txt file in BlogSpot, you should know more about it. Let's see what inside a robots.txt file.
The default BlogSpot robots.txt file looks like the below picture. If you don't understand the code then no problem, I will explain these terms step by step.
What's in the robot.txt file?
A robots.txt file contains just a few lines of code such as User-agent, Disallow, Allow, Sitemap. Let's introduce with these terms:
User-agent: Media partners-Google
This code is used to serve better ads on your blog through Google Adsense. Please leave it as it is even if you are not running ads through Adsense.
This user-agent code with an asterisk (*) is for all robots.
This code is used to restrict indexing web pages of a site with having the search keyword just after the domain name.
In Blogger, your labels and search queries will not index by crawlers if you use this code. So, If you remove the code from your robots.txt file the crawlers can crawl and index your full website.
This code is used to crawl and index the homepage of a blog.
This code refers to the sitemap of a website. Adding a sitemap in the robots.txt file is the easiest way to optimizing the crawling rate of a site. Whenever a crawler scans the robots.txt file of a site it finds a sitemap link of that site. And the sitemap.xml file contains all of the published posts and pages of that blog.
The default BlogSpot sitemap link only tells the web crawlers about the most recent 25 posts and if you want to increase the number then replace the default sitemap with the below one.
If you have less than 500 posts in your blog then use the sitemap.
But If you have more than 500 posts in your blog then you need to use two sitemaps like below:
Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=1&max-results=500 Sitemap: http://example.blogspot.com/atom.xml?redirect=false&start-index=500&max-results=1000
How to add custom robots.txt in Blogger?
Here is the main part of this tutorial and that is how you can add a custom robots.txt file in your Blogger blog. Please follow the below steps to do it.
- Go to your Blogger blog.
- Navigate to Settings ›› Crawlers and indexing ›› Enable custom robots.txt ›› Yes
- Now paste your robots.txt file code in Settings ›› Crawlers and indexing ›› Custom robots.txt
- Click on the Save button
- Wow! You are done.
User-agent: Media partners-Google Disallow: User-agent: * Disallow: /search Allow: / Sitemap: https://example.com/atom.xml?redirect=false&start-index=1&max-results=500
How to check the robots.txt file?
Finally, you successfully add a custom robots.txt file in your BlogSpot blog. Now, it's time to check your blog robots.txt file. Follow the below steps to check the robots.txt file:
- Open your browser
- Type your blog URL in the address bar
- Add /robots.txt after your blog domain name
- Hit enter and it will open the file
- That's it.
See an example below:
How to Disallow a particular post?
If you want to restrict a particular post from indexing then you can do it by following the below steps.
Please be careful while doing this, you may block your site from indexing accidentally.
The yyyy refers to the year you published your post and mm refers to the month when you published your post and finally replace your-post-url.html with the post URL that you are trying to block from web crawlers.
How to Disallow a particular page?
You can follow the above method to block a particular page from indexing on search engines. But this code is a little bit different, just look at the below code:
Finally, I highly recommend you to do it very carefully and do not include unnecessary or incorrect code because it may break your site. I tried my best to explain robots.txt and how to add custom robots.txt in Blogger. Don't hesitate to ake me if you don't understand this tutorial or have any questions for this tutorial and I will try my best to answer your question.
Thanks for reading this tutorial and If you like my article then please share this blog on social media platforms and also stay with this blog to grow your online career. Happy blogging journey!