Robots.txt is the configuration text file with couple of commands to take control over web content sharing with web crawlers.
The robots exclusion protocol is created to instruct robots or web crawlers like google-bot and Bingbot for how to access and index pages from particular site.
Generally, for personal website owner(i.e word press) needs to place this file in main or top level directory.
Example:
http://www.blogger.com/robots.txt
Read more about: Best Example ROBOTS.TXT Generated by google.com
For blogger users, By default this files has been placed in root directory. To access it user needs to Enable it manually by change in settings.
Step 2. Click on Edit and configure your commands.
Step 3. Click on save changes
Allow to index xml sitemap and mobile urls with special characters
Block dynamic urls which contains Special characters "?" and "="
Sitemap information
The robots exclusion protocol is created to instruct robots or web crawlers like google-bot and Bingbot for how to access and index pages from particular site.
Generally, for personal website owner(i.e word press) needs to place this file in main or top level directory.
Example:
http://www.blogger.com/robots.txt
Read more about: Best Example ROBOTS.TXT Generated by google.com
For blogger users, By default this files has been placed in root directory. To access it user needs to Enable it manually by change in settings.
Steps to Enable Robots.txt file in blogger
Step 1. Login to your account and go to settings.Step 2. Click on Edit and configure your commands.
Optimized ROBOTS.TXT file
Wrong setting may affect your whole website or even may be ignored by search engines. So it is necessary to know and understand each command and its importance before use.User-agent: * Allow: / Allow: /*?m=* Allow: /atom.xml?redirect=false&start-index=1&max-results=100000 Allow: /sitemap.xml?page=1 Allow: /sitemap.xml?page=2 Disallow: /search Disallow: /*?* Disallow: /*=* Sitemap: http://www.yourdomain.com/sitemap.xml
Lets Understand each commands
Allow and index everythingUser-agent: * Allow: /
Allow to index xml sitemap and mobile urls with special characters
Allow: /*?m=* Allow: /atom.xml?redirect=false&start-index=1&max-results=100000 Allow: /sitemap.xml?page=1 Allow: /sitemap.xml?page=2
Block dynamic urls which contains Special characters "?" and "="
Disallow: /search Disallow: /*?* Disallow: /*=*
Sitemap information
Sitemap: http://www.yourdomain.com/sitemap.xmlCaution: In above line make sure you update it with YOUR website URL, otherwise it is possible that your entire website will be ignored by google.
How its SEO optimized
This configuration is specially designed for bloggers only. In blogs there are several urls that can not be changed.
For example: Dynamic comment url structure contains special characters and expires soon. This cause 404 error in webmaster.
Generally 404 Error does not affect overall site ranking but it does make negative impact on users and so on Search engine optimization
Read more about: 5 Custom configuration for SEO in Blogger
Mobile optimized or multi window blogs
Due to increase of smartphone users Google changed policy about optimization. So if you noticed then the google prioritized the search results according to its design.
If any user searching from mobile phone then the search engine will give first priority to multi window optimized webpages.
In case of blocking all Dynamic URL will affect to search engine bots and you may loss mobile users. That is why *m?=* in command will give manual access to search engine even other dynamic urls were blocked!
This will not just make your blog clean but it will decrease the 404 errors and optimize it.
Read more about: Blogger SEO| Does it really exist?
Validation and verification
As mentioned above any mistake will cause your website to be ignored by search engines. Thus it is important to verify it manually.
Login to your google webmaster account and go to Crawl > Robots txt tester
Click on submit to update your settings. To see changes reload the page and verify it.
Place you sitemap and other url then click on Test. Make sure result is ALLOWED.
Comments
Post a Comment