Blogs >> Technology >>
What is robots.txt File and How to Best Use it For Your Benefit?
Duplicate
content is a big threat to a SEO company as it can affect the ranking
of a site but in certain cases it becomes necessary to put up such
content on the site especially when you want to provide the printable
version of a particular page to the viewers. You might be thinking if
there is a way to avoid the search engines from indexing a page and the
answer is the robots.txt HTML code. What
is robots.txt File and How to Best Use it For Your Benefit? For those
who are new to SEO when the robot.txt file command appears before a
page it instructs the search engines to stay away from that page. This
command is especially useful when there are same pages on the site or
some sensitive data which you do not want to be public. Images,
javascript and stylesheets can be excluded from indexing using this
command this will also save on bandwidth.
It should be understood that robots.txt is a mere instruction for the search engines and it will be very foolish if you use it for sensitive data. If in case the search engines are not able to locate the robots file then there are all possibilities that the data is indexed by them. So it is better to use a combination of methods for protecting sensitive data.
It should be known that the search engines do not search the whole site to find the robots.txt file but the main directory located here www.mysite.com/robots.txt. The robots.txt should be located in the main directory for the search engines to find it and do not index the files you do not want to be indexed. If it is not so placed then you cannot blame the search engines for indexing all the pages.
It should be understood that robots.txt is a mere instruction for the search engines and it will be very foolish if you use it for sensitive data. If in case the search engines are not able to locate the robots file then there are all possibilities that the data is indexed by them. So it is better to use a combination of methods for protecting sensitive data.
It should be known that the search engines do not search the whole site to find the robots.txt file but the main directory located here www.mysite.com/robots.txt. The robots.txt should be located in the main directory for the search engines to find it and do not index the files you do not want to be indexed. If it is not so placed then you cannot blame the search engines for indexing all the pages.
|