Over the past two days, I have been writing about duplicate content. As discussed earlier, WordPress blogs are notorious for duplicate content. Duplicate content can confuse search engines, and can get you penalized by Google.
Today’s Lesson
While researching on Google’s website (webmaster tools), Google suggests using a robots.txt file as one way to avoid duplicate content.
The robots txt file, gives the crawlers, bots and spiders “instructions” as to what to crawl on your site.
With the robots txt. file, you can avoid sections of your blog from being crawled, thus, avoiding duplicate content.
In researching this issue, I find differing opinions. Some will say a definite “Yes”, you need a robots txt. file. Others claim, it’s not necessary.
Today’s Lesson
Having reviewed your site for duplicate content, do you deem it necessary to add a robots.txt file to your blog?
To learn more about robots txt. files, here’s a link that gives very valuable information.
To know what others are doing, Daniel, at Daily Blog Tips, wrote a great post, where he researched how others are dealing with this issue. He includes sites such as Problogger, John Chow, and TechCrunch. The results are quite interesting.
Adding a robots txt. file to your blog is a decision only you can make.
To see how your site looks to the robots, you can type in http://yoursitename.com/robots.txt
When you hit the search button, a new screen will appear. It may look like this:
User-agent: *
Disallow:
This (*) tells all crawlers, spiders and bots (user agents) to crawl your site. “Disallow:” means that they are allowed to crawl everything on your site.
What have you decided?
Do you feel comfortable setting up a robots txt. file?
Do you think you need one?
What I did was install a plugin for this purpose. It is called the KB Robots txt. plugin. and was written for WordPress blogs, by Adam R. Brown. It can be downloaded here. Many thanks, Adam.
Hello Barbara,
This has been done with my site for a few week now. And I am finding since I did this, Technorati never updates my site. My Technorati listing says I have not written any article already for 44 days!!
Do you think I am blocking Technorati’s robot to crawl?
These things are so tricky.
Hi Asako,
I looked at your robots.txt file, and it doesn’t appear you are blocking any user agents.
Does your blog automatically ping Technorati when you publish a post? WordPress uses pingomatic, which includes Technorati.
You can also manually ping Technorati, but that’s can get time consuming.
I know you were “working behind the scenes” a few weeks ago. You might try and backtrack and see if you can figure out what you changed. It’s probably something very simple.
You are correct….make one small mistake, and you could be in deep trouble. Trust me, I crashed both of my blogs, and it was all because I did something I wasn’t familiar with. But…….that’s how we learn.