How To Prevent Duplicate Content With Effective Use Of The Robots.txt And Robots Meta Tag
|
|
Duplicate content is one of the problems that we regularly come across as part of the search engine optimization services we offer. If the search engines determine your site contains similar content, this may result in penalties and even exclusion from the search engines. Fortunately it's a problem that is easily rectified.
Your primary weapon of choice against duplicate content can be found within "The Robot Exclusion Protocol" which has now been adopted by all the major search engines.
There are two ways to control how the search engine spiders index your site.
1. The Robot Exclusion File or "robots.txt" and
2. The Robots < Meta > Tag
The Robots Exclusion File (Robots.txt)
This is a simple text file that can be created in Notepad. Once created you must upload the file into the root directory of your website e.g. www.yourwebsite.com/robots.txt. Before a search engine spider indexes your website they look for this file which tells them exactly how to index your site's content.
The use of the robots.txt file is most suited to static html sites or for excluding certain files in dynamic sites. If the majority of your site is dynamically created then consider using the Robots Tag.
Creating your robots.txt file
Example 1 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and make the entire site available for indexing. The robots.txt file would look like this:
User-agent: *
Disallow:
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. By leaving the "Disallow" blank all parts of the site are suitable for indexing.
Example 2 Scenario
If you wanted to make the .txt file applicable to all search engine spiders and to stop the spiders from indexing the faq, cgi-bin the images directories and a specific page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: *
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
The use of the asterisk with the "User-agent" means this robots.txt file applies to all search engine spiders. Preventing access to the directories is achieved by naming them, and the specific page is referenced directly. The named files & directories will now not be indexed by any search engine spiders.
Example 3 Scenario
If you wanted to make the .txt file applicable to the Google spider, googlebot and stop it from indexing the faq, cgi-bin, images directories and a specific html page called faqs.html contained within the root directory, the robots.txt file would look like this:
User-agent: googlebot
Disallow: /faq/
Disallow: /cgi-bin/
Disallow: /images/
Disallow: /faqs.html
Explanation
By naming the particular search spider in the "User-agent" you prevent it from indexing the content you specify. Preventing access to the directories is achieved by simply naming them, and the specific page is referenced directly. The named files & directories will not be indexed by Google.
That's all there is to it!
As mentioned earlier the robots.txt file can be difficult to implement in the case of dynamic sites and in this case it's probably necessary to use a combination of the robots.txt and the robots tag.
The Robots Tag
This alternative way of telling the search engines what to do with site content appears in the section of a web page. A simple example would be as follows;
In this example we are telling all search engines not to index the page or to follow any of the links contained within the page.
In this second example I don't want Google to cache the page, because the site contains time sensitive information. This can be achieved simply by adding the "noarchive" directive.
What could be simpler!
Although there are other ways of preventing duplicate content from appearing in the Search Engines this is the simplest to implement and all websites should operate either a robots.txt file and or a Robot tag combination.
Should you require further information about our search engine marketing or optimization services please visit us at http://www.e-prominence.co.uk ? The search marketing company
|
|
|
Created & Maintained by Empower! CMS Web Sites
Host2Sell Web Hosting | Emarketing Workshops | Site SEO Review | FREE NewsletterStay In The Know With Google Sms
The Short Messaging Service (SMS) from Google sends short, quick, text answers in response to your queries from an SMS-enabled mobile device, such as a cell phone. For example, you can look up phone numbers and addresses of local restaurants, do local phone book searches, compare prices from online merchants in Froogle to those you find in local stores, even look up definitions of words fr...(related: Search Engine Optimization)
Search Engine Robots - How They Work, What They Do (part Ii)
If your site isn't found in the search engines, it is probably because the robots couldn't deal with it. It could be something as simple as not being able to find the site, or it may be more complicated issues involving the robot's not being able to crawl the site or ...(related: Search Engine Optimization)
5 Things To Keep An Eye On In The Seo World In 2005...
After the latest PR update at Google and MSN's beta search going live, there is one thing for certain in 2005: the world of search is in for some major changes. There has been growing speculation around the SEO world that reciprocal linking is a thing of the past. Rumors are abound that PR means less and less, if anything. Bill Gates came out of his cave to say that "Today's search is nothing" and that it won't be that way for long. There are quiet rumblings in the SEO back alleys of a new, state-of-the-art search engine currently indexing the internet. Websites are dropping off the face of the planet. And we're all left to sit here and put together the pieces. So what is in store for 2005?1) Reciprocal links, while not becoming totally dead, are decreasing in v...(related: Search Engine Optimization)
Seos Relationship With Website Architecture
SEO's Relationship With Website ArchitectureSearch engine optimization for today's search engine robots requires that sites be well-designed and easy-to-navigate. To a great degree, organic search engine optimization is simply an extension of best practices in web page design. SEO's relationship with web design is a natural one. By making sites simple and easily accessible, you are providing the easiest path for the search engine robots to index your site, at the same time that you are creating the optimum experience for your human visitors.This approach ties well into the notion of long-term search engine mark...(related: Search Engine Optimization)
Got Spiders?
Many internet marketers blow mountains of start-upcash on their websites just trying to break into search engine rankings. I was one of these internetmarketers.I spent cash on get-rich-quick submission servicesthat claime...(related: Search Engine Optimization)
The Long And Short Of It Is That These Two Sales Techniques Are The Same
With the Internet beginning to stand up and be counted as an online business medium, many are beginning to realize that selling online is not only possible, but very profitable. People are seeking ways to improve their sales technique online and close on more of their website visitors. It was bound to happen. The problem now is that there are thousands of different companies telling you that their way is the best way and that you should follow their advice. I don't subscribe to all the thousands of new and fangled ideas to sell anything. This article describes two methods described by many as 'new' online sales techniques, which many people call the long copy versus the short copy debate. It's actually simply a mix of the ways we've all been selling s...(related: Search Engine Optimization)
Increase Web Site Sales With A Seo Proposal - Part 1
You can easily get confused by all the search engine optimization companies and SEO experts that offer SEO services. It's hard to know who to trust or what should be included in a SEO p...(related: Search Engine Optimization)
Complete Web-site Optimization For Search Engines (part 1)
SEO or search engine optimization strategy now becomes widely popular among online business operators. Nothing strange about it as it allows to substantial...(related: Search Engine Optimization)
Seo Expert Guide - Keyword Analysis (part 3/10)
If you imagine that building an optimized site is like cooking a meal, then keywords are the essential ingredients. Would you attempt to cook a complex new dish without first referring to a recipe? Would you start before you had all the ingredients available and properly prepared?In our analogy, key words are your ingredients and the rest of the guide (after this part) is your recipe. It is vital that you start by investing time in key word research. This may surprise you, but I would recommend you spend at least 25% of your time on this activ...(related: Search Engine Optimization)
What Is Search Engine Optimization And Why Do I Need It?
Purchasing web design service is confusing with all of the different buzz words floating around. In an earlier article I discussed the differences between custom web design and web design based on templates. Another service being pushed by many web developers is search engine optimization (SEO). It has become a huge field in the last few years despite the fact that many purchasers of web development services are not clear on what it is or why they need to consider it in the making of their sit...(related: Search Engine Optimization)
An Seo Checklist
Search engine optimization is on every webmaster's mind these days. Achieving a favorable ranking for the right keywords can mean a steady stream of targeted traffic to your site, and all for free - that's hard to beat. The key to high search engine rankings is structuring your website correctly, including plenty of content that is relevant to your keywords, and making sure your website is spider-friendly. You can use this checklist to make sure all of your Web pages can be found, indexed and ranked correctly:
Your website is themed. Your site deals with an identifiable theme which is obvious from the text on the home page and reinforced by all the other pages on your site. In other words, all the individual Web pages relate to each other and deal with various aspects o...(related: Search Engine Optimization)
site-map - Copyright © 2006 Empower! Web Design | All Rights Reserved. | Search Engine Optimization
