Recent Google updates to its search engine algorithm will have a significant impact on search results especially with reference to duplicate content.
The most important changes to the algorithm in the latest update are:
Link evaluation – The evaluation techniques used for some time now are changing although no details are available yet.
Fresher images – Search results for images will show fresher content with the option to filter by “Past Week.”
Higher position for local results – Local search results will appear higher in search engine result pages (SERPs).
Panda 3.3 – The Panda update involves a data refresh and apparently it will not have any impact on ranking.
Panda and Duplicate Content
Google’s Panda updates changed the way the search engine deals with duplicate content. Bad content not only hurts pages now, but also it can bring down an entire website.
Panda’s purpose is to penalize sites with too much duplicate content whereas before the search engine omitted these pages or included them as supplemental results.
Here are 10 great ways to help you deal with duplicate content and avoid penalties due to the Google Panda update.
1. Use the canonical tag — Once you track down duplicate content with Google Webmaster Tools or similar applications, you can implement the rel=canonical tag to handle domain and cross-domain issues.
2. 301 Redirect – Another way to handle duplication is to use a 301 redirect that informs visitors that the page has moved to another address. This may be a better option than implementing 404 pages, which may end up disappointing human visitors and could have some impact on your overall site juice. A 301 redirect is especially good for duplicate content with incoming links.
3. Consolidating pages – Google Panda recognizes pages with similar but not identical content as a form of duplication. Therefore, if you have articles split into many pages with thin content, you may want to consider consolidating these into fewer pages if not just a single page.
4. Templated content – Instead of using templated content, find ways to create new content without breaking the bank. Many freelance contractor websites allow you to ask for bids on projects from writers all over the world. In many cases, you can save substantial money by outsourcing to skilled writers with English language skills in the developing world. Another possibility is to hire a professional to create fresh original content.
5. Robots.txt – You can use the robots.txt to keep crawlers from indexing the duplicates without having to remove them from your site. The robots.txt file in the root directory tells search bots which files to crawl when they reach your domain. Simply add lines like “Disallow: /duplicate.htm” or “Disallow: /duplicate-folder” to prevent bots from crawling specific pages or folders respectively.
6. Meta Robots tag – Web bots also check header tags before crawling through the page’s content. The meta robots tag goes between the and tags prior to the main body of the page with the following format: . The tag tells the web crawler not to index the content of the page or to follow links included on the page. If you do want the search bot to follow the links but not to index the page content, use the tag “NOINDEX, FOLLOW” instead.
7. Google URL removal – An option that addresses the problem of duplicate content on Google alone is to use the “Remove URL” option at Google Webmaster Tools. You can find this tool by going to “Site configuration” at Google Webmaster Tools and then clicking through to “Crawler access.” On the latter page, there are three tabs with the last one giving the option “Remove URL.” Click on this tab and enter the URL to remove the duplicate page.
8. Internal links – Be careful to set your internal links to the canonical or 301 redirect page rather than to the duplicate content! If you make changes like a 301 redirect or add a rel=canonical link, take the time to check your old links and ensure that they point to the correct page.
9. Setting Google crawl parameters – At Google Webmaster Tools, you can block the search bot from indexing pages based on URL parameters. Click through to “Site Configuration” and then to “URL parameters.” Google will display the URL parameters for your site along with the current settings for those parameters. Select “No: Doesn’t affect page content (ex: tracks usage)” for those parameters that you do not want the search bot to crawl.
10. Improve content quality – Generally, you can reduce the possibility of Google sensing near-duplication by creating original high quality content with significant depth. Thin pages can easily categorize as near-duplicates, so increase the amount of text, but make sure that the style and quality are great. Adding rich media will help set your pages apart.
As you can see, with the recent Google Algorithm Update website owners really can’t afford to ignore duplicate content anymore. A complete website audit would be beneficial in determining the best method to fix the problem, especially during a website redesign.