Duplicate Content Blankets 25 to 30% of the Web – Can Copying Content Get You Banned by Google?
One of the main ingredients to building and maintaining a web site that ranks well for search engines is avoiding duplicate content issues.
In December of 2013, Matt Cutts of Google’s Webspam team posted a video speaking about duplicate content and the repercussions that can follow within the Google search engine results pages (SERPs).
In his video, Matt Cutts says that duplicate content on the web is roughly between 25% and 30%… can you believe that? Out of all the web content on the Internet, over one-quarter of it is duplicate or repetitive material.
Can Duplicate Content Get Your Site Banned by GOOGLE?
Unfortunately there is a ranking war out there, so there are many people that are committing plagiarism to attempt to work their way up the rankings faster.
One of the problems is that people steal content from high ranking web sites, which in turn is stealing from the success of that website and their ideas.
Google’s explanation of duplicate content is: “Duplicate content generally refers to substantive blocks of content within or across domains that either completely match other content or are appreciably similar.”
“In some cases, content is deliberately duplicated across domains in an attempt to manipulate search engine rankings or win more traffic. Deceptive practices like this can result in a poor user experience, when a visitor sees substantially the same content repeated within a set of search results.“
“Google tries hard to index and show pages with distinct information. In the rare cases in which Google perceives that duplicate content may be shown with intent to manipulate our rankings and deceive our users, we’ll also make appropriate adjustments in the indexing and ranking of the sites involved. As a result, the ranking of the site may suffer, or the site might be removed entirely from the Google index, in which case it will no longer appear in search results.”
You can find this information at the following link: Google Webmaster Central!
Here is Matt speaking about duplicate content:
Google only wants what’s best for its search queries, which is to return results with the most relevant piece’s of content. So, don’t be insulted if your content is not chosen… you are not being penalized.
My philosophy is whether sites are being penalized or not for duplicate content, you must remain vigilant because someone may very well attempt to poach your content.
Duplicate Content or Plagiarism from Article Directory Sites?
Duplicate content from plagiarism can be a real problem for web site owners, and it can ruin a websites reputation.
If you have submitted content to Article Directory Submission sites, then this is public domain for others to copy your content. These website’s are ‘public content depositories’ where webmasters can search for, and select content to place on their own site. This content is usually in the form of an article written by the webmaster of another site.
That webmaster has the option to include a link to his or her own site that MUST BE left intact when the article is copied by others.
Not to include article directory submission sites (unless the rules were not followed by the person copying your content), if you happen to discover a person that has stolen your content, and placed it on their site without your permission, you should first contact the person and demand that it be removed at once.
If this doesn’t work, then you will need to contact the hosting service for their site.
You can find this information on the Whois Source database. Contacting the hosting service may also alert the search engines and make them aware of the problem. This may also get the content thief kicked off the search engine indexing all together.
Fortunately there are tools that exist that can help you search for replicated content. One of the most commonly used tools is Copyscape. This is a free service that searches for similar or identical content, then reports it to you.
Copyscape also offers a free plagiarism warning banner that you can display on your web site to deter others from stealing your work. There is also the premium membership with Copyscape that gives you unlimited searches for copies of your web pages, and also tracks acts of plagiarism.
Another great plagiarism detector is CopyGator. CopyGator is a free service that is designed to monitor your RSS feed and find where your content has been republished in cyberspace. They will notify you when a new post of yours was copied to another feed, plus CopyGator has built a page that you can view to see where and when your content was duplicated.
CopyGator will also provide you with a badge that you can place on your blog that will find the feeds to your site and watch your content for duplication. When the badge has turned RED, it means that your content has been duplicated.
Closing Comments about Duplicate Content and Plagiarism!
If you are duplicating content on your own site, remember Matt’s warning… if you are doing nothing but duplicating content, and are doing it in an abusive, deceptive or malicious way, Google reserves the right to penalize any site that is duplicating content in an excessive and manipulative manner.
As a website owner, you need to understand that you have the rights to protect your website content. Content duplication is wrong and illegal to use without authorization, and as such you should stand up for yourself because this is the only way to put an end to it.
There is another issue known as URL Canonicalization you need to be aware of. A URL canonical issue is the presence of multiple URL’s that all return the same web page, which can also lead to duplicate content issues.