Duplicate content

28 March 2024

Reading time: 7 minutes

Senior SEO-specialist

When you create unique and quality content, Google sees this as a plus. What’s in opposition to unique content? Duplicate content. In this article, we talk about what duplicate content is and the impact it has on a website’s SEO.

Table of contents show

What is duplicate content?

Duplicate content is literally translated from English: duplicate content. Internal duplicate content often occurs within a website due to technical reasons.

An example is when a website has both the http version and the https version live. In this case, the website is already completely duplicated, causing serious problems within Google. This also occurs in the case where the website has both the www. as non-www. version live.

The impact of duplicate content on SEO

Does one of your pages contain content that was also published on another page (inside or outside your website)? When this is the case, it can send a negative ranking signal to Google. Google strictly monitors this. The reason? Google only wants unique and quality content in the search engines.

The search engine was originally created to provide visitors with the best and quickest answers to their questions. When Google’s index contains a lot of duplicate content, this is impossible. Google also issues penalties for duplicate content. When you copy someone else’s work and blindly copy it onto your website, Google penalizes it more harshly than when it happens within a website.

How do you avoid duplicate content?

One solution for pages that contain duplicate content is to use the ”canonical tag.” This tells Google which page should rank higher. That way, you can be sure that Google is making the right choice and the right page is ranking higher.

Common cases of duplicate content

Duplicate content refers to substantially similar or identical content that appears on more than one Web address (URL). This can occur both within a single Web site (internal duplicate content) and across multiple Web sites (external duplicate content). Here are some common forms of duplicate content:

WWW vs. non-WWWW versions: When the same page is accessible from www.example.com and example.com without redirecting one of the versions to the other.
HTTP vs. HTTPS: Similar to the above, when the same content is available on both secured (HTTPS) and unsecured (HTTP) versions of a website.
Trailing slash: Pages reachable with and without a slash at the end of the URL (e.g., example.com/about and example.com/about/).
URL parameters: when URL parameters such as session IDs or tracking codes result in content that is reachable from multiple URLs.
Print-friendly versions: Content available in both regular and print-friendly versions can be seen as duplicate if proper measures are not taken.
Product pages: E-commerce sites often have multiple URLs leading to the same product, such as through different color or size selections.
Content Syndication: Content originally published on one Web site and then replicated on other sites can lead to external duplicate content.
Language- and region-specific URLs: Websites that offer multiple language or region versions of the same page may inadvertently create duplicate content if not configured correctly with hreflang tags or other methods.
Archive and categorization pages: Blogs or news websites can display identical articles on both individual post pages and archive or category pages.
Mobile and desktop versions: Before responsive design became commonplace, websites often had separate mobile versions (e.g., m.example.com) that contained the same content as their desktop equivalent.

Duplicate content can be problematic for SEO because it can confuse search engines about which version of the content should be indexed and how link equity should be distributed. Properly using 301 redirects, canonical tags, and hreflang tags are some techniques to address these issues.

Determine priorities

With fixing duplicate content, I always try to prioritize initially. I do this by dividing a a spreadsheet this way (with an explanation for each column below):

URL	SEO traffic	Words	Duplicate (%)	Duplicate words	Relevance

URL: This is where I enter the URL of the page it is about.
SEO traffic: This is where I enter the monthly SEO traffic (usually I filter this from high to low, to get an immediate picture of important pages for SEO). To be more complete, you may want to supplement this with the number of conversions.
Words: The total number of words on the page (briefly create an export from Screaming Frog).
Duplicate (%): Here I enter the percentage of duplicate content (number of words duplicate). This is without the footer/main menu, this is always duplicate of course.
Duplicate words: This is where I enter the number of words of duplicate content per page.
Relevance: Here I display how relevant the page is to us as a company (high/mid/low). From this column, it is easy to determine if a page is actually relevant to us. Sometimes you have pages that receive a lot of traffic but do not generate conversions. This is in addition to prevent these pages from being immediately prioritized.

Often, a duplicate content implementation is a large project, making setting the right priorities vital in. When you start with the highest priorities, you can immediately make the biggest impact on organic findability.

Duplicate content at the most important pages

On the most important pages of a website for SEO (often this is about 5 pages), I really try to go for the 0% duplicate content. Consider optimizing the following:

Make call-to-actions unique to the page (yes, even the image used in the call-to-action).
No stock photos (I wouldn’t do within an SEO journey anyway).
Making the reviews or portfolio unique. So do not use the same portfolio items on these pages when they are also used on the portfolio page.
Rewrite the USPs on these important pages so that they are also unique. Also consider the icons used for this purpose.
Make banner or background images unique so they are unique.

Get thorough in this, this still constitutes those extra percent of optimization for these important landing pages. It won’t get you from the third page to position #1, but can ultimately make the difference between a #3 position and position #2.

My advice

In many cases, duplicate content is created by an internal technical error. Google is quite accommodating in this and in many cases simply shows one of the versions. The version that Google thinks is good is shown highest for the keyword it ranks for.

However, in some cases, Google does not see exactly which page should be shown higher. In this case, the page with the most authority (read: highest page rank) is displayed. In many cases, this is not exactly the page the owner wants to show on the keyword in question. Even though a website contains a lot of duplicate content, it can still rank high.

As SEO consultants, we always consider duplicate content. All the content we wrote is unique. Similarly, we pay very close attention to duplicate content within our site.

Therefore, we recommend spending enough time to create high-quality and unique content. This is bearing fruit!

Frequently Asked Questions

What is duplicate content?

Duplicate content in Dutch means duplicate content. It can occur, for example, if a website has both http version and https version open at the same time. If this is the case then you are completely duplicated and that can cause problems within google. Google strictly checks for all possible duplicate content.

What is the impact of Duplicate Content on SEO?

Google especially wants content in the search engines that is unique and of good quality. If you have duplicate content then this will send a negative ranking signal and that will be penalized by google. A search engine is to give the visitor answers to all their questions as quickly as possible.