Duplicate- Not set as canonical by the user

August 7, 2024

Niels Stuck CEO & Founder

ÜBER DEN AUTOR

SEO expert mit über 10 Jahren Erfahrung. Ich helfe Unternehmen, online sichtbar zu werden.

What is a duplicate?

Duplicate, also known as "Duplicate Content" or "duplicate content" occurs when identical or almost identical content appears under different URLs on the Internet. This can happen both deliberately and accidentally.

Search engines such as Google recognize these duplicates and often have difficulty selecting the relevant, original page and prioritizing it in the search results. This can not only lead to a poorer Ranking but also significantly impair the effectiveness of SEO measures.

Typologies and examples of duplicate content

Duplicate content can occur in various forms. It can occur either within a single domain (internal duplicate content) or across multiple domains (external duplicate content). Examples:

A print version and an HTML version of the same article.
Pages with and without trailing slash, such as example.com/page and example.com/page/.
HTTPS and HTTP versions of the same website.

Identifying and eliminating such duplicates is essential for an effective SEO strategy.

Negative effects of duplicate content

Duplicate content can have several undesirable consequences. One of the biggest challenges is that search engines cannot recognize the Ranking-potential, also "Link Juice" to several pages instead of giving it to a central page. This way, none of the duplicate pages will get the full SEO benefit.

In addition, the number of crawled pages that have to be processed by search engine bots can be unnecessarily increased. This can affect the indexing and findability of important web pages. It is therefore crucial to identify duplicate content and take appropriate measures to clearly distinguish between original and duplicate content.

Detection and challenges of duplicate URLs

Identifying duplicate URLs is crucial to optimizing the SEO performance of a website. Duplicate content is often caused by various factors such as different URL-Parameter, Druckversionen von Seiten oder die Erreichbarkeit von Seiten sowohl unter HTTP als auch HTTPS. Ohne eine kanonische Markierung betrachtet Google diese Seiten als gleichwertige Duplikate, was das Problem schafft, dass keine eindeutige Priorität zwischen den Seiten festgelegt wird..

Tools and techniques for identification

An effective way of identifying duplicate content is to use the Google Search Console. Under the menu item "Index" and then "Pages", URLs can be checked for successful and incorrect indexing. The index coverage report helps to identify affected URLs and their errors. Other helpful tools are specialized SEO analysis tools that can generate automated reports on duplicate content.

Challenges in eliminating

Das Entfernen doppelter URLs erfordert präzise und konsistente Maßnahmen. Eine der Herausforderungen besteht darin, die richtige kanonische URL festzulegen. Zu den empfohlenen Methoden gehören das Setzen eines rel=canonical-Tags im HTML-Code oder das Versenden eines entsprechenden HTTP-Headers in der Seitenantwort. Für größere Websites ist es zudem sinnvoll, die kanonischen Seiten in einer Sitemap zu hinterlegen. Eine weitere bewährte Methode ist die Implementierung von 301-Weiterleitungen, insbesondere wenn eine duplizierte Seite nicht mehr aktiv genutzt wird oder deaktiviert werden soll.

However, it is important to note that the file robots.txt nicht für die Kanonisierung verwendet wird und innerhalb der Website konsistente Verlinkungen zur kanonischen URL should take place. Furthermore, the implementation of hreflang-Tags a canonical page must always be specified, and Google always prefers HTTPS pages opposite HTTP pages.

Methods for defining a canonical URL

The determination of a canonical URL ist ein entscheidender Schritt, um doppelten Inhalt zu vermeiden und den Suchmaschinen eine klare Priorisierung zu ermöglichen. Es gibt verschiedene Methoden, um eine kanonische URL each with their specific areas of application and advantages.

Rel=canonical tag

The rel=canonical-tag is probably the most common method of labeling a canonical URL. Dieses Tag wird direkt im-Bereich des HTML-Codes einer Seite eingebettet und weist Suchmaschinen darauf hin, welche Version einer Seite als die Original- oder Hauptversion angesehen werden soll. Diese Methode eignet sich am besten für HTML-Seiten und ist relativ einfach zu implementieren.

HTTP header rel=canonical

An alternative method is to send a "rel=canonical" header in the HTTP response of the page. This method is particularly useful to prevent the enlargement of pages by different versions. It can also be used to effectively canonicalize content that is not in HTML.

Sitemap and 301 redirection

For larger websites, it is advisable to save the canonical pages directly in a Sitemap must be specified. This makes it easier for search engines to index correctly. Another common practice is the use of 301 redirects. This forwarding method is particularly useful if a duplicated page is no longer in use or is to be permanently removed.

Es ist entscheidend sicherzustellen, dass keine verschiedenen kanonischen URLs für dieselbe Seite angegeben werden und innerhalb der Website konsistente Verlinkungen zur kanonischen URL must be carried out. It should also be noted that when using hreflang-Tags a canonical page is always specified and HTTPS pages opposite HTTP pages are preferred.

Common crawling errors and their causes

Crawling-Fehler treten auf, wenn Suchmaschinen-Bots Schwierigkeiten haben, eine Seite oder bestimmte Inhalte zu indexieren. Diese Fehler können aus verschiedenen Gründen entstehen und müssen regelmäßig überwacht und behoben werden, um die Suchmaschinenoptimierung und Indexierung der Seite nicht zu beeinträchtigen.

Server errors and URL blocking

Server error (5xx) are caused by problems such as server overload, incorrect server configuration or server failures. In these cases, the bot is unable to access the page. Continuous monitoring and immediate resolution of these problems is essential. Another common error is when URLs are blocked by the robots.txt-file are blocked. In this case, a rule in the robots.txt-file to access the URL. Die Lösung besteht darin, die betreffende Blockierregel zu entfernen und somit den Zugang freizugeben.

Errors such as "Sent URL is as noindex characterized" occur when a noindex-meta tag is present. To rectify this error, the noindex-meta tag can be removed if the URL indexed should be displayed. Soft 404 errors are also common. These occur when a page does not exist but still returns a 200 status code. An HTTP response code 404 or 410 should be used here instead.

Pages not found and indexing problems

A "404 not found“-Fehler bedeutet, dass die URL nicht existiert. Um dies zu beheben, sollten die eingehenden Links überprüft und gegebenenfalls umgeleitet werden. Ein „403 Prohibition of access" error indicates that the search engine bot is missing the necessary login data. Here, access for the crawler should be enabled without restriction.

Errors such as "Crawled - currently not indexed" are caused by less relevant content for users, unrated pages or duplicate content. To solve these problems, duplicate content should be checked, internal links optimized and more helpful content added. Content are created. Similar measures apply to pages with the status "Found - currently not indexed“, wo die URL has been found but not yet crawled.

Other specific errors

A common error is "Alternative page with correct canonical tag", where the Canonical tag refers to the main version of the content. The error "Duplicate - not defined as canonical by the user" occurs if there are no canonical tags on the main version. Appropriate tags should be set here. There are also cases in which Google determines another page as canonical, as with "Duplicate - Google has designated a different page than the user as the canonical page". Redirects should be checked here and adjusted manually if necessary.

Measures to avoid duplicates in WordPress

Avoiding duplicates in WordPress is an important part of SEO management and requires a targeted approach. One of the most effective methods for identifying and eliminating duplicate content is to use the Google Search Console or special SEO tools that duplicate Content and make appropriate recommendations.

Identification and canonization

The first step is to identify duplicates and determine the main page. Once the main page has been defined, canonical tags can be set. This is done in WordPress using either a SEO plugins or manually.

With an SEOPlugin: Edit the page or post in question in the WordPress administration, go to SEO-Plugin-area and scroll to the canonical URL and save.

Without Plugin: Switch to text mode in the WordPress editor and <link rel="canonical" href="URL_YOUR_MAIN_PAGE" /> in the area and refresh the page.

Structure and internal linking

A clear and unambiguous page structure also helps to avoid duplicate content. Care should be taken to ensure that each page has a unique and clear purpose. The internal linking should be optimized in such a way that it is consistently based on the canonical URL verweist. Zudem ist es ratsam, regelmäßig SEO-Audits durchzuführen, um sicherzustellen, dass alle kanonischen Tags richtig gesetzt sind und keine neuen Duplikate entstanden sind.

One preventative measure is to avoid duplicates from the outset by ensuring that no superfluous pages or posts are created. If necessary, external help can also be sought to ensure that the measures presented are implemented correctly and that all potential problems are identified and rectified.

« Back to Glossary Index

With top positions to the new sales channel.

Let Google work for you, because visitors become customers.

About the author

Niels Stuck

Niels Stuck has 10 years of SEO experience and is the founder of the SEO agency "WOLF OF SEO". He gained practical experience by building 20+ affiliate sites alongside his marketing studies. Finally, he wrote his bachelor thesis about the influence of SEO on Google rankings, traffic and sales development in the form of a case study. Today, he specializes in e-commerce SEO and helps more than 80 companies build a sustainable organic revenue channel through SEO. Niels advises startups, established brands and corporations in search engine optimization of their online stores and primarily focuses on data-based content strategies and link building. He shares his knowledge about SEO and online marketing in this blog, as a speaker at conferences, in podcasts and as a guest author for OMT, Forbes, Starting Up and many more platforms.

All contributions

Social Media & Links:

Duplicate- Not set as canonical by the user

ÜBER DEN AUTOR

What is a duplicate?

Typologies and examples of duplicate content

Negative effects of duplicate content

Detection and challenges of duplicate URLs

Tools and techniques for identification

Challenges in eliminating

Methods for defining a canonical URL

Rel=canonical tag

HTTP header rel=canonical

Sitemap and 301 redirection

Common crawling errors and their causes

Server errors and URL blocking

Pages not found and indexing problems

Other specific errors

Measures to avoid duplicates in WordPress

Identification and canonization

Structure and internal linking

Content

With top positions to the new sales channel.

About the author

Niels Stuck

Arrange free SEO initial consultation

Open questions? Shoot!

Our services

Overview

SEO top posts

Top ratings

Gifts

SEO Scaling Framework

Request video + PDF now!

SEO Funnel Breakdowns

Jetzt Videos anfordern!