Orphan Pages: Why They Matter for SEO (& How to Find & Fix Them on Your Website)
If you have orphan pages on your website, you may be missing out on valuable opportunities to drive traffic to those pages — or you might be potentially wasting your crawl budget on unimportant content. In this article, you’ll learn how to avoid SEO issues relating to orphan pages — and how to find (and fix) any orphan pages that exist on your site.
Orphan pages are URLs that do not receive internal links from any other page on your website.
Without internal links pointing to a page on your website, it’s much harder for that page to be discovered and indexed by search engine crawlers (more on that below!). Website visitors will likewise not be able to find your orphan pages unless they know the direct URL.
If the orphan page in question houses content that you want people to find in their search results (or as they browse through your wider site), you’ll likely want to fix its orphan page status by adding internal links.
The good news is that orphan pages are easily found and resolved — in this article, we’ll explain why orphan pages matter from an SEO perspective and show you how to find orphan pages on your site so you can start getting them discovered by search engines and humans alike.
This post is part of Deepcrawl’s series on Website Health. In this series, we are diving into the 7 categories of the SEO Funnel to help digital marketing teams learn more about the many factors of search engine optimization that contribute to a high-performing, well-ranking website. Our resources will help you learn more about Site Architecture as a foundational aspect of the SEO funnel.
What is an orphan page?
Orphan pages (also called orphaned pages) are simply pages on a website with no internal links leading to them. These pages are isolated from the larger ecosystem of your site because, without internal links in place, visitors cannot discover them from any other site section. From an SEO perspective, is also much harder for search engines to discover orphan pages.
Are orphan pages bad for SEO?
Often, orphan pages will not appear in users’ search engine results pages (SERPs). This is because the search engine crawlers that are responsible for finding and indexing pages (and serving them up to users in the SERPs) navigate websites in large part through the internal linking structures that are in place on your site. (Note: search engines also use the information provided in your sitemap to discover new pages).
In short, a page without any internal links pointing to it is less likely to be crawled — and therefore, less likely to be indexed and found by potential visitors.
Here’s how Google explains its crawling and URL discovery process:
The first stage is finding out what pages exist on the web. There isn’t a central registry of all web pages, so Google must constantly look for new and updated pages and add them to its list of known pages. This process is called “URL discovery”. Some pages are known because Google has already visited them. Other pages are discovered when Google follows a link from a known page to a new page: for example, a hub page, such as a category page, links to a new blog post. Still other pages are discovered when you submit a list of pages (a sitemap) for Google to crawl.
It’s also worth noting that pages with fewer internal links pointing to them, in general, are often assumed by search engines to be less important and ranked accordingly in the SERPs—this is one reason internal linking matters a lot when trying to get your content ranked highly.
So, if you have an orphan page with content that you’d like people to find, you’ll definitely want to ensure it can be found by users and search engines alike!
On the other hand, if you have orphan pages that you don’t want Google to find, but are discovered by Google’s crawlers regardless (for example, through external backlinks or a sitemap), they could be wasting valuable crawl budget. If you are wasting your crawl budget on low-value pages with no SEO potential, this could hold your more essential pages back by limiting the frequency of crawling on your more important content.
Suppose you have orphan pages that you want to leave out of Google’s index intentionally. In that case, you should make sure to use the appropriate ‘nofollow’, ‘disallow’, and ‘noindex’ directives.
How to find orphan pages on your website
Before you determine whether you’ll want to fix an orphan page with internal linking — or whether you should remove or ‘noindex’ it — it’s helpful to get a full picture by finding all of the orphaned pages on your website.
So, how can you find orphan pages on your website?
There are several methods available — below, we’ve outlined how you can easily find orphan pages on your site using Deepcrawl. You can also use supplementary tools such as Google Analytics to see whether or not any traffic is currently landing on your orphan pages.
How to quickly find orphan pages with Deepcrawl
In Deepcrawl’s Analytics Hub, you can easily view all orphan pages on your site by navigating to the “Overview” report within the “Source Gap” menu section:
Here, you can view all orphan pages by various criteria, including:
- Orphaned Pages in Analytics
- Orphaned Pages in Search Console
- Orphaned Pages with Backlinks (that is, orphan pages that have no internal links to their URLs within your own website, but might be linked to from an external source)
- Orphaned Sitemaps Pages
- Orphaned Log Summary Pages
What should you do with orphan pages?
No indexable pages on your site should be orphaned. (If it’s a page you don’t want to appear in search results, it should be noindexed).
So, when you find an orphan page on your site, you need to determine one of three things:
- Has this page been unintentionally orphaned? (For example, it was missed in a site migration or it’s a new page you forgot to internally link to.)
- Is this page something you don’t want to show in search engine results pages? (For example, a landing page for a specific audience that doesn’t need to be shown in search results.)
- Is this page very similar to, or a duplicate of, another page on your site?
Identifying which of these categories the orphan page falls into will help you to take the best course of action. Assess each orphan page and determine whether it is a URL that should be un-orphaned (by adding internal links to it), noindexed, deleted (if no longer needed), or merged with other content on your site.
Unintentionally orphaned pages that you want to appear in the SERPs
Often, orphan pages appear on your site unintentionally. It’s easy for someone to create a new page and simply forget to add internal links to it. But if you have an orphan page on your site that is unintentionally left out from your site’s internal linking structure, the first thing you need to do is determine whether or not it has value for your site users.
If an orphan page is still valuable to your audience, you should add internal links from appropriate pages to ensure it is integrated into your site’s larger structure. This will encourage search engines to crawl, index, and rank it — and will make it possible for users to receive this page as a result of their queries in the SERPS.
If the page is no longer valuable to you, it may be best to delete the page altogether, unless it has backlinks from other websites. In which case, redirecting that URL to another relevant page may be the best option.
Pages that you don’t want to display in the SERPs
If an orphaned page is something you don’t want to appear in search engine results pages and deliberately has no internal links, such as a landing page for PPC ads, the appropriate action would be to add a noindex meta tag to this page.
You can add a noindex tag to the <head> section of any page that you don’t want to appear in search results with this line of code:
<meta name=”robots” content=”noindex”/>
However, many sites these days use SEO plugins that allow you to simply check a box and it will add the noindex tag for you.
Noindexing essentially removes the orphan page from the watchful eye of Googlebot and prevents them from using up your crawl budget on unimportant pages. By noindexing orphan pages that are not priority content pieces for your organic search strategy, you are enabling Googlebot to focus its efforts on crawling and indexing your more important pages instead.
Similar pages and duplicate content
If an orphan page is very similar to, or a duplicate version of, another page, the best course of action is to merge those pages together.
You can do this by taking any unique and valuable information from the orphan page and consolidating it into the other similar page. Then simply add a 301 redirect from the orphan page to the newly consolidated content.
How to avoid orphan pages in the first place
It can be time intensive to find and fix all of the orphan pages on your site. The ideal scenario, of course, is to prevent them from occurring in the first place. Here are some tips for avoiding the creation of orphan pages in the future.
Take care when migrating your website
Quite often, orphan pages are the result of a website migration. They may be rogue pages that should have been 301 redirected during the migration but were missed during the process. This is why it’s a great idea to have a site migration plan and checklist in place so that you can avoid common migration issues.
Use automation where possible to avoid missing internal links
Internal linking is incredibly important for SEO and users alike, and avoiding orphan pages is only one reason why. Of course, when adding new pages, it can be easy to miss internal linking opportunities or forget to add internal links from other areas on your site.
However, most content management systems (CMS) like WordPress have automated settings that help ensure your team is linking new pages to other areas of your site. For example, with WordPress, it is easy to make sure every new blog post you add automatically appears, at the very least, on your primary blog page as well as within your archives and category pages.
Remove old pages properly
It’s a good idea to regularly review your site content for no longer necessary pages. If you have an old page that is no longer relevant (such as a discontinued product), you should remove both the page and any internal links pointing to it across the site. You should then set the appropriate HTTP status code, such as a 410 or 301 redirect if you are redirecting that outdated page to another relevant page.
Audit your site regularly
Of course, it isn’t always possible to avoid every single orphan page. There will likely be a few that slip through the cracks over time, especially on large enterprise sites that add new content regularly.
Running regular website audits and using an SEO monitoring platform will help you quickly detect and rectify any accidental orphan pages that crop up in the future.
Final thoughts on orphan pages
In general, orphaned pages do not perform well on the SERPs. When you have good internal linking structures in place across your domain, this communicates the relevancy of those pages to Google and helps search engines discover, index, and rank new content on your site. Whether or not you want the content on orphan pages to rank in the search engines, having numerous orphan pages isolated away from the rest of your website can waste your crawl budget and reduce the SEO prospects of your specifically optimized pages. It’s worth finding the orphan pages on your site so you can identify whether you should improve their discoverability through internal linking, merge that content with other pages, remove them entirely, or noindex those pages if you don’t want them to appear on search engines.
Orphan pages and internal linking are related to your overall site architecture. Want to learn more about how your site architecture impacts SEO and how to ensure your site’s structure is helping your content rank well in search engines? Check out our Ultimate Guide to Site Architecture, or peruse our SEO Office Hours notes on the subject.