Facets / Faceted Navigation

Faceted navigation is often found on eCommerce sites and contains filtered content which allows users to easily find what they are looking for. While facets can help to improve user experience, it is important to understand how they can also impact SEO. We cover how Google Search handles faceted navigation on websites in our SEO Office Hours recaps below (along with best practice guidance and insights from Google).

For more, see our guide to SEO best practices for faceted search.

If URLs that are blocked by robots.txt are getting indexed by Google, it may point to insufficient content on the site’s accessible pages

February 21, 2022 Source

Why might an eCommerce site’s faceted or filtered URLs that are blocked by robots.txt (and have a canonical in place) still get indexed by Google? Would adding a noindex tag help? John replied that the noindex tag would not help in this situation, as the robots.txt block means it would not be seen by Google.

He pointed out that URLs might get indexed without content in this situation (as Google cannot crawl them with the block in robots.txt), but they would be unlikely to show up for users in the SERPs, so should not cause issues. He went on to mention that, if you do see these blocked URLs being returned for practical queries, then it can be a sign that the rest of your website is hard for Google to understand. It could mean that the visible content on your website is not sufficient for Google to understand that the normal (and accessible) pages are relevant for those queries. So he would first recommend looking into whether or not searchers are actually finding those URLs that are blocked by robots.txt. If not, then it should be fine. Otherwise, you may need to look at other parts of the website to understand why Google might be struggling to understand it.

Google Only Needs to Crawl Facet Pages That Include Otherwise Unlinked Products

April 16, 2019 Source

For eCommerce sites, if Google can access and crawl all of your products through the main category page then it won’t need to crawl any of the facets. However, facets should be made crawlable if they contain products that aren’t linked to from anywhere else on the site.

Ensure All Product Pages Can be Crawled With Considered Use of Noindex

April 13, 2018 Source

eCommerce sites with facets should be careful which pages are noindexed because this may make it difficult for Googlebot to crawl individual product pages e.g. noindexing all category pages. Webmasters might consider noindexing specific facets or deciding that everything after a certain number of pages in a paginated set be noindexed.

Googlebot Can Recognise Faceted Navigation & Slow Down Crawling

April 3, 2018 Source

Googlebot understands URL strucures well and can recognise faceted navigation and will slow down when it realises where the primary content is and where it has strayed from that. This is aided by GSC parameter handling.

Canonicalization For Filter Results Pages Isn’t Recommended

January 9, 2018 Source

Canonicalization shouldn’t be used for filter pages. This is because canonical tags can be ignored and filter pages aren’t always the same as they have different types of results.

Canonicalise Faceted pages to Non-filtered Version

May 30, 2017 Source

Google recommends allowing crawling of faceted pages but canonicalise to non-filtered version of that page instead of blocking them with robots.txt.

Indexable Product Variations Should Reflect Search Behaviour

May 5, 2017 Source

Variations of pages which people are searching for should be made indexable, otherwise the variations should be folded together.

Prevent Excessive Crawling on Filters, Sort Orders and Pagination with Nofollow

September 23, 2016 Source

Add nofollow to filtered, sorted and paginated results pages to prevent excessive crawling.

Use Noindex or Canonical on Faceted URLs Instead of Disallow

September 23, 2016 Source

John recommends against using robots.txt disallow to prevent facet URLs from being crawled as they may still be indexed, and allow them to be crawled and use a noindex or canonical tag, unless they are causing a server performance issue.

Related Topics

HTTPS URL Parameters Site Architecture URL Architecture Subdomains Canonical Domain TLDs Site/Page Quality