So for this post I don’t have an impressive case study to reference, or a boat-load of mined data that I’ve processed to reveal something extraordinary.
Put simply, this blog post materialised @ 4PM on a Tuesday after consuming at least 4 too many bourbons.
I had a theory, I posted the question to Mr. John Muller on Twitter, it got some traction, he responded and here we are!
What makes this blog post, well, blog-worthy is I don’t think it’s been discussed before. Or more likely – I’ve never come across it myself. Why? I’m unsure. Maybe it’s an edge case – but I think it’s certainly worth highlighting and personally, it’s something I’ll be remembering going forward.
Come on then, what is it?
Believe it or not, Search Console does NOT report link data on URLs tagged as ‘noindex’. Now I’m not sure if it was just me, but to my eyes, that was quite the surprise when reading that in a Tweet for the first time.
Let me just add some context as to how my theory came to be:
One of the websites I take care of has a huge (I really do mean huge) number of internal search URLs accessible to search engines. Approx. 100k of these search URLs were indexed. Overall, these 100k URLs in the index were just a minuscule subset of the total number of search URLs that existed. There was over 16m of them (and counting). Dynamic, poorly optimised search URLs, which in virtually all cases make for terrible landing pages.
Why so many URLs?
An infinite number of search queries and each search results page contained faceted navigation. Each facet a different URL and with over 1500 different facet groups available and even more facet variables… Hopefully now the sheer scale of the number of URLs in existence is becoming apparent.
Virtually all of these URLs receive no traffic
Of the 100,000 search URLs that Google decided to index, only a handful receive any organic traffic. Less than 50 of them.
The colossal task of removing these URLs from search engines.
Canonical tags? Completed it mate. No, but seriously – we added canonical tags to around 1000 or so search URLs. Google ignored practically all of them. You see, the problem is – these search URLs often aren’t duplicates of another page (besides possibly another search page). So by canonicalising these search URLs to another non-search destination, it’s only going to be similar at best – not a duplicate.
Tip! If you’d like to know more about how Google can often choose to ignore canonical tags and other ranking signals, I highly recommend Rachel Costello’s awesome presentation that covers this and more.
Check out Rachel’s slides on SlideShare.
There were only 2 options left open to us:
- 301 redirect the search URLs – We can’t do that as searching for a query could redirect the user to an unexpected destination
- Noindex? – Noindex it is! All search URLs were immediately updated with the noindex meta tag
Now, it’s here where things began to heat up in Search Console…
Specifically in two areas, ‘Internal Links’ and ‘Links to Your Site’ reports. It appeared that data for noindex URLs were not being included.
Now, I know for a fact (i’d bet my last £1) that these search URLs are linked internally within the website and have backlinks pointing to a bunch of them too. So why zero data?
I reached out to John Muler on Twitter just to be sure (please excuse the grammar):
We list nofollow links. Since the links are between canonicals, it'll be unlikely that a noindex URL is canonical. Nothing really changed there. 🙂
— John ☆.o(≧▽≦)o.☆ (@JohnMu) August 22, 2018
As you can see from John’s response, nofollow URLs? No problem. However reading between the lines, although this isn’t new, data for noindexed URLs are not captured within Search Console.
So what if Search Console doesn’t show data for noindex URLs?
Ok, so big deal right? Well, not quite. It depends…
It depends on how many URLs are marked as noindex on your website. For the website in question, there’s lots and for us – that’s a big chunk of internal and external links data we’re invisible to. In such a case as this, I can only leverage backlink data from SEO tools such as SEMRush or AHREFS.
As great as these tools are, they don’t always capture 100% of backlink data. Search Console can often report backlinks that other tools just aren’t aware of.
To get a better picture of internal links, I’ll have to complete a full crawl of the website with a tool such as Screaming Frog or DeepCrawl.
So no, it’s not the end of the world to not have any data in SC for noindexed URLs, but depending on how many noindex URLs your site has depends on how much data you’re potentially missing out on. Even if your website has none, it’s definitely something worth knowing and remembering for the future.
Thanks for reading.