Google’s John Mueller said in a webmaster hangout on YouTube yesterday at the 16:56 mark into the video that it can confuse Google if you are tagging your internal links with tracking parameters, such as UTM links. It is actually a common practice to do this but it can cause confusion to Google for indexing if you use too many parameters. The issue is, Google needs to crawl all the various URLs, which can be many. It then needs to figure out if the canonical signal is more important than the signals you are telling Google through your internal links. Google’s John Mueller explains it below.
Glenn Gabe has summarized a lot of this in two tweets:
More from @johnmu about utm params: Our systems try to understand the different urls… so send Google clear signals. Rel canonical is a strong signal, but so are internal links. You could also be causing more crawling by using those parameters: https://t.co/vCz9gjpFTQ pic.twitter.com/y3XydkGB09
— Glenn Gabe (@glenngabe) February 19, 2019
Here is the transcript:
Question: A question regarding parameter URLs with UTM links. So UTM links are essentially tagged URLs that you might use for analytics tracking or just general tracking. And the question is will these links dilute the link value if they’re heavily linked internally. We’re getting pages indexed with parameters where the canonical is pointing to the preferred version. How will it affect in the long run, if we’re linked within the website with 80% parameters and 20% clean URLs?
Answer: I I guess that’s always a bit of a tricky situation because you’re giving us mixed signals essentially. On the one hand you’re saying these are the links I want to have indexed because that’s how you link internally within your website. On the other hand those pages when we open them they have a rel canonical pointing to a different URL. So you’re saying Index this one and from that one you’re saying well actually Index it different one.
Sp what our systems end up doing there is they try to weigh the the different types of URLs that we find for this content. We can probably recognize that this content these URLs lead to the same content. So we can kind of put them in the same group and then it’s a matter of picking which one to actually use for indexing. And on the one hand we have the internal links pointing to the UTM versions. On the other hand we have the rel canonical pointing to kind of the cleaner version. The cleaner version is probably also a shorter URL and nicer looking URL, that kind of plays in inline with us as well. But it’s still not guaranteed from our point of view, that we would always use the shorter URL. So rel canonical is obviously a strong sign, internal linking is also kind of a stronger signal, in that that’s something that’s under your control. So if you explicitly linked to those URLs and we think maybe you want them indexed like that.
So in practice what what would probably happen here is we would index a mix of URLs. Some of them we would index the shorter version because maybe we find other signals pointing at the shorter version as well. Some of them we probably index with the UTM version and we would try to rank them normally as the UTM version.
In practice in search you wouldn’t see any difference in ranking, you would just see that these URLs might be shown in the search results. So they would rank exactly the same with UTM or without UTM and they they would just be listed individually in the search results. And from a practical point of view that just means that in search console you might see a mix of these URLs, in the the performance report you might see kind of this mix, in the indexing report you might see a mix, in some of the other reports maybe around the AMP or structured data if you use anything like that you might also see this mix, you might also see in some cases situation where it swaps between the URLs. So it might be that we index it with UTM parameters at one point and then a couple weeks later if we switch to the cleaner version. And we say well probably this cleaner version is better and then at some point later on or algorithm to look at it again and say well actually more signals point to the UTM version we’ll switch back. That could theoretically happen as well.
So what what I would recommend doing there is if you have a preference with regards to your URLs, make sure that you’re being as clear as possible within your website about what version you want to have indexed. With UTM parameters you’re also creating the situation that we’d have to crawl both of those versions. So it’s a little bit more overhead if it’s just one extra version that’s probably not such a big deal. If you have multiple UTM parameters that you’re using across the website then we would try to crawl all of this different variations which would in turn mean that maybe we crawl I don’t know a couple times as many URLs as your website actually has to be able to keep up with indexing. So that’s probably something you’d want to avoid.
So my recommendation would be really to try to clean that up as much as possible, so that we can stick to the clean URLs. to the URLs that you want to have indexed. Instead of ending up in this state where maybe we’ll pick them up like this, maybe we’ll pick them up like this, and in your reporting it could be like this, it could be like this, you have to watch out for that all the time. So keep it as simple as possible.
Here is the video embed:
Forum discussion at Twitter.