-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add link rel="canonical" response header? #309
Comments
Seems logical, will investigate! |
This has been implemented with commit 522e8f2, which has just been rolled out to production. Thanks for reporting this! |
Nice. Just a question about the implementation: |
The purpose of a canonical is to explicitly and unambiguously indicate a preferred URL, so I think it's better to include the final URL rather than the initial URL in the When an upstream server changes the redirection to another image, search engines may index the wrong URL if we set the initial URL as canonical (due to caching). While the current approach would still point to the final URL fetched at that time. Note that the blog post you linked is a bit misleading, as Google doesn't honor this response header for images, see for example: |
Ok that makes sense. I just looked at how Photon does it but this is probably better. This way any url that redirects to https also gets the https url as canonical. It seems the only information about Google is from 5 years ago so who knows. Thanks! |
@kleisauke But I wanted to come back on the redirect issue: Say you get an image from Instagram like this: This will redirect to an url on cdninstagram.com with a time based signature that expires after a couple of days. So if you still have the image in the cache the canonical link will point to a non-existent location! In this case the request goes like this: So I think in this case the 3rd URL should be the canonical one. So I would suggest when determining the canonical URL to only follow permanent redirects ( Thanks! |
I think the root cause of this is caching. Without that, the canonical link will always indicate the correct preferred URL. While determining the canonical link on permanent redirects might solve the issue for Instagram, it is more or less a workaround; if Instagram switches to permanent redirects for these URLs then it won't work anymore. You're always free to use our source code to host your own solution without any caching. Images.weserv.nl is for caching and manipulating images, not for bypassing time-based expiration URLs nor circumventing hotlinking-protection. |
I disagree this is about caching. Instagram gives a temporary redirect because it's a temporary URL. If Instagram switches to permanent redirects than that would mean they consider that final URL to be permanent. But this isn't about Instagram but about all canonical URLs, I was just using this as an example. Don't you agree that a canonical URL is always a permanent one? (That being said, in the case that the final URL has a canonical Not sure why you mention self-hosting since this is only about search engines. |
According to RFC 6596 section 3 ("The Canonical Link Relation"), the target (canonical) URL may be the source of a temporary redirect: The same section emphasizes that the target (canonical) URL should not be designated to a permanent redirect. So I think you're right. I just fixed this with commit 34b08b1, which has just been rolled out to production. Thanks for reporting this! |
But how to disable it, as its very unwanted to announce the original location. Also gives in our case unwanted behavior with caching and could also lead to unwanted indexing at search engines. I cant find in the documentation where to disable it. |
There are no plans to disable nor allow the rel="canonical" HTTP header to be configured on the public API. Most search engine crawlers will respect If you need to "mask" the original image location, then our public service is probably the wrong solution. You're always free to host your own solution. Note that the |
We do not use the public service, but our own docker. We want to hide the source location of images for obvious reasons, but just noticed it by accident that there was a canonical header added. For example, if you use the blur option to hide images, but then you also announce the source location it could miss the point of blurring images. |
To determine whether the rel="canonical" response header should be set to proxied images.
Ah, I thought you were referring to our public service. However, I don't think this is a security issue, since the original image source would also be available in the Anyways, I made this configurable with the |
I came across the following article about WordPress Photon:
https://michaelkummer.com/tech/jetpack-photon-seo/
Apparently they use a link response header with
rel="canonical"
so that search engines will index the original image rather than the proxied image.Might be a good idea for you to have this as well?
The text was updated successfully, but these errors were encountered: