Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SEO: Online Shop wrong category linking to products #12532

Closed
fczaja opened this issue Jun 22, 2016 · 12 comments
Closed

SEO: Online Shop wrong category linking to products #12532

fczaja opened this issue Jun 22, 2016 · 12 comments
Labels

Comments

@fczaja
Copy link

fczaja commented Jun 22, 2016

Impacted versions

  • 9.0

Goal

Linking all navigation must always refer to the original URL of a product. This concerns the category links in the online shop.

Reason

If there are multiple URLs for a product, we have duplicate content.
It may be that a product page can be reached at multiple URLs in the search engines. For this reason, duplicate content by search engines is very poor rates.

Current behavior Odoo 9

The categories refer to ODOO Online Shop wrong product URLs. This new pages are generated, which then appear in the index of search engines.

Steps to reproduce:

  1. Go to Odoo 9 site
  2. Click in the top navigation on "Shop"
  3. Activate the left Navigation > Customize > Product Categories
  4. Click in the left menu to > Computer and then on the product
  5. Compare the product URL with the original product URL.

Correct: shop/product/imac-11
Wrong: shop/product/imac-11category=12?

Expected behavior

The products in the online shop must in the various navigation (categories) always point to the original URL.

This information is available

#12527

@pedrobaeza
Copy link
Collaborator

You can use website_canonical_url from OCA: https://github.com/OCA/website/tree/9.0/website_canonical_url for this.

@JKE-be is there any improvement about this for v12? I consider this one important. We can make the PR to master if you want starting from the OCA module.

@Yenthe666
Copy link
Collaborator

@JKE-be any update on this one? :)

@JKE-be
Copy link
Contributor

JKE-be commented Aug 12, 2019

We plan some update for v13 but I just take a look to the linked module, and we will do less than that.

Imo, the canonical url of /blog?page=2 is not /blog ... (only for page1)
All the query string is not to ignore

but yes /product/1?xxx=eee ==> should have /product/1 as canonical url

@pedrobaeza
Copy link
Collaborator

cc @yajo

@yajo
Copy link
Contributor

yajo commented Aug 13, 2019

Imo, the canonical url of /blog?page=2 is not /blog ... (only for page1)

At the time of writing the module, I found no odoo controller paginating with ?page=X. All of them used /base-controller/page/X. They used query string to add search parameters, tag filters, etc., which all of them are to be removed in the canonical url.

The case you say is no exception, as the blog controller is paginated with /blog/page/2 and /blog/our-blog-1/page/2, so AFAICS the module is still valid.

@JKE-be
Copy link
Contributor

JKE-be commented Aug 13, 2019

Hi @yajo

Not Wrong...
Seems like odoo is well done ;) and use the QueryURL Class helper to avoid this bug.
Probably that most of reported issue is due to custo, or wrongly implemented theme that use ?page in url.

But if we take for example event,
https://www.odoo.com/event will be the canonical of all these following url that show all distinct content:
https://www.odoo.com/event?type=2
https://www.odoo.com/event?type=3
https://www.odoo.com/event?type=...
https://www.odoo.com/event?country=20&type=4
...

While it should be probably indexed imo, no ?

@yajo
Copy link
Contributor

yajo commented Aug 16, 2019

/event is a good exception indeed.

In this case, the categories and countries are not canonical IMHO. They just repeat information that can be found in the same controller without that filter, and would lead to duplicate content errors.

The only parameter that actually makes you see events that are impossible to see without it is ?date=old.

Possible solutions:

  1. Make that filter part of the controller. So, when you click in past events, you actually get to /event/old?type=1&country=33. Just that little fix would make website_canonical_url work out of the box.

  2. Right now, /event?date=all doesn't list all events, but just future ones. Another solution would be to make it actually list all. We could add a ?date=next for that purpose, and let ?date=all list all events. You can ORDER BY date > now() DESC, date ASC those results, so next events appear at the beginning as usual.

  3. Add a "canonical_url" parameter to all pages, where you can customize it in case there's a special case like this one. We'd still require properly separated ?date=all and ?date=next parameters, but the default one would be next (as now) and you wouldn't need to do the "order by" hack. Then, in the template just add <t t-set="canonical_url" t-value="/event?date=all"/> and you're done.

@JKE-be
Copy link
Contributor

JKE-be commented Aug 16, 2019

not really sure

imo,

if /shop (with pagination[ /shop/page/1], and with sort by name)...
the canonical url from /shop?order=name desc is not /shop (except if len(product) < product by page, whats is a rare exception)

From what I understand here: https://tools.ietf.org/html/rfc6596#section-3
It will be not duplicate, not a subset... (or only a partial subset).

But yes, maybe we could use a heuristic as in your module, and after allow by controller to make it specific. But need to be triple check : avoid to loose referenced page VS penalties for duplicate content.

@yajo
Copy link
Contributor

yajo commented Aug 16, 2019

Hmm I think that you're not understanding what the canonical URL does (or maybe it's me 😆). AFAIK, the canonical URL indicates what should be the "main" URL indexed by search engines in case a engine lands in the present page.

So yes, there's the possibility that if you search for "rocky" in a movies shop, some items found in page 2 are not actually found in page 2 of the same controller without the search term, but the canonical instruction would be telling the search engine:

  1. Do not index /shop/page/2?search=rocky.
  2. Go to /shop/page/2 instead
  3. Index that page.
  4. Continue from there.

Of course, there are chances that the search engine doesn't find any "rocky" titles in that page, but there's one thing it will find: pagination links.

The search engine then will crawl all those links and index all movies, including the rocky ones.

Then somebody goes to the search engine in question and searches for "rocky movie". That person might get a result in /shop/page/554, but will never get a result in /shop/page/2?search=rocky because that one wasn't even indexed. And that's what we want.

After all, the best is that the person lands directly from the search engine to the product page itself, but if he ever lands in the products list, then it should be directly in the list itself. The user can then click on any product or use the filters by himself.

Keep in mind that search engines give your site a time & pages budget, depending on its update rate and the interest people show in it (so it's not under your control). If you spend all that budget indexing the same products list over and over under different filters, the bot might run out of budget and skip the product page itself, which would actually be the most interesting one.

That said, it might be interesting to index product categories; since they are part of the URL itself in Odoo, it would work out of the box. Yes, there could be duplicated content, but in this case it's valuable to have it, so the canonical URL is different.

@JKE-be
Copy link
Contributor

JKE-be commented Aug 16, 2019

I think that you're not understanding what the canonical URL

not really, I think that I'm not sure about all what we can read ;) and look for the best explication with other people instead to apply mine opinion like the only one that is correct ;)

Yes, for the search, I'm sure, I think like you ... We should use canoncial.

But for example /shop?order=best_seller, and /shop?order=rating, not sure that we want these 3 pages with the canonical to /shop like you suggest ;)
I would prefer to index 3 versions since the content will be not duplicated in most of the case, and the tree result interesting for distinct purpose...

What you are not really agree.
Yes, I read you, And understand that btw all products from these 2 pages, will be in the first one ;)

So still looking for others sources that help me to make the best choice ;)

Nothing against you, or against me ;) just want to avoid to take one more bad decision...

@JKE-be
Copy link
Contributor

JKE-be commented Aug 16, 2019

If you spend all that budget indexing the same products list over and over under different filters, the bot might run out of budget and skip the product page itself

don't worry, it does'nt start from scratch every time, so after a few day, you will have all your content ;) even the 2 (potential useless) pages

seb-odoo added a commit to odoo-dev/odoo that referenced this issue Sep 3, 2019
The canonical tag is important for SEO, indeed it prevents search engines from
indexing duplicate content.

Reasoning
=========

The choice has been made to create the canonical tag automatically depending on
the request path, ignoring the query string, and manually prefixing the
appropriate domain and language code.

Indeed creating it manually for each resource would create a lot of code and
potential mistakes.

It is more dangerous to do it the generic way, but after investigation it
appears that it is an acceptable trade-off since the vast majority of our routes
are well built and already ready for this:

- using query string only for minor features that do not change the main content
- having the models, the ids, the pager and other important features in the path

Override
========

It is still possible to override the default behavior by passing
`canonical_params` manually to the view or to the different methods.

This is done for `/event` because the only way to display Past Events is to add
`date=old`.

Languages
=========

Fix an issue where it was possible for a bot to be on the URL without language
code but to use a language that is not the default language.

Adapt hreflang, because it:
- must only be present on canonical pages
- must always lead to canonical pages
- should not be set if there is no alternate language

Misc
====

task-1958075
closes odoo#12532

Inspired by OCA module `website_canonical_url` courtesy of Jairo Llopis.

Co-authored-by: Jairo Llopis <jairo.llopis@tecnativa.com>
Co-authored-by: Sébastien Theys <seb@odoo.com>
robodoo pushed a commit that referenced this issue Sep 3, 2019
The canonical tag is important for SEO, indeed it prevents search engines from
indexing duplicate content.

Reasoning
=========

The choice has been made to create the canonical tag automatically depending on
the request path, ignoring the query string, and manually prefixing the
appropriate domain and language code.

Indeed creating it manually for each resource would create a lot of code and
potential mistakes.

It is more dangerous to do it the generic way, but after investigation it
appears that it is an acceptable trade-off since the vast majority of our routes
are well built and already ready for this:

- using query string only for minor features that do not change the main content
- having the models, the ids, the pager and other important features in the path

Override
========

It is still possible to override the default behavior by passing
`canonical_params` manually to the view or to the different methods.

This is done for `/event` because the only way to display Past Events is to add
`date=old`.

Languages
=========

Fix an issue where it was possible for a bot to be on the URL without language
code but to use a language that is not the default language.

Adapt hreflang, because it:
- must only be present on canonical pages
- must always lead to canonical pages
- should not be set if there is no alternate language

Misc
====

task-1958075
closes #12532

Inspired by OCA module `website_canonical_url` courtesy of Jairo Llopis.

closes #35852

Signed-off-by: Jérémy Kersten (jke) <jke@openerp.com>


Co-authored-by: Jairo Llopis <jairo.llopis@tecnativa.com>
Co-authored-by: Sébastien Theys <seb@odoo.com>
@mart-e
Copy link
Contributor

mart-e commented Nov 26, 2019

Closing as fixed at #35852

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants