Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sitemap sort order priorities updated #5724

Merged
merged 8 commits into from
Jun 17, 2019

Conversation

saadmk11
Copy link
Member

closes #5447

@saadmk11 saadmk11 requested a review from a team May 23, 2019 13:08
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good but I think the ordering needs to be changed only in the sitemap_xml view. So, I'm requesting to make that change.

The ``LATEST`` version shall always beat other versions in comparison.
``STABLE`` should be listed second. If we cannot figure out the version
The ``STABLE`` version shall always beat other versions in comparison.
``LATEST`` should be listed second. If we cannot figure out the version
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we want to make this change here (in this function) since it will affect more places than just the sitemap_xml view.

@saadmk11
Copy link
Member Author

@humitos I have pushed some changes that will easily change the ordering but the problem here is ordering would be like this:
latest with priority=0.9, changefreq=daily
stable with priority=1, changefreq=weekly
others with priority=[0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0.1......], changefreq=monthly

is it okay? or should stable be the first one in the list?

@saadmk11 saadmk11 requested a review from humitos May 25, 2019 11:40
@humitos
Copy link
Member

humitos commented May 28, 2019

@saadmk11 I'm not really sure if that matters. Does the sitemap specification say anything about what's the correct order here?

@saadmk11
Copy link
Member Author

@humitos The position of a URL in the Sitemap doesn't really matter.
reference (sitemaps.org)

@humitos
Copy link
Member

humitos commented May 28, 2019

@saadmk11 OK, so I think it's better to sort it by priority.

@saadmk11
Copy link
Member Author

@humitos Updated the PR. Please have a look.

Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! but I'm sorry: two small more changes and we are done :)

@@ -351,7 +351,7 @@ def priorities_generator():
It generates values from 1 to 0.1 by decreasing in 0.1 on each
iteration. After 0.1 is reached, it will keep returning 0.1.
"""
priorities = [1, 0.9, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
priorities = [0.9, 1, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can leave this list sorted as it was, to avoid confusions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if we change this

latest with priority=0.9, changefreq=daily
stable with priority=1, changefreq=weekly

won't work. then we'll have to take another approach. So, I'll try another approach but this seem to have less complexity :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not changing the order of changefreqs_generators to return weekly first?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing the order of changefreqs_generators won't set the priority of stable to 1 and latest to 0.9.
priorities = [0.9, 1, 0.8, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2]
this is the easiest way to set the priority correctly and also get the changefreq in order.
i.e:

[
  {
    'loc': 'http://public.readthedocs.io/en/stable/',
    'priority': 1,
    'changefreq': 'weekly',
    'languages': [
       ..........
    ]
  },
  {
    'loc': 'http://public.readthedocs.io/en/latest/',
    'priority': 0.9,
    'changefreq': 'daily',
    'languages': [
       ...........
    ]
  }
]

otherwise we have to look for another solution for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or we have to define a function like this inside the sitemap view which will sort the versions keeping stable as the first position
https://github.com/rtfd/readthedocs.org/blob/900c57532a3199ffdb8975b732eade1be87a6e5c/readthedocs/projects/version_handling.py#L42-L65

even though we are calling it (comparable_version) from here, we have to remove this with the function we create
https://github.com/rtfd/readthedocs.org/blob/900c57532a3199ffdb8975b732eade1be87a6e5c/readthedocs/core/views/serve.py#L385

let me know if there is a better approach then this one or the previous one?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was saying this because returning [0.9, 1...] contains some logic hidden in the woods. What we want in the end is assign 1 to stable an 0.9 to latest.

To keep things simple, why not just swapping these two versions from the result returned by sort_version_aware. Like,

sorted_versions = sort_versions_aware(...)
# TODO: add comment explaining why we do this here
sorted_versions[0], sorted_version[1] = sorted_version[1], sorted_version[0]

It's not good either, but at least it does not hide logic and show the problem explicitly.

I'm not sold in any of the "solutions" that I'm proposing. Follow the one that you feel it's better, more legible and maintainable. Whatever you choose, please add a comment in the correct place explaining what's happening and how is the sorting.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I thought about swapping but it felt like a hacky way and wasn't able find anything simpler to do thats why I was asking you if you know of a better way.
I think we Shouldn’t go for a too complex way to do this small thing. Swapping is simple enough. I think I'll go with your idea with a good comment :)

@@ -379,6 +379,14 @@ def changefreqs_generator():
changefreqs = ['daily', 'weekly']
yield from itertools.chain(changefreqs, itertools.repeat('monthly'))

def sort_by_priority(version_list):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this argument could be called just versions.

@saadmk11 saadmk11 requested a review from humitos June 11, 2019 18:34
@saadmk11
Copy link
Member Author

@humitos Updated the PR :)

@saadmk11 saadmk11 closed this Jun 11, 2019
@saadmk11 saadmk11 reopened this Jun 11, 2019
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good. It has the amount of hackyness we can support 😂

Thanks for taking care of this.

@saadmk11
Copy link
Member Author

@humitos I have added a check so that we don't get index out of range error.

@saadmk11
Copy link
Member Author

@humitos it's starting to get conflicts can it be merged?

@humitos humitos merged commit ac6bccf into readthedocs:master Jun 17, 2019
@saadmk11 saadmk11 deleted the sitemap-sort-order branch June 17, 2019 09:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Sitemap sort order may prioritise unstable/development documentation
2 participants