Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(v2): fix bad theme pluralization rules for some labels #4304

Merged
merged 14 commits into from
Mar 3, 2021

Conversation

slorber
Copy link
Collaborator

@slorber slorber commented Feb 26, 2021

Motivation

As discussed previously in #4295
We need to handle pluralization of some labels.
Some languages like Russian have complex rules.

Most Docusaurus sites will be in English, or will use languages with simpler pluralization rules, so it wouldn't be fair to add unnecessary JS code that wouldn't be used by 95% of the Docusaurus sites


We have only 3 Docusaurus labels for which we need to handle pluralization.

We don't want to introduce too much complexity just for that, so we really are looking for a good enough solution here.

https://v2.docusaurus.io/blog/tags/recap :

image

https://v2.docusaurus.io/search?q=test

image

The rest of the theme is quite "static", and users are unlikely to have too much dynamic values either, unless they implement their own plugins or fetch remote backend data that is dynamic. In those cases the user should decide if he ants

This solution is the smallest one I could think of to solve the problem and get the theme labels translated properly by default.

The tradeoff is that older browsers without Intl.PluralRules support will fallback to English rules, so a Russian user on IE11 or older Edge will only get an approximate result. In my opinion it is not a big deal.


Test

https://deploy-preview-4304--docusaurus-2.netlify.app/classic/blog/tags/recap
https://deploy-preview-4304--docusaurus-2.netlify.app/classic/fr/blog/tags/recap
https://deploy-preview-4304--docusaurus-2.netlify.app/classic/ru/blog/tags/recap

Will have to test better some older browsers to see the potential impact

@slorber slorber added the pr: bug fix This PR fixes a bug in a past release. label Feb 26, 2021
@slorber slorber requested a review from lex111 as a code owner February 26, 2021 18:50
@facebook-github-bot facebook-github-bot added the CLA Signed Signed Facebook CLA label Feb 26, 2021
@netlify
Copy link

netlify bot commented Feb 26, 2021

[V1] Deploy preview success

Built with commit d313d8c

https://deploy-preview-4304--docusaurus-1.netlify.app

@github-actions
Copy link

github-actions bot commented Feb 26, 2021

Size Change: 0 B

Total Size: 532 kB

ℹ️ View Unchanged
Filename Size Change
website/build/assets/css/styles.********.css 87.1 kB 0 B
website/build/assets/js/main.********.js 359 kB 0 B
website/build/blog/2017/12/14/introducing-docusaurus/index.html 60.3 kB 0 B
website/build/docs/introduction/index.html 235 B 0 B
website/build/index.html 25.4 kB 0 B

compressed-size-action

@netlify
Copy link

netlify bot commented Feb 26, 2021

Deploy preview for docusaurus-2 ready!

Built with commit d313d8c

https://deploy-preview-4304--docusaurus-2.netlify.app


// Not ideal but good enough!
// See doc of usePluralFormSector for reason!
const BlogPostPlurals = (count) => ({
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slorber actually a interesting and at the same time elegant native-based solution, thanks! Is it possible to "combine" all these plurals forms (options) into one message via (for example) some separator or extending of placeholder with certain keyword? Do you think it is worth it, or will it increase code complexity and its final size?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lex111

I just upgraded the code with something that you may prefer, let me know if that looks fine to you

You can provide the Russian message with syntax similar to your Vue I18n example doc, like:

  "theme.blog.post.plurals": "Few posts|Many posts|One post|{count} posts",

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@slorber wow! This is very cool, thanks, I checked locally, and it works fine! 👍 Only I am confused, why Intl for Ru locale defined 4 plurals forms, but not three? And If so, can I skip the last part ({count} записи|{count} записей|{count} запись instead of {count} записи|{count} записей|{count} запись|{count} записей)?

By the way, how do we handle zero values? It looks like at this time when passing zero values, interpolation does not work, do we need to define fallback for this case?

image

  count = 0;

  return selectMessage(
    count,
    translate(
      {
        id: 'theme.blog.post.plurals',
        description: 'Pluralized label for one blog post',
        message: 'One post|{count} posts',
      },
      {count},
    ),
  );

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm curious why you decided to go with a custom syntax instead of ICU? Bc ICU handles cases like 0 natively e.g {count, plural, =0 {no post} one {one post} other {# posts}}

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @lex111

Won't be able to answer your questions regarding the Russian language plural forms but it seems there are 4 forms: one few many other.

https://www.unicode.org/cldr/cldr-aux/charts/34/supplemental/language_plural_rules.html

image

(will probably have to set a constant order, not sure if the order of that array is specified?)

By the way, how do we handle zero values?

It looks like Russian handles 0 values as the "many" plural category, do you see a problem using the many plural for zeros? Like showing a label such as "no post" but in Russian?

(we don't have the case for blog posts as count >= 1, but we may need this for search results)

It looks like at this time when passing zero values, interpolation does not work

I will take a look 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longlho thanks for joining this review :)

Would appreciate your opinion on the choices I made.

Just to explain better the context:

  • most Docusaurus sites are not translated
  • most Docusaurus sites use a language with simple plural forms like English
  • most Docusaurus sites don't have very complex custom pages, and if they do, the data is quite static and does not have any dynamic plural to handle
  • we only have 2 labels in the classic theme to get translated
  • we want a very lightweight solution

hmm curious why you decided to go with a custom syntax instead of ICU?

Because as far as I understand it, using ICU means supporting a larger set of i18n features like genders, number formatting etc, which most Docusaurus sites don't need, and require an ICU parser, reading CLDR data by using a parser too, and the associated Intl polyfills.

The current implementation gets the job done with less than 100 lines of code, and I think it's good enough for our use-case as we basically just want to pluralize just 2 labels:

  • "{n} blog posts"
  • "{n} search results"

Using react-intl would add a decent amount of JS to the hello-world Docusaurus site that wouldn't benefit from it much. I considered using intl-messageformat but thought it was overkill for our need and a simpler solution could work.

Note: using the syntax based on | is not something we'll expose as a public API. It really is used to just translate 2 theme labels, not n user-provided labels. We won't tell the user that this even exist.

Instead we'll ask the user to add react-intl to Docusauurs himself if he needs complex plurals in his homepage. But Home pages are generally fully static so it's quite unlikely this will be often needed.

Also, if too many users ask for pluralization for their custom pages, we can still add this later (we can move from a subset of ICU to full ICU, using react-intl FormattedMessage as an implementation detail, but I doubt many users will ask for this, and prefer to see if there is any traction at all before increasing the base size of a Docusaurus site (we could also add a config so that full ICU support is opt-in).

For me, the conditions for which a user might really benefit from full ICU is:

  • They have custom pages
  • They display dynamic labels on those pages (ie fetching data from a backend or a CMS? not the typical Docusaurus site)
  • They handle multiple languages

Let me know if that makes sense to you.

If you tell me there's a way to support full ICU in less than 1kb, I'm open to any suggestion, but I think the min gz is more around 10kb (not including the polyfills, which will become less of a problem with polyfill services and browser support increasing)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes sense. In which case I don't think you should concat | but rather separate them out like gettext style basically (so have 1 blog posts & {n} blog posts as 2 separate strings).

Copy link
Collaborator Author

@slorber slorber Mar 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@longlho this is what I initially had, but this is annoying to maintain and less convenient:

const BlogPostPlurals = (count) => ({
  one: translate(
    {
      id: 'theme.blog.post.plurals.one',
      description: 'Pluralized label for one blog post',
      message: 'One post',
    },
    {count},
  ),
  two: translate(
    {
      id: 'theme.blog.post.plurals.two',
      description: 'Pluralized label for two blog posts',
      message: '{{count}} posts',
    },
    {count},
  ),
  few: translate(
    {
      id: 'theme.blog.post.plurals.few',
      description: 'Pluralized label for few blog posts',
      message: '{{count}} posts',
    },
    {count},
  ),
  many: translate(
    {
      id: 'theme.blog.post.plurals.many',
      description: 'Pluralized label for many blog posts',
      message: '{{count}} posts',
    },
    {count},
  ),
  other: translate(
    {
      id: 'theme.blog.post.plurals.other',
      description: 'Pluralized label for other blog posts',
      message: '{{count}} posts',
    },
    {count},
  ),
});

// Very simple pluralization: probably good enough for now
function useBlogPostPlural(count: number): string {
  const pluralForm = usePluralForm(count);
  return BlogPostPlurals(count)[pluralForm];
}

The reason is because I'd like to have all the theme translations to be statically analysable (ie no key like key.${pluralForm}.

The goal is to see if we can later add a babel plugin that removes the translation runtime and the messages.json bundle. IE the final bundle would not contain EN in code + FR in messages.json, but just FR in code, and we won't have to put all the messages in a single chunk or figure out a way to do i18n messages code splitting. This is not a high-priority optimization but it would be nice if we have the right constraints now to see if it's worth to work on this later.

This is not ideal but for 2 or 3 labels and a private API that do not seem to me like a big deal :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I mean you'd still declare only one & other, translates to multiple cases like ru and runtime knows to swap that in. Key can still be static but effectively you'd deal w/ an array of strings instead of a single string. That's what gettext basically does.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will keep current system for now and see if we need to change it later. as long as it's not public that should not be too hard to change if this becomes a problem

@github-actions
Copy link

github-actions bot commented Feb 27, 2021

⚡️ Lighthouse report for the changes in this PR:

Category Score
🟠 Performance 67
🟢 Accessibility 96
🟢 Best practices 100
🟢 SEO 100
🟢 PWA 95

Lighthouse ran on https://deploy-preview-4304--docusaurus-2.netlify.app/classic/

@lex111 lex111 added this to the v2.0.0-alpha.71 milestone Mar 1, 2021
@slorber slorber merged commit 364d4db into master Mar 3, 2021
@slorber slorber changed the title fix(v2): Attempt to fix bad theme pluralization rules for some labels fix(v2): fix bad theme pluralization rules for some labels Mar 3, 2021
slorber pushed a commit that referenced this pull request Mar 5, 2021
@slorber slorber deleted the slorber/i18n-plurals branch August 17, 2021 17:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed Signed Facebook CLA pr: bug fix This PR fixes a bug in a past release.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants