Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better localisation support #686

Closed
costas80 opened this issue Mar 30, 2023 · 6 comments
Closed

Better localisation support #686

costas80 opened this issue Mar 30, 2023 · 6 comments

Comments

@costas80
Copy link
Contributor

The approach currently used by this library to localise error messages is by setting Locale.setDefault(aLocale). For example, as documented here, one is expected to do as follows:

Locale.setDefault(Locale.GERMAN);
JsonSchemaFactory jsonFactoryInstance = JsonSchemaFactory.getInstance(SpecVersion.VersionFlag.V202012);
JsonSchemaFactory schemaFactory = JsonSchemaFactory.builder(jsonFactoryInstance).build();

The problem with this approach is that this sets the default Locale for the entire JVM, which in any multi-threaded scenario like a web application is not what you would want. Moreover, checking the code of I18nSupport I see that the resource bundle is loaded in a static initialiser and stored in a static final variable:

public class I18nSupport {

    private static final ResourceBundle bundle;

    static {
        ResourceBundle tmpBundle = null;
        try {
            tmpBundle = ResourceBundle.getBundle(BASE_NAME);
        } catch (MissingResourceException mre) {
           ...
        }
        bundle = tmpBundle;
    }

}

In following this approach, there can be no locale switch after the class is initially loaded.

A better approach would be to introduce a "context" of sorts for a given validation run that allows you to set it with the desired Locale. Using this you could then do a Locale-aware lookup of the resource bundle to use and return the localised message.

Note that this is not a blocking issue for use in a multilingual context given that you can do something like this:

Locale aLocale = ... // Determine the locale to use.
// Load the jsv-messages resource bundle (or another one with the same keys).
ResourceBundle translationBundle = ResourceBundle.getBundle("jsv-messages", aLocale);
JsonSchema schema = ... // Load schema somehow.
JsonNode content = ... // Load content to validate somehow.

List<String> localisedMessages = schema.validate(content).stream().map((message) -> {
   String text = message.getMessage();
   if (this.translationBundle.containsKey(message.getType())) {
      var localisedTemplate = this.translationBundle.getString(message.getType());
      var arguments = message.getArguments();
      var params = new String[(arguments == null ? 0 : arguments.length) + 1];
      params[0] = message.getPath();
      if (arguments != null) {
         System.arraycopy(arguments, 0, params, 1, params.length - 1);
      }
      text = MessageFormat.format(localisedTemplate, (Object[]) params);
   }
   return text;
}

Although this workaround exists it would be a better design to foresee this kind of multilingual support in the library itself. Note also that if this workaround is used any custom messages would be forcibly ignored (although anyway custom messages don't support localisation as they are currently defined).

@costas80
Copy link
Contributor Author

Diving into the code a bit more it seems that the main issue is how to pass certain preferences specific to a given validation run (the Locale in this case) and then access them from the BaseJsonValidator class where messages are produced. I see that the JsonSchema class includes a ValidatorState instance set in a ThreadLocal, which seems like a good place to record and access a Locale. Given this, one would need to adapt:

  • Class I18nSupport to no longer use a static initialiser and bundle (in truth the class could be dropped altogether).
  • The ValidatorTypeCode to no longer record a fixed MessageFormat instance.

The question now becomes how to pass the expected Locale to the validation from the library's public API. One could imagine overloaded methods on JsonSchema that accept a new ValidationConfig class that would allow defining a Locale. I thought of this intermediate ValidationConfig class being inspired from the SchemaValidatorsConfig class, and assuming that there could be additional preferences apart from the Locale supported in the future.

@stevehu
Copy link
Contributor

stevehu commented Apr 5, 2023

Assuming that all validators share the same locale, we can put it in the ValidationConfig. I agree that we should use something other than static local across the entire JVM. Do we have another use case that needs to switch the locale between validators?

@costas80
Copy link
Contributor Author

costas80 commented Apr 5, 2023

Do we have another use case that needs to switch the locale between validators?

My use case is a web app, localised in several languages, that shares a validation report produced (mostly) by this library. The workaround code I shared in my initial comment is what we currently do to allow the validation to produce locale-specific messages. In fact we even define our own resource bundle to tailor all messages to what we want plus add missing translations (messages are translated in all EU languages).

Also, to be clear, I am not referring to changing the locale between validators (e.g. an EnumValidator or a FormatValidator) but for different validation runs (one user wants the report in English, and another one in French).

Regarding the ValidationConfig approach I proposed, I think that validation-specific configuration should actually be implemented in increments. The existing SchemaValidatorsConfig is currently used to hold all configuration and this arguably already contains properties that could also vary per validation run. I would rather use the current SchemaValidatorsConfig now to support a locale (I would also add the possibility of a custom resource bundle), and as a separate issue foresee a ValidationConfig that would hold configuration pertinent to a given validation run (to allow caching of schemas while applying different configurations). I see this as a backwards compatible update, given that any properties defined in SchemaValidatorsConfig would act as defaults but for a given validation run if ValidationConfig is defined it would take precedence.

So, to summarise, I would propose the following for the current issue:

  1. Extend SchemaValidatorsConfig to support a Locale and a custom ResourceBundle. This is a much simpler task, following the current config approach, whereby the existing implementation would serve as the default if these config properties are not set.
  2. Create a separate issue to support a ValidationConfig and check which existing properties from SchemaValidatorsConfig would also make sense at the level of a specific validation run.

What do you think? If you agree with the above I can make a PR for point 1 (the current issue) and create a separate issue for point 2.

@stevehu
Copy link
Contributor

stevehu commented Apr 5, 2023

I understand your use case now. It makes sense to have different locales for different validation sessions. I think your approach is very prudent. I am looking forward to your PR. Thanks.

@costas80
Copy link
Contributor Author

costas80 commented May 5, 2023

Hi @stevehu . Just prepared a PR for this issue for you to review and merge (took me some time as I had to jump into other tasks).

stevehu pushed a commit that referenced this issue May 6, 2023
Co-authored-by: simatosc <costas.simatos@sigmacubed.eu>
@costas80
Copy link
Contributor Author

costas80 commented May 8, 2023

Created issue #744 for the discussed per-validation configuration. Inspecting more closely the caching code it seems that using a different configuration per validation was already considered, however the current implementation is prone to race conditions if validations are executed in parallel. I adapted the issue to reflect this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants