Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-blocking getSupportedLocales + Use getResourceAsStream to read resource bundle #878

Merged
merged 32 commits into from
Dec 7, 2020

Conversation

jessiejuachon
Copy link
Contributor

@jessiejuachon jessiejuachon commented Nov 17, 2020

For the reviewers:

Issues described in #882:

  1. The java client library tries to walk the file system when getting the list of supported locales. This fails when the file system is in a nested jar. Such is the case in Spring boot applications where the offline bundles directory may exist inside dependency jars which under BOOT-INV/libs. This is related to [ENHANCEMENT and BUG] Non-blocking getSupportedLocales when fetching messages #877

  2. The java client library tries to access the file system when reading the bundle (code is in LocalMessagesOpt.getComponentMessages). This fails when the file system is in a nested jar.

All files changed in this PR are for # 1 above, except for change sin LocalMessagesOpt.java which is for # 2.

Comment on lines 144 to 156
dto.setLocale(LocaleUtility.fmtToMappedLocale(dto.getLocale()).toLanguageTag());

ProductService ps = new ProductService(dto);

//Refresh the cache of supported locales as needed in a separate thread (non-blocking).
ps.refreshSupportedLocalesTask();

//Match against list of supported locales that is already in the cache
Set<Locale> supportedLocales = LocaleUtility.langTagtoLocaleSet(ps.getCachedSupportedLocales());
Locale matchedLocale = LocaleUtility.pickupLocaleFromList(supportedLocales, Locale.forLanguageTag(dto.getLocale()));
if (matchedLocale != null) { // Requested locale matches a supported locale (eg. requested locale "fr_CA matches supported locale "fr")
dto.setLocale(matchedLocale.toLanguageTag());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the flow should as below:
Get supported locales from product service. Returning null if cache is empty. The product service will refresh cache.
If supported locales isn't empty, then pick up locale from them.

Current implementation is a little messy, hard to read and maintain.

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty much the same to me. Based on the flow you described

  1. ComponentService will get supported locales from product service
    This PR does it when ComponentService.doLocaleMatching calls ps.getCachedSupportedLocales().
  2. The product service returns an empty set if cache of locales is not found.
    ProductService.getCachedSupportedLocales() in # 1 returns an empty set if not found in cache.
  3. The Product Service will refresh the cache if needed
    Sure, ProductService.getCachedSupportedLocales now refreshes the cache in a separate thread instead of doing it separately here.
  4. If supported locales set isn't null/empty, then pick up the locale from them.
    This PR does it in ComponentService.doLocaleMatching after step 2: dto.setLocale(matchedLocale.toLanguageTag());

As 99% of the flow matches (only #3 needs some update, I am not sure what you mean by "little messy, hard to read and maintain".

The only added logic to the flow is ComponentService.proceed(), which is only called when message bundle is not yet in cache or is expired. This is for performance enhancement. ComponentService.proceed calls ps.getCachedSupportedLocales(dataSource) to check if the requested locale (dto.getLocale()) is supported in that specific data source. This is so that if it is not supported in that data source, it won't even try to fetch the message bundle. This is important especially if that invalid fetch is to a remote VIP service which will affect performance.

Comment on lines 76 to 80
private Set<String> getSupportedLocales() {
return getSupportedLocales(true);
}

public Set<String> getCachedSupportedLocales() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interfaces should be consistent.
Do you plan to create such interface for component module as getCachedMessages or create interface getCacheComponents for component list?
If cache is empty, just populate the cache. if not, then return cache data and refresh cache if it's expired.

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please review the other issue linked to this: #877:

We do not need getCachedXXX for other data. We only need the presence/absence of cached supported locales to be non-blocking.

getCachedSupportedLcoales is an exception because we use it to pre-determine if getComponentMessages must even proceed or not. The locale list is not the requested information. It is used for performance enhancement only: so that if locale is not supported, it won't even try to fetch the bundle, and it will directly serve fallback locale bundle. However, if this performance-related call to getSupportedLocales was not even previously cached, it defeats its purpose. Not having this list in the cache should not make the call to getComponentMessages wait. Why wait for a separate call to getSupportedLocales when it can send this very first call to getComponentMessages directly? Refreshing the locales list can be done simultaneously in a separate thread to benefit succeeding getComponentMessages calls, but it is of no benefit to the first call at this point.

More importantly, if getSupportedLocales previously failed (or keeps on failing like in Atlas because FileSystems.newFilesSystems walk does not work in nested jars for Java 11 or less), it should definitely not block getComponentMessages. Otherwise, localized messages cannot be served at all. This is an unacceptable bug. Hence, we use getCachedSupportedLocales to just search from cache and not have to wait to fetch supported locales from data source every time, which is expected to fail in this scenario.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you created so much code just to work around an extreme situation, getSupportedLocales keeps fail. In this case, we should remove test supported locales because stability takes precendence over performance. Let users take charge of it. Get the supported locale before getting translation or bear the delay of unsupported locale. In fact this logic has been there before for a long time.

if (cacheItem.isExpired())
refreshLocalesCacheItemTask(cacheItem, dataSource);
if (cacheItem.isExpired()) {
synchronized (cacheItem) { // Allow only 1 thread to refresh the cacheItem at a time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a new update process can be started after previous one has started successfully. right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and that is why inside the synchronized block, another if (cacheItem.isExpired()) is done so that it will not proceed with the update if previous thread has already refreshed the cacheItem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, I have removed this part out of this PR to narrow down the scope. I will include it in another PR.

return cacheItem.getCachedData().keySet();
} else {
cacheItem = createLocalesCacheItem(dataSource);
// Allow only 1 thread to create the cacheItem.
synchronized (TranslationCacheManager.getInstance().getCache(VIPCfg.CACHE_L3)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this too coarse-grained, there is only one level3 cache globally. Does this mean that only one cache item can be created at the same time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. Any suggestions on what object to use as a lock that will be specific to the item that does not exist yet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this part out of this PR to narrow down the scope. I will include it in another PR.

}

public Set<String> getSupportedLanguageTags() {
private Set<String> getSupportedLocales(boolean refreshCache) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's better to create 2 method, one is getSupportedLocales and another is refreshSupportedLocales to reduct ambiguity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed this to combineSupportedLocales. The overloading should be clear. Below are the method signatures

  1. getSupportedLocales() - gets all supported locales (from all data sources)
  2. getSupportedLocales(dataSource) - gets supported locales from the specified data source
  3. getCachedSupportedLocales() - gets all supported locales (from all data sources) from cache
  4. getCachedSupportedLocales(dataSource) - gets supported locales from the specified data source from cache
  5. private combineSupportedLocales (refreshCache) - combines the list of supported locales from all dataSources. Used in # 1 and # 3.

I have also updated javadoc for all methods.

* Do not block refreshCacheItem if set of supported locales is not in cache (i.e. supportedLocales.isEmpty()).
* This happens either when cache is not initialized, OR previous attempts to fetch the set had failed.
*/
return (supportedLocales.isEmpty() || supportedLocales.contains(dto.getLocale()) || VIPCfg.getInstance().isPseudo());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why proceed if pseudo is true?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been there since before this PR. I just moved the if-statement judgements from ComponentService.getMessage to here, so no change here.

return getSupportedLocales(true);
}

public Set<String> getCachedSupportedLocales() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only reason we need this interface is that supported locales isn't cached. Suggest cache it when initialization. Then all the logic will be same as other features.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also agree cache locale list when initialization no matter initializeCache is true or false, since it may reduce the chance of locale list fetching be triggered by message fetching which can improve performance at message fetching.
Further more, we may remove fetch locale list task code at message fetching.

This can be done in a new PR.

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. We should not force to cache this at app start up. This is why we have initializeCache property in the config. You may look at this as "small" data to always preload at start up time, but this is poor coding practice. We always lazy-load when possible. For instance whenever we use the Singleton design pattern, we do no even initialize the instance until it is needed (lazy initialization). Same should apply here.

Besides, why are you against caching the list in a separate thread only when requested instead of at app start up time? It calls the exact same code (ProductService.processSupportedLocales). We just added a logic to call createLocalesCacheItemTask in a separate thread if it needs to be non-blocking. Nothing else was changed in the existing logic.

  1. In some cases, getSupportedLocales from offline resource bundle will always fail (FileSystems.newFileSystem walk fails in nested jar applications like in Atlas for Java 11 or less), so always calling getSupportedLocales that fails will affect performance as it tries to read from the disk (slow).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check ProductService. The code is now more unified.

huihuiw01
huihuiw01 previously approved these changes Dec 4, 2020
for (String languageTag : languageTags) {
locales.add(Locale.forLanguageTag(languageTag));

public void refreshSupportedLocalesTask() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually when locale list is in cache, you won't populate it, so here 'refreshSupportedLocalesTask' may not appropriate, how about 'getSupportedLocalesTask'?

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed this task entirely. See code changes.

Comment on lines 76 to 80
private Set<String> getSupportedLocales() {
return getSupportedLocales(true);
}

public Set<String> getCachedSupportedLocales() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are 3 'getSupportedLocales' methods and 2 'getCachedSupportedLocales' methods, and they call each other, and seems 2 'getCachedSupportedLocales' are not the relation of overload, it's a little difficult to understand the relationship of these methods, it's better optimize the methods' name.

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have renamed 1 method only. The overloading should be clear. Look how the method signatures are self-explanatory.. descriptions seem to be repetitive of what is already obvious.

  1. getSupportedLocales() - gets all supported locales (from all data sources)
  2. getSupportedLocales(dataSource) - gets supported locales from the specified data source
  3. getCachedSupportedLocales() - gets all supported locales (from all data sources) from cache
  4. getCachedSupportedLocales(dataSource) - gets supported locales from the specified data source from cache
  5. private combineSupportedLocales (refreshCache) - combines the list of supported locales from all dataSources. Used in # 1 and # 3.

I have also updated javadoc for all methods.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according OO thinking, the dataSource should be responsible of its methods instead of product service.

path = Paths.get(uri);
messages = JSONBundleUtil.getMessages(path);
}
InputStream is = getInputStream();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

close the stream?

Iterator<DataSourceEnum> msgSourceQueueIter = VIPCfg.getInstance().getMsgOriginsQueue().iterator();
Set<String> supportedLangTags = new HashSet<>();
while(msgSourceQueueIter.hasNext()) {
supportedLangTags.addAll(getSupportedLanguageTags(msgSourceQueueIter.next()));
supportedLangTags.addAll(getSupportedLanguageTags(withCacheRefresh, msgSourceQueueIter.next()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As huihui said, the additional footprint isn't much because they are just a very few strings. But will improve performance greatly. Don't need to combine two set anymore.

Comment on lines 76 to 80
private Set<String> getSupportedLocales() {
return getSupportedLocales(true);
}

public Set<String> getCachedSupportedLocales() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems you created so much code just to work around an extreme situation, getSupportedLocales keeps fail. In this case, we should remove test supported locales because stability takes precendence over performance. Let users take charge of it. Get the supported locale before getting translation or bear the delay of unsupported locale. In fact this logic has been there before for a long time.

Comment on lines 76 to 80
private Set<String> getSupportedLocales() {
return getSupportedLocales(true);
}

public Set<String> getCachedSupportedLocales() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

according OO thinking, the dataSource should be responsible of its methods instead of product service.

while (true) {
String filePath = FormatUtils.format(OFFLINE_RESOURCE_PATH, dto.getComponent(), locale);
Path path = Paths.get(VIPCfg.getInstance().getOfflineResourcesBaseUrl(), filePath);
InputStream is = Thread.currentThread().getContextClassLoader().getResourceAsStream(path.toString());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change FileSystem to getResourceAsStream?

Copy link
Contributor Author

@jessiejuachon jessiejuachon Dec 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the primary root causes of the bug. Aside from getSupportedLocales is failing, and even if it succeeds, getComponentMessages is also failing! This is because FileSystems.newFileSystem fails in nested jars (not supported in Java 11 or below). However, getResourceAsStream does not fail! So getComponentMessages actually succeeds in this PR even for nested jars like in Atlas.

@jessiejuachon
Copy link
Contributor Author

jessiejuachon commented Dec 5, 2020

@Xiaochao8

"As huihui said, the additional footprint isn't much because they are just a very few strings. But will improve performance greatly. Don't need to combine two set anymore."

As I said, pre-loading anything is not good practice. Actually, pre-loading the list of supported locales will affect performance of application start up. How does it "improve performance greatly"? What's wrong with getting this only when needed in a separate thread? The added code is merely 5-7 lines of code (createLocalesCacheItemTask).

@jessiejuachon
Copy link
Contributor Author

jessiejuachon commented Dec 5, 2020

@Xiaochao8

It seems you created so much code just to work around an extreme situation, getSupportedLocales keeps fail. In this case, we should remove test supported locales because stability takes precendence over performance. Let users take charge of it. Get the supported locale before getting translation or bear the delay of unsupported locale. In fact this logic has been there before for a long time.

This "so much code" is NOT just a "workaround" for "an extreme situation". This is the proper behavior for any situation: The first step (getSupportedLocales) that is only used for performance purposes should not block the second step (getComponentMessages). This is regardless of whether FileSystems walk fails or not. ANY getSupportedLocales failure (offline or online mode) should not block getComponentMessages. For instance, getSupportedLocales from remote may fail too (e.g. temporary network issue), but getComponentMessages should still succeed if network issue is gone.

This is what this PR is about. It proves that even if getSupportedLocales step fails, messages can still be fetched.

The issue of FileSystem walk not working happens whenever the application has shared libs and are using Spring boot (or any platform that packages with nested jars instead of 1 fat jar). So this will be quite common. You are right, we should fix this issue for stability, but our hands are tied for now because FileSystem.newFilesSystem in nested jars is not supported in Java 11. We can explore other options later.

What do you mean by let users take charge of it? What can they do if getSupportedLocales failed?

Get the supported locale before getting translation or bear the delay of unsupported locale. In fact this logic has been there before for a long time.

You are mixing 2 issues:

  1. There is a delay if supported locales is not yet fetched.
    The only added code for this problem, is createLocalesCacheItemTask -- createLocalesCacheItem in a separate thread.
    As I said, I do not see why fetching supported locales at app start up is better. It will affect app start up time performance. Fetching in a separate thread is merely 5-7 lines of code (createLocalesCacheItemTask) added.
  2. getComponentMessages fails if first step (getSupportedLoclaes) failed.
    This buggy logic has been there for a long time, but hidden for a while because we are not testing getSupportedLocales failed scenario. It was just made obvious by the Atlas bug. This is primarily what we are fixing in this PR (aside from using getResourceAsStream in getComponentMessages so that it will not fail even in nested jars)

@jessiejuachon
Copy link
Contributor Author

jessiejuachon commented Dec 5, 2020

@Xiaochao8

according OO thinking, the dataSource should be responsible of its methods instead of product service.

ProductService.getCachedSupportedLocales() is certainly not a function of the dataSource. The dataSource is not even processed here.

Getting supported locales from cache is a function of CacheService. CacheService is utilized when it is called down the line in ProductService.processSupportedLocales:

CacheService cs = new CacheService(new MessagesDTO(dto)); MessageCacheItem cacheItem = cs.getCacheOfLocales(dataSource);

Therefore, I do not think we are violating any OO concept.

Copy link
Contributor

@Xiaochao8 Xiaochao8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though there are still some places to be improved, there aren't blocker issues. Merge it now.

@Xiaochao8 Xiaochao8 merged commit 8d7af99 into vmware:g11n-java-client Dec 7, 2020
@jessiejuachon
Copy link
Contributor Author

jessiejuachon commented Dec 7, 2020

Even though there are still some places to be improved, there aren't blocker issues. Merge it now.

I agree! Thanks for merging it. I have created the following to track the getSupportedLocales issue (FileSystems walk fails) in nested-jar applications: #912

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/java-client cla-not-required kind/bug Something isn't working kind/enhancement Enhancement to existing features priority/high
Projects
None yet
4 participants