-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Labels do not correctly render languages that require text shaping #2521
Comments
Using ES6 (and the polyfill available here: http://norbertlindenberg.com/2012/05/ecmascript-supplementary-characters/) it should be possible to fix LabelCollection to detect this and draw surrogate pairs as a single glyph. |
I'm not sure the case in the example is surrogate pairs issue. According to the link in the forum post:
But it seems that each character in the in the example string can be represented as 3 hexadecimal digits. maybe the issue similar to this: BTW: for example:
returns while
returns
returns if we take the example string of this issue:
we get |
Congratulations on closing the issue! I found these Cesium forum links in the comments above: https://groups.google.com/d/msg/cesium-dev/6EA78tUxGRY/xMr9cfJGS1IJ If this issue affects any of these threads, please post a comment like the following:
I am a bot who helps you make Cesium awesome! Contributions to my configuration are welcome. 🌍 🌎 🌏 |
We are still seeing disconnected arabic lettering. We've set the enableRightToLeftDetection properly based upon the browsers language. I really don't know much about the rtl language characters, so some of the discussion here is over my head. Should arabic labels appear connected? Does the TwitterCldr fixed this issue? |
@scottnc27603 Could you please paste a short code example to reproduce what you're seeing? Thanks! |
I think the original issue still exists as @siloboula shows. Here's a better example that shows it. The text in the billboard is correct. The one in the label is not.
|
I would have expected #7280 to have fixed this (which is master only) but that doesn't appear to be the case. Master does fix अनुच्छेद for example My hunch is that we need to do something special for RTL, such as iterating in the other direction. |
It's not a unicode issue, and I don't think it's an RTL issue. These are the right characters in the right place, but they're not the right shape. The shaping step in a text rendering system takes care of figuring out the right glyph for the character based on where it is. This is not possible if each character is rendered separately. This is a good article on this. Some quotes:
If the Label is rendering each character to a separate canvas in order to re-use them, that's also an incorrect assumption (the same character can have up to 3 different representations in Arabic). We'd have to at the very least render each word together I think. |
Spoke offline with @OmarShehata the best solution is probably to add an option to render labels as whole strings, either at the Entity API level (which would be trivial) or the LabelPrimitive level (which may be more involved). This would definitely cause memory issues for large collections of labels, so care must be taken either way. The reason Cesium renders per-character to begin with is because per-word rendering does not scale with texture usage so that needs to remain the default behavior. If we could efficiently auto-detect, that's great, but I doubt that's possible. @siloboula we are certainly interested in fixing this, but there is currently no ETA on when that will happen. We welcome pull requests if you would like to propose a solution. See our Contributing Guide if you want to give it a try. |
As discussed on the forum JavaScript treats surrogate pairs as two characters, when in reality they are one.
To reproduce:
The easiest short-term work around for this is to use a Billboard instead (writeTextToCanvas works fine, this problem is specific to LabelCollection).
The text was updated successfully, but these errors were encountered: