-
-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DESIGN] Update bidi design document to show proposed design #871
Conversation
The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization). This PR consists of moving text around. I added one "pro" to one option also.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some reservations about taking this direction, but I don't oppose it provided that we express clearly the recommended isolation/marking style.
I also continue to advocate for bidi controls to be available around name
, so that it can be isolated from prefix sigils like :
, $
, @
.
But all of that can be addressed separately; this request for changes is about the inline comment below.
exploration/bidi-usability.md
Outdated
The second part of the hybrid approach would be to recommend ("SHOULD") the "strict isolation" | ||
design for serializers. | ||
This syntax is a subset of the super-loose syntax and can be applied selectively to messages that | ||
have RTL sequences or which have problematic display. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presuming that this is referring to the "Strict isolation all the time" alternative, its proposed syntax is not a subset of the "Super-loose isolation" syntax, as the former includes at least this rule, which proposes bidi control characters in a non-whitespace position:
identifier = [(namespace ns-separator)] name
ns-separator = [bidi] ":"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I'll add appropriate notes about this and the larger issue of names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This probably works for most real-world cases.
It might be good to explicitly note that with the proposed solution RTL names that don't start with a strongly RTL-directional character are not well supported in positions that use a leading sigil.
This means that when using a name like _مصر
for a variable (where the _
is the first character of the name), this can only be rendered incorrectly as either $_مصر
or $_مصر
.
but whitespace is not at least optional. | ||
This could be defined as: | ||
```abnf | ||
ns-separator = [bidi] ":" [bidi] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why allow for the bidi
after :
if we're not allowing it to show up elsewhere after a starting sigil?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your example $_مصر
can be made to render correctly using LRI/PDI around the identifier (which the syntax allows and the strict form encourages). The [bidi]
production allows starting or ending strongly directional marks to prevent spillover in the middle of a namespaced identifier, e.g. $_م1صر:_م2صر
I'll admit that it's generally better practice to put the strongly directional character at the end (before the :
separator), but I didn't want to make the syntax ultra-fussy: whatever stew of strongly directional characters and a colon are not part of either the namespace
or the name
.
The separator is different from sigils because the sigils are all at token-start, whereas the namespace separator is embedded into the word-token.
code points for the first example: \u2066$_\u0645\u0635\u0631\u200e\u2069
try it
code points for the second example: \u2066$_\u06451\u0635\u0631:\u200e_\u06452\u0635\u0631\u200e\u2069
try it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your example
$_مصر
can be made to render correctly using LRI/PDI around the identifier (which the syntax allows and the strict form encourages). [...]code points for the first example:
\u2066$_\u0645\u0635\u0631\u200e\u2069
try it
That's rendering for me with the _
next to the $
, which is wrong? The _
here has neutral directionality, and it's the first character of a name which as a whole is RTL. So we ought to have some way to render it as the right-most character, but that's not possible without some bidi control after the $
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough, although a display like $_مصر
is in itself super-weird (and it's namespaced friend $_م1صر:_م2صر
is even weirder) from the point of view that it's an RTL isolated run inside an LTR isolated run (or, in the latter case, two RTL runs inside an LTR run). This makes the name tokens display correctly RTL while still forcing the overall token (including the sigil) to display LTR. That is, this sequence:
\u2066$\u2067_\u06451\u0635\u0631\u2069:\u2067_\u06452\u0635\u0631\u2069\u2069
... with 6 of the 18 code points being bidi isolates.
There's a similar problem with options, where we're trying to force them to be LTR-ordered (option = value
/م1صر=م2صر
) when RTL really really wants that to be displayed as م1صر=م2صر
.
So you're right: my proposal does compromise some aspects of RTL display as part of our insistence that messages are intended to work in an LTR editing environment with LTR syntax. Either way, logical order is all that matters and users should be careful about using mixed direction for non-literal values. We give them the tools to make their bidi literals and patterns work RTL-normally as well as the tools to let non-RTL speakers read placeholders LTR (like most developers debugging messages). It is not perfect. Do we need that last bit of cruft to allow the sigil and identifier to be separate? (Not asking rhetorically, btw. What do people think? Where can we get developer/translator feedback?)
Note: It took a minute of fiddling to get the namespaced example to not get entangled with the markdown in this comment--all in service of getting the underscore on the right side.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the proposed solution solves the 99% case, and trying to solve the remaining 1% will be tricky. Moreover, the people who really have to deal with messages are the translators, and they will need tooling to effectively do their jobs at scale — and that tooling has far more freedom to deal with bidi issues than we have in plain text. So I don't think we need to go any further down this path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tbf, _ has neutral direction and it is relatively common to use it as a prefix for identifiers.
Yes, I know. So do all of the sigils (with some of them being enclosing punctuation).
I guess my argument would be that the sigil is, from a user perspective, "part of the name", e.g. $foo
, not $
with foo
being separate. In an RTL display context, the sigil, being a neutral, will naturally reverse sides to be at the start, (e.g. $_مصر
). For us to maintain an LTR bias while making only the name token isolated is a lot more work--for tools and users.
The counterargument would be that identifier
and name
are their own things (and many are generated from the user's environment). The various sigils and such in MF2 are thus "not part of the name" and should be treated separately. Allowing users to insert these characters is not the same as obligating them to use them and I could see us making the affordance.
In the screenshot below, the display is set to RTL. The placeholders have an LRI/PDI inside the {
/}
.
- The first line, lacking any other bidi controls, is illegible.
- The second line uses RLI/PDI around each token. This presents sigils on the right, which an RTL speaker would understand, but the tokens are in LTR order. I don't specifically object to this presentation.
- The third line uses LRI/PDI around each token and approximates what I'm suggesting above.
- The fourth line uses the markup you're suggesting, which is RLI/PDI around each
name
and LRI/PDI around the sigil+identifier.
Here's the data used to produce the screenshot. Note that I do use an U+200E next to the :
separator.
Example: {\u2066$_\u06351\u0636\u0637 :_\u06352\u0636\u0637:_\u06353\u0636\u0637 _\u06354\u0636\u0637=_\u06355\u0636\u0637\u2069}<br>
Example RLI: {\u2066\u2067$_\u06351\u0636\u0637\u2069 \u2067:_\u06352\u0636\u0637:_\u06353\u0636\u0637\u2069 \u2067_\u06354\u0636\u0637\u2069=\u2067_\u06355\u0636\u0637\u2069\u2069}<br>
Example LRI: {\u2066\u2066$_\u06351\u0636\u0637\u2069 \u2066:_\u06352\u0636\u0637\u200e:_\u06353\u0636\u0637\u2069 \u2066_\u06354\u0636\u0637\u2069=\u2066_\u06355\u0636\u0637\u2069\u2069}<br>
Example EAO: {\u2066\u2066$\u2067_\u06351\u0636\u0637\u2069\u2069 \u2066:\u2067_\u06352\u0636\u0637\u2069\u200e:\u2067_\u06353\u0636\u0637\u2069\u2069 \u2067_\u06354\u0636\u0637\u2069=\u2067_\u06355\u0636\u0637\u2069\u2069}<br>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The various sigils and such in MF2 are thus "not part of the name" and should be treated separately.
That would be my position. In particular for variables, AFAIK all current implementations would require parameters like { foo: 42 }
in order to resolve a $foo
, and not e.g. { $foo: 42 }
.
In case it matters, here's a slightly shorter way to get the same results as in the last example (The LR isolation around the sigil+identifier isn't required as the placeholder already isolates its contents, and the LRM next to the :
has no effect):
Example EAO: {\u2066$\u2067_\u06351\u0636\u0637\u2069 :\u2067_\u06352\u0636\u0637\u2069:\u2067_\u06353\u0636\u0637\u2069 \u2067_\u06354\u0636\u0637\u2069=\u2067_\u06355\u0636\u0637\u2069\u2069}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case it matters, here's a slightly shorter way to get the same results as in the last example (The LR isolation around the sigil+identifier isn't required as the placeholder already isolates its contents, and the LRM next to the : has no effect):
Your formulation is correct. Note that the LRM is necessary if we don't permit isolates inside the identifier
production (which is not what you are proposing), because the :
is a neutral and wants to extend whatever run is to either side of it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something got damaged somewhere in the several copy/paste operations.
To make things readable I replaced:
\u0635 A
\u0636 B
\u0637 C
\u2066 [LRI]
\u2069 [PDI]
\u200E [LRM]
\u2067 [RLI]
And the resulting strings above are:
Example: {[LRI]$_A1BC :_A2BC:_A3BC _A4BC=_A5BC[PDI]}<br>
Example RLI: {[LRI][RLI]$_A1BC[PDI] [RLI]:_A2BC:_A3BC[PDI] [RLI]_A4BC[PDI]=[RLI]_A5BC[PDI][PDI]}<br>
Example LRI: {[LRI][LRI]$_A1BC[PDI] [LRI]:_A2BC[LRM]:_A3BC[PDI] [LRI]_A4BC[PDI]=[LRI]_A5BC[PDI][PDI]}<br>
Example EAO: {[LRI][LRI]$[RLI]_A1BC[PDI][PDI] [LRI]:[RLI]_A2BC[PDI][LRM]:[RLI]_A3BC[PDI][PDI] [RLI]_A4BC[PDI]=[RLI]_A5BC[PDI][PDI]}<br>
Example EAO: {[LRI]$[RLI]_A1BC[PDI] :[RLI]_A2BC[PDI]:[RLI]_A3BC[PDI] [RLI]_A4BC[PDI]=[RLI]_A5BC[PDI][PDI]}
Taking out the bidi control characters we are left with
{$_A1BC :_A2BC:_A3BC _A4BC=_A5BC}
That is not valid MF2 syntax.
So I don't know what this is trying to fix.
But yesterday I played a bit and put together a web page that one can use to interactively play with this.
https://mihai-nita.net/tmp/mf2bidi.html
And I argue that:
- isolates are enough (FSI, LRI, RLI, PDI)
- there is no need for any control character between
$
and the name proper, even when there is an_
there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is not valid MF2 syntax.
Um... how is it not valid?
$_A1BC
is a valid operand.
:_A2BC
is a valid namespace. :_A3BC
is a valid function name.
_A4BC
is a valid option name.
_A5BC
is a valid unquoted literal.
It's even a valid message, in that a simple message might consist solely of a placeholder.
What am I missing?
Notes from 2024-09-02 call: need to add text about alm. look up alm end-effects |
* Create notes-2024-08-19.md * Accept attributes design & remove spec note (#845) * Accept attributes design & remove spec note * Disallow duplicate attribute names (closes #756) * Add link to contextual options PR * Add more prose to tag example text Co-authored-by: Addison Phillips <addison@unicode.org> * Mention attribute validity condition in the **_valid_** definition --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Update selection-declaration design doc based on mtg / issue discussion (#867) * Add tests for pattern selection (#863) * Add tests for pattern selection * Add missing errors * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Add Duplicate Variant to table in test/README.md (#861) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders (#860) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders * Improve examples * Switch example order * Update the stability policy (#834) * Update the stability policy Based on discussion in the 2024-07-22 call and in PR #829, update the stability policy. * A deeper, more thorough rewrite - Standardizes the phrasing completely. - Moves all potential future changes (which are not, after all, stability policies) to an "important" block - Removes duplication - Separates functions, options, and option values into separate guarantees - Clarifies the note about formatting changing over time * Update spec/README.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update spec/README.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * remove well-formed * Update spec/README.md --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Refine error handling text (#816) * Refine error handling text * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Update fallback text * Turn bullet point list into paragraphs * Be more mighty Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Create notes-2024-08-26.md * Select "Match on variables instead of expressions" for selection-declarations (#824) * Select "Match on variables instead of expressions" for selection-declarations * Add hybrid option to selection-declaration.md (#870) * Add hybrid option to selection-declaration.md * Update selection-declaration.md fixed glitch in original edit * Update selection-declaration.md * Apply suggestions from code review Fixing typos Co-authored-by: Addison Phillips <addison@unicode.org> * Update selection-declaration.md * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update selection-declaration.md --------- Co-authored-by: Mark Davis <mark@unicode.org> Co-authored-by: Addison Phillips <addison@unicode.org> * Fix "Allow immutable input declarative selectors" example (#874) * Update README.md (#875) * Update README.md * Update README.md * [DESIGN] Update bidi design document to show proposed design (#871) * [DESIGN] Update bidi design document to show proposed design The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization). This PR consists of moving text around. I added one "pro" to one option also. * Address comments * Miscellaneous test fixes (#862) * Add missing expected bad-selector errors * Fix expected parts for unsupported-statement test * Add a few new tests for leading-whitespace and duplicate-variant * Add tests for escaped-char changes made in #743 * Fix tests for attributes with variable values * Update contributing and joining info (#876) * Update contributing and joining info * Update README.md * Update CONTRIBUTING.md * Restore CLA copy * Clarify error & fallback handling (#879) * Clarify error & fallback handling * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Select last rather than first attribute * Drop mention of "starting with Pattern Selection" * Attributes can't change the formatted output * Use "nor" instead of "or" regarding attribute restrictions --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Clarify rule selection (#878) * Clarify rule selection Fixes #868 This adds normative SHOULD language to using CLDR plural and ordinal data, which was intended originally. - clarifies that keyword selection follows exact match - clarifies the purpose of rule-based selection - makes non-CLDR-based implementation permitted * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * [DESIGN] Maintaining the Standard, Optional and Unicode Namespace Function Sets (#634) * Design doc to capture registry maintenance * Update maintaining-registry.md * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Add user stories, small updates to RGI * Update exploration/maintaining-registry.md * Adding additional detail * Remove machine readable registry; update prose * Update maintaining-registry.md * Further development work * Update to change format and naming Per the 2024-08-19 call, we decided to switch towards a specification-per-function model, with statuses. This commit includes the initial set of changes to try and implement this. * Address some comments. --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> * Create notes-2024-09-09.md * Fix a typo in an example (#880) The upcoming work to implement resolved value might make this patch unnecessary or obsolete, but fixing the typo (missing `{`/`}` around the variable in the pattern) just in case * Remove forward-compatibility promise and all reserved & private syntax (#883) * Remove forwards compatibility from stability guarantee * Drop reserved statements and expressions * Drop private-use annotations * Update tests * Clarify that deprecation is not removal * Match on variables instead of expressions (#877) * Match on variables instead of expressions * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Apply suggestions from code review * Add missing test changes noticed during implementation * Empty commit to re-trigger CLA check --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Create notes-2024-09-10.md * Add bidi support and address UAX31/UTS55 requirements (#884) * Add bidi support and address UAX31/UTS55 requirements Adds the bidi strong marks ALM, RLM, and LRM plus the bidi isolate controls LRI, RLI, FSI, and PDI to the syntax. Formally defines optional vs. non-optional whitespace. Non-optional whitespace must include at least one whitespace character. Optional whitespace may contain only bidi marks (which are invisible) * Update syntax.md including text from previous PR * Repair the guidance on strongly directional marks Include ALM and better specify how to use the marks. * Fix formatting of the "important" * Add bidi characters to description of whitespace. * Permit bidi in a few more places Add optional whitespace at the start of `variant` Add optional whitespace around `quoted-pattern` These changes result in allowing bidi around keys and quoted patterns as intended. * Update syntax.md ABNF * Update formatting.md - Add a note about the difference between formatting and message syntax. - Clarify the sentence about message directionality. * Address comment about name/identifier * Address comments related to bidi in `name` * Fix variable's location * Address comment about the list of LRI/PDI targets * One character typo :-P * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Address comments about rule R3a-1 * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Address comment about U+061C * Change [o]wsp => `o` or `s` * Match syntax spec to abnf * Remove * * Update syntax.md * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update syntax.md * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Specify `bad-option` for bad digit size option values (#882) * Specify `bad-option` for bad digit size option values Fixes #739 * adopt 'non-negative integer' * Create notes-2024-09-16.md * Address name and literal equality (#885) * Address name and literal equality This change defines equality as discussed in the 2024-09-09 teleconference in the following ways: - It defines _name_ equality as being under NFC - It defines _literal_ equality as explicitly **not** under NFC - It moves _name_ before _identifier_ in that section of text to avoid a forward definition. Note that this deviates from discussion in 2024-09-09's call in that we didn't discuss literals at length. It also doesn't discuss non-name/non-literal values, which I'll point out are limited to ASCII sequences such as keywords. * Typo fix * Add a note about not requiring implementations to actually normalize * Implement changes dicussed in 2024-09-16 call. - Make _key_ require NFC for uniqueness/comparison - Add a note about NFC - Make _literal_ **_not_** define equality - Make text in _name_ identical to that in _key_ for consistency * Update formatting.md to include keys in NFC * Address comments * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update list of normative changes during the LDML45 period (#890) * Fix typos in data-model-errors tests (#892) Fix #886 * Update note on exact numeric match for v46 (#891) Addresses #887 Non-normative changes to the notes specifically part of LDML46 * Fix attribute value to be literal (#894) Fixes #893 * Create notes-2024-09-30.md * Add Resolved Values and Function Handler sections to formatting (#728) * Add Resolved Values section to formatting * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Tim Chevalier <tjc@igalia.com> * Linkify "resolved value" * Add some examples & explicitly allow wrapping input values * No throw, only emit Co-authored-by: Tim Chevalier <tjc@igalia.com> * Add section on Function Handlers, defining the term * Apply suggestions from code review * Rephrase initial resolved value definition * Update spec/formatting.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update resolved value definition again Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Addison Phillips <addison@unicode.org> * Define function composition for :number and :integer values (#823) * Define function composition for :number and :integer values * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Add operand option priority example * Add apostrophes' Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> * Create notes-2024-10-07.md * Apply NFC normalization during :string key comparison (#905) * Apply NFC normalization during :string key comparison * Add link to UAX#15 Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Add tests for changes due to bidi/whitespace (#902) * Add tests for changes due to bidi/whitespace * Correct output * Make erroneous test a syntax error * Define function composition for date/time values (#814) * Define function composition for date/time values * Apply suggestions from code review Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> * Drop the "only" * Update spec/registry.md * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Make :date and :time composition implementation-defined --------- Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> Co-authored-by: Addison Phillips <addison@unicode.org> * DESIGN: Add alternative designs to the design doc on function composition (#806) * DESIGN: Add a sequel to the design doc on function composition This document sketches out some alternatives for the machinery provided to enable function composition. The goal is to provide an exhaustive list of alternatives. * Remove 'part 2' document and move contents to the end of part 1 * Revise introduction to reflect the changed goal * Edited for conciseness * Further edits for conciseness * Give a name to InputType and use it * Refer to motivating examples * Update function-composition-part-1.md status Per 2024-10-14 telecon * Create notes-2024-10-14.md * Add test for :integer and :number composition (#907) * Fix `:integer` option `useGrouping` values (#912) I noticed that `:integer` does not include the "never" value for the option `useGrouping`. This is a bug. * Drop syntax note on additional bidi changes (#910) Drop syntax note on addition bidi changes * Add tests for changes due to #885 (name/literal equality) (#904) * Add tests for changes due to #885 (name/literal equality) * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Update test/tests/syntax.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Added tests for reordering and special case mapping * Add another selection test --------- Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Add u: options namespace (#846) * Move spec/registry.md -> spec/registry/default.md * Add Unicode Registry definition * Refer to BCP47, add note about only requiring normal tags * Call it a namespace * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Fix test file reference Co-authored-by: Tim Chevalier <tjc@igalia.com> * Apply suggestions from code review * Update spec/u-namespace.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Add mention of functions to namespace description --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> * Define function composition for :string values (#798) * Define function composition for :string values * Update spec/registry.md as suggested by @stasm in #814 * Drop the "only" * Update text following code review comments --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Drop data model request for feedback on "name" (#909) * Allow surrogates in content, issue #895 (#906) * Allow surrogates in content, issue #895 * Grammar and typos, linkify terms, make into a note, and fix 2119 keywords Thanks Addison! Co-authored-by: Addison Phillips <addisonI18N@gmail.com> * Not using "localizable elements" Co-authored-by: Addison Phillips <addisonI18N@gmail.com> * Keep syntax.md in sync with message.abnf * Added note about surrogates to quoted literals * Moved the note about surrogates from Security Considerations to The Message * Update spec/syntax.md * Update spec/syntax.md * Italicize in a couple of places * Implemeted more (all?) feedback from review --------- Co-authored-by: Addison Phillips <addisonI18N@gmail.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> Co-authored-by: Elango Cheran <elango@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Mark Davis <mark@unicode.org> Co-authored-by: Danny Gleckler <daniel.gleckler@d2l.com> Co-authored-by: Steven R. Loomis <srl295@gmail.com> Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> Co-authored-by: Eemeli Aro <eemeli@gmail.com> Co-authored-by: Mihai Nita <nmihai_2000@yahoo.com>
* [DESIGN] Number selection design refinements This is to build up and capture technical considerations for how to address the issues raised by @eemeli's PR #842. * Update examples to match changes to syntax Also responds to the long discussion with @eemeli about significant digits by removing from the example. * Address 2024-09-16 call comments This changes the status to "Re-Opened" and adds a link to the PR. Expect to merge this imminently, although discussion on number selection remains. * Update exploration/number-selection.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update from main (#914) * Create notes-2024-08-19.md * Accept attributes design & remove spec note (#845) * Accept attributes design & remove spec note * Disallow duplicate attribute names (closes #756) * Add link to contextual options PR * Add more prose to tag example text Co-authored-by: Addison Phillips <addison@unicode.org> * Mention attribute validity condition in the **_valid_** definition --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Update selection-declaration design doc based on mtg / issue discussion (#867) * Add tests for pattern selection (#863) * Add tests for pattern selection * Add missing errors * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Add Duplicate Variant to table in test/README.md (#861) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders (#860) * Add new selection-declaration alternative: Require annotation of selector variables in placeholders * Improve examples * Switch example order * Update the stability policy (#834) * Update the stability policy Based on discussion in the 2024-07-22 call and in PR #829, update the stability policy. * A deeper, more thorough rewrite - Standardizes the phrasing completely. - Moves all potential future changes (which are not, after all, stability policies) to an "important" block - Removes duplication - Separates functions, options, and option values into separate guarantees - Clarifies the note about formatting changing over time * Update spec/README.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update spec/README.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * remove well-formed * Update spec/README.md --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Refine error handling text (#816) * Refine error handling text * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Update fallback text * Turn bullet point list into paragraphs * Be more mighty Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Create notes-2024-08-26.md * Select "Match on variables instead of expressions" for selection-declarations (#824) * Select "Match on variables instead of expressions" for selection-declarations * Add hybrid option to selection-declaration.md (#870) * Add hybrid option to selection-declaration.md * Update selection-declaration.md fixed glitch in original edit * Update selection-declaration.md * Apply suggestions from code review Fixing typos Co-authored-by: Addison Phillips <addison@unicode.org> * Update selection-declaration.md * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update exploration/selection-declaration.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update selection-declaration.md --------- Co-authored-by: Mark Davis <mark@unicode.org> Co-authored-by: Addison Phillips <addison@unicode.org> * Fix "Allow immutable input declarative selectors" example (#874) * Update README.md (#875) * Update README.md * Update README.md * [DESIGN] Update bidi design document to show proposed design (#871) * [DESIGN] Update bidi design document to show proposed design The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization). This PR consists of moving text around. I added one "pro" to one option also. * Address comments * Miscellaneous test fixes (#862) * Add missing expected bad-selector errors * Fix expected parts for unsupported-statement test * Add a few new tests for leading-whitespace and duplicate-variant * Add tests for escaped-char changes made in #743 * Fix tests for attributes with variable values * Update contributing and joining info (#876) * Update contributing and joining info * Update README.md * Update CONTRIBUTING.md * Restore CLA copy * Clarify error & fallback handling (#879) * Clarify error & fallback handling * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Select last rather than first attribute * Drop mention of "starting with Pattern Selection" * Attributes can't change the formatted output * Use "nor" instead of "or" regarding attribute restrictions --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Clarify rule selection (#878) * Clarify rule selection Fixes #868 This adds normative SHOULD language to using CLDR plural and ordinal data, which was intended originally. - clarifies that keyword selection follows exact match - clarifies the purpose of rule-based selection - makes non-CLDR-based implementation permitted * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * [DESIGN] Maintaining the Standard, Optional and Unicode Namespace Function Sets (#634) * Design doc to capture registry maintenance * Update maintaining-registry.md * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update exploration/maintaining-registry.md Co-authored-by: Tim Chevalier <tjc@igalia.com> * Add user stories, small updates to RGI * Update exploration/maintaining-registry.md * Adding additional detail * Remove machine readable registry; update prose * Update maintaining-registry.md * Further development work * Update to change format and naming Per the 2024-08-19 call, we decided to switch towards a specification-per-function model, with statuses. This commit includes the initial set of changes to try and implement this. * Address some comments. --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> * Create notes-2024-09-09.md * Fix a typo in an example (#880) The upcoming work to implement resolved value might make this patch unnecessary or obsolete, but fixing the typo (missing `{`/`}` around the variable in the pattern) just in case * Remove forward-compatibility promise and all reserved & private syntax (#883) * Remove forwards compatibility from stability guarantee * Drop reserved statements and expressions * Drop private-use annotations * Update tests * Clarify that deprecation is not removal * Match on variables instead of expressions (#877) * Match on variables instead of expressions * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Apply suggestions from code review * Add missing test changes noticed during implementation * Empty commit to re-trigger CLA check --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Create notes-2024-09-10.md * Add bidi support and address UAX31/UTS55 requirements (#884) * Add bidi support and address UAX31/UTS55 requirements Adds the bidi strong marks ALM, RLM, and LRM plus the bidi isolate controls LRI, RLI, FSI, and PDI to the syntax. Formally defines optional vs. non-optional whitespace. Non-optional whitespace must include at least one whitespace character. Optional whitespace may contain only bidi marks (which are invisible) * Update syntax.md including text from previous PR * Repair the guidance on strongly directional marks Include ALM and better specify how to use the marks. * Fix formatting of the "important" * Add bidi characters to description of whitespace. * Permit bidi in a few more places Add optional whitespace at the start of `variant` Add optional whitespace around `quoted-pattern` These changes result in allowing bidi around keys and quoted patterns as intended. * Update syntax.md ABNF * Update formatting.md - Add a note about the difference between formatting and message syntax. - Clarify the sentence about message directionality. * Address comment about name/identifier * Address comments related to bidi in `name` * Fix variable's location * Address comment about the list of LRI/PDI targets * One character typo :-P * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Address comments about rule R3a-1 * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Address comment about U+061C * Change [o]wsp => `o` or `s` * Match syntax spec to abnf * Remove * * Update syntax.md * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update syntax.md * Update spec/message.abnf Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Specify `bad-option` for bad digit size option values (#882) * Specify `bad-option` for bad digit size option values Fixes #739 * adopt 'non-negative integer' * Create notes-2024-09-16.md * Address name and literal equality (#885) * Address name and literal equality This change defines equality as discussed in the 2024-09-09 teleconference in the following ways: - It defines _name_ equality as being under NFC - It defines _literal_ equality as explicitly **not** under NFC - It moves _name_ before _identifier_ in that section of text to avoid a forward definition. Note that this deviates from discussion in 2024-09-09's call in that we didn't discuss literals at length. It also doesn't discuss non-name/non-literal values, which I'll point out are limited to ASCII sequences such as keywords. * Typo fix * Add a note about not requiring implementations to actually normalize * Implement changes dicussed in 2024-09-16 call. - Make _key_ require NFC for uniqueness/comparison - Add a note about NFC - Make _literal_ **_not_** define equality - Make text in _name_ identical to that in _key_ for consistency * Update formatting.md to include keys in NFC * Address comments * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/syntax.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update list of normative changes during the LDML45 period (#890) * Fix typos in data-model-errors tests (#892) Fix #886 * Update note on exact numeric match for v46 (#891) Addresses #887 Non-normative changes to the notes specifically part of LDML46 * Fix attribute value to be literal (#894) Fixes #893 * Create notes-2024-09-30.md * Add Resolved Values and Function Handler sections to formatting (#728) * Add Resolved Values section to formatting * Apply suggestions from code review * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: Tim Chevalier <tjc@igalia.com> * Linkify "resolved value" * Add some examples & explicitly allow wrapping input values * No throw, only emit Co-authored-by: Tim Chevalier <tjc@igalia.com> * Add section on Function Handlers, defining the term * Apply suggestions from code review * Rephrase initial resolved value definition * Update spec/formatting.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update resolved value definition again Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Addison Phillips <addison@unicode.org> * Define function composition for :number and :integer values (#823) * Define function composition for :number and :integer values * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Add operand option priority example * Add apostrophes' Co-authored-by: Tim Chevalier <tjc@igalia.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> * Create notes-2024-10-07.md * Apply NFC normalization during :string key comparison (#905) * Apply NFC normalization during :string key comparison * Add link to UAX#15 Co-authored-by: Addison Phillips <addison@unicode.org> --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Add tests for changes due to bidi/whitespace (#902) * Add tests for changes due to bidi/whitespace * Correct output * Make erroneous test a syntax error * Define function composition for date/time values (#814) * Define function composition for date/time values * Apply suggestions from code review Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> * Drop the "only" * Update spec/registry.md * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Update spec/registry.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Make :date and :time composition implementation-defined --------- Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> Co-authored-by: Addison Phillips <addison@unicode.org> * DESIGN: Add alternative designs to the design doc on function composition (#806) * DESIGN: Add a sequel to the design doc on function composition This document sketches out some alternatives for the machinery provided to enable function composition. The goal is to provide an exhaustive list of alternatives. * Remove 'part 2' document and move contents to the end of part 1 * Revise introduction to reflect the changed goal * Edited for conciseness * Further edits for conciseness * Give a name to InputType and use it * Refer to motivating examples * Update function-composition-part-1.md status Per 2024-10-14 telecon * Create notes-2024-10-14.md * Add test for :integer and :number composition (#907) * Fix `:integer` option `useGrouping` values (#912) I noticed that `:integer` does not include the "never" value for the option `useGrouping`. This is a bug. * Drop syntax note on additional bidi changes (#910) Drop syntax note on addition bidi changes * Add tests for changes due to #885 (name/literal equality) (#904) * Add tests for changes due to #885 (name/literal equality) * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Update test/tests/syntax.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Update test/tests/functions/string.json Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Added tests for reordering and special case mapping * Add another selection test --------- Co-authored-by: Eemeli Aro <eemeli@gmail.com> * Add u: options namespace (#846) * Move spec/registry.md -> spec/registry/default.md * Add Unicode Registry definition * Refer to BCP47, add note about only requiring normal tags * Call it a namespace * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Fix test file reference Co-authored-by: Tim Chevalier <tjc@igalia.com> * Apply suggestions from code review * Update spec/u-namespace.md Co-authored-by: Eemeli Aro <eemeli@mozilla.com> * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Apply suggestions from code review Co-authored-by: Addison Phillips <addison@unicode.org> * Add mention of functions to namespace description --------- Co-authored-by: Addison Phillips <addison@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> * Define function composition for :string values (#798) * Define function composition for :string values * Update spec/registry.md as suggested by @stasm in #814 * Drop the "only" * Update text following code review comments --------- Co-authored-by: Addison Phillips <addison@unicode.org> * Drop data model request for feedback on "name" (#909) * Allow surrogates in content, issue #895 (#906) * Allow surrogates in content, issue #895 * Grammar and typos, linkify terms, make into a note, and fix 2119 keywords Thanks Addison! Co-authored-by: Addison Phillips <addisonI18N@gmail.com> * Not using "localizable elements" Co-authored-by: Addison Phillips <addisonI18N@gmail.com> * Keep syntax.md in sync with message.abnf * Added note about surrogates to quoted literals * Moved the note about surrogates from Security Considerations to The Message * Update spec/syntax.md * Update spec/syntax.md * Italicize in a couple of places * Implemeted more (all?) feedback from review --------- Co-authored-by: Addison Phillips <addisonI18N@gmail.com> --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> Co-authored-by: Elango Cheran <elango@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Mark Davis <mark@unicode.org> Co-authored-by: Danny Gleckler <daniel.gleckler@d2l.com> Co-authored-by: Steven R. Loomis <srl295@gmail.com> Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> Co-authored-by: Eemeli Aro <eemeli@gmail.com> Co-authored-by: Mihai Nita <nmihai_2000@yahoo.com> * Add serialization proposal * Revert "Add serialization proposal" This reverts commit 17af553. * Revert "Update from main (#914)" This reverts commit da9377b. * Add serialization proposal --------- Co-authored-by: Eemeli Aro <eemeli@mozilla.com> Co-authored-by: Elango Cheran <elango@unicode.org> Co-authored-by: Tim Chevalier <tjc@igalia.com> Co-authored-by: Mark Davis <mark@unicode.org> Co-authored-by: Danny Gleckler <daniel.gleckler@d2l.com> Co-authored-by: Steven R. Loomis <srl295@gmail.com> Co-authored-by: Stanisław Małolepszy <sta@malolepszy.org> Co-authored-by: Eemeli Aro <eemeli@gmail.com> Co-authored-by: Mihai Nita <nmihai_2000@yahoo.com>
The design I actually think we should adopt is the "hybrid approaches" one. This is a necessary first step on the highway to UAX31 compliance and I think is responsibly contained/managed. It is a hybrid approach, in that it permits testable strict implementations to be created (particularly for message serialization).