-
Notifications
You must be signed in to change notification settings - Fork 680
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[css-fonts] Drop bracket matching step from @font-face
src: line parsing
#6340
Comments
Right, because after segmentation on commas the procedure is
and the grammar includes parentheses. Also Fonts 3 didn't define parsing this way so I suspect implementations just extended what they already had for Fonts 3. @litherum you added that section could you explain the reasoning? |
Also this resolution:
|
I suspect the split on parantheses came about because at one point we were considering syntax like @font-face {
font-family: heading-font;
src: url(fancy-font.woff2) format("woff2;tables=CPAL,FVAR"),
url(fallback-font.woff2) format("woff2"),
url(fallback-fallback-font.woff) format("woff"),
url(how-old-is-your-browser-font.ttf) format("ttf");
} Note the commas inside the overloaded format specifier. But we ended up not doing that, so segmenting first on balanced parentheses no longer has any value. |
Agenda+ in case there are any good arguments in favor of the balanced-parens initial step. Otherwise, I'm fine dropping that part to improve internal spec consistency and also better align with actual implementations |
If the first step is to split on commas, what happens if there's a URL that includes a comma? |
comma is in the set
I guess that was never an issue for CSS Fonts 3, which didn't have the split-on-balanced-parens step? |
I don't think @drott's original comment here:
can be quite accurate, given that an example like:
successfully loads the base64-encoded font resource in all of Safari, Chrome and Firefox. If they were splitting the |
I'll note that I added an algo to Syntax for this exact case (for other reasons) that behaves correctly, so the spec can just invoke that: https://drafts.csswg.org/css-syntax/#parse-comma-separated-list-of-component-values |
(CSS specifications can't specify text-level parsing details anyway; by the time they have a value that they know is part of a particular property, it's already been tokenized and had basic parsing done on it, so all you've got is tokens, not characters. The above algo works at the correct level.) |
Thanks @jfkthame and @tabatkins for observations and clarifications. Of course, Jonathan, you're right that the data url scheme you're giving as an example does work. And it's also true that the CSS parsing operates on a token stream, not on a character stream. What I looked at during my analysis for WebKit and Chrome were these bits of code for where a font face src descriptor is parsed: In particular, the behavior of the parser here consuming and progressing to each next value comma-token by comma-token. But indeed while still inner component values such as With that clarified, I am still convinced that the prose for separating by bracket matching should be dropped as that's definitely not what's happening or needed. Because @tabatkins provides this very useful reference, as an additional improvement to the current text, we may want to refer to "5.3.10. Parse a comma-separated list of component values" as the algorithm for parsing this value, assuming that it's fully applicable: In that respect, are url(...) and local(...) defined as function tokens? |
Yeah, per the Syntax spec, they definitely get parsed as function tokens. (Depending on its contents, A little more precisely, by the time this algo can get called, they're already processed into functions, a very slightly higher-level object that contains all the stuff between their parens, so a pass over the top-level component values won't see any commas inside of there unless you recurse manually. |
The idea was just that I didn’t want “foo, bar(baz, quux)” to be broken into 3 pieces: “foo” “bar(baz” and “quux)”. If there’s logic elsewhere that already satisfies this desire, I’m happy to use it instead. WebKit hasn’t implemented parsing support for this new src: descriptor grammar yet. |
Thanks for commenting, @litherum. To summarize, I suggest to improve this section by dropping
and instead say:
I also suggest to address "ISSUE 4" and the definition of "thrown out" by saying:
|
Thanks for the suggested wording, @drott I just added it. |
This issue is still on the agenda for today's CSS call, but unless there are problems with the suggested wording then it looks like we can just close the issue as resolved? It would be good to have some known-invalid-component |
Yes, I support resolving and closing with this change. Unfortunately I have a conflict and can't attend today's call, but at least @lilles said they'd be there from our side. I plan to add tests in parallel to implementing the |
@jfkthame okay to close this? |
Yes, sounds OK to me. |
+1 to committed change, but @svgeesus INCONSISTENT TABS AND SPACE INDENTATION??? You could also just link the algorithm directly with |
Linking to the algorithm directly might well have been better, yes. I'm aware that Tabs vs. Spaces is one of the GREAT HOLY WARS OF OUR TIME and I am really not interested in signing up as a combatant. |
Adapt font-face src: parsing behavior to more leniently accept a src: line that has parse errors in an individual component. Part of Interop 2023 in the font feature detection and palettes category [1]. Also return serialized values in lowercase to match CSSOM spec. After CSS WG resolution [1] and updated spec text [2] individually failing src: line components for incorrect format() or tech(), or other parse failures per component should not lead to dropping the whole src: line, but only reject the individual component. Implement checks for component sanity and on failure, skip to next comma or end. [1] https://wpt.fyi/results/css/css-fonts/parsing?q=label%3Ainterop-2023-fonts [1] w3c/csswg-drafts#6340 [2] https://drafts.csswg.org/css-fonts-4/#font-face-src-parsing Fixed: 1398933, 857226 Change-Id: Ib94f745fb8b453e21fd981973c29dccb4b722439
Adapt font-face src: parsing behavior to more leniently accept a src: line that has parse errors in an individual component. Part of Interop 2023 in the font feature detection and palettes category [1]. Also return serialized values in lowercase to match CSSOM spec. After CSS WG resolution [1] and updated spec text [2] individually failing src: line components for incorrect format() or tech(), or other parse failures per component should not lead to dropping the whole src: line, but only reject the individual component. Implement checks for component sanity and on failure, skip to next comma or end. [1] https://wpt.fyi/results/css/css-fonts/parsing?q=label%3Ainterop-2023-fonts [1] w3c/csswg-drafts#6340 [2] https://drafts.csswg.org/css-fonts-4/#font-face-src-parsing Fixed: 1398933, 857226 Change-Id: Ib94f745fb8b453e21fd981973c29dccb4b722439
Adapt font-face src: parsing behavior to more leniently accept a src: line that has parse errors in an individual component. Part of Interop 2023 in the font feature detection and palettes category [1]. Also return serialized values in lowercase to match CSSOM spec. After CSS WG resolution [1] and updated spec text [2] individually failing src: line components for incorrect format() or tech(), or other parse failures per component should not lead to dropping the whole src: line, but only reject the individual component. Implement checks for component sanity and on failure, skip to next comma or end. Test updates: * fast/css/font-face-src-parsing.html updated to account for more cases being allowed. For example, trailing commas in a src: line are not a full parse failure. * fast/css/font-face-unquoted-local.html removed. The test is too restrictive after the behavior change, and relatively trivial and already covered in WPT tests, for example font-format-src-parsing.html. * fast/css/fontfaceset-multiple-families.html updated to account for a @font-face with format("svg") being rejected early and thus not becoming part of the FontFaceSet of the page. * external/wpt/css/css-fonts/format-specifiers-variations.html disabled until review in web-platform-tests/wpt#40666 is complete. Test seems to be expecting an incorrect error type, NetworkError vs. SyntaxError. [1] https://wpt.fyi/results/css/css-fonts/parsing?q=label%3Ainterop-2023-fonts [1] w3c/csswg-drafts#6340 [2] https://drafts.csswg.org/css-fonts-4/#font-face-src-parsing Fixed: 1398933, 857226 Change-Id: Ib94f745fb8b453e21fd981973c29dccb4b722439 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/4624052 Reviewed-by: Rune Lillesveen <futhark@chromium.org> Commit-Queue: Dominik Röttsches <drott@chromium.org> Cr-Commit-Position: refs/heads/main@{#1160567}
For context, the original resolution from which the spec text for parsing the
src:
line of the@font-face
descriptor was tracked in issue #633. This issue is related to #6328.The current spec text says:
Why do we need the bracket matching? In my quick analysis, at least WebKit and Blink do not do that but the top level step for parsing the
src:
descriptors value is to split this into a comma separated list of values, then parse these individually. I am not so familiar with reading Rust code, but it roughly looks like Gecko parses this as a comma separated list at the top level as well? (@jfkthame, could you comment?).I suggest to drop this text and explain that the first segmentation step is separating by commas, then parsing each individual value separately, unless I am missing something on why this might be needed. To me it seems this would not conflict with any of the second level parsing of individual source values. (It does not look to me as commas would have any other meaning in this descriptor except separating sources, do they?)
This would also address the TODO issue in the text and explain that after comma separation, values that are not parsed correctly are dropped from the list and not considered as downloadable resources.
The text was updated successfully, but these errors were encountered: