Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: remove valid color checking from PWA requirements #9623

Closed
patrickhulce opened this issue Aug 28, 2019 · 23 comments · Fixed by #14447
Closed

Proposal: remove valid color checking from PWA requirements #9623

patrickhulce opened this issue Aug 28, 2019 · 23 comments · Fixed by #14447

Comments

@patrickhulce
Copy link
Collaborator

Summary

So after #3891 (comment) I went to update our cssstyle dependency to fix the rgba(0,0,0,0) bug, but then it turns out the new version that fixes that introduces several other bugs. The valid color checking has been a whack-a-mole problem for years and it feels not very useful so I was curious what it looks like on real sites.

According to July HTTPArchive data...

Only .08% of sites ever trigger our invalid color logic and of that set ~42% are actually valid colors that we incorrectly flag as invalid (see trix).

Proposal A: Change invalid color check to a warning on PWA themed-omnibox and splash-screen. It feels like this is a warning-level thing anyway and we're almost wrong as often as we're right.

Proposal B: Remove invalid color check entirely. 20KB for something useful to .05% of sites feels like a bit much, false warnings aren't great either.

@connorjclark
Copy link
Collaborator

is removing an audit a "breaking change"?

@patrickhulce
Copy link
Collaborator Author

is removing an audit a "breaking change"?

To clarify, it's not removing an existing audit. It's just changing the behavior of our two existing audits (themed-omnibox and splash-screen) to not fail just because a color was parsed as invalid.

@clshortfuse
Copy link

For reference, it was from the Chromium source code for manifest parsing:

https://chromium.googlesource.com/chromium/src.git/+/62.0.3178.1/content/renderer/manifest/manifest_parser_unittest.cc#1569

where I found that "background_color": "rgba(0,0,0,0)" is valid. Replicating their tests could be a good idea. I'm also not sure if "background_color": "cornflowerblue" would pass either, and that's the sample shown at:

https://developers.google.com/web/tools/lighthouse/audits/manifest-contains-background_color

@patrickhulce
Copy link
Collaborator Author

FYI cornflowerblue should work just fine @clshortfuse, it's a CSS keyword (keywords covered by the test cases at https://chromium.googlesource.com/chromium/src.git/+/62.0.3178.1/content/renderer/manifest/manifest_parser_unittest.cc#1530)

@paulirish
Copy link
Member

paulirish commented Aug 29, 2019

I'm +1 on removing valid color checking. let's do it.

@brendankenny
Copy link
Member

Only .08% of sites ever trigger our invalid color logic

isn't that generally true of most of our PWA audits, though? How many sites have a manifest at all?

My main issue with dropping it is that invalid CSS colors are treated as unset, so they really won't have a theme color set if the color is invalid.

It does look like we were missing a bunch of other functionality in cssstyle specifically for checking colors by trying to just valueType(), so it may help to switch to using that. I did notice there were some CSS 4 color forms missing, though.

What other new bugs came with up updating to 2.0.0?

@brendankenny
Copy link
Member

(less important, but I also don't think "whack-a-mole problem" is really applicable. AFAICR we added color parsing from devtools-frontend way back at the dawn of time, then switched to this library when we pulled out the last bit of WebInspector :)

@clshortfuse
Copy link

@patrickhulce I mean if cornflowerblue would pass on Lighthouse. I didn't go too far in my testing, but transparent fails on Lighthouse while it does pass fine on Chrome.

I guess there's a whitelist of some sorts on Lighthouse and it's missing transparent, maybe?

@patrickhulce
Copy link
Collaborator Author

I guess there's a whitelist of some sorts on Lighthouse and it's missing transparent, maybe?

Correct, the version of the library we use does not contain transparent as a keyword. This is the "whack-a-mole" problem I was referring to.

How many sites have a manifest at all?

~6% of sites have an installable manifest. So the upper bound on invalid color applicability to PWA developers is ~.8% (.0008 / .06 * (1 - 0.42)).

What other new bugs came with up updating to 2.0.0?

They don't implement hex color validation correctly so for example our invalid test #1234567 passes as a valid color. They also didn't fix rgba correctly so now it incorrectly flags rgba(128, 64, 128, 0.4) as an invalid color even though rgba(0, 0, 0, 0) passes (they look for 4 integers).

I also don't think "whack-a-mole problem" is really applicable. AFAICR we added color parsing from devtools-frontend way back at the dawn of time, then switched to this library when we pulled out the last bit of WebInspector

I'm probably conflating with our validation of other PWA values that have popped up in iOS, but between the two we've had at least 4 updates to these sorts of validators and still have lots of missing invalid values.

My main issue with dropping it is that invalid CSS colors are treated as unset, so they really won't have a theme color set if the color is invalid.

This is fine but my bigger issue is that when we knowingly have incomplete coverage of a problem, whatever the reason may be, falsely failing for correct code is worse than warning it might have been a mistake especially when our accuracy is nearly as bad as a coin flip.

@brendankenny
Copy link
Member

brendankenny commented Aug 30, 2019

So I ran the spreadsheet of values through an actual setting of a css color to see which ones were valid, then compared against what the library can do.

If we

  • update to cssstyle 2.0.0
  • call trim().toLowerCase() on the input

we end up with 8 out of 4910 sites in that HTTPArchive data incorrectly flagged as invalid (8 out of 1265 unique color strings in that sheet). A breakdown:

  • "rgba(239,239,239,0.9"
    "rgb(150, 191, 161"

    apparently you're allowed to do this and the parser will treat the css function as done (see <EOF-token>)

  • "rgba(46, 49, 146)"
    "rgba(255,238,20)"
    "rgba(116,143,138)"
    TIL CSS Colors 4 defines rgba() as just an alias for rgb() (both support an optional alpha)

  • "rgb( 0, 174, 255)"
    "rgb( 0, 144, 54)"
    "rgb( 91, 186, 213)"
    cssstyle's regex incorrectly doesn't allow a space before the first entry

These could all be fixed pretty easily upstream (as well as the need for toLowerCase since "Unless otherwise specified, all matches are ASCII case-insensitive.")


Color strings incorrectly flagged as valid is a slightly bigger set, but we're also less worried about bugging users with these.

We end up with 29 out of 4910 sites in that HTTPArchive data we consider valid when they're actually invalid (16 out of 1265 unique color strings in that sheet).

  • 20 sites (13 unique) use a seven-digit hex value ("#fffffff", "#1717171", "#0000000", "#ff5349f", "#F649941", "#1e87f01", "#659e367", "#f0f0f0f", "#ebe2daf", "#E21836F", "#C51af5b", "#f8f8f8E", "#0f46600"), which isn't valid. This could also be fixed easily upstream.

  • 9 sites (3 unique) use a non-breaking space in their string ("#ffffff  ", "#8dc63f  ", "#ffffff ") which isn't a valid CSS whitespace. This is an artifact of using trim() (which eats nbsp) vs a more correct .replace(/^[ \t\n\r\f]*/, '').replace(/[ \t\n\r\f]*$/, '');, which we could easily use instead, it just made the top-level bullet points look worse :). This is also different in the meta theme-color attribute, since \t etc in an attribute make the color not recognized by Chrome at least.

    It's not clear to me if cssstyle wants to deal with starting and ending whitespace (we are reaching into their parsers, so maybe this is dealt with earlier in their pipeline), so this one may be up to us.

@clshortfuse
Copy link

clshortfuse commented Aug 31, 2019

I recently submitted a PR to fix the regex used in clean-css: https://github.com/jakubpawlowicz/clean-css/pull/1083/files

I reworked it a bit since then, but here's a Regex that would follow the CSS spec:

^\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(currentcolor|transparent|aliceblue|antiquewhite|aqua|aquamarine|azure|beige|bisque|black|blanchedalmond|blue|blueviolet|brown|burlywood|cadetblue|chartreuse|chocolate|coral|cornflowerblue|cornsilk|crimson|cyan|darkblue|darkcyan|darkgoldenrod|darkgray|darkgreen|darkgrey|darkkhaki|darkmagenta|darkolivegreen|darkorange|darkorchid|darkred|darksalmon|darkseagreen|darkslateblue|darkslategray|darkslategrey|darkturquoise|darkviolet|deeppink|deepskyblue|dimgray|dimgrey|dodgerblue|firebrick|floralwhite|forestgreen|fuchsia|gainsboro|ghostwhite|gold|goldenrod|gray|green|greenyellow|grey|honeydew|hotpink|indianred|indigo|ivory|khaki|lavender|lavenderblush|lawngreen|lemonchiffon|lightblue|lightcoral|lightcyan|lightgoldenrodyellow|lightgray|lightgreen|lightgrey|lightpink|lightsalmon|lightseagreen|lightskyblue|lightslategray|lightslategrey|lightsteelblue|lightyellow|lime|limegreen|linen|magenta|maroon|mediumaquamarine|mediumblue|mediumorchid|mediumpurple|mediumseagreen|mediumslateblue|mediumspringgreen|mediumturquoise|mediumvioletred|midnightblue|mintcream|mistyrose|moccasin|navajowhite|navy|oldlace|olive|olivedrab|orange|orangered|orchid|palegoldenrod|palegreen|paleturquoise|palevioletred|papayawhip|peachpuff|peru|pink|plum|powderblue|purple|rebeccapurple|red|rosybrown|royalblue|saddlebrown|salmon|sandybrown|seagreen|seashell|sienna|silver|skyblue|slateblue|slategray|slategrey|snow|springgreen|steelblue|tan|teal|thistle|tomato|turquoise|violet|wheat|white|whitesmoke|yellow|yellowgreen|#[A-F0-9]{3,4}|#[A-F0-9]{6}|#[A-F0-9]{8}|rgba?\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*,){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(,\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|rgba?\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*,){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(,\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|rgba?\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|rgba?\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|hsla?\(\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?(deg|grad|rad|turn)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(,\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*){2}(,\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|(hsla?|hwb)\(\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?(deg|grad|rad|turn)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\s+\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*){2}(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|device-cmyk\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+){3}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(,([^()]*|\([^)]*\))*)?(\)|$)|lab\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|lch\((\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+){2}\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?(deg|grad|rad|turn)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|gray\(\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*\s+(\/\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*[+-]?((\d+)|(\d*\.\d+))(e[+-]?\d+)?%?\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*)?(\)|$)|color\(([^()]*|\([^)]*\))*(\)|$))\s*(\/\*([^*]|\*(?!\/))*\*\/)*\s*$

It can look really confusing, but if follow the code it's pretty logical. You can see the gist for it here: https://gist.github.com/clshortfuse/acffca0a5f1412edcf9e132f33febe1c

As for the point of the EOF parsing, ECMAScript doesn't support \z or \Z (different languages use different terms), so, it's just using $.

There are some things to note:

  • There is no negative integer check.
  • <alpha-value> isn't bound to [0-1] or [0% - 100%]
  • rgba is now alias for rgb so it can take 3 values (same with hsla/hsl)
  • rgb can take 4 values (same with hsl)
  • Whitespace to spec would be/[\n \t]/, but I'm using /\s/ to support both carriage return (\r) and form feed (\f).

I'm not sure about too sure about the performance though. It might faster to trim it first, and convert to lowercase and checking against the named colors first.


Edit: Forgot about comments, so I had to add it in. I added in device-cmyk(), hwb(), lch(), lab(), gray(), and color(). Tried hand optimizing a bit as well. color() seems is probably too complicated to implement with Regex, and device-cmyk() needs recursion for its fallback value.

@brendankenny
Copy link
Member

@clshortfuse if we still go this route the changes would be made upstream in cssstyle, where the different CSS color functions are split into separate regexes and a named color check, yes.

@patrickhulce
Copy link
Collaborator Author

To be clear, I was never attempting to claim it's impossible to fix all the issues in cssstyle :) I'm arguing that the large gap is evidence of the fact that we'd essentially have to fill this gap ourselves and maintain this library for very little LH user value.

To quote a great mind,

I personally don't think omnibox colors are that important to a PWA experience

^^ this. On top of the tiny percentage of users this affects makes this just not seem worth it.

@clshortfuse
Copy link

clshortfuse commented Sep 3, 2019

I wrote the crazy regex, and I kinda agree. It's going to be a cat and mouse game with standards. Validating the content of the color properties is really more of a courtesy than necessity.

Also, the Manifest spec itself is still in a working draft state. And the draft even lists that color processing steps are incorrect with a reference to this GitHub issue: w3c/manifest#760 So realistically speaking, the process we implement now may change again later. There's also no direction as to how to handle EOF or non-terminated functions ).

On another note, ensuring a string is parse-able (regex), parsing, and validating are all different things. Having to validate or parse the color is a bit more work than just ensuring it can be parsed. I also have no idea what to do with env(). In theory, any user-agent can support something like env(system-tint-color). For example, Android now has a "tint" color on the system theme that perhaps could be added as a user-agent environment variable in the future. But we have no way to be able to parse that. The objective, should be to ensure it can be parsed, not parse it, if done at all.

And if we're referencing an external library that's purpose is to validate (and not just parse), cssstyle has no support for device-cmyk(). Nor does it support color() which has complexities considering there are multiple color spaces. So, even today, it would be incomplete. And taking a quick glance in master that cssstyle is missing the ability to parse 4 parameter rgb(), doesn't support commaless functions, and is too strict on its angle parsing (the unit is not required). So it's not just a clean upgrade either. And we'd be waiting for them to not only build a validator, but also the parser.

Edit: My terminology was mixed up with parsing/validating. Ensuring "parse-able" would regex check on the string. Reading the values into Objects would be "parsing". Converting those Objects into other formats would be related to validation.

@connorjclark
Copy link
Collaborator

We could close the gap between implementation and our color validation by not writing our own color validation (or rather, relying on cssstlye to be correct).

when the color is invalid, blink pushes an error to the parser: https://cs.chromium.org/chromium/src/third_party/blink/renderer/modules/manifest/manifest_parser.cc?l=164&rcl=65b3e11ce92ca41af626cb68ec84b795c90453ad

These end up as console errors: https://cs.chromium.org/chromium/src/third_party/blink/renderer/modules/manifest/manifest_manager.cc?l=194&rcl=ab4ecdb6a2dacf0a35344c2baa9509a9659eec86

@patrickhulce
Copy link
Collaborator Author

I think that's a great approach @connorjclark, do we know if we need to listen on a specific target for those? I'm not seeing those messages in our existing errors-in-console audit results for sites that currently fail on the invalid color currently in HTTPArchive :/

@brendankenny
Copy link
Member

brendankenny commented Sep 3, 2019

that doesn't check meta theme-color, though, which is what #3891 wants to prioritize :)

@patrickhulce
Copy link
Collaborator Author

patrickhulce commented Sep 3, 2019

that doesn't check meta theme-color, though, which is what #3891 wants to prioritize :)

But conceptually we could extend it to the exact same thing if that became the priority of the PWA guidelines, no? As in, Chrome would then have a vested interest in validating the meta tag not just the manifest value.

@patrickhulce
Copy link
Collaborator Author

The only two places we use cssstyle today are for validating a color. My position is still that it's not worth bringing in and keeping up with this dependency for the little value most sites see from it.

We decided to pick up discussion next time though :)

@brendankenny
Copy link
Member

My position is still that it's not worth bringing in and keeping up with this dependency for the little value most sites see from it.

I'm more agnostic today about how we validate colors, but I do think some of the PWA audits suffer from a recency bias in these type of conversations.

In a world where we're happy to pull in jsonld and hundreds of KiB of schemas, Lighthouse being able to correctly validate a CSS color (through whatever method) is a pretty low bar :)

@patrickhulce
Copy link
Collaborator Author

While I cannot deny the applicability argument for your jsonld point (which I almost conclude with you on), the fact that "is this color valid" is such a low bar and easier to check without Lighthouse is also basically how I can rationalize the existence of the schema validation. The structured data validator was prescribed as the only remaining way to know if your structured data was nonsense. Still belongs as a plugin? Yeah probably, but the high complexity justifies its existence as a thing.

@patrickhulce
Copy link
Collaborator Author

We've decided to move forward with dropping the color checking here.

@connorjclark
Copy link
Collaborator

connorjclark commented Oct 12, 2022

Assuming we're still good with dropping valid color checking (in themed-omnibox and in manifest-parser)(#3891), let's do this for 10.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants