Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict allowed JavaScript MIME types #870

Open
evilpie opened this issue Feb 7, 2019 · 5 comments
Open

Restrict allowed JavaScript MIME types #870

evilpie opened this issue Feb 7, 2019 · 5 comments
Labels
security/privacy There are security or privacy implications topic: orb

Comments

@evilpie
Copy link
Contributor

evilpie commented Feb 7, 2019

I am cautiously optimistic that we can change the allowed JavaScript MIME types from a block to an allow list.

This list would include all the JavaScript MIME types, plus text/html, application/json, text/plain and empty (no Content-Type).

MIME Loads %
javaScript 9723904447 95.45%
text_html 240640161 2.36%
empty 79707178 0.78%
app_json 77716915 0.76%
text_plain 44977157 0.44%
unknown 8032881 0.08%
image 6772345 0.07%
app_octet_stream 4899410 0.05%
app_xml 787319 0.01%
text_json 440959 0.00%
text_xml 37279 0.00%
audio 7459 0.00%
video 61 0.00%
text_csv 0 0.00%
  10187923571  
Source: https://mzl.la/2SxxvNw

Note: that we already block image/, which has almost the same percentage as unknown, which includes all not explicitly enumerated MIME types.

@annevk @mikewest

@dveditz
Copy link
Member

dveditz commented Feb 7, 2019

I wonder if there's anything we can do to get all those text/html ones down.

@mikewest
Copy link
Member

mikewest commented Feb 8, 2019

Chrome's numbers look a bit different:

Cross-origin scripts

MIME % of page views
text/html ~10%
text/plain ~4%
application/octet-stream ~1%
application/xml ~1%
Other ~25%

Same-origin scripts

MIME % of page views
text/html ~2%
text/plain ~0.3%
application/octet-stream ~0.05%
application/xml ~0.01%
Other ~3%

We might just be measuring different things. It looks like Mozilla's metrics use the number of scripts loaded as the denominator, while Chrome is measuring the number of pages on which any script had the given MIME type?

@evilpie
Copy link
Contributor Author

evilpie commented Feb 8, 2019

Other ~25%

That number is incredibly high. Sadly you don't seem to count application/json? Would this also include no Content-Type?

Would you assume that breaking cross-origin scripts would usually be less of a problem, assuming that a lot of those are tracking scripts?

We might just be measuring different things. It looks like Mozilla's metrics use the number of scripts loaded as the denominator, while Chrome is measuring the number of pages on which any script had the given MIME type?

Yes correct, this counts every script load. Actually this number also includes ServiceWorker, Worker etc., but those numbers are so small compared to normal <script> loads that they are probably insignificant. We could add more counters later.

I am still surprised that the difference seems so high, but I don't have a good intuition on how those two measurements compare.

@mikewest
Copy link
Member

That number is incredibly high. Sadly you don't seem to count application/json? Would this also include no Content-Type?

Yes. "Other" is everything else, including application/json and the empty string.

Would you assume that breaking cross-origin scripts would usually be less of a problem, assuming that a lot of those are tracking scripts?

Yes, that's exactly my intuition. Hence the separate metrics. :)

I am still surprised that the difference seems so high, but I don't have a good intuition on how those two measurements compare.

I can imagine that Chromium's page-views-based number would look much higher than Mozilla's script-load-based number if there are a small number of very widely used scripts with incorrect MIME types. Facebook was in this category, as is VK, and a zillion ad scripts.

I think it's worth experimenting in this direction, and explicitly allowing text/html and application/json probably takes care of a large chunk of the potential breakage, but I think it'll be necessary to do some more research before I'd be able to convince Blink folks to ship this kind of change.

@annevk annevk added the security/privacy There are security or privacy implications label May 28, 2019
@annevk
Copy link
Member

annevk commented May 28, 2019

It seems that even if we figure this out #721 (comment) (CORB++) will still be needed due to text/html and JSON being so prominent, but depending on the exact shape of this it might make for a simpler check there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
security/privacy There are security or privacy implications topic: orb
Development

No branches or pull requests

4 participants