Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

data: URL tests #6890

Merged
merged 20 commits into from
Jan 31, 2018
Merged
Show file tree
Hide file tree
Changes from 12 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions fetch/data-urls/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
== data: URLs ==

`resources/data-urls.json` contains `data:` URL tests. The tests are encoded as a JSON array. Each value in the array is an array of two or three values. The first value describes the input, the second value describes the expected MIME type, null if the input is expected to fail somehow, or the empty string if the expected value is `text/plain;charset=US-ASCII`. The third value, if present, describes the expected body as an array of integers representing bytes.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great documentation, this and the below!


These tests are used for `data:` URLs in this directory (see `processing.any.js`).

== Forgiving-base64 decode ==

`resources/base64.json` contains [forgiving-base64 decode](https://infra.spec.whatwg.org/#forgiving-base64-decode) tests. The tests are encoded as a JSON array. Each value in the array is an array of two values. The first value describes the input, the second value describes the output as an array of integers representing bytes or null if the input cannot be decoded.

These tests are used for `data:` URLs in this directory (see `base64.any.js`) and `window.atob()` in `../../html/webappapis/atob/base64.html`.
16 changes: 16 additions & 0 deletions fetch/data-urls/base64.any.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
promise_test(() => fetch("resources/base64.json").then(res => res.json()).then(runBase64Tests), "Setup.");
function runBase64Tests(tests) {
for(let i = 0; i < tests.length; i++) {
const input = tests[i][0],
output = tests[i][1],
dataURL = "data:;base64," + input;
promise_test(t => {
if(output === null) {
return promise_rejects(t, new TypeError(), fetch(dataURL));
}
return fetch(dataURL).then(res => res.arrayBuffer()).then(body => {
assert_array_equals(new Uint8Array(body), output);
});
}, "data: URL base64 handling: " + format_value(input));
}
}
23 changes: 23 additions & 0 deletions fetch/data-urls/processing.any.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
promise_test(() => fetch("resources/data-urls.json").then(res => res.json()).then(runDataURLTests), "Setup.");
function runDataURLTests(tests) {
for(let i = 0; i < tests.length; i++) {
const input = tests[i][0];
let expectedMimeType = tests[i][1];
const expectedBody = expectedMimeType !== null ? tests[i][2] : null;
promise_test(t => {
if(expectedMimeType === "") {
expectedMimeType = "text/plain;charset=US-ASCII";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Normalizing this in the data files would be better and clearer. Performing computation in the tests is best avoided if possible.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems somewhat useful to me to have a special value representing the fallback, since it's used so much, to make "text/plain;charset=UTF-8" stand out more if nothing else.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, but copypasta is a risk too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I'm confused, I'm arguing in favor of keeping this as-is, isn't copypasta the least likely with the empty string as the value?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My reply was for @domenic. I also made my comment six minutes before yours according to the GitHub timestamp on it, so not sure why it appears to be later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would using null work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null means failure, per line 6.

Copy link
Member

@domenic domenic Jan 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The larger principle here is that any substitution requires every client of these data files to read the documentation and duplicate the substitution logic themselves, instead of just being able to write an equality test.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you have to read the documentation either way. I don't understand why the focus is on consumption so much as writing tests should be easy too. Maybe once this is implemented in more places and we have a good grasp on coverage we can simplify the consumption though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The focus is on being able to read the data file and tell what the test is. Code (and tests) are read more often than they are written.

If you're just going to generate tests from logic, instead of using a data file with the correct data, you might as well just use jsdom/data-url and check that the results match, instead of using a data file at all. That's obviously more extreme, but it's the same slippery slope.

}
if(expectedMimeType === null) {
return promise_rejects(t, new TypeError(), fetch(input));
} else {
return fetch(input).then(res => {
return res.arrayBuffer().then(body => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Optional style nit: simplify to .then(res => res.arrayBuffer())

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then res would no longer be in scope as is required here.

assert_array_equals(new Uint8Array(body), expectedBody);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aside: recalling some changes required in https://chromium-review.googlesource.com/c/chromium/src/+/786016 this made me a bit nervous, but looks like this won't run afoul of #7560.

assert_equals(res.headers.get("content-type"), expectedMimeType); // We could assert this earlier, but this fails often
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good comment, but since this is the case, I wonder if perhaps we should separate this into another test, especially for cases where both asserts will fail. Wrapping this in a separate test() will do the trick, although it's not my favorite thing in the world. WDYT?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thinking was that all the additional fetch() and arrayBuffer() calls are better avoided.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The "wrap in a test() trick" would be to keep everything unchanged, but here do:
test(() => assert_equals(res.headers.get("content-type"), expectedMimeType), format_value(input) + ', content-type).

As I said, not my favorite thing, but it does the trick in a situation like this.

});
});
}
}, format_value(input));
}
}
79 changes: 79 additions & 0 deletions fetch/data-urls/resources/base64.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
[
["", []],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these aren't new to the spec change and copied from another file I've only reviewed lightly.

["abcd", [105, 183, 29]],
[" abcd", [105, 183, 29]],
["abcd ", [105, 183, 29]],
[" abcd===", null],
["abcd=== ", null],
["abcd ===", null],
["a", null],
["ab", [105]],
["abc", [105, 183]],
["abcde", null],
["𐀀", null],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed to be String.fromCharCode(0xd800, 0xdc00).

["=", null],
["==", null],
["===", null],
["====", null],
["=====", null],
["a=", null],
["a==", null],
["a===", null],
["a====", null],
["a=====", null],
["ab=", null],
["ab==", [105]],
["ab===", null],
["ab====", null],
["ab=====", null],
["abc=", [105, 183]],
["abc==", null],
["abc===", null],
["abc====", null],
["abc=====", null],
["abcd=", null],
["abcd==", null],
["abcd===", null],
["abcd====", null],
["abcd=====", null],
["abcde=", null],
["abcde==", null],
["abcde===", null],
["abcde====", null],
["abcde=====", null],
["=a", null],
["=a=", null],
["a=b", null],
["a=b=", null],
["ab=c", null],
["ab=c=", null],
["abc=d", null],
["abc=d=", null],
["ab\tcd", [105, 183, 29]],
["ab\ncd", [105, 183, 29]],
["ab\fcd", [105, 183, 29]],
["ab\rcd", [105, 183, 29]],
["ab cd", [105, 183, 29]],
["ab\u00a0cd", null],
["ab\t\n\f\r cd", [105, 183, 29]],
[" \t\n\f\r ab\t\n\f\r cd\t\n\f\r ", [105, 183, 29]],
["ab\t\n\f\r =\t\n\f\r =\t\n\f\r ", [105]],
["A", null],
["/A", [252]],
["//A", [255, 240]],
["///A", [255, 255, 192]],
["////A", null],
["/", null],
["A/", [3]],
["AA/", [0, 15]],
["AAAA/", null],
["AAA/", [0, 0, 63]],
["\u0000nonsense", null],
["abcd\u0000nonsense", null],
["YQ", [97]],
["YR", [97]],
["~~", null],
["..", null],
["--", null],
["__", null]
]
175 changes: 175 additions & 0 deletions fetch/data-urls/resources/data-urls.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,175 @@
[
["data://test/,X",
"",
[88]],
["data://test:test/,X",
null],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does this test case fail? Is it the URL parser that returns an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This URL loads an X document in Firefox and Chromium. (It doesn’t in Edge, but Edge appears to reject any data: URL with a MIME type parsing error.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah.

["data:,X",
"",
[88]],
["data:,X#X",
"",
[88]],
["data:,%FF",
"",
[255]],
["data:text/plain,X",
"text/plain",
[88]],
["data:text/plain ,X",
"text/plain",
[88]],
["data:text/plain%20,X",
"text/plain%20",
[88]],
["data:text/plain\f,X",
"text/plain%0c",
[88]],
["data:text/plain%0C,X",
"text/plain%0c",
[88]],
["data:text/plain;,X",
"text/plain",
[88]],
["data:;x=x;charset=x,X",
"text/plain;x=x;charset=x",
[88]],
["data:;x=x,X",
"text/plain;x=x",
[88]],
["data:text/plain;charset=windows-1252,%C2%B1",
"text/plain;charset=windows-1252",
[194, 177]],
["data:text/plain;Charset=UTF-8,%C2%B1",
"text/plain;charset=UTF-8",
[194, 177]],
["data:image/gif,%C2%B1",
"image/gif",
[194, 177]],
["data:IMAGE/gif,%C2%B1",
"image/gif",
[194, 177]],
["data:IMAGE/gif;hi=x,%C2%B1",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All browsers fail this test, but for different reasons. Chrome and Firefox say "image/gif", while Edge and Safari say "IMAGE/gif;hi=x". So all good.

"image/gif;hi=x",
[194, 177]],
["data:IMAGE/gif;CHARSET=x,%C2%B1",
"image/gif;charset=x",
[194, 177]],
["data: ,%FF",
"",
[255]],
["data:%20,%FF",
"",
[255]],
["data:\f,%FF",
"",
[255]],
["data:%1F,%FF",
"",
[255]],
["data:\u0000,%FF",
"",
[255]],
["data:%00,%FF",
"",
[255]],
["data:text/html ,X",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given whatwg/fetch#579 (comment) I think this would be a good place to have a negative test for the odd thing that Safari now has, for example with "data:text / html,X", and asserting that it's treated as an invalid MIME type. Should I just push that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done!

"text/html",
[88]],
["data:†,X",
"",
[88]],
["data:†/†,X",
"%e2%80%a0/%e2%80%a0",
[88]],
["data:X,X",
"",
[88]],
["data:image/png,X X",
"image/png",
[88, 32, 88]],
["data:application/xml,X X",
"application/xml",
[88, 32, 88]],
["data:unknown/unknown,X X",
"unknown/unknown",
[88, 32, 88]],
["data:text/plain;a=\",\",X",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everyone fails this too, but again for different reasons.

"text/plain;a=\"\\\"\"",
[34, 44, 88]],
["data:text/plain;a=%2C,X",
"text/plain;a=%2C",
[88]],
["data:;base64;base64,WA",
"text/plain",
[88]],
["data:x/x;base64;base64,WA",
"x/x",
[88]],
["data:x/x;base64;charset=x;base64,WA",
"x/x;charset=x",
[88]],
["data:x/x;base64;base64x,WA",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In Chrome, Edge and Firefox this fails with "assert_array_equals: lengths differ, expected 2 got 1" and in Safari it fails on the following assert, which means the body did match. I take this to mean that Chrome, Edge and Firefox treat this as base64 while Safari does not.

I don't know what to expect web compat wise, but would there be anything wrong with aligning with the majority on this point?

"x/x",
[87, 65]],
["data:;base64,W%20A",
"",
[88]],
["data:;base64,W%0CA",
"",
[88]],
["data:x;base64x,WA",
"text/plain",
[87, 65]],
["data:x;base64;x,WA",
"text/plain",
[88]],
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my tests this should be [87, 65] like the above (doesn't end in ;base64).

["data:x;base64=x,WA",
"text/plain;base64=x",
[87, 65]],
["data:; base64,WA",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing "; base64" is the most interesting remaining question for me. Here it would be useful to also add a "data:text/plain; base64,WA" case to isolate entirely from differences in MIME parsing. As it is this fails everywhere, and I think splitting out that difference would be useful. (When triaging failures, it'll usually make sense to look at tests that fail everywhere last.)

"",
[87, 65]],
["data:;base64;,WA",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is also "assert_array_equals: lengths differ, expected 2 got 1" in all except Safari, which then fails because its ";base64;" instead of "text/plain". Same bucket as the above comment.

"text/plain",
[87, 65]],
["data:;base64 ,WA",
"",
[88]],
["data:;base 64,WA",
"text/plain",
[87, 65]],
["data:;BASe64,WA",
"",
[88]],
["data:;%62ase64,WA",
"text/plain",
[87, 65]],
["data:%3Bbase64,WA",
"",
[87, 65]],
["data:;charset=x,X",
"text/plain;charset=x",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass condition should change to preserve the space before the lowercase x.

[88]],
["data:; charset=x,X",
"text/plain;charset=x",
[88]],
["data:;charset =x,X",
"text/plain",
[88]],
["data:;charset= x,X",
"text/plain;charset=\" x\"",
[88]],
["data:;charset=,X",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty string is not a valid token and hence this will result in text/plain;charset=US-ASCII, I guess.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll result in text/plain. We only use the fallback if MIME type parsing returns missing or failure which it won't for text/plain;charset=.

"text/plain",
[88]],
["data:;charset,X",
"text/plain",
[88]],
["data:;charset=\"x\",X",
"text/plain;charset=\"x\"",
[88]],
["data:;CHARSET=\"X\",X",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this would be the place to test the questions in whatwg/fetch#579 (comment)? Or are such tests already here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, a little above this one. Line 163, 166, and various tests around those too.

"text/plain;charset=\"X\"",
[88]]
]
Loading