Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

javascript URLs that do not return strings is apparently wrong #1896

Open
domenic opened this issue Oct 13, 2016 · 29 comments
Open

javascript URLs that do not return strings is apparently wrong #1896

domenic opened this issue Oct 13, 2016 · 29 comments
Assignees
Labels
compat Standard is not web compatible or proprietary feature needs standardizing interop Implementations are not interoperable with each other topic: javascript: URLs topic: navigation

Comments

@domenic
Copy link
Member

domenic commented Oct 13, 2016

See @bzbarsky's comment at #1107 (comment).

@bzbarsky could you explain in more detail how the current spec is wrong? Is the idea that the response status should not be 204, and that it should have a Content-Type header, but it should still have the empty body?

@bzbarsky
Copy link
Contributor

"wrong" is a slightly loaded term. What I will claim is that the current spec does not match non-Gecko UAs, which it claimed to do, and is not web-compatible.

The latter problem is described in in #1895

For the former, a quick check suggests that at least Chrome and Safari seem to convert non-string values to the string "" and then proceed to load that as the response body. They also treat exceptions in the javascript: as "". Here's the testcase I used:

<!DOCTYPE html>
<input type="button" value="Load javascript:undefined"
       onclick="doLoad('undefined')">
<input type="button" value="Load javascript:1"
       onclick="doLoad('1')">
<input type="button" value="Load javascript:false"
       onclick="doLoad('false')">
<input type="button" value="Load javascript:null"
       onclick="doLoad('null')">
<input type="button" value="Load javascript:new String(&quot;foo&quot;)"
       onclick="doLoad('new String(&quot;foo&quot;)')">
<input type="button" value="Load javascript:new Object()"
       onclick="doLoad('new Object()')">
<input type="button" value="Load javascript:&quot;bar&quot;"
       onclick="doLoad('&quot;bar&quot;')">
<input type="button" value="Load javascript:noSuchFunc()"
       onclick="doLoad('noSuchFunc()')">
<hr>
<script>
  function doLoad(arg) {
    document.querySelector("iframe").src = "javascript:"+arg;
  }
  var blob = new Blob(["Original document"], { type: "text/plain" });
  var url = URL.createObjectURL(blob);
  var ifr = document.createElement("iframe");
  ifr.src = url;
  ifr.name = "iframe";
  document.body.appendChild(ifr);
</script>

This is one possible thing that could be specified, and would not suffer from the problem described in #1895. It would not match longstanding IE/Gecko/Netscape behavior for undefined, of course.

I haven't tested what Chrome/Safari do in other cases of interest, like:

  1. The origin check in https://html.spec.whatwg.org/multipage/browsers.html#javascript-protocol step 1 fails (iirc, that case is treated as 204 by them, but I didn't really test very carefully).
  2. Script is disabled.
  3. CSP blocks the execution (I actually have no clue where this is specified at all; certainly not in HTML or https://w3c.github.io/webappsec-csp/#script-src-inline; filed Interaction of CSP and javascript: URLs in iframe src is not defined anywhere w3c/webappsec-csp#127 on it).

or whatever other cases currently get commoned up into the "return value is not string" codepath.

@bzbarsky
Copy link
Contributor

Oh, and the point is that the "script is disabled" and "CSP blocks the execution" cases can still have it be observable whether the load event fires, so the interaction there would need to be specified not matter what, even if we take the Blink/WebKit approach for the cases in the testcase I pasted above.

@domenic
Copy link
Member Author

domenic commented Oct 13, 2016

For the former, a quick check suggests that at least Chrome and Safari seem to convert non-string values to the string "" and then proceed to load that as the response body. They also treat exceptions in the javascript: as "". Here's the testcase I used:

What I don't understand is why this is different from the current spec, which says

Process result: If Type(result) is not String, then set response to a response whose status is 204.

Run process a navigate response with resource, response, navigationType, the source browsing context, and browsingContext, and then abort these steps.

I'm probably just missing something, but that's the largest source of my confusion.

@domenic
Copy link
Member Author

domenic commented Oct 13, 2016

Oh, duh, it's because 204s abort in the next algorithm. I get it, sorry.

@bzbarsky
Copy link
Contributor

Oh, duh, it's because 204s abort in the next algorithm.

Right, exactly.

@domenic domenic self-assigned this Oct 15, 2016
domenic added a commit to web-platform-tests/wpt that referenced this issue Oct 15, 2016
domenic added a commit to web-platform-tests/wpt that referenced this issue Oct 15, 2016
@domenic
Copy link
Member Author

domenic commented Oct 15, 2016

OK, here is the problem. Your test case, with loading things into iframe srcs, shows that they do process the response bodies as empty strings. But the following test case shows that they don't, when clicking on <a> elements: https://jsbin.com/zokamoyoxe/edit?html,output

I don't really know what to make of this. Is this distinction make elsewhere in navigation? How do Blink and WebKit actually implement this!?

@bzbarsky
Copy link
Contributor

I don't really know what to make of this.

Well. See my (somewhat lazy, not backed by testing) comments about "not-initial javascript: loads" in #1895 (comment). But of course my testcase in this bug also has non-initial loads.

How do Blink and WebKit actually implement this!?

Last I looked, really crazily. ;) Starting with "not in navigation, unlike the spec". That's mostly what you can make of this.

Is this distinction make elsewhere in navigation?

Not particularly, no.

Note that it's worth testing what location.href sets do as well; those may have yet another behavior. Or maybe we're lucky and they match anchor clicks.

One piece of good news is that for web compat purposes we don't need to duplicate WebKit/Blink here in general.... The problem is that we're not sure what a saner web-compatible behavior is, apart from the current Firefox or Edge ones (which are different, note; I have no idea what Edge really does!).

@domenic
Copy link
Member Author

domenic commented Oct 15, 2016

At this point I'm leaning toward the current Firefox behavior, as it sounds rather easy to spec. But I feel like I should probably do more research and testing and noodling over the spec to see what variations would be less invasive...

@domenic
Copy link
Member Author

domenic commented Oct 15, 2016

Does using javascript: URLs in the following ways cause navigation? A per-browser breakdown.

javascript: URL usage Edge Chrome Firefox Safari
location.href assignment (non-string) ?
location.href assignment (string) X X X ?
Clicking an <a> (non-string) ?
Clicking an <a> (string) X X X ?
iframe.src assignment (non-string) X ?
iframe.src assignment (string) X X X ?

Leaving Safari as ?s since I don't have my Safari tech preview computer for the weekend, but my guess is it's going to be the same as Chrome.

I might have gotten turned around, but this implies that Edge and Firefox are compatible with the current spec. But @bzbarsky, you say that that's not web-compatible? Despite Edge and Firefox shipping it to stable?

@bzbarsky
Copy link
Contributor

The main web compat requirement I'm aware of for the non-string case is this:

<iframe src="javascript:non-string-non-undefined"></iframe>

needs to fire a load event. Certainly when non-string-non-undefined is a boolean false, because the jQuery fileupload library relies on that, but possibly in other cases too; I didn't carefully analyze all the sites that had this problem. Note that this says nothing about whether a navigation happens; I haven't tested that carefully, not least because it's not trivial to detect given that UAs disagree with the spec in various other ways around javascript: URLs in general (e.g. what the URL of the document resulting from the navigation is). We could add tests for this, of course.

In terms of what's shipped, Firefox until Firefox 49 fired the load event in that situation for non-undefined return values, because it did a navigation (to the ToString() of the return value). Firefox 49 aligned with the spec, which does NOT fire a load event in that situation, and this broke a bunch of websites; we did not get any bug reports until shipping to stable because all the sites involved require an account and we didn't have any pre-release users with accounts on those sites who noticed the issue, apparently. :( We're likely to do a point release of Firefox 49 soon, backing out my change to align with the current spec. All the non-release branches have already been updated with the backout.

As for Edge, I have verified that it does fire a load event in the case of:

<iframe src="javascript:false"></iframe>

but I have not investigated in detail whether it performs a navigation or not, in that situation and others; testing Edge on my end is a bit of a pain.

@bzbarsky
Copy link
Contributor

So to be clear, the known-web-compatible "Firefox behavior" is not found in current release Firefox 49, nor in Firefox nightlies from a few days ago (before the backout). You can see it in a nightly from 2016-10-15 (today), though.

@bzbarsky
Copy link
Contributor

And note that "non-string" in the Firefox behavior contains two buckets that have different behaviors: there's undefined and everything else.

@annevk
Copy link
Member

annevk commented Jul 27, 2017

When exactly does <iframe src="javascript:false"></iframe> need to fire a load event? Does <iframe>.src = "javascript:false" at some arbitrary point in time also need to cause a load event? (That is, could it just be the load event from creating and appending the iframe element, assuming we ever start doing that consistently for about:blank.)

@annevk annevk changed the title javascript: URLs that do not return strings is apparently wrong javascript URLs that do not return strings is apparently wrong Jul 27, 2017
@bzbarsky
Copy link
Contributor

When exactly does <iframe src="javascript:false"></iframe> need to fire a load event?

Unclear. The failure symptoms on the web were that if it never fired one, sites would "hang" because they were waiting for that event to do other stuff.

https://bugzilla.mozilla.org/show_bug.cgi?id=1307420#c9 has a quote of the relevant jQuery plugin code. You may be able to find other information in https://bugzilla.mozilla.org/show_bug.cgi?id=1306472 and its various duplicates...

It's possible that treating this load as a 204 but still firing a load event like we would for no src at all or src="about:blank" might be good enough. I just can't tell you.

Does <iframe>.src = "javascript:false" at some arbitrary point in time also need to cause a load event?

Unknown. Worth checking whether it does in browsers.

@annevk
Copy link
Member

annevk commented Jul 31, 2017

<!DOCTYPE html>
<script>w("before iframe")</script>
<iframe onload=w('load') src=javascript:false></iframe>
<script>w("after iframe")</script>

With w() logging ends up as before iframe, after iframe, load in Edge and Firefox and before iframe, load, after iframe in Chrome and Safari TP.

<!DOCTYPE html>
<iframe onload=w('load')></iframe>
<script>
setTimeout(() => {
  w("timeout")
  document.querySelector("iframe").src = "javascript:false"
  w("after src")
}, 200)
</script>

Ends up as load, timeout, after src, load in all browsers, except Edge doesn't log the final load.

@bzbarsky
Copy link
Contributor

As far as Chrome and Safari go, they do weird sync stuff with javascript: even in the string case. Most simply:

<!DOCTYPE html>
<script>w("before iframe")</script>
<iframe onload=w('load') src=javascript:"hello"></iframe>
<script>w("after iframe")</script>

logs: before iframe, load, load (yes, twice), after iframe in Chrome and Safari.

Also note that this testcase:

<!DOCTYPE html>
<iframe onload=w('load') src="javascript:'Does this stick around?'"></iframe>
<script>
setTimeout(() => {
  w("timeout")
  document.querySelector("iframe").src = "javascript:false"
  w("after src")
}, 2000)
</script>

shows that the load of false is not treated as a 204 in Chrome and Safari at all: it makes the previous document go away instead. Same thing if I use http://software.hixie.ch/utilities/ for the src. Similar for:

<!DOCTYPE html>
<iframe onload=w('load') src="javascript:'Does this stick around?'"></iframe>
<script>
setTimeout(() => {
  w("timeout")
  document.querySelector("iframe").src = "javascript:undefined"
  w("after src")
}, 2000)
</script>

though that one is treated as a 204 in Firefox and Edge.

@annevk
Copy link
Member

annevk commented Jul 31, 2017

I'd be okay with just loading an empty 200. I guess effectively treating them as navigating to about:blank?

@bzbarsky
Copy link
Contributor

I guess we have no indication that "sync stuff happens" based on those testcases, as opposed to "iframe with javascript: src blocks the parser until its src runs". But this testcase:

<!DOCTYPE html>
<script>
  w("before iframe");
  var i = document.createElement("iframe");
  i.setAttribute("onload", "w('load')");
  i.src = "javascript:'hello'";
  document.documentElement.appendChild(i);
  w("after iframe");
</script>

shows that it really is "sync stuff happens", complete with the two firings of onload.

I'd be okay with just loading an empty 200

For which cases?

I guess effectively treating them as navigating to about:blank?

Whatever cases these are, why not just treat them as if the javascript: returned "" for simplicity, instead of having to shoehorn about:blank in there?

Note that I did point out some of these aspects of the behavior in #1896 (comment) along with my list of things that still need testing...

@bzbarsky
Copy link
Contributor

Per https://bugzilla.mozilla.org/show_bug.cgi?id=1382035 there is now a web-compat requirement that <a href="javascript: foo()"> not perform navigation when foo returns a Promise. Per the table above, this is something Chrome already does for <a> but not <iframe>. It's not something Firefox does, but due to the web compat issues we are running into we will be adding a special-case for Promise return values specifically in our javascript: handling.

It would be good to get this thing sorted out, given that it's causing real-life compat pain. @domenic can Chrome align closer to the spec for its javascript: handling, or failing that create good documentation for what it's actually doing?

@domenic
Copy link
Member Author

domenic commented Aug 1, 2018

:(

I can ask around about aligning Chrome closer to the spec, although it sounds like the spec is not web-compatible and we'd need to work on a better spec first. I haven't had time for that myself; again I can ask around, but perhaps Mozilla could help?

As for Chrome creating documentation on what we're doing, again I can ask around. I suppose Chrome engineers are better positioned to read their source code and transcribe it into prose than other folks would be...

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 1, 2018

The part of the spec that is not web-compatible is the load event behavior, not the handling of javascript: per se, for what it's worth.

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 1, 2018

But in general, I can obviously spec what Firefox does, say. The question is whether Chrome would be willing to implement it. If not, what are the constraints on what Chrome is willing to implement?

@domenic
Copy link
Member Author

domenic commented Aug 1, 2018

Hmm I guess I don't understand the load event connection, but maybe I need to page more of this area/thread into my memory.

I know @zetafunction was looking in to modernizing Chrome's javascript: handling, specifically wanting to make it async, in #3730. That does indeed point to some constraints, or at least wishes. Let me try to get him specifically to chime in.

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 1, 2018

The load event connection is described in #1895

@foolip foolip added compat Standard is not web compatible or proprietary feature needs standardizing interop Implementations are not interoperable with each other labels Aug 1, 2018
@zetafunction
Copy link

I think we're willing to implement what Firefox specs. Unfortunately, there's a lot of things that are broken about javascript: URLs in Chrome. From my perspective, the two things we should fix first are:

  1. run the Javascript URL processing steps asynchronously
  2. the double load event (I have no idea why this is happening without digging in more; is there a bug already filed for this?)

@bzbarsky Am I correct in understanding that Firefox matches the spec queues a task to run the javascript: URL steps in https://html.spec.whatwg.org/multipage/browsing-the-web.html#javascript-protocol? I'd like to make sure that any changes we make here bring us closer to the spec and Firefox.

@bzbarsky
Copy link
Contributor

bzbarsky commented Aug 2, 2018

So generally speaking, what Firefox implements aims to match the current spec, with to my knowledge two exceptions. One is what the final URL of a javascript: navigation is, and we are planning to align to the spec on that. The other is that https://html.spec.whatwg.org/multipage/browsing-the-web.html#javascript-protocol step 12 is instead:

  1. If IsPromise(result) is true, then set result to undefined.
  2. If Type(result) is not Undefined, set result to ToString(result). If that throws an exception, report the exception and set result to undefined.
  3. If Type(result) is Undefined, then set response to a response whose status is 204. Otherwise, ... (what the spec has in its step 12, since at this point we know result is a string).

There might be some complications around the exception-reporting part in terms of sync vs async.

@foolip
Copy link
Member

foolip commented Aug 16, 2018

@zetafunction do we have an open issue for "run the Javascript URL processing steps asynchronously"? It looks to at a glance like nothing in https://wpt.fyi/results/html/browsers/browsing-the-web/navigating-across-documents is targeting this behavior, does anyone know if there are other tests for it? @bzbarsky, perhaps in Gecko?

@domenic
Copy link
Member Author

domenic commented Aug 16, 2018

We have #3730 but last I asked @zetafunction he said that @annevk convinced him that was not the right way to go. If so maybe closing that one and opening a new one more targeted at javascript: URLs would be helpful.

@foolip
Copy link
Member

foolip commented Aug 16, 2018

I was thinking of an issue on crbug.com, but sounds like any change is still blocked on some spec changes?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing interop Implementations are not interoperable with each other topic: javascript: URLs topic: navigation
Development

No branches or pull requests

6 participants
@foolip @domenic @bzbarsky @annevk @zetafunction and others