Frozen base URL definition is broken #1060

bzbarsky · 2016-04-15T16:41:33Z

https://html.spec.whatwg.org/multipage/semantics.html#set-the-frozen-base-url step 2 says "Parse the value of element's href content attribute relative to document". This is wrong. It should be relative to the document's fallback base URL. The href getter defined immediately afterward gets this right...

The text was updated successfully, but these errors were encountered:

bzbarsky · 2016-04-15T17:57:00Z

This got broken by 30bc2557 looks like. Before that, it used to say:

<p>To <dfn>set the frozen base URL</dfn>, <span data-x="resolve a URL">resolve</span> the value of
the element's <code data-x="attr-base-href">href</code> content attribute relative to the
<code>Document</code>'s <span>fallback base URL</span>; if this is successful, set the
<span>frozen base URL</span> to the <span>resulting absolute URL</span>, otherwise, set the
<span>frozen base URL</span> to the <span>fallback base URL</span>.</p>

annevk · 2016-04-16T05:44:40Z

This whole setup seems broken. As far as I can tell the href IDL attribute reflects the href content attribute, but that is not what the specification says. E.g., per the specification setting it to "https://test:test/" should end up returning the fallback base URL, but browsers will just return the literal value. Now that might not end up being used as base URL of course, but that's something else again.

annevk · 2016-04-16T05:53:15Z

We can't use reflect however since we have a custom base URL. So, the getter should simply return the element's internal base URL, serialized, if it's non-null, and the href attribute value otherwise, and the setter should set the href attribute. We should also take action upon every href attribute value set, not only when it's the first base element in a document. And make that set the internal base URL as appropriate, taking into account CSP.

@mikewest is it fine for CSP to run "Is base allowed for Document?" for each base element?

mikewest · 2016-04-16T06:16:24Z

@annevk: That would potentially end up creating multiple violation reports. Not the end of the world, but it would be better to process things once.

annevk · 2016-04-16T06:54:07Z

Okay, so we should do that whenever we determine whether the document base URL needs to change.

bzbarsky · 2016-04-17T03:41:24Z

I didn't test what browsers do about the href getter, and that's not what this issue is primarily about. This issue is about the fact that the current definition of the document base URL is circular, since it depends on the <base> tag and the frozen base url of that is currently defined (incorrectly) to depend on the document base URL.

In terms of what the href getter does, Gecko will try to parse the attr value relative the the document URL (not the fallback base URL; that's a bug) and if that fails return the attr value. That's clearly not what the spec says right now (e.g. it should return the empty string on parse failure as the spec is currently written), and I don't think it would be a problem to change the behavior if other UAs follow the spec. If UAs agree with each other but not the spec, and do something moderately sane in the parse failure case, it's probably better to do what UAs do, of course.

annevk · 2016-04-18T06:13:45Z

Hmm indeed,

<base href=/test/>
<script>
  var b = document.createElement("base")
  w(b.href)
</script>

logs the document URL, not the document base URL, in both Firefox and Chrome.

annevk · 2016-04-18T06:38:33Z

For the failure scenario,

<script>
  var b = document.createElement("base")
  b.href = "//test:test"
  w(b.href)
</script>

Firefox returns the given value, but Chrome and Safari return the empty string. Don't have Edge.

annevk · 2016-04-18T06:45:51Z

<iframe onload="var b=this.contentDocument.createElement('base');b.href='test';w(b.href)"></iframe>

returns "test" in Firefox and the empty string in Chrome and Safari.

annevk · 2016-04-18T07:29:30Z

I also tested OP's assertion that the first base element in the document would use the fallback base URL (which would make its href getter weirdly inconsistent I think if that did not use that concept). This does not seem true for about:blank:

<iframe onload="var d=this.contentDocument,b=d.createElement('base');b.href='t';d.head.appendChild(b);w(d.baseURI);w(b.href)"></iframe>

It does seem true for srcdoc, but only in Firefox, and it indeed weirdly inconsistent:

<iframe srcdoc="" onload="var d=this.contentDocument,b=d.createElement('base');b.href='t/';d.head.appendChild(b);w(d.baseURI);w(b.href)"></iframe>

logs http://.../t/ (a full base URL) and t/. Chrome and Safari support this scenario just as bad as about:blank.

(Use http://software.hixie.ch/utilities/js/live-dom-viewer/ to run these examples for yourself.)

So I guess this raises these questions:

Should the fallback base URL be used since it appears to be not used by Chrome and Safari. (I think it should be used.)
Should about:blank and srcdoc behave the same here. (I think so.)
Should the href getter use the fallback base URL? (I think so.)
Should we use the empty string in the failure case? (I think so, though I could go either way on this one given that returning the input value is more consistent with other URL getters. @zcorpan?)

Fixes #1060.

annevk · 2016-04-18T07:42:56Z

I created a PR that addresses this per my assumptions of how we think this should work. @foolip @bzbarsky, @hober, would appreciate your input.

zcorpan · 2016-04-18T09:19:45Z

Re 4: I noticed the failure case issue in #561 but missed to fix it. Per #561 (comment) there is little interop. I think I prefer returning the input so it's more consistent.

bzbarsky · 2016-04-18T13:15:23Z

@annevk Before I start looking at the PR, can you explain to me what you're trying to change? Are you changing what the href getter returns, or are you changing what the actual base URL of the document gets set to (what this issue was about), or both?

Have you put your testcases somewhere publicly loadable? I'm happy to test IE and Edge for you, but don't really want to try to recreate your testcases in that environment; even loading a URL is pain enough there, much less typing things.

annevk · 2016-04-18T13:40:16Z

Both.

bzbarsky · 2016-04-18T20:52:27Z

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4079

Safari: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4079" (document URI, or fallback base URI; it's the same in this case, so hard to tell).
Chrome: Same as Safari.
Firefox: Same as Safari.
IE11: Logs empty string.
Edge: Same as Safari.

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4080

Safari: Logs empty string.
Chrome: Same as Safari.
Firefox: Logs "//test:test".
IE11: Logs "http://test:test".
Edge: Same as IE11.

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4081

Safari: Logs empty string.
Chrome: Same as Safari.
Firefox: Logs "test".
IE11: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/test".
Edge: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/test".

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4082

Safari: Logs null and empty string. But note that document.baseURI is known broken in WebKit (and Chrome? can't recall whether they fixed it) and doesn't match the actual URI used for relative URI resolution in various cases.
Chrome: Logs "about:blank" and empty string.
Firefox: Logs "about:blank" and "t".
IE11: Logs undefined and "http://software.hixie.ch/utilities/js/live-dom-viewer/t".
Edge: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/t" twice.

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4083

Safari: Logs null and empty string.
Chrome: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4083" and empty string.
Firefox: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/t/" and "t/".
IE11: Logs undefined and "http://software.hixie.ch/utilities/js/live-dom-viewer/t".
Edge: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/t" twice.

What behavior are you aiming for with the PR on these testcases?

Oh, and as far as the Firefox behavior here, some notes:

The "fallback base URI" implementation in Gecko doesn't work correctly for "about:blank" yet, since that case is only pretty recently defined: it just returns the document URI in that case. There are some code comments to that effect. That explains why you see the inconsistency between srcdoc and about:blank there. The about:blank setup in Firefox is kinda weird: we store an explicit base URI override at the moment, and it gets clobbered if you add in a element, which is what you see on http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4082.
As I mentioned on IRC, in Gecko if the href value can't be parsed as a URI .href will return the original value; that explains what's going on with http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4080
The .href getter in Firefox resolves the attribute value relative to the document URI, not the document's fallback base URI. That explains what's going on with http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4081 and http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4083. I consider this behavior a Gecko bug that we should just fix.

annevk · 2016-04-19T08:23:59Z

Apart from the failure scenario (for which the PR matches Safari/Chrome, but should probably be updated to match Firefox per @zcorpan's comment), the PR matches Edge (I'm assuming you have some copypasta and Edge actually logs a trailing slash as well for the final test).

Currently the PR also takes into account a "frozen base URL" if the element has one, so that when CSP applies the href getter returns the fallback base URL and not the base URL that was blocked, but I should probably simply remove that since it's not worth the complexity (just use document.baseURI if you want to know the base URL).

bzbarsky · 2016-04-19T13:19:49Z

I'm assuming you have some copypasta and Edge actually logs a trailing slash as well for the final test

Er, right you are. Sorry about that.

I assume the "failure scenario" is http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4080? If so, I'm ok with either the WebKit or Gecko behavior there, I guess; there's no way to win.

I agree that having .href just return the thing you would set the base URL to if it were not for CSP is a lot more useful than returning the fallback base URL in that situation.

The changes in the PR look good with those two caveats.

Always parse <base> elements against the document's fallback base URL. When parsing fails, returns their href attribute value. (Note that it can only fail if there is an href attribute value.) This matches the semantics of "reflect", but does not directly use that mechanism since it does not support arbitrary base URLs and such. Fixes #1060. (That issue also has links to tests and results for contemporary browsers.)

annevk · 2016-04-20T06:01:00Z

The PR that landed uses the Gecko handling for failure and does not let CSP have an effect on the href getter.

annevk · 2016-04-20T06:01:11Z

Thank you for reporting this!

bzbarsky · 2016-04-20T14:48:08Z

I included some web platform tests in the patch I just put up in https://bugzilla.mozilla.org/show_bug.cgi?id=1266077

bzbarsky assigned annevk Apr 15, 2016

annevk added a commit that referenced this issue Apr 18, 2016

Correct URL parsing of base elements

732a605

Fixes #1060.

annevk added the compat Standard is not web compatible or proprietary feature needs standardizing label Apr 18, 2016

annevk mentioned this issue Apr 18, 2016

Correct URL parsing of base elements #1064

Merged

annevk closed this as completed in #1064 Apr 20, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Frozen base URL definition is broken #1060

Frozen base URL definition is broken #1060

bzbarsky commented Apr 15, 2016

bzbarsky commented Apr 15, 2016

annevk commented Apr 16, 2016

annevk commented Apr 16, 2016

mikewest commented Apr 16, 2016

annevk commented Apr 16, 2016

bzbarsky commented Apr 17, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

zcorpan commented Apr 18, 2016

bzbarsky commented Apr 18, 2016

annevk commented Apr 18, 2016

bzbarsky commented Apr 18, 2016 •

edited

Loading

annevk commented Apr 19, 2016

bzbarsky commented Apr 19, 2016

annevk commented Apr 20, 2016

annevk commented Apr 20, 2016

bzbarsky commented Apr 20, 2016

Frozen base URL definition is broken #1060

Frozen base URL definition is broken #1060

Comments

bzbarsky commented Apr 15, 2016

bzbarsky commented Apr 15, 2016

annevk commented Apr 16, 2016

annevk commented Apr 16, 2016

mikewest commented Apr 16, 2016

annevk commented Apr 16, 2016

bzbarsky commented Apr 17, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

annevk commented Apr 18, 2016

zcorpan commented Apr 18, 2016

bzbarsky commented Apr 18, 2016

annevk commented Apr 18, 2016

bzbarsky commented Apr 18, 2016 • edited Loading

annevk commented Apr 19, 2016

bzbarsky commented Apr 19, 2016

annevk commented Apr 20, 2016

annevk commented Apr 20, 2016

bzbarsky commented Apr 20, 2016

bzbarsky commented Apr 18, 2016 •

edited

Loading