Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frozen base URL definition is broken #1060

Closed
bzbarsky opened this issue Apr 15, 2016 · 20 comments
Closed

Frozen base URL definition is broken #1060

bzbarsky opened this issue Apr 15, 2016 · 20 comments
Assignees
Labels
compat Standard is not web compatible or proprietary feature needs standardizing

Comments

@bzbarsky
Copy link
Contributor

https://html.spec.whatwg.org/multipage/semantics.html#set-the-frozen-base-url step 2 says "Parse the value of element's href content attribute relative to document". This is wrong. It should be relative to the document's fallback base URL. The href getter defined immediately afterward gets this right...

@bzbarsky
Copy link
Contributor Author

This got broken by 30bc2557 looks like. Before that, it used to say:

<p>To <dfn>set the frozen base URL</dfn>, <span data-x="resolve a URL">resolve</span> the value of
the element's <code data-x="attr-base-href">href</code> content attribute relative to the
<code>Document</code>'s <span>fallback base URL</span>; if this is successful, set the
<span>frozen base URL</span> to the <span>resulting absolute URL</span>, otherwise, set the
<span>frozen base URL</span> to the <span>fallback base URL</span>.</p>

@annevk
Copy link
Member

annevk commented Apr 16, 2016

This whole setup seems broken. As far as I can tell the href IDL attribute reflects the href content attribute, but that is not what the specification says. E.g., per the specification setting it to "https://test:test/" should end up returning the fallback base URL, but browsers will just return the literal value. Now that might not end up being used as base URL of course, but that's something else again.

@annevk
Copy link
Member

annevk commented Apr 16, 2016

We can't use reflect however since we have a custom base URL. So, the getter should simply return the element's internal base URL, serialized, if it's non-null, and the href attribute value otherwise, and the setter should set the href attribute. We should also take action upon every href attribute value set, not only when it's the first base element in a document. And make that set the internal base URL as appropriate, taking into account CSP.

@mikewest is it fine for CSP to run "Is base allowed for Document?" for each base element?

@mikewest
Copy link
Member

@annevk: That would potentially end up creating multiple violation reports. Not the end of the world, but it would be better to process things once.

@annevk
Copy link
Member

annevk commented Apr 16, 2016

Okay, so we should do that whenever we determine whether the document base URL needs to change.

@bzbarsky
Copy link
Contributor Author

I didn't test what browsers do about the href getter, and that's not what this issue is primarily about. This issue is about the fact that the current definition of the document base URL is circular, since it depends on the <base> tag and the frozen base url of that is currently defined (incorrectly) to depend on the document base URL.

In terms of what the href getter does, Gecko will try to parse the attr value relative the the document URL (not the fallback base URL; that's a bug) and if that fails return the attr value. That's clearly not what the spec says right now (e.g. it should return the empty string on parse failure as the spec is currently written), and I don't think it would be a problem to change the behavior if other UAs follow the spec. If UAs agree with each other but not the spec, and do something moderately sane in the parse failure case, it's probably better to do what UAs do, of course.

@annevk
Copy link
Member

annevk commented Apr 18, 2016

Hmm indeed,

<base href=/test/>
<script>
  var b = document.createElement("base")
  w(b.href)
</script>

logs the document URL, not the document base URL, in both Firefox and Chrome.

@annevk
Copy link
Member

annevk commented Apr 18, 2016

For the failure scenario,

<script>
  var b = document.createElement("base")
  b.href = "//test:test"
  w(b.href)
</script>

Firefox returns the given value, but Chrome and Safari return the empty string. Don't have Edge.

@annevk
Copy link
Member

annevk commented Apr 18, 2016

<iframe onload="var b=this.contentDocument.createElement('base');b.href='test';w(b.href)"></iframe>

returns "test" in Firefox and the empty string in Chrome and Safari.

@annevk
Copy link
Member

annevk commented Apr 18, 2016

I also tested OP's assertion that the first base element in the document would use the fallback base URL (which would make its href getter weirdly inconsistent I think if that did not use that concept). This does not seem true for about:blank:

<iframe onload="var d=this.contentDocument,b=d.createElement('base');b.href='t';d.head.appendChild(b);w(d.baseURI);w(b.href)"></iframe>

It does seem true for srcdoc, but only in Firefox, and it indeed weirdly inconsistent:

<iframe srcdoc="" onload="var d=this.contentDocument,b=d.createElement('base');b.href='t/';d.head.appendChild(b);w(d.baseURI);w(b.href)"></iframe>

logs http://.../t/ (a full base URL) and t/. Chrome and Safari support this scenario just as bad as about:blank.

(Use http://software.hixie.ch/utilities/js/live-dom-viewer/ to run these examples for yourself.)

So I guess this raises these questions:

  1. Should the fallback base URL be used since it appears to be not used by Chrome and Safari. (I think it should be used.)
  2. Should about:blank and srcdoc behave the same here. (I think so.)
  3. Should the href getter use the fallback base URL? (I think so.)
  4. Should we use the empty string in the failure case? (I think so, though I could go either way on this one given that returning the input value is more consistent with other URL getters. @zcorpan?)

annevk added a commit that referenced this issue Apr 18, 2016
@annevk annevk added the compat Standard is not web compatible or proprietary feature needs standardizing label Apr 18, 2016
@annevk
Copy link
Member

annevk commented Apr 18, 2016

I created a PR that addresses this per my assumptions of how we think this should work. @foolip @bzbarsky, @hober, would appreciate your input.

@zcorpan
Copy link
Member

zcorpan commented Apr 18, 2016

Re 4: I noticed the failure case issue in #561 but missed to fix it. Per #561 (comment) there is little interop. I think I prefer returning the input so it's more consistent.

@bzbarsky
Copy link
Contributor Author

@annevk Before I start looking at the PR, can you explain to me what you're trying to change? Are you changing what the href getter returns, or are you changing what the actual base URL of the document gets set to (what this issue was about), or both?

Have you put your testcases somewhere publicly loadable? I'm happy to test IE and Edge for you, but don't really want to try to recreate your testcases in that environment; even loading a URL is pain enough there, much less typing things.

@bzbarsky
Copy link
Contributor Author

bzbarsky commented Apr 18, 2016

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4079

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4080

  • Safari: Logs empty string.
  • Chrome: Same as Safari.
  • Firefox: Logs "//test:test".
  • IE11: Logs "http://test:test".
  • Edge: Same as IE11.

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4081

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4082

http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4083

Safari: Logs null and empty string.
Chrome: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4083" and empty string.
Firefox: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/t/" and "t/".
IE11: Logs undefined and "http://software.hixie.ch/utilities/js/live-dom-viewer/t".
Edge: Logs "http://software.hixie.ch/utilities/js/live-dom-viewer/t" twice.

What behavior are you aiming for with the PR on these testcases?

Oh, and as far as the Firefox behavior here, some notes:

  1. The "fallback base URI" implementation in Gecko doesn't work correctly for "about:blank" yet, since that case is only pretty recently defined: it just returns the document URI in that case. There are some code comments to that effect. That explains why you see the inconsistency between srcdoc and about:blank there. The about:blank setup in Firefox is kinda weird: we store an explicit base URI override at the moment, and it gets clobbered if you add in a element, which is what you see on http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4082.
  2. As I mentioned on IRC, in Gecko if the href value can't be parsed as a URI .href will return the original value; that explains what's going on with http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4080
  3. The .href getter in Firefox resolves the attribute value relative to the document URI, not the document's fallback base URI. That explains what's going on with http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4081 and http://software.hixie.ch/utilities/js/live-dom-viewer/saved/4083. I consider this behavior a Gecko bug that we should just fix.

@annevk
Copy link
Member

annevk commented Apr 19, 2016

Apart from the failure scenario (for which the PR matches Safari/Chrome, but should probably be updated to match Firefox per @zcorpan's comment), the PR matches Edge (I'm assuming you have some copypasta and Edge actually logs a trailing slash as well for the final test).

Currently the PR also takes into account a "frozen base URL" if the element has one, so that when CSP applies the href getter returns the fallback base URL and not the base URL that was blocked, but I should probably simply remove that since it's not worth the complexity (just use document.baseURI if you want to know the base URL).

@bzbarsky
Copy link
Contributor Author

I'm assuming you have some copypasta and Edge actually logs a trailing slash as well for the final test

Er, right you are. Sorry about that.

I assume the "failure scenario" is http://software.hixie.ch/utilities/js/live-dom-viewer/?saved=4080? If so, I'm ok with either the WebKit or Gecko behavior there, I guess; there's no way to win.

I agree that having .href just return the thing you would set the base URL to if it were not for CSP is a lot more useful than returning the fallback base URL in that situation.

The changes in the PR look good with those two caveats.

annevk added a commit that referenced this issue Apr 20, 2016
Always parse <base> elements against the document's fallback base URL. When parsing fails, returns their href attribute value. (Note that it can only fail if there is an href attribute value.) This matches the semantics of "reflect", but does not directly use that mechanism since it does not support arbitrary base URLs and such.

Fixes #1060. (That issue also has links to tests and results for contemporary browsers.)
@annevk
Copy link
Member

annevk commented Apr 20, 2016

The PR that landed uses the Gecko handling for failure and does not let CSP have an effect on the href getter.

@annevk
Copy link
Member

annevk commented Apr 20, 2016

Thank you for reporting this!

@bzbarsky
Copy link
Contributor Author

I included some web platform tests in the patch I just put up in https://bugzilla.mozilla.org/show_bug.cgi?id=1266077

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compat Standard is not web compatible or proprietary feature needs standardizing
Development

No branches or pull requests

4 participants