-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Revamp about:blank and about:srcdoc iframe/popup base URL inheritance #421
Comments
Looking closer, I can't find any connection at all between HTML's various base URL concepts and DOM's base URL. The string "concept-document-base-url" appears nowhere in the generated HTML source, and the only link with exactly the text "base URL" is in script settings for browsing contexts but in fact refers to HTML's document base URL concept like all the rest. @annevk, is this all entirely broken? How many concepts are the supposed to be, and how do they fit together? |
The DOM definition doesn't seem quite right; I'd think we'd always want HTML's "document base URL" (except for resolving the |
My idea was that the DOM defines a base URL concept for documents and HTML takes care of setting it as appropriate. None of this is quite done. |
On the Blink bug, @bzbarsky said "My (and Firefox's) definition is the one that's in the spec, and the one that happens to make sense: .baseURI returns the thing that's used as the base URI." Although I can't make sense of the specs, I agree that it would make a lot of sense if what If somebody has a detailed model in mind for this, it would be interesting to see in which ways Blink differs. |
@foolip So on the HTML spec end, I'm not sure what this issue is even about. The "should inherit" behavior is already covered by the "fallback base URL" as defined at https://html.spec.whatwg.org/#fallback-base-url and the corresponding use in https://html.spec.whatwg.org/#document-base-url Maybe the question is whether this is the right way to spec it, since it doesn't actually match any implementations?
Sure. Each document has a "baseURL" field, which may be null. Each document has a concept of "base URL", which is computed as follows: if the document is a srcdoc document and has a parent document (long story about why it might not have the latter, involving references to documents that outlive their browsing context), return the "base URL" of the parent document. Otherwise if the "baseURL" field is not null, return that. Otherwise return the "document's address" in HTML spec terms. The "baseURL" field is set by several things; this may not be quite right because I'm trying to exclude codepaths that are only accessible to extensions, sorry:
I fully expect that no one else does anything like item 6 in that list. It was a quick and easy hack to make about:blank generally more or less web-compatible a long time ago... |
When filing this issue, I was looking for some kind of interaction with DOM's "base URL", and I expected the "base URL" field to be updated somewhere in https://html.spec.whatwg.org/#creating-a-new-browsing-context, which does not happen. You're right that HTML's computed "document base URL" and "fallback base URL" explain why things are resolved correctly, but the DOM+HTML specs taken together don't explain why So, we need to figure out at least:
There's a lot of small differences here, not sure how to proceed. |
That's fair. My main design requirments here are (1) web compat, whatever that means and (2) having |
Yeah, I agree with those goals. It sounds like tests for a number of different documents are needed.
For each of those, check
|
This definition is more correct right now, and using a getter rather than a mutable field makes it easier to figure out what is returned. CC <whatwg/html#421>.
This definition is more correct right now, and using a getter rather than a mutable field makes it easier to figure out what is returned. See also: whatwg/html#421.
@bzbarsky you mostly describe base URL as a static field, but for srcdoc it's a computation? Or do we compute it once for srcdoc too and then update it as things change? |
@foolip tests for XMLHttpRequest: web-platform-tests/wpt#5863. |
My comments above predated Gecko doing sane things for base URIs in srcdoc. The current setup in Gecko is as follows:
This has the somewhat unintuitive behavior that if you start with an initial about:blank, insert a |
Why can't we set it when creating a srcdoc document too? |
(Aside: the idea is that for all |
We can. The question is whether the srcdoc should dynamically track the base URI of the parent or not (i.e. snapshot it at creation time) and how |
Okay. I tend to think we should just snapshot it at creation time (and then only change it for |
Drafted WPT just for Chromium: Takes snapshot of parent's base URL around the time of (The opposite direction from #5474, where Chromium reflects parent's referrer policy updates while Firefox takes snapshot) |
Some parts of the Chrome team (@csreis @wjmaclean) have started investigating this area in https://bugs.chromium.org/p/chromium/issues/detail?id=1356658 . We'd love to get interop on base URL inheritance in general, and I've volunteered to help with the spec discussions. As general background when talking about inheritance discussions, there are potentially two parties involved: creator (= embedder for iframes), and navigation initiator. These two are the same for the initial about:blank, but are not the same in general, including for non-initial about:blank or for about:srcdoc. I believe the team's proposal is:
Related issues are #2883 and #3989. (Plus #8105, which is a proposal for a change to limit how much of the base URL is inherited; but IMO we should only explore that after first getting interop.) Implications of this proposal:
|
This generally sounds good to me. Thank you for working on it! cc @cdumez |
One addition to the plan: we believe we'll need to store the base URL in the session history entry for such documents, just like we do for the origin currently. This helps in cases like: There are a few other possible ways to get the desired behavior here, but we think using the session history entry is nicest because it's symmetrical with what we're already doing with origin. |
This monster completely rewrites everything to do with navigation and traversal. It introduces the "navigable" and "traversable navigable" concepts, which take on many of the roles that browsing contexts previously did, but better. A navigable can present a sequence of browsing contexts, which to the user seem to all be the same, but due to browsing context group switches, have different WindowProxys and are allocated in different agent clusters. A traversable navigable manages the session history for itself and all its descendant navigables, providing a synchronization point and source of truth. The general flow of navigation and traversal is now geared toward creating a session history entry, populated with the appropriate document, before finally applying the history "step". The step concept for session history, managed by the traversable, replaces the previous idea of joint session history, which was a sort of deduplicated union of individual session histories for each browsing context within a top-level browsing context. Notable things we won't tackle this round, but are much easier to tackle in the future: - Iframe restoration on (non-bfcache) history traversal is not yet specified. - Overlapping navigations and traversals (see #6927) are not perfect yet, although this makes them better. - Browsing context names (see #313) are not perfect yet, although this makes them better. - Base URL inheritance and storage in session history (see #421, #2883, and #3989) is not yet specified. - Sandbox flag storage in session history (see #6809) is not yet specified. - Task queuing when creating agents/realms/windows/documents (see #8443) remains sketchy. - Window object reuse is not yet rationalized (see #3267). Closes #854 by clarifying the javascript: URL origin and origin-checking setup. Closes #1073 by properly resetting active-ness of documents when they are removed. Closes #1130 by removing the source browsing context concept, using a sourceDocument argument instead, and taking source snapshot params at the appropriate early time. Closes #1191 by properly sharing document state across documents, as well as overlapping same-document navigations plus cross-document traversals. Closes #1336 by properly handling child browsing contexts. Closes #1382 by only unloading after we are sure we have a new document (i.e., not a 204 or download). Closes #1454 by rewriting session history closer to what implementations do, with the nested history concept in particular taking care of the issues discussed there. Closes #1524 by introducing the POST data concept and storing it in the document state. Closes #2436 by rewriting the spec for history.go() to be clear about the results. Tests: web-platform-tests/wpt#36366. Closes #2566 by introducing an explicit "history object" definition. Tests: web-platform-tests/wpt#36367. Closes #2649 through clear creation of srcdoc documents, including during history traversal. Closes #3215 by preserving POST data and reusing it on reloads. Closes #3447 by specifying a precise mechanism (the ongoing navigation) for canceling navigations, and the points at which that mechanism is consulted. It also stops queuing a task for hyperlink navigations. Closes #3497 by posting appropriate tasks for cross-event-loop navigations. Closes #3615 by rewriting traverse a history by a delta, which eventually calls into apply the history step, to navigate all relevant navigables. Closes #3625 by storing information in the document state (not just the URL), so that future traversals can reconstruct the request appropriately. Closes #3730 by doing proper task queuing for navigation, including one for javascript: URLs but not including one for normal same-frame navigations. Tests: web-platform-tests/wpt#36358. Closes #3734 by rewriting the definition of script-closable to use well-defined concepts. Closes #3812 by removing all uses of "active document" as a predicate instead of a property. Closes #4054 by introducing the session history traversal queue and renaming the previous "history traversal task source" to "navigation and traversal task source". Closes #4121 by doing the "allowed to navigate" check at the top of apply the history step. Closes #4428 by keeping a strong reference from documents (including bfcached documents) to their containing browsing context. Closes #4782 by introducing the top-level traversable and navigable concepts. Closes #4838 by doing sandbox checking in a much more precise manner, in particular snapshotting the relevant flags early in any traversals. Closes #4852 by using document state (in particular history policy container, request referrer, and request referrer policy) in reloads. Closes #5103 by properly restoring scroll positions for everything that is traversed, as part of properly traversing more than one navigable. Closes #5350 by properly restoring window names across browsing context group switches, and going back to the same browsing context as was previously there when traversing back across a BCG switch boundary. (Implementations could create new browsing contexts, as long as they restore the WindowProxy scripting relationships and other browsing context features; the result is observably equivalent.) Closes #5597 by rewriting "allowed to download" to just take booleans, derived from the appropriate snapshotted or computed sandboxing flags. Closes #5767, modulo bugs and oversights we made, by rewriting everything :). Closes #5877 by re-specifying "fully active" in terms of navigables, instead of browsing contexts. Closes #6446 by properly firing beforeunload to all descendant navigables, although whether or not they actually prompt still allows implementation leeway. Closes #6483 by introducing the distinction between current session history entry and active session history entry. Closes #6514 by settling on using a single origin for these checks. Closes #6628 by storing window.name values in the document state, so even in strange splitting situations like described there, they remain. Closes #6652 by no longer changing history.state when reactivating a document from bfcache ("restore the history object state" is called only when documentsEntryChanged is true). Tests: web-platform-tests/wpt#36368. Closes #6773 by having careful handling of synchronous navigations during traversals. Test updates: web-platform-tests/wpt#36364. Closes #6798 by treating javascript: URL navigations as replacements. Works towards #6809 by storing srcdoc resources in the document state. Closes #6813 by storing referrer in the document state. Tests for the repopulation case: web-platform-tests/wpt#36352. (No tests yet for the reload case.) Closes #6947 by rolling its contents into this change: PDF documents are put in the same category as other inaccessible, no-DOM documents. Closes #7107 by clearing history state on redirects and when origin changes by other means, such as CSP. Closes #7441 by making window.blur() a no-op because that was simpler than updating it to operate on navigables. Closes #7722 by incorporating its contents into the rewritten version. Closes #8295 by refactoring the iframe/frame load event specs to avoid the bug. Helps with #8395 by at least ensuring the javascript: case does not fire beforeunload. Tests: web-platform-tests/wpt#36488. (The other cases remain open for investigation and testing.) Closes #8449 by exporting "create a fresh top-level traversable" which is designed for the use case in question. Co-authored-by: Domenic Denicola <d@domenic.me> Co-authored-by: Dominic Farolino <domfarolino@gmail.com>
This gives an "about base URL" member to Document, document state, and navigation params. The intention is to capture a Document's creator's base URL when creating a new browsing context, and preserve it to (1) the newly created Document itself, and (2) the newly-created document state. Notably, preserving it in document state means that the same base URL is used when we recreate the Document while traversing the session history. For the navigation case, we capture the initiator's base URL in the navigate algorithm as initiatorBaseURLSnapshot (alongside initiatorOriginSnapshot). This eventually threads through, via the document state and navigation params, to the point where we initialize a new Document object. Finally, we remove the concept of a browsing context's creator base URL algorithm, and update the fallback base URL algorithm accordingly to refer to the relevant Document's new "about base URL" member. This is all rather different from how the previous specification works. Previously, behavior differed between about:srcdoc and about:blank; base URL changes were supposed to be inherited in a live, not snapshotted, fashion; sometimes the navigation initiator was used and sometimes the browsing context creator/embedder; and the spec "crashed" for disconnected srcdoc iframes. Closes #421. Closes #2883. Closes #3989.
Update as of 2023-06-01 by @domenic: this issue has expanded to cover general base URL inheritance for
about:blank
iframes/popups, andabout:srcdoc
iframes. See #421 (comment) for the current proposal for how to update the spec and achieve interop.Original 2015 post contents by @foolip:
This is about a Blink bug filed by @bzbarsky
For an iframe with no src, Edge uses the parent document's URL, while Firefox uses the parent document's base URL:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3791
It looks like the base URL is frozen is Edge, but can be affected by a later base URL change of the parent document in Firefox:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3792
In both cases, Blink uses "about:blank" as the base URL, but will actually look at the parent document's base URL while trying to resolve URLs in the iframe document.
Everyone seems to agree about the simpler case with a src attribute, that the iframe's URL is also its base URL:
http://software.hixie.ch/utilities/js/live-dom-viewer/saved/3793
This does not appear to be handled in https://html.spec.whatwg.org/#creating-a-new-browsing-context
@bzbarsky, can you describe what the model for this in Gecko is? Is it specific steps for iframe insertion, or does any of it also apply to frame elements, object elements,
createDocument()
orcreateHTMLDocument()
?The text was updated successfully, but these errors were encountered: