-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fallback base URL computation for a document with about:srcdoc but not an iframe srcdoc document #3989
Comments
It really feels like we should not have documents whose URL is about:srcdoc that are not srcdoc documents... It causes problems of all sorts, not just this. |
yeah, tweaking |
Are there other ways to cause documents to have about:srcdoc URLs, besides document.open()? If that's the only one, then probably we should fix it. And now's the time to do so, as we're making other changes/alignments in that area. |
I have no idea about other ways to set about:srcdoc to an existing document. "URL and history update steps" is used by |
|
A |
I wonder if we should port the
check into the URL and history update steps generally. I guess that would make document.open() throw in this case; perhaps we want it to no-op instead. This would also impact blob: and data: URL-derived documents. (I believe those can be same origin... haven't checked.)
I wasn't able to find this in the spec. |
On further reflection this seems unlikely to work. There is probably lot of code that does things like const win = window.open("about:blank");
win.document.open();
win.document.write("<a href='/foo'>");
win.document.close(); and expects the relative link to work. With my "I wonder..." proposal the relative link would instead of fail to resolve relative to So, we are back to either a guard against |
Hmm. The spec used to define something here (I can't find old versions well enough on my current network to look up what), but now https://html.spec.whatwg.org/multipage/browsing-the-web.html#navigating-across-documents step 13 javascript: case substep 13 just creates a Response with no obvious URL. And then https://html.spec.whatwg.org/multipage/browsing-the-web.html#process-a-navigate-response does:
which is definitely not what used to be going on there; here used to be an "override URL" and whatnot. It's possible that that URL was the URL of the Document that the evaluation happens in, not of the one where the URL came from. But the current spec doesn't seem to do either one, afaict... |
Any solution related to this? |
@bzbarsky it was removed as part of #3946. I recall pointing that out as a potential problem, but I cannot find me doing so. To take a step back here, we have various things that affect state of documents and it would be really good to have a consistent story. All the state that is expected to be copied onto And then we have certain special URLs that we generally do not want to leak (e.g., via referrer) that we perhaps should avoid copying over as a somewhat general policy. So that if you have a At a high-level this would make sense to me, but I don't think I have the bandwidth right now to drive all the necessary changes and tests. |
Er... why would that commit have changed the behavior of javascript: loads??? If it did, that's a bug in the commit and we should fix it, no?
Possibly, yes. It's a little hard to say what semantics people expect out of |
@domenic @TimothyGu do you remember why you removed the "override URL" concept in #3946 despite it also affecting |
@TimothyGu may recall better, but my best attempt at recollection was that we did a bit of testing and found it was Gecko-only. Don't hold me to that though. Edit: the linked PR says
|
That's talking about the |
(And to be clear, this was a use of override URL that had nothing to do with "overridden reload" and which also got removed!) |
Er, no, the current spec behavior is unclear to me. https://html.spec.whatwg.org/multipage/browsing-the-web.html#initialise-the-document-object says to use the URL of the request if there is a request, otherwise to use the URL of the response. But in this case https://html.spec.whatwg.org/multipage/browsing-the-web.html#read-html doesn't pass a request or a response to those steps, though we might assume it's at least passing the response handed to <https://html.spec.whatwg.org/multipage/browsing-the-web.html#process-a-navigate- response>. That response has no URL, though, in the javascript: case, and the only request around is the one with the |
I think I understand the concern here. I reviewed my notes and chat logs but I can't find anything that specifically talks about
A PR to restore the concept of "override URL" solely for the purpose of |
Thanks @TimothyGu. I guess what would be most correct would be to test browsers and figure out what they do, and write extensive tests, but failing that, restoring us to a non-broken spec would be a good idea, yeah. /cc @natechapin in case he has also looked at this aspect of javascript: URLs. |
I agree. Thank you for the link to the old spec text: that pretty clearly says that the result of a javascript: evaluation gets the URL of the document it was evaluated against, not of the document that started the navigation (as I said above). I think that matches the Chrome behavior, in my limited testing. |
This concept was deleted in 6440cca along with "overridden overload", but it was discovered in whatwg#3989 that correct handling for javascript: URLs still depend on this concept. The spec around this area has changed quite a bit since then due to intermediate cleanups such as e13762f, so this is not a simple revert. Also, fix ungrammatical "append URL URL list" that was accidentally introduced in e13762f. Addresses some concern raised in whatwg#3989.
This concept was deleted in 6440cca along with "overridden overload", but it was discovered in whatwg#3989 that correct handling for javascript: URLs still depend on this concept. The spec around this area has changed quite a bit since then due to intermediate cleanups such as e13762f, so this is not a simple revert. Also, fix ungrammatical "append URL URL list" that was accidentally introduced in e13762f. Addresses some concern raised in whatwg#3989.
A series of changes starting with 6440cca which removed the override URL concept as well as intermediate cleanups such as e13762f eroded the correctness of javascript: document URLs. Now it seems to be the case that fixing this typo restores the correct handling for such URLs. Addresses some concern raised in whatwg#3989.
A series of changes starting with 6440cca which removed the override URL concept as well as intermediate cleanups such as e13762f eroded the correctness of javascript: document URLs. Now it seems to be the case that fixing this typo restores the correct handling for such URLs. Addresses some concerns raised in #3989.
This monster completely rewrites everything to do with navigation and traversal. It introduces the "navigable" and "traversable navigable" concepts, which take on many of the roles that browsing contexts previously did, but better. A navigable can present a sequence of browsing contexts, which to the user seem to all be the same, but due to browsing context group switches, have different WindowProxys and are allocated in different agent clusters. A traversable navigable manages the session history for itself and all its descendant navigables, providing a synchronization point and source of truth. The general flow of navigation and traversal is now geared toward creating a session history entry, populated with the appropriate document, before finally applying the history "step". The step concept for session history, managed by the traversable, replaces the previous idea of joint session history, which was a sort of deduplicated union of individual session histories for each browsing context within a top-level browsing context. Notable things we won't tackle this round, but are much easier to tackle in the future: - Iframe restoration on (non-bfcache) history traversal is not yet specified. - Overlapping navigations and traversals (see #6927) are not perfect yet, although this makes them better. - Browsing context names (see #313) are not perfect yet, although this makes them better. - Base URL inheritance and storage in session history (see #421, #2883, and #3989) is not yet specified. - Sandbox flag storage in session history (see #6809) is not yet specified. - Task queuing when creating agents/realms/windows/documents (see #8443) remains sketchy. - Window object reuse is not yet rationalized (see #3267). Closes #854 by clarifying the javascript: URL origin and origin-checking setup. Closes #1073 by properly resetting active-ness of documents when they are removed. Closes #1130 by removing the source browsing context concept, using a sourceDocument argument instead, and taking source snapshot params at the appropriate early time. Closes #1191 by properly sharing document state across documents, as well as overlapping same-document navigations plus cross-document traversals. Closes #1336 by properly handling child browsing contexts. Closes #1382 by only unloading after we are sure we have a new document (i.e., not a 204 or download). Closes #1454 by rewriting session history closer to what implementations do, with the nested history concept in particular taking care of the issues discussed there. Closes #1524 by introducing the POST data concept and storing it in the document state. Closes #2436 by rewriting the spec for history.go() to be clear about the results. Tests: web-platform-tests/wpt#36366. Closes #2566 by introducing an explicit "history object" definition. Tests: web-platform-tests/wpt#36367. Closes #2649 through clear creation of srcdoc documents, including during history traversal. Closes #3215 by preserving POST data and reusing it on reloads. Closes #3447 by specifying a precise mechanism (the ongoing navigation) for canceling navigations, and the points at which that mechanism is consulted. It also stops queuing a task for hyperlink navigations. Closes #3497 by posting appropriate tasks for cross-event-loop navigations. Closes #3615 by rewriting traverse a history by a delta, which eventually calls into apply the history step, to navigate all relevant navigables. Closes #3625 by storing information in the document state (not just the URL), so that future traversals can reconstruct the request appropriately. Closes #3730 by doing proper task queuing for navigation, including one for javascript: URLs but not including one for normal same-frame navigations. Tests: web-platform-tests/wpt#36358. Closes #3734 by rewriting the definition of script-closable to use well-defined concepts. Closes #3812 by removing all uses of "active document" as a predicate instead of a property. Closes #4054 by introducing the session history traversal queue and renaming the previous "history traversal task source" to "navigation and traversal task source". Closes #4121 by doing the "allowed to navigate" check at the top of apply the history step. Closes #4428 by keeping a strong reference from documents (including bfcached documents) to their containing browsing context. Closes #4782 by introducing the top-level traversable and navigable concepts. Closes #4838 by doing sandbox checking in a much more precise manner, in particular snapshotting the relevant flags early in any traversals. Closes #4852 by using document state (in particular history policy container, request referrer, and request referrer policy) in reloads. Closes #5103 by properly restoring scroll positions for everything that is traversed, as part of properly traversing more than one navigable. Closes #5350 by properly restoring window names across browsing context group switches, and going back to the same browsing context as was previously there when traversing back across a BCG switch boundary. (Implementations could create new browsing contexts, as long as they restore the WindowProxy scripting relationships and other browsing context features; the result is observably equivalent.) Closes #5597 by rewriting "allowed to download" to just take booleans, derived from the appropriate snapshotted or computed sandboxing flags. Closes #5767, modulo bugs and oversights we made, by rewriting everything :). Closes #5877 by re-specifying "fully active" in terms of navigables, instead of browsing contexts. Closes #6446 by properly firing beforeunload to all descendant navigables, although whether or not they actually prompt still allows implementation leeway. Closes #6483 by introducing the distinction between current session history entry and active session history entry. Closes #6514 by settling on using a single origin for these checks. Closes #6628 by storing window.name values in the document state, so even in strange splitting situations like described there, they remain. Closes #6652 by no longer changing history.state when reactivating a document from bfcache ("restore the history object state" is called only when documentsEntryChanged is true). Tests: web-platform-tests/wpt#36368. Closes #6773 by having careful handling of synchronous navigations during traversals. Test updates: web-platform-tests/wpt#36364. Closes #6798 by treating javascript: URL navigations as replacements. Works towards #6809 by storing srcdoc resources in the document state. Closes #6813 by storing referrer in the document state. Tests for the repopulation case: web-platform-tests/wpt#36352. (No tests yet for the reload case.) Closes #6947 by rolling its contents into this change: PDF documents are put in the same category as other inaccessible, no-DOM documents. Closes #7107 by clearing history state on redirects and when origin changes by other means, such as CSP. Closes #7441 by making window.blur() a no-op because that was simpler than updating it to operate on navigables. Closes #7722 by incorporating its contents into the rewritten version. Closes #8295 by refactoring the iframe/frame load event specs to avoid the bug. Helps with #8395 by at least ensuring the javascript: case does not fire beforeunload. Tests: web-platform-tests/wpt#36488. (The other cases remain open for investigation and testing.) Closes #8449 by exporting "create a fresh top-level traversable" which is designed for the use case in question. Co-authored-by: Domenic Denicola <d@domenic.me> Co-authored-by: Dominic Farolino <domfarolino@gmail.com>
This gives an "about base URL" member to Document, document state, and navigation params. The intention is to capture a Document's creator's base URL when creating a new browsing context, and preserve it to (1) the newly created Document itself, and (2) the newly-created document state. Notably, preserving it in document state means that the same base URL is used when we recreate the Document while traversing the session history. For the navigation case, we capture the initiator's base URL in the navigate algorithm as initiatorBaseURLSnapshot (alongside initiatorOriginSnapshot). This eventually threads through, via the document state and navigation params, to the point where we initialize a new Document object. Finally, we remove the concept of a browsing context's creator base URL algorithm, and update the fallback base URL algorithm accordingly to refer to the relevant Document's new "about base URL" member. This is all rather different from how the previous specification works. Previously, behavior differed between about:srcdoc and about:blank; base URL changes were supposed to be inherited in a live, not snapshotted, fashion; sometimes the navigation initiator was used and sometimes the browsing context creator/embedder; and the spec "crashed" for disconnected srcdoc iframes. Closes #421. Closes #2883. Closes #3989.
Demo: http://jsfiddle.net/int32_t/oztck57e/5/
Bug report from a web author: https://crbug.com/875642
Specification link 1: https://html.spec.whatwg.org/multipage/dynamic-markup-insertion.html#document-open-steps Step 11.3
Specification link 2: https://html.spec.whatwg.org/multipage/urls-and-fetching.html#fallback-base-url
The demo page has three nested documents,
doc1
doc2
According to the specification,
doc2
's URL will beabout:srcdoc
bydoc2.open()
. See specification link 1. After that, fallback base URL computation fordoc2
returnsabout:srcdoc
because the document is neither an iframe srcdoc document norabout:blank
. See specification link 2.Chrome currently follows the specification. So relative URLs in
doc2
can't be resolved.Firefox throws an exception on
doc2.open()
. Sodoc2
's URL is not changed.Safari's behavior is not understandable. But anyway the base URL of
doc2
is the URL of the top document.Proposal:
I'd like to change the specification so that
doc2
has the top document URL as the fallback base URL in this case. For example, changingto
in https://html.spec.whatwg.org/multipage/urls-and-fetching.html#fallback-base-url
The text was updated successfully, but these errors were encountered: