Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(url.ml): set_fragment need not any urlencode #1497

Merged
merged 3 commits into from
Nov 11, 2023

Conversation

erikmd
Copy link
Contributor

@erikmd erikmd commented Jul 27, 2023

Dear js_of_ocaml maintainers,

This patch comes from an issue spotted in learn-ocaml (which heavily relies on js_of_ocaml), namely:

Note: I didn't recompile js_of_ocaml itself yet to test it, but I'm confident of the fix; to summarize what I understood:

  1. learn-ocaml handles local urls of the form http://localhost:8080/exercises/tp1/#tab=report;
  2. it contains some (js_of_)ocaml code such as Url.Current.set_fragment "tab=report" to this aim;
  3. unfortunately the browser URL was always http://localhost:8080/exercises/tp1/#tab%3Dreport, which is not that legible!;
  4. replacing the line of point 2. above with Dom_html.window##.location##.hash := Js.bytestring "tab=report" solves the issue.

Hence this small patch; WDYT?

@hhugo
Copy link
Member

hhugo commented Jul 28, 2023

We should probably check that get_fragment (set_fragment ()) round trip

@erikmd
Copy link
Contributor Author

erikmd commented Jul 29, 2023

Hi @hhugo, that's a good point, thanks!
on the one hand, I'm pretty sure this equality now holds;
on the other hand, I'm not aware enough of the js_of_ocaml codebase to be able to add a unit test for this myself!

@hhugo
Copy link
Member

hhugo commented Aug 2, 2023

@erikmd, the round-tripping seems to work more often with your PR. In particular, it works with -._~!$&'()*+,;=:@.
As far as I understand, location.hash just returns what's displayed on the url bar of my web browser. Setting location.hash will percent encode some characters. @dbuenzli, it seems you've dived into this mess recently. Any opinion on what should be done here ?

@dbuenzli
Copy link
Contributor

dbuenzli commented Aug 2, 2023

Not really it's difficult for me to say what is right here because:

  1. This API doesn't say exactly what it tries to expose and in particular if arguments are supposed to be percent encoded or not.
  2. I do not know how browsers render locations. Some stuff gets percent encoded some doesn't (e.g. the encoding of accented characters) I don't know if there's a specification for that.
  3. This module seems to be doing its own URL encoding rather than use encodeURI or encodeURIComponent.

Also Brr doesn't use Location objects which seems to be a different URI like structure (I conjecture kept for compat reasons). The API exposed by brr to interact with the browser location is to capture the location as an URL object and set the location from an URL object (and Brr does not try to mutate URL objects).

In any case if you are interested in what happened in Brr the API is here and the recent discussion is there.

@erikmd
Copy link
Contributor Author

erikmd commented Aug 2, 2023

@hhugo the round-tripping seems to work more often with your PR

Good! but out of curiosity, when do you think this does not "work"?

As far as I understand, location.hash just returns what's displayed on the url bar of my web browser.

yes, almost: it returns what follows the #. You can also try running location.hash; in the browser console.

Setting location.hash will percent encode some characters

this is maybe another issue. See my initial comment. To sum up, more than 8 years ago, there was a bug in firefox about the location.getter. Hence the somewhat involved code before this PR that did not simply call the location.hash property.

This bug is now solved, so I'd suggest we also simplify the getter code along the way.

I will push another commit, @hhugo could you run the same tests? (maybe with Firefox + another Chromium-based browser)

@erikmd
Copy link
Contributor Author

erikmd commented Aug 2, 2023

Hi @dbuenzli (disclaimer: I'm not expert in js_of_ocaml! I just give my thoughts based on standard specs)

  • This API doesn't say exactly what it tries to expose and in particular if arguments are supposed to be percent encoded or not.

IMO the API should faithfully map JavaScript properties to OCaml ones.

And according to https://developer.mozilla.org/en-US/docs/Web/API/Location/hash:
"The fragment is not URL decoded."

  • I do not know how browsers render locations.

According to examples from https://stackoverflow.com/questions/1703552/encoding-of-window-location-hash:

  1. running location.hash = "a=b"; location.hash in the console round-trips, so = should not be escaped (it was before this PR)
  2. the browser renders UTF-8 location.hash as is but internally they are escaped, e.g. if we run location.hash="ü" in the console, we see as is, but if we copy the entire URL (or do location.hash in the console), we see that internally, the hash only uses ASCII chars. Anyway, this does not hinder URL semantics. Indeed if one creates a dummy HTML page with <div id="ü> at the bottom, both file:///tmp/foo.html#%C3%BC and file:///tmp/foo.html#ü scrool to the anchor.
  • Some stuff gets percent encoded some doesn't (e.g. the encoding of accented characters) I don't know if there's a specification for that.

RFC 3986 says:
the data should first be encoded as octets according to the UTF-8 character encoding; then only those octets that do not correspond to characters in the unreserved set should be percent-encoded.

Indeed, there was a spurious urlencoding before this PR, which should not be there.

@hhugo
Copy link
Member

hhugo commented Aug 3, 2023

Good! but out of curiosity, when do you think this does not "work"?

  • set "#" -> get ""
  • set "#ABC" -> get "ABC"
  • set " " -> get "%20"
  • set "\001" -> get "%01"

@hhugo could you run the same tests?

You can test your self by building the toplevel

dune build toplevel/examples/lwt_toplevel/ && chromium-browser _build/default/toplevel/examples/lwt_toplevel/index.html

@dbuenzli
Copy link
Contributor

dbuenzli commented Aug 3, 2023

IMO the API should faithfully map JavaScript properties to OCaml ones.

I don't really care about what it does. I'm just saying that the API docs should mention what is being done.

RFC 3986 says:

That's unrelated to the problem I mentioned (and so is checking stuff in the browser console).

Indeed, there was a spurious urlencoding before this PR, which should not be there.

That was not my point. The point is that this module has its own functions to perform url encoding.

yurug pushed a commit to ocaml-sf/learn-ocaml that referenced this pull request Aug 18, 2023
@hhugo hhugo merged commit 4fab80d into ocsigen:master Nov 11, 2023
14 of 15 checks passed
hhugo pushed a commit to hhugo/opam-repository that referenced this pull request Dec 4, 2023
CHANGES:

## Features/Changes
* Compiler: global dead code elimination (Micah Cantor, ocsigen/js_of_ocaml#1503)
* Compiler: change control-flow compilation strategy (ocsigen/js_of_ocaml#1496)
* Compiler: loop no longer absorb the whole continuation
* Compiler: Dead code elimination of unused references (ocsigen/js_of_ocaml#2076)
* Compiler: reduce memory consumption (ocsigen/js_of_ocaml#1516)
* Compiler: support for import and export construct in the js parser/printer
* Lib: add download attribute to anchor element
* Misc: switch CI to OCaml 5.1
* Misc: preliminary support for OCaml 5.2
* Misc: support for OCaml 5.1.1

## Bug fixes
* Runtime: fix Dom_html.onIE (ocsigen/js_of_ocaml#1493)
* Runtime: add conversion functions + strict equality for compatibility with Wasm_of_ocaml (ocsigen/js_of_ocaml#1492)
* Runtime: Dynlink should be able to find symbols in jsoo_runtime ocsigen/js_of_ocaml#1517
* Runtime: fix Unix.lstat, Unix.LargeFile.lstat (ocsigen/js_of_ocaml#1519)
* Compiler: fix global flow analysis (ocsigen/js_of_ocaml#1494)
* Compiler: fix js parser/printer wrt async functions (ocsigen/js_of_ocaml#1515)
* Compiler: fix free variables pass wrt parameters' default value (ocsigen/js_of_ocaml#1521)
* Compiler: fix free variables for classes
* Compiler: fix internal invariant (continuation)
* Compiler: fix variable renaming for let, const and classes
* Lib: Url.Current.set_fragment need not any urlencode (ocsigen/js_of_ocaml#1497)
mseri pushed a commit to ocaml/opam-repository that referenced this pull request Dec 6, 2023
CHANGES:

## Features/Changes
* Compiler: global dead code elimination (Micah Cantor, ocsigen/js_of_ocaml#1503)
* Compiler: change control-flow compilation strategy (ocsigen/js_of_ocaml#1496)
* Compiler: loop no longer absorb the whole continuation
* Compiler: Dead code elimination of unused references (ocsigen/js_of_ocaml#2076)
* Compiler: reduce memory consumption (ocsigen/js_of_ocaml#1516)
* Compiler: support for import and export construct in the js parser/printer
* Lib: add download attribute to anchor element
* Misc: switch CI to OCaml 5.1
* Misc: preliminary support for OCaml 5.2
* Misc: support for OCaml 5.1.1

## Bug fixes
* Runtime: fix Dom_html.onIE (ocsigen/js_of_ocaml#1493)
* Runtime: add conversion functions + strict equality for compatibility with Wasm_of_ocaml (ocsigen/js_of_ocaml#1492)
* Runtime: Dynlink should be able to find symbols in jsoo_runtime ocsigen/js_of_ocaml#1517
* Runtime: fix Unix.lstat, Unix.LargeFile.lstat (ocsigen/js_of_ocaml#1519)
* Compiler: fix global flow analysis (ocsigen/js_of_ocaml#1494)
* Compiler: fix js parser/printer wrt async functions (ocsigen/js_of_ocaml#1515)
* Compiler: fix free variables pass wrt parameters' default value (ocsigen/js_of_ocaml#1521)
* Compiler: fix free variables for classes
* Compiler: fix internal invariant (continuation)
* Compiler: fix variable renaming for let, const and classes
* Lib: Url.Current.set_fragment need not any urlencode (ocsigen/js_of_ocaml#1497)
@erikmd erikmd deleted the fix_set_fragment branch February 29, 2024 23:47
nberth pushed a commit to nberth/opam-repository that referenced this pull request Jun 18, 2024
CHANGES:

## Features/Changes
* Compiler: global dead code elimination (Micah Cantor, ocsigen/js_of_ocaml#1503)
* Compiler: change control-flow compilation strategy (ocsigen/js_of_ocaml#1496)
* Compiler: loop no longer absorb the whole continuation
* Compiler: Dead code elimination of unused references (ocsigen/js_of_ocaml#2076)
* Compiler: reduce memory consumption (ocsigen/js_of_ocaml#1516)
* Compiler: support for import and export construct in the js parser/printer
* Lib: add download attribute to anchor element
* Misc: switch CI to OCaml 5.1
* Misc: preliminary support for OCaml 5.2
* Misc: support for OCaml 5.1.1

## Bug fixes
* Runtime: fix Dom_html.onIE (ocsigen/js_of_ocaml#1493)
* Runtime: add conversion functions + strict equality for compatibility with Wasm_of_ocaml (ocsigen/js_of_ocaml#1492)
* Runtime: Dynlink should be able to find symbols in jsoo_runtime ocsigen/js_of_ocaml#1517
* Runtime: fix Unix.lstat, Unix.LargeFile.lstat (ocsigen/js_of_ocaml#1519)
* Compiler: fix global flow analysis (ocsigen/js_of_ocaml#1494)
* Compiler: fix js parser/printer wrt async functions (ocsigen/js_of_ocaml#1515)
* Compiler: fix free variables pass wrt parameters' default value (ocsigen/js_of_ocaml#1521)
* Compiler: fix free variables for classes
* Compiler: fix internal invariant (continuation)
* Compiler: fix variable renaming for let, const and classes
* Lib: Url.Current.set_fragment need not any urlencode (ocsigen/js_of_ocaml#1497)
vouillon added a commit to ocaml-wasm/wasm_of_ocaml that referenced this pull request Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants