Re-write `PDFHistory` from scratch #8775

Snuffleupagus · 2017-08-13T15:58:14Z

Please refer to the individual commit messages.

Supersedes PRs #6272 and #8753; hopefully third time's the charm :-)

Edit: Based on this PR, we should also be able to finally address issue #5927 properly; see WIP patches available here.

Edit2: Rebased to fix merge conflicts, no code changes were made.

timvandermeij · 2017-08-27T17:17:45Z

/botio-linux preview

timvandermeij · 2017-08-27T17:22:51Z

/botio unittest

timvandermeij

Phew, that was quite a bit of code to review :)

In general these patches look really good! As is to be expected with such a large pull request, I have some questions and suggestions for improvement, but most of them are minor. After these comments have been adressed, I'll do more testing (initial testing did not yield any unexpected behavior yet) and then we can ship it. With approximately one month to go before a new Firefox release, this should give us enough time in case we need to tweak things.

timvandermeij · 2017-08-27T19:15:57Z

web/ui_utils.js

+ * @property {Object} target - The event target, can for example be:
+ *   `window`, `document`, a DOM element, or an {EventBus} instance.
+ * @property {string} name - The name of the event.
+ * @property {number} delay - The delay, in milliseconds.


Would the following perhaps be a bit more descriptive to indicate what the delay is for?

The delay, in milliseconds, after which a timeout occurs.

timvandermeij · 2017-08-27T19:27:44Z

test/unit/ui_utils_spec.js

+      }, done.fail);
+    });
+
+


Nit: remove duplicate empty line

Sure; also we might want to change the ESLint rule no-multiple-empty-lines to only allow one empty to avoid these issues.

timvandermeij · 2017-08-27T19:32:59Z

web/pdf_history.js

+    if (typeof first !== typeof second) {
+      return false;
+    }
+    if (first instanceof Array) {


Is it no problem if second is an array?

I figured that the typeof checks just above should handle that, but considering that typeof [] === 'object' you're obviously correct. There should be a check for second too, really good call!

timvandermeij · 2017-08-27T19:35:46Z

web/pdf_history.js

+    if (first instanceof Array) {
+      return false;
+    }
+    if (typeof first === 'object' && first !== null) {


Can we swap these conditions? It turns out that typeof null === 'object' evaluates to true, which is not what I would expect. By using first !== null && typeof first === 'object' you avoid this confusion a bit by stopping early if first === null. Moreover, I find it a bit more readable.

Don't we also need to check these conditions for second? If e.g., second is not an object, using Object.keys below fails I think.

Can we swap these conditions?

Yes, there's no particular reason for the ordering, so if you find that more readable that can be done.

Don't we also need to check these conditions for second?

Oh, we should probably check at least second !== null here too. That should be sufficient, since the initial typeof first !== typeof second check ought to handle the rest.

timvandermeij · 2017-08-27T20:10:17Z

web/pdf_history.js

+   * Push an internal destination to the browser history.
+   * @param {PushParameters}
+   */
+  push({ namedDest, explicitDest, pageNumber, }) {


This signature does not match the interface. The interface most likely needs to get the options parameter.

timvandermeij · 2017-08-27T21:29:00Z

web/pdf_history.js

+    _boundEvents.popState = this._popState.bind(this);
+    _boundEvents.pageHide = (evt) => {
+      // Attempt to push the `this._position` into the browser history when
+      // navigating away from the document, if the history is currently empty.


Why only if the history is empty? If there are history entries, I change the position and navigate away from the document, I would expect the last position to be retained when I navigate back.

By the time the pagehide event fires, it's too late to make (most) modifications of the history. That is also the whole reason for the existence of commit 281070b, please refer to the big comment starting at 281070b#diff-3b4ebbe223f93c9123a754b4845844e7R380.

At this point in time, since the viewer is already being unloaded, it's impossible to push a new entry to the browser history. What would happen instead here, is that the current browser history entry would be overwritten (which is definitely not what you want here).
The only case where that doesn't matter is if the history is empty, and for all other cases we'll instead rely on the temporary position instead here.

I'll try to amend this comment a bit, to make this clearer.

timvandermeij · 2017-08-27T21:37:43Z

web/pdf_history.js

+    if (UPDATE_VIEWAREA_TIMEOUT > 0) {
+      // When closing the browser, a 'pagehide' event will be dispatched which
+      // *should* allow us to push the current position to the browser history.
+      // In practice, it seems that the event is arriving to late in order for


Nit: too late

timvandermeij · 2017-08-27T21:39:31Z

web/pdf_history.js

+
+      if (this._destination.temporary) {
+        // Always replace a previous *temporary* position, before moving back.
+        this._pushOrReplaceState(this._position);


Should we force-replace here, since it always needs to happen?

Doing so would break things very badly, since it would overwrite the current history entry.

Since we get where when the user goes back in the history, the current browser history entry (window.history.state) now points to the new destination we should be navigating to.
By force-replacing in this case, you'd get stuck in the current position in the document, with all sorts of breakage as a result (history non-functional and so on).

Note also how we're not force-replacing in the case below either, for the same reasons.

This is quite complicated to wrap your head around at first (don't ask me how I know :-), so if you'd like I can try and explain it in more detail!

timvandermeij · 2017-08-27T21:43:54Z

web/pdf_history.js


-    // Ensure that we don't miss a 'presentationmodechanged' event, by
-    // registering the listener immediately.
+    // Ensure that we don't miss either a 'presentationmodechanged' or


Nit: or -> or a

timvandermeij · 2017-08-27T21:47:57Z

web/pdf_history.js

+      // To prevent `this._tryPushCurrentPosition()` from effectively being
+      // reduced to a no-op in this case, we will assume that the position
+      // *did* in fact change if the 'updateviewarea' event was dispatched
+      // more than `POSITION_UPDATED_THRESHOLD` times.


I'm not entirely sure what this means or what this commit aims to achieve. Why is it bad if _tryPushCurrentPosition is reduced to a no-op if the hash is invalid, and where does the heuristic come from? Could you elaborate a bit on what problem this commit solves and how it works (approximately)?

Consider loading the following document http://ftp.acc.umu.se/mirror/CTAN/info/lshort/english/lshort.pdf#pagemode=thumbs
In this case, the hash is defined, but it will obviously not cause a destination to be navigated to (which would set the page of its browser history entry). Furthermore, parseCurrentHash can obviously not extract a page either; #8775 (comment) might be somewhat relevant here too.

Given 1f41a59#diff-3b4ebbe223f93c9123a754b4845844e7R288, tryPushCurrentPosition won't ever update the browser history with the current position (unless push was called at some point). This could thus mean that despite commit 281070b, the browser history might not accurately capture the current position when the viewer is navigated away from or the browser closed.

Hence this heuristic, that attempts to "unlock" the adding of entries to the browser history when the position has updated "enough" times that it should be safe to assume that the position has in fact changed.
You could argue that this is a bit magical, especially since the value of the constant was basically pulled out of a hat, but it's my attempt to avoid permanently disabling certain parts of the implementation in e.g. the edge-case outlined above.
Note that in most cases, this heuristic won't even be necessary, but I figured that it shouldn't hurt to try and avoid this situation.

Does this help, if not ask again!

Snuffleupagus · 2017-08-28T10:38:37Z

Phew, that was quite a bit of code to review :)

Thank you for having the persistence, despite the size[1], to wade through all of it!!!

As is to be expected with such a large pull request, I have some questions and suggestions for improvement, but most of them are minor.

There's a number of good questions asked and I've tried my best to answer them[2], so please check if you're satisfied with the responses or if you think further clarifications are necessary!

In general, did you find that the existing comments helped, or were they to verbose?

I'll do more testing (initial testing did not yield any unexpected behavior yet)

That's very reassuring to hear, since the whole point of this re-write was to make the code more robust (and of course more maintainable too).

Again, thanks for taking the time to do such a thorough review :-)

[1] I really tried to break this up into somewhat smaller commits, but unfortunately that didn't work as well as I would have liked :-(

[2] It's sometimes quite difficult to know what parts will be confusing or difficult to understand for someone else, considering that you've got a lot of context that the reviewer might not have.

timvandermeij · 2017-08-28T20:45:49Z

Thank you for the clear responses! In general the comments really helped to understand the code, in particular the edge cases you're addressing, so I'm really glad that it's documented so well! You're right that having the context is key here, so a review is a good way to find out which parts may need further clarification.

The responses look good so far. For some of them I'll need some more time to compare them with the code and to really understand, which I'll do this week including the testing, so we can hopefully merge it at the end of the week if all goes well. Feel free to amend the commits in the meantime if you want; I'll check each comment above in the amended version after that.

The current implementation of `PDFHistory` contains a number of smaller bugs, which are *very* difficult to address without breaking other parts of its code. Possibly the main issue with the current implementation, is that I wrote it quite some time ago, and at the time my understanding of the various edge-cases the code has to deal with was quite limited. Currently `PDFHistory` may, despite most of those cases being fixed, in certain edge-cases lock-up the browser history, essentially preventing the user from navigating back/forward. Hence rather than trying to iterate on `PDFHistory` to make it better, the only viable approach is unfortunately rip it out in its entirety and re-write it from scratch.

…an event or a timeout, whichever occurs first

…tinations, in `pdf_history.js`

This patch completely re-implements `PDFHistory` to get rid of various bugs currently present, and to hopefully make maintenance slightly easier. Most of the interface is similar to the existing one, but it should be somewhat simplified. The new implementation should be more robust against failure, compared to the old one. Previously, it was too easy to end up in a state which basically caused the browser history to lock-up, preventing the user from navigating back/forward. (In the new implementation, the browser history should not be updated rather than breaking if things go wrong.) Given that the code has to deal with various edge-cases, it's still not as simple as I would have liked, but it should now be somewhat easier to deal with. The main source of complication in the code is actually that we allow the user to change the hash of a already loaded document (we'll no longer try to navigate back-and-forth in this case, since the next commit contains a workaround). In the new code, there's also *a lot* more comments (perhaps too many?) to attempt to explain the logic. This is something that the old implementation was serverly lacking, which is a one of the reasons why it was so difficult to maintain. One particular thing to note is that the new code uses the `pagehide` event rather than `beforeunload`, since the latter seems to be a bad idea based on https://bugzilla.mozilla.org/show_bug.cgi?id=1336763.

…viewer is idle This patch attempts to address an issue in the old `PDFHistory` implementation, where the current position wouldn't be correctly saved when the browser was closed. In theory this *should* already be working, however as the discussion in https://bugzilla.mozilla.org/show_bug.cgi?id=1153393 showed, it seems that both `pagehide` and `beforeunload` arrive to late to successfully update the history during closing. Hence a timeout is used to *temporarily* add the current position to the browser history when the viewer is idle. Note that we need to take care not to update the browser history too often, since that would render the viewer more or unusable. Furthermore, if the timeout is *too* long it may end up effectively disable this whole functionality. The `UPDATE_VIEWAREA_TIMEOUT` constant is thus a heuristic value, which we may need to tweak taking the above into account.

…ing a no-op if the document hash contains an invalid/non-existent destination By using the (heuristic) `POSITION_UPDATED_THRESHOLD` constant, we can ensure that the current document position will be added to the browser history when a sufficiently "large" number of `updateviewarea` events have been dispatched.

timvandermeij · 2017-09-03T17:51:51Z

/botio-linux preview

pdfjsbot · 2017-09-03T17:51:52Z

From: Bot.io (Linux m4)

Received

Command cmd_preview from @timvandermeij received. Current queue size: 0

Live output at: http://54.67.70.0:8877/6b6437d30606f31/output.txt

pdfjsbot · 2017-09-03T17:54:13Z

From: Bot.io (Linux m4)

Success

Full output at http://54.67.70.0:8877/6b6437d30606f31/output.txt

Total script time: 2.35 mins

Published

timvandermeij · 2017-09-03T17:54:40Z

/botio unittest

pdfjsbot · 2017-09-03T17:54:41Z

From: Bot.io (Linux m4)

Received

Command cmd_unittest from @timvandermeij received. Current queue size: 0

Live output at: http://54.67.70.0:8877/c3e152ce1da4ff8/output.txt

pdfjsbot · 2017-09-03T17:54:41Z

From: Bot.io (Windows)

Received

Command cmd_unittest from @timvandermeij received. Current queue size: 0

Live output at: http://54.215.176.217:8877/1038d17381090b9/output.txt

pdfjsbot · 2017-09-03T17:57:11Z

From: Bot.io (Linux m4)

Success

Full output at http://54.67.70.0:8877/c3e152ce1da4ff8/output.txt

Total script time: 2.51 mins

Unit Tests: Passed

pdfjsbot · 2017-09-03T18:07:19Z

From: Bot.io (Windows)

Success

Full output at http://54.215.176.217:8877/1038d17381090b9/output.txt

Total script time: 12.63 mins

Unit Tests: Passed

timvandermeij · 2017-09-03T18:39:39Z

Nice work! Looks really good and it's a massive improvement over the old implementation.

Snuffleupagus · 2017-09-04T07:59:51Z

Nice work! Looks really good and it's a massive improvement over the old implementation.

Thanks so much for helping with getting this reviewed (with lots of good points/questions) and landed! :-D

Re-write `PDFHistory` from scratch

Snuffleupagus added the viewer label Aug 13, 2017

mozilla deleted a comment from pdfjsbot Aug 27, 2017

timvandermeij requested changes Aug 27, 2017

View reviewed changes

Snuffleupagus added 6 commits August 30, 2017 19:45

Add a waitOnEventOrTimeout helper function that allows waiting for …

0c49855

…an event or a timeout, whichever occurs first

Add a isDestsEqual helper function, to allow comparing explicit des…

388851e

…tinations, in `pdf_history.js`

mozilla deleted a comment from pdfjsbot Sep 3, 2017

timvandermeij approved these changes Sep 3, 2017

View reviewed changes

timvandermeij merged commit 1c9af00 into mozilla:master Sep 3, 2017

Snuffleupagus deleted the rewrite-PDFHistory-2 branch September 4, 2017 07:59

Snuffleupagus mentioned this pull request Sep 7, 2017

Address a couple of edge-cases in the new PDFHistory implementation (PR 8775 follow-up) #8885

Merged

Snuffleupagus mentioned this pull request Sep 15, 2017

History not functional in presentation mode #7082

Closed

This was referenced Oct 5, 2017

Fix a regression that (effectively) makes PDFHistory.forward a no-op #8994

Merged

Fix a PDFHistory regression with document hashes of the nameddest=... form #9013

Merged

Snuffleupagus mentioned this pull request Oct 29, 2017

Zoom level is not maintained after Back navigation #9080

Closed

movsb pushed a commit to movsb/pdf.js that referenced this pull request Jul 14, 2018

Merge pull request mozilla#8775 from Snuffleupagus/rewrite-PDFHistory-2

5929b05

Re-write `PDFHistory` from scratch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Re-write `PDFHistory` from scratch #8775

Re-write `PDFHistory` from scratch #8775

Snuffleupagus commented Aug 13, 2017 •

edited

Loading

timvandermeij commented Aug 27, 2017

timvandermeij commented Aug 27, 2017

timvandermeij left a comment •

edited

Loading

timvandermeij Aug 27, 2017

timvandermeij Aug 27, 2017

Snuffleupagus Aug 28, 2017

timvandermeij Aug 27, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017

timvandermeij Aug 27, 2017

Snuffleupagus Aug 28, 2017 •

edited

Loading

timvandermeij Aug 27, 2017

timvandermeij Aug 27, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017 •

edited

Loading

timvandermeij Aug 27, 2017

timvandermeij Aug 27, 2017

Snuffleupagus Aug 28, 2017 •

edited

Loading

timvandermeij Aug 27, 2017

timvandermeij Aug 27, 2017

Snuffleupagus Aug 28, 2017 •

edited

Loading

Snuffleupagus commented Aug 28, 2017

timvandermeij commented Aug 28, 2017

timvandermeij commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

timvandermeij commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

timvandermeij commented Sep 3, 2017

Snuffleupagus commented Sep 4, 2017

Re-write PDFHistory from scratch #8775

Re-write PDFHistory from scratch #8775

Conversation

Snuffleupagus commented Aug 13, 2017 • edited Loading

timvandermeij commented Aug 27, 2017

timvandermeij commented Aug 27, 2017

timvandermeij left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timvandermeij Aug 27, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Snuffleupagus Aug 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

timvandermeij Aug 27, 2017 • edited Loading

Choose a reason for hiding this comment

Snuffleupagus Aug 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Snuffleupagus Aug 28, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Snuffleupagus Aug 28, 2017 • edited Loading

Choose a reason for hiding this comment

Snuffleupagus commented Aug 28, 2017

timvandermeij commented Aug 28, 2017

timvandermeij commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

From: Bot.io (Linux m4)

Received

pdfjsbot commented Sep 3, 2017

From: Bot.io (Linux m4)

Success

Published

timvandermeij commented Sep 3, 2017

pdfjsbot commented Sep 3, 2017

From: Bot.io (Linux m4)

Received

pdfjsbot commented Sep 3, 2017

From: Bot.io (Windows)

Received

pdfjsbot commented Sep 3, 2017

From: Bot.io (Linux m4)

Success

pdfjsbot commented Sep 3, 2017

From: Bot.io (Windows)

Success

timvandermeij commented Sep 3, 2017

Snuffleupagus commented Sep 4, 2017

Re-write `PDFHistory` from scratch #8775

Re-write `PDFHistory` from scratch #8775

Snuffleupagus commented Aug 13, 2017 •

edited

Loading

timvandermeij left a comment •

edited

Loading

timvandermeij Aug 27, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017 •

edited

Loading

timvandermeij Aug 27, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017 •

edited

Loading

Snuffleupagus Aug 28, 2017 •

edited

Loading