Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

After sync aborts, resume in a more graceful manner #5414

Open
mvglasow opened this issue Dec 22, 2016 · 4 comments
Open

After sync aborts, resume in a more graceful manner #5414

mvglasow opened this issue Dec 22, 2016 · 4 comments

Comments

@mvglasow
Copy link

mvglasow commented Dec 22, 2016

When sync aborts, e.g. due to a flaky network connection or an overloaded server, discovery (of changed items) seems to start all over again.

When syncing large amounts of data in a setup that is experiencing this issue, this behavior adds a huge overhead (each abort/resume causes a full discovery even though very little, if anything at all, may have changed since the last discovery).

Suggestion:

  • When sync aborts during discovery and is resumed, do not discard previous discovery results but resume where we left off.
  • When sync aborts during the actual transfer and is resumed, continue with transfer based on old sync results.
  • Optionally, do another discovery cycle after an aborted and resumed sync.

Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@dragotin
Copy link
Contributor

dragotin commented Jan 3, 2017

I do not think that this will work. If a discovery run breaks and later continues, how would the system be able to know how long this "later" would be, and if the filetree has not changed meanwhile? That is a problem that can not be solved.

A way to approach this kind of problems is to speed up the discovery phase, as discussed in #4117 for example.

And the best option probably would be to constantly keep the file tree in memory and only apply changes through the local file watcher or the remote ETag changes and process the differences from there. But that is probably far out. @ogoffart @ckamm

@mvglasow
Copy link
Author

mvglasow commented Jan 6, 2017

As I understand it, the issue of files changing after discovery would also arise during a discovery run that just takes a long time to complete:

  • Discovery starts
  • Fairly soon after, file foo is scanned and found unchanged.
  • While discovery is still in progress, someone changes foo on the client or on the server.

Would the change to foo be picked up? If not, resuming an aborted download does not really introduce any new issues.

Speeding up discovery through parallelization, as you mention, might mitigate this issue but probably wouldn't eliminate it.

As for how long “later” would be—that is always going to be an arbitrary decision. I’d resume rather than start over if the duration of the interruption is inferior to the duration of the sync run.

If you want to rework the whole sync logic, the cleanest way is probably to keep a journal of changes on the server and on each client, and remember the last journal entry that has been synced.

  • Periodically check both journals and trigger sync if there are new changes.
  • When the client comes up, trigger a full scan. Keep a local database of mirrored files and their attributes (timestamp and hash value at a minimum), and when one of these attributes is found to have changed, update the database and create a journal entry.
  • Similar logic as on the client might be needed for external storage on the server.

@mvglasow
Copy link
Author

mvglasow commented Jan 8, 2017

Update: apparently the “change files during sync” scenario has issues of that very same kind, which are potentially even more severe (modifications made during sync get reset), see #5437. That needs to be tackled with priority, but maybe this one can be considered in the solution.

@guruz
Copy link
Contributor

guruz commented Apr 5, 2017

Would be improved by #2976

@fmoc fmoc added the Stale label Mar 23, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants