-
Notifications
You must be signed in to change notification settings - Fork 665
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Do Discovery and Propagation in parallel. #7589
Comments
Is it possible to dynamically not only change the current value but also change the target value (means the 100% reference value)? If this is possible, then you have a full dynamic progressbar, feeded by two dynamic streams. The current processed items and the taget number which changes based on parallel running gathering. If described, this would not be confusing. |
Yes, that's possible, but that means that the progress might actually go backward. |
Worth a try maybe if users see that the total number of discovered files is also increasing. |
This looks sensible to me. The progress indication may be a reason for keeping the discovery and propagation separate, even if propagation would now already start before discovery finishes. That way progress indication is possible as soon as discovery finishes, exactly like it is right now. |
Right now, the sync algorithm first do a discovery step to see what changes over the whole sync folder, then propagates files.
As a result, it might take a very long time before even starting to download or upload the first file, and restarting the sync has to do the discovery step from scratch again.
In order to solve the problem, we could do the propagation of downloading / uploading files, as long as we've discovered that folder.
How:
There might be different approach. But I guess the easier would be to merge the discovery's
ProcessDirectoryJob
and thePropagateDirectory
into a single operation. I believe this might be the easier.They could be kept separate and still run in parallel, but then it might be harder to synchronize.
Note that in any case, all the removes would still be done at the end.
Currently, only the removal of directory are done at the end, but under the new approach, we would also have to perform the file removal at the end.
Problems:
Some features will stop working. Notably:
Non-problems
Something that will not be a problem is the detection of moves/rename. We do not need to have the whole sync tree in memory for that. The new discovery algorithm already does move detection quite well without that.
Alternative
We may choose not to do that at all and cache the result of the discovery in a new table: #2976
This would solve the problem that restarting the first sync before the discovery is finished restart from zero.
This will however not solve the memory usage problem which will be solved by the first approach, and it will still (i think) be slower that the paralelization.
The text was updated successfully, but these errors were encountered: