oC rolls back local changes made while discovery/sync was in progress #5437

mvglasow · 2017-01-08T13:20:42Z

I have a ~23 GB folder on the server and synced it with the client. At some point, I started reorganizing things inside (mostly moving files to a new location). It took me some time, and sync started while I was still at it.

Expected behaviour

I would ultimately expect all local changes to be propagated to the server. I can live with local changes made during a sync run not getting propagated in the current run but in the next one. In any case, if it is found during sync that some items have changed since discovery, handle that in a graceful manner (e.g. re-run partial discovery on those items, or skip them for this pass and schedule another sync run right after).

Actual behaviour

As I was moving files between folders while sync was in progress, I noticed files magically reappearing in their old location just after I’d moved them away, giving me two copies of each file—one in the old and one in the new location. I haven’t spotted any files getting lost in transit (deleted from new location, not restored in old location)—can’t rule it out, though, and that would need to be tested as well.

Steps to reproduce

Sync a large folder.
Make some changes (preferably large ones so discovery and sync will take a while).
Wait for discovery to begin.
Move some of the local files to a different folder (still within the local ownCloud mirror folder).

Note: as for 3, you might actually need to try three different scenarios:

during discovery
during sync
during sync of the folder you’re moving the files out of

Server configuration

Operating system: Raspbian Jessie

Web server: Apache

Database: MySQL

PHP version: PHP 5.6.29-0+deb8u1

ownCloud version: 9.1.3

Storage backend (external storage): CIFS (Samba on same server)

Client configuration

Client version: 2.1.1+dfsg-1ubuntu1.1

Operating system: Linux (Ubuntu MATE 16.04)

OS language: English

Installation path of client: N/A (package default)

Logs

Several hours of heavy usage have passed since the error occurred, thus the entries in question may be buried deep in the logs. Let me know if you need logs for this one.

mvglasow · 2017-01-10T12:03:06Z

Investigated into the issue a bit further: I made a few more attempts, this time carefully pausing sync before making changes, and resuming it after completing them. Still, a lot of changes got rolled back—even if no sync run is in progress while the changes are being made.

I also had a couple of sync attempts that aborted when oC reported an Internal Server Error.

I compared the current state of the folder (accessing the Samba share directly without going through oC) against an earlier backup. The move operations reverted during sync were either fully reverted or resulted in dupes but no files have been lost. Some of the dupes would share timestamps with the original files, in other cases the file in the new location had the timestamp of the sync operation, not the original file creation.

I have, however, had six files which were corrupted during the operation and were now zero bytes in size. Presumably synchronization failed just as these files were being copied.

Another thing I noticed is that sync would transfer certain files again in each pass, despite these files not having been touched.

Bottom line: the algorithm used for change detection is far from reliable, to the extent that I currently don’t want to entrust my data to it. The fact that sync runs occasionally abort is, in my opinion, a secondary issue, as we can never fully prevent this from happening (think power outages or dropped network connections). But sync needs to get more resilient and deal with this situation.

In particular, when sync fails while a file is just being transferred, it must be ensured that the partial transfer is not treated as a legitimate change (resulting in the corrupt file being propagated to all clients). Ideally, the server would roll the change back upon detecting such a condition, and the client would replay the update on the next sync run.

Mandatory test case for sync should include:

Change a large file, wait for it to sync, then interrupt the network link (e.g. pull the cable). Then restore the link, repeat sync and ensure the server has the latest version of the file and that it is intact.
Change a large file, wait for it to sync, then kill the client process the hard way (kill -9). Restart the server, repeat sync and ensure the server has the latest version of the file and that it is intact.

In the case of 2., further tests may be appropriate to rule out corruption of the client’s internal data structures having an impact on proper change detection.

This probably merits the labels [bug] [server-involved] and a severity of 2 or higher.

SamuAlfageme · 2017-01-10T12:15:31Z

@mvglasow thanks for the detailed scenario! I'll try to reproduce and extend these since, indeed, file corruption would be critical.

Bottom line: the algorithm used for change detection is far from reliable

As @guruz said in #2621 (comment):

the sync algorithm is written with "everything can fail/break any time" in mind

So let's reliably try to tackle down what's causing the problem here. If you can get some logs from your scenarios, that would be extremely helpful.

ogoffart · 2017-01-11T13:32:01Z

Client version: 2.1.1+dfsg-1ubuntu1.1

That's pretty old, and i'm sure there was improvements since then.

guruz · 2017-01-13T13:48:01Z

Indeed, please use a current version and re-open if it happens again

mvglasow mentioned this issue Jan 8, 2017

After sync aborts, resume in a more graceful manner #5414

Open

SamuAlfageme added type:bug Needs info labels Jan 10, 2017

guruz closed this as completed Jan 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

oC rolls back local changes made while discovery/sync was in progress #5437

oC rolls back local changes made while discovery/sync was in progress #5437

mvglasow commented Jan 8, 2017

mvglasow commented Jan 10, 2017

SamuAlfageme commented Jan 10, 2017

ogoffart commented Jan 11, 2017

guruz commented Jan 13, 2017

oC rolls back local changes made while discovery/sync was in progress #5437

oC rolls back local changes made while discovery/sync was in progress #5437

Comments

mvglasow commented Jan 8, 2017

Expected behaviour

Actual behaviour

Steps to reproduce

Server configuration

Client configuration

Logs

mvglasow commented Jan 10, 2017

SamuAlfageme commented Jan 10, 2017

ogoffart commented Jan 11, 2017

guruz commented Jan 13, 2017