Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

michaelstingl · 2017-06-26T11:59:35Z

Some server side errors cause the client to abort the current sync run. If the ownCloud admin or user is not able to fix the error, EVERY sync run will abort at the same position, and following files that don't have an error will NEVER sync because of this.

@SamuAlfageme We discussed this a time ago. Please link existing issues here or open new ones that could could cause this behaviour. Please also elaborate on the server improvements.

guruz · 2017-06-26T12:16:22Z

Please report only if this happens with latest client version as this is something which changed a lot back and forth :-)
(CC @ckamm @ogoffart )

SamuAlfageme · 2017-06-26T12:31:59Z

WIP

Sync aborted - on server/network error

Error from @cdamken's setup with external storages replying "connection closed" and "connection terminated" & the blacklist. -> related to Check if the file exists before trying to download it core#28598
Server (as ext.storage services "client") should return 403/4 on requests to Unauthorized/Unavailable -> Consider unreadable remote folders (e.g. negative ACLs) #4622
- If a file download fails with 503, don't abort the whole sync - If a file download fails with 503, don't abort the whole sync #5088

Sync aborted - on client/protocol error

csync errors - [macOS] csync errors do not appear in new error view and block the sync process #5888

Related Fixes

Possibility to deselect a folder (from the selective sync list) with errors that cause sync aborts - Selective sync: Skip excluded folders when reading db #5772

ckamm · 2017-06-28T09:31:15Z

Incomplete list of potential sync run aborts during propagation stage, by looking at FatalError in the code

can't write to database
network error > NoError and <= UnknownProxyError
network error 503
timeout on download
critical disk space error on download

ogoffart · 2017-07-21T06:47:33Z

I believe the current behavior is the correct one: We only abort the sync run if we experience fatal error (network down, server on maintainnence, ...)

@michaelstingl Do you have any concrete example of error that causes the sync to abort while it should not?

cdamken · 2017-08-01T15:05:56Z

I believe the current behavior is the correct one: We only abort the sync run if we experience fatal error (network down, the server on maintenance, ...)

Wenn a user has some shares and/or has some external storages and/or federated shares and/or some inconsistency in the oc_filecache because the database has a problem (in some cases is normally fixed with occ files:scan to the user, the user that shares, etc) the synclient just stops on the error and does not sync the new files and folders (even if they are in a different path)

IMO, the expected behavior, the sync-client should mark those files, and try to upload the new files (because those files are probably needed for someone else) and at the end send a summary the main errors.

Like:
files x, y, z in folder w could not be downloaded because the server reported: connection timeout. ->Contact your administrator
files a, b, c could not be uploaded because chunks weren't received correctly, we will try to upload later. -> if the problem persist please contact your administrator.
files m,n,o were successfully uploaded. -> Synced after the errors occurred.

ogoffart · 2017-08-01T15:42:02Z

files x, y, z in folder w could not be downloaded because the server reported: connection timeout. ->Contact your administrator

The problem is that "Connection timeout" is also the error we get when the network is down.
Maybe when we get a "connection timeout", we should run the connection validator once more and only report a FatalError if the connection is indeed broken. otherwise mark it as a NormalError. That's some work and it may not play well with the parallelism, but that's an option.

cdamken · 2017-08-01T16:18:13Z

The client get the error that is delivered from the ~~client~~ server, but as you said, is not necessarily disconnected, only for some files or folder that could have problems finishing the download/upload

michaelstingl · 2017-08-04T18:57:18Z

@SamuAlfageme Should we add #4622 (comment) to this meta-issue?

The current user experience is bad: in this case the client shows 403 error and aborts current sync run. The it restarts a new run which shows 403 and aborts. And so on.

michaelstingl · 2017-09-25T06:31:56Z

@SamuAlfageme ~~Another one~~ More: #6049 #6050 – could you verify?

michaelstingl · 2017-09-25T06:37:00Z

@guruz my initial focus wasn't about client side problems like flaky wifi, more about error states on the server side that abort the sync run every time at the same position, so that later files will never be synced (like in @cdamken 's example)

At ownCloud Conference 2017 a user even proposed to randomize the order to get a bigger chance that later files get synced, but I of course, I would prefer to fix the root cause of the problem.

SamuAlfageme · 2017-09-25T06:55:46Z

@michaelstingl

@SamuAlfageme ~~Another one~~ More: #6049 #6050 – could you verify?

Those are not sync aborts but crashes. Will look into them to see what introduced 'em.

michaelstingl · 2017-09-25T07:20:11Z

Those are not sync aborts but crashes.

Pain is the same: Later files will never get synced. oC desktop sync client need better strategies to deal with oC server or infrastructure fuckup.

guruz · 2017-09-25T09:26:18Z

At ownCloud Conference 2017 a user even proposed to randomize the order to get a bigger chance that later files get synced

This is something for @mrow4a ;-) SCNR

oC desktop sync client need better strategies to deal with oC server or infrastructure fuckup.

But if the local storage or the network connection go away then there is only so much you can do.. :-)

In general we all agree with the goal of making the discovery less costly and therefore earlier going to sync. 2.4 should already be better in this.

michaelstingl · 2017-09-25T10:56:52Z

But if the local storage or the network connection go away then there is only so much you can do.. :-)

@guruz In the cases above it's not about local storage and network connection. It's about mounted storages that the oC server can't access temporarily or permanently. My expectation is, that oC client continues sync with the next available file/folder. And it seems we have enough cases where this isn't the case.

ogoffart · 2017-09-26T07:02:25Z

The solution is what i write in
#5859 (comment)

When a propagator jobs end in a connection timeout, we assume it is because the network is down and we abort the sync. But before aborting the sync, we should do a connection validator to check if the network is down or if it is a problem with that file.

mrow4a · 2017-09-26T08:07:27Z

@ogoffar while it make sense for single file, when whole folder gets unavailable it is a signal that sth is wrong with the server. It could mean it is flaky network drive (user should manually remove it in server UI and client should halt for that period not to allow moving files across folders and materialize the state on faulty server) or federated share (which might lead to even worse because multiple users do changes and patent node of folder is down or faulty). What do you think ?

michaelstingl · 2018-04-09T07:06:15Z

Looks like another one: #6435

SamuAlfageme · 2018-06-01T06:58:21Z

HTTP Transmission errors: Crash fixes and valgrind error fix #6563 (comment) - individual bogus (missing essential csync attributes like ETag) files/folders can abort/block the whole sync progress.

michaelstingl · 2018-10-18T20:09:07Z

Another candidate: #6826

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

HanaGemela · 2019-06-03T12:14:14Z

@ogoffart @michaelstingl what exactly is ready to test here? Only abortion on error 503?

ogoffart · 2019-06-04T07:49:52Z

Right, when the server returns an error 503 for a file, (but the server is not in maintainence mode) it should still sync all other files.
I don't know exactly how to get error 503 on the server, this is usually because of a problem in the storage backend.
However, when the server is in maintainence mode, the error is also 503, but in this case, the sync needs to stop. (because otherwise all the files would be 503)

Maybe we can just close this issue.

HanaGemela · 2019-07-30T09:13:21Z

Closing this aggregating issue. All individual issues will be/were tested separately. 503 error will be tested in #5088

michaelstingl · 2019-07-30T10:42:02Z

Closing this aggregating issue. All individual issues will be/were tested separately.

👍 😍

michaelstingl added Discussion research Server Involved sev2-high labels Jun 26, 2017

michaelstingl assigned SamuAlfageme Jun 26, 2017

michaelstingl added blue-ticket p2-high labels Jun 26, 2017

guruz added this to the 2.4.0 milestone Jun 26, 2017

SamuAlfageme mentioned this issue Jul 11, 2017

[macOS] csync errors do not appear in new error view and block the sync process #5888

Closed

michaelstingl added technical debt and removed blue-ticket labels Jul 17, 2017

michaelstingl modified the milestones: 2.4.0, 2.5.0 Jul 28, 2017

guruz changed the title ~~Never ever abort sync runs~~ Never ever abort sync runs / be resilient to flaky network connection or wifi Aug 3, 2017

ogoffart changed the title ~~Never ever abort sync runs / be resilient to flaky network connection or wifi~~ Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout Sep 26, 2017

SamuAlfageme mentioned this issue Dec 5, 2017

Server replied: Locked and client didn't notified the user #5500

Closed

michaelstingl mentioned this issue Jan 18, 2018

Excludes: Further optimize also patterns with slash and add trailing slash to some of the items in existing sync-exclude.lst #5017

Closed

guruz modified the milestones: 2.5.0, 2.6.0 Mar 27, 2018

felixboehm added p3-medium Normal priority and removed sev2-high p3-medium Normal priority labels May 3, 2018

michaelstingl mentioned this issue May 14, 2018

Connection closed when unicode is present in filename #6516

Closed

SamuAlfageme mentioned this issue May 31, 2018

Crash fixes and valgrind error fix #6563

Merged

SamuAlfageme mentioned this issue Jun 4, 2018

Segfault when clicking on unconfigured SFTP server #6562

Closed

michaelstingl mentioned this issue Oct 18, 2018

PROPFIND timeout: show folder with error, and don't abort sync run #6826

Closed

ogoffart added a commit that referenced this issue Nov 29, 2018

Propagator: Don't abort sync on error 503

1003948

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart mentioned this issue Nov 29, 2018

Propagator: Don't abort sync on error 503 #6906

Merged

ogoffart added a commit that referenced this issue Nov 29, 2018

Propagator: Don't abort sync on error 503

39331e9

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart added a commit that referenced this issue Dec 17, 2018

Propagator: Don't abort sync on error 503

8fac2bf

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart added the ReadyToTest QA, please validate the fix/enhancement label Dec 17, 2018

michaelstingl mentioned this issue May 22, 2019

One broken folder stops the entire sync #7199

Closed

michaelstingl added p3-medium Normal priority and removed p2-high labels May 29, 2019

HanaGemela closed this as completed Jul 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

michaelstingl commented Jun 26, 2017

guruz commented Jun 26, 2017

SamuAlfageme commented Jun 26, 2017 •

edited

Loading

ckamm commented Jun 28, 2017

ogoffart commented Jul 21, 2017

cdamken commented Aug 1, 2017

ogoffart commented Aug 1, 2017

cdamken commented Aug 1, 2017 •

edited by michaelstingl

Loading

michaelstingl commented Aug 4, 2017

michaelstingl commented Sep 25, 2017 •

edited

Loading

michaelstingl commented Sep 25, 2017

SamuAlfageme commented Sep 25, 2017

michaelstingl commented Sep 25, 2017

guruz commented Sep 25, 2017

michaelstingl commented Sep 25, 2017

ogoffart commented Sep 26, 2017

mrow4a commented Sep 26, 2017

michaelstingl commented Apr 9, 2018

SamuAlfageme commented Jun 1, 2018

michaelstingl commented Oct 18, 2018

HanaGemela commented Jun 3, 2019

ogoffart commented Jun 4, 2019

HanaGemela commented Jul 30, 2019

michaelstingl commented Jul 30, 2019

Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

Comments

michaelstingl commented Jun 26, 2017

guruz commented Jun 26, 2017

SamuAlfageme commented Jun 26, 2017 • edited Loading

Sync aborted - on server/network error

Sync aborted - on client/protocol error

Related Fixes

ckamm commented Jun 28, 2017

ogoffart commented Jul 21, 2017

cdamken commented Aug 1, 2017

ogoffart commented Aug 1, 2017

cdamken commented Aug 1, 2017 • edited by michaelstingl Loading

michaelstingl commented Aug 4, 2017

michaelstingl commented Sep 25, 2017 • edited Loading

michaelstingl commented Sep 25, 2017

SamuAlfageme commented Sep 25, 2017

michaelstingl commented Sep 25, 2017

guruz commented Sep 25, 2017

michaelstingl commented Sep 25, 2017

ogoffart commented Sep 26, 2017

mrow4a commented Sep 26, 2017

michaelstingl commented Apr 9, 2018

SamuAlfageme commented Jun 1, 2018

michaelstingl commented Oct 18, 2018

HanaGemela commented Jun 3, 2019

ogoffart commented Jun 4, 2019

HanaGemela commented Jul 30, 2019

michaelstingl commented Jul 30, 2019

SamuAlfageme commented Jun 26, 2017 •

edited

Loading

cdamken commented Aug 1, 2017 •

edited by michaelstingl

Loading

michaelstingl commented Sep 25, 2017 •

edited

Loading