If a file download fails with 503, don't abort the whole sync #5088

guruz · 2016-07-28T12:01:54Z

The failing file will fail with 503 Service Unavailable, the 2 other concurrently syncing files will get Operation Cancelled.

We could actually continue to sync the other 2 files.

HOWEVER what would this mean if all other files (more than the 3 currently syncing) lead to a 503? We'd probably not want to get an endless 503 sync..

dragotin · 2016-07-28T12:07:35Z

Where does the 503 come from?

The definition of 503 is: If the 503 is returned to a PROPFIND on root, it means that the server is in maintenance, and the whole sync goes offline.

If the 503 is returned on any operation on an individual file or (more imnportant) folder, it means that the folder is not available current, and is IGNORED. That behavior is important for external storages that can go away because of their own problems. The server will return 503 for them in that case, to avoid the client remove all their contents in the dir.

guruz · 2016-07-28T12:22:04Z

@dragotin See bug subject, file download :-)

guruz · 2016-09-26T17:49:22Z

Related #5187

moscicki · 2018-11-14T13:36:31Z

Yes, we also see this and it affects our production services. One GET 503 will block the entire synchronization for all unrelated files as well.

guruz · 2018-11-16T16:02:42Z

@moscicki (semi-related) How long are your files usually 503?

ckamm · 2018-11-21T10:33:14Z

Mmh, looks like the client needs more careful backoff then. The other ask for 503s was "please give the server some breathing room when you see it" (for the overload or maintenance case). I guess we could keep going after the first and only abort when a certain number of errors have accumulated.

@moscicki If I understand you correctly you're using 503s on a resource level and they can be persistent? Is this related to the overloading with "503 storage unavailable"?

guruz · 2018-11-21T17:13:59Z

I guess we could keep going after the first and only abort when a certain number of errors have accumulated.

But what's the number?
Indeed maybe it would be better for @moscicki to mark the whole directory as "503 storage unavailable" as this won't abort the sync?

moscicki · 2018-11-21T20:29:01Z

Cannot do that. We get 503 when a file to be downloaded is "busy" (similar to "locked"). A problem is that sometime files may stay a long time in such state.

…

On Wed, Nov 21, 2018 at 6:14 PM Markus Goetz ***@***.***> wrote: I guess we could keep going after the first and only abort when a certain number of errors have accumulated. But what's the number? Indeed maybe it would be better for @moscicki <https://github.com/moscicki> to mark the whole directory as "503 storage unavailable" as this won't abort the sync? — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#5088 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAl9jUFuQTC0cSJohG-IcqxASRbUMfUjks5uxYnfgaJpZM4JXJDQ> .

-- --- Best regards, Kuba

guruz · 2018-11-22T07:25:42Z

@butonic @DeepDiver1975 @PVince81 @ogoffart Didn't we somewhere say we could support sending a Retry-After header (optional) on 503 GET in case we think the file will come back up soon and should not abort the whole sync?

PVince81 · 2018-11-23T16:06:40Z

AFAIK the server returns 503 either when the whole server is not available "Service not available" or whenever the underlying external storage is temporarily unavailable "Storage not available".

We have some logic that detects "unknown errors" in the storage and maps them to "Storage not available" so it is possible that in some specific yet undiscovered scenarios, legit storage specific errors that should actually be mapped to other errors.

As for "Retry-After", we could add that in the StorageNotAvailableException on Sabre level.

guruz · 2018-11-23T16:28:35Z

@moscicki Would this be OK for you? CERNBox could send Retry-After, then we treat the 503 differently?

moscicki · 2018-11-24T06:34:25Z

So if I understand correctly if a file GET would return 503 + Retry-After then the sync would not be aborted but a GET retried after the specified amount of time? Without a Retry-After header the sync would be aborted as it is now.

Is this interpretation correct?

moscicki · 2018-11-29T15:43:02Z

Ping.

Also, is there any contradiction of what we discuss here with this issue: #3932 ?

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

guruz · 2018-12-17T13:07:46Z

It looks like the Retry-After suggestion is indeed #3932 (CC @butonic )

@ogoffart has implemented something else though in https://github.com/owncloud/client/pull/6906/files#diff-7e6f8d2579afb992a2e4ed25a154935bR78 ... so as long as the 503 doesnt look like the maintenance-mode-503 the sync will go on.
@ogoffart when will the next sync be triggered?

labkode · 2019-07-19T13:27:10Z

@guruz what is the status of this?

michaelstingl · 2019-07-22T08:22:59Z

@guruz what is the status of this?

@labkode It should be improved in the upcoming 2.6.0. Please verify with the 2.6.0-alpha1 and provide feedback:
https://github.com/owncloud/client/releases/tag/v2.6.0-alpha1

HanaGemela · 2019-07-29T15:20:05Z

Why have I got the message twice? Logs shared with @ckamm and @guruz

Also the Not synced protocol shows two entries - one without file name

HanaGemela · 2019-07-30T09:03:28Z

Not really sure how to recreate but sometimes the message looks like this. Logs shared

HanaGemela · 2019-07-30T09:30:18Z

I've tried with server in maintenance mode. While in maintenance mode, I've unchecked one folder in selective sync list and disabled VFS for another account. After turning off the maintenance mode, the client crashed 48a065f1-c5e8-4541-b427-9355b200dd62
Not sure if it is related to this issue or not. I've found the scenario with maintenance mode in the top level issue. In case it is not related, I'll raise a new issue

ckamm · 2019-07-30T11:31:21Z

@HanaGemela Thanks for the logs. EDIT: Made a PR against the error duplication.

The crash is unrelated and worth its own issue if you can reproduce it. EDIT: The crash on sentry looks like an unspecific error in the Qt network stack unfortunately: https://sentry.io/organizations/owncloud/issues/1131537840/events/48a065f1c5e84541b4279355b200dd62/?project=79001&statsPeriod=14d&utc=true

Previously fatal error texts were duplicated: Once they entered the SyncResult via the SyncFileItem and once via syncError(). syncError is intended for folder-wide sync issues that are not pinned to particular files. Thus that duplicated path is removed. For #5088

HanaGemela · 2019-08-06T12:16:24Z

@ckamm Great, now I see the error message only once.

Still some issues here:

Sometimes the error is e:http:response_header as already described above
When syncing a folder that is unavailable, the rest of the files are not synced
When the folder becomes available, the client doesn't recover - folder is still shown as unavailable. I believe this is a server issue though

ckamm · 2019-08-07T09:41:00Z

@HanaGemela

"Undefined variable: http_response_header" is an error message the server sends. I don't know why it's intermittent.
A 503 during discovery should mean "ignore that folder" and the sync should continue with other files. A 503 during propagation is normally a normal error where other files will be synced after. The exception is if the 503 reply contains ">Sabre\DAV\Exception\ServiceUnavailable<" because then we assume maintenance mode has been switched on and abort the sync.

HanaGemela · 2019-08-07T09:48:20Z

@ckamm then it's broken, the sync should continue but it doesn't

ckamm · 2019-08-07T10:13:45Z

@HanaGemela How do you set up the 503 reply for your tests? I need it to reproduce your specific case.

HanaGemela · 2019-08-07T11:57:10Z

@ckamm Start two servers:

docker run -e OWNCLOUD_SHARING_FEDERATION_ALLOW_HTTP_FALLBACK=true -p 8080:8080 owncloud/server
docker run -e OWNCLOUD_SHARING_FEDERATION_ALLOW_HTTP_FALLBACK=true -p 8081:8080 owncloud/server

Then:

User1 shares a folder with User2 from another server (federated share)
Turn off the first server
Check the account of User2

ckamm · 2019-08-12T06:57:34Z

Thanks, that helped. An unavilable federated share gets reported as:

<?xml version="1.0" encoding="utf-8"?>
<d:error xmlns:d="DAV:" xmlns:s="http://sabredav.org/ns">
  <s:exception>Sabre\DAV\Exception\ServiceUnavailable</s:exception>
  <s:message>Storage is temporarily not available</s:message>
</d:error>

but the error handling code assumes this ServiceUnavailable means maintenance mode and considers it a fatal error.

This is an unreliable workaround. The real fix will need to be deferred to another release. For #5088

ckamm · 2019-08-12T07:11:42Z

@HanaGemela I've added a PR for an unreliable workaround in 2.6.0 - a reliable fix will need to be deferred to 2.6.1. (it currently relies on the s:message string, which is a user-facing translatable string; a correct solution would trigger a follow-up request to status.php to check for maintenance mode)

This is an unreliable workaround. The real fix will need to be deferred to another release. For #5088

HanaGemela · 2019-09-06T18:46:53Z

Works as expected on 2.6.0rc1 (build 12411), mac 10.14.6

guruz added the Discussion label Jul 28, 2016

guruz added this to the 2.3.0 milestone Jul 28, 2016

guruz removed this from the 2.3.0 milestone Aug 24, 2016

guruz mentioned this issue Oct 6, 2016

Do not stop sync when error occurs #5187

Closed

SamuAlfageme mentioned this issue Sep 19, 2017

Never ever abort sync runs / check that the connection is indeed broken before aborting on a timeout #5859

Closed

SamuAlfageme mentioned this issue Nov 14, 2017

add 1-5 minutes of delay after maintenance mode was enabled #5872

Closed

guruz mentioned this issue Nov 14, 2018

synchronization is blocked when one file gets 503 error on GET #6883

Closed

guruz assigned ckamm Nov 14, 2018

guruz added the type:bug label Nov 16, 2018

guruz added this to the 2.6.0 milestone Nov 16, 2018

ogoffart added a commit that referenced this issue Nov 29, 2018

Propagator: Don't abort sync on error 503

1003948

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart mentioned this issue Nov 29, 2018

Propagator: Don't abort sync on error 503 #6906

Merged

ogoffart added a commit that referenced this issue Nov 29, 2018

Propagator: Don't abort sync on error 503

39331e9

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart added a commit that referenced this issue Dec 17, 2018

Propagator: Don't abort sync on error 503

8fac2bf

Only do it when it is actually a maintenance mode Issues #5088, #5859, https://github.com/owncloud/enterprise/issues/2637

ogoffart added the ReadyToTest QA, please validate the fix/enhancement label Dec 17, 2018

ckamm mentioned this issue Feb 18, 2019

"503 Service Unavailable" Error Issue on Almost All Files in Client Sync #7042

Closed

ckamm assigned ogoffart and unassigned ckamm Jul 23, 2019

ckamm mentioned this issue Jul 30, 2019

SyncEngine: Don't duplicate fatal errors #7352

Merged

HanaGemela removed the ReadyToTest QA, please validate the fix/enhancement label Aug 6, 2019

HanaGemela mentioned this issue Aug 7, 2019

Random crashes #7281

Closed

ckamm added a commit that referenced this issue Aug 12, 2019

Don't fatal on "Storage temporarily unavailable"

201f0bc

This is an unreliable workaround. The real fix will need to be deferred to another release. For #5088

ckamm added the PR available label Aug 12, 2019

ckamm mentioned this issue Aug 12, 2019

Don't fatal on "Storage temporarily unavailable" #7374

Merged

ckamm mentioned this issue Aug 12, 2019

Do not abort sync on unavailable federated shares #7375

Open

ckamm added ReadyToTest QA, please validate the fix/enhancement and removed PR available labels Aug 22, 2019

ckamm added a commit that referenced this issue Aug 22, 2019

Don't fatal on "Storage temporarily unavailable"

7da58fc

This is an unreliable workaround. The real fix will need to be deferred to another release. For #5088

HanaGemela closed this as completed Sep 6, 2019

HanaGemela mentioned this issue Sep 6, 2019

Missing notification when trying to download from unavailable storage #7460

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

If a file download fails with 503, don't abort the whole sync #5088

If a file download fails with 503, don't abort the whole sync #5088

guruz commented Jul 28, 2016 •

edited by SamuAlfageme

Loading

dragotin commented Jul 28, 2016

guruz commented Jul 28, 2016

guruz commented Sep 26, 2016

moscicki commented Nov 14, 2018

guruz commented Nov 16, 2018

ckamm commented Nov 21, 2018 •

edited

Loading

guruz commented Nov 21, 2018

moscicki commented Nov 21, 2018 via email

guruz commented Nov 22, 2018

PVince81 commented Nov 23, 2018

guruz commented Nov 23, 2018

moscicki commented Nov 24, 2018

moscicki commented Nov 29, 2018

guruz commented Dec 17, 2018

labkode commented Jul 19, 2019

michaelstingl commented Jul 22, 2019 •

edited

Loading

HanaGemela commented Jul 29, 2019 •

edited

Loading

HanaGemela commented Jul 30, 2019

HanaGemela commented Jul 30, 2019

ckamm commented Jul 30, 2019 •

edited

Loading

HanaGemela commented Aug 6, 2019

ckamm commented Aug 7, 2019

HanaGemela commented Aug 7, 2019

ckamm commented Aug 7, 2019

HanaGemela commented Aug 7, 2019

ckamm commented Aug 12, 2019

ckamm commented Aug 12, 2019

HanaGemela commented Sep 6, 2019

If a file download fails with 503, don't abort the whole sync #5088

If a file download fails with 503, don't abort the whole sync #5088

Comments

guruz commented Jul 28, 2016 • edited by SamuAlfageme Loading

dragotin commented Jul 28, 2016

guruz commented Jul 28, 2016

guruz commented Sep 26, 2016

moscicki commented Nov 14, 2018

guruz commented Nov 16, 2018

ckamm commented Nov 21, 2018 • edited Loading

guruz commented Nov 21, 2018

moscicki commented Nov 21, 2018 via email

guruz commented Nov 22, 2018

PVince81 commented Nov 23, 2018

guruz commented Nov 23, 2018

moscicki commented Nov 24, 2018

moscicki commented Nov 29, 2018

guruz commented Dec 17, 2018

labkode commented Jul 19, 2019

michaelstingl commented Jul 22, 2019 • edited Loading

HanaGemela commented Jul 29, 2019 • edited Loading

HanaGemela commented Jul 30, 2019

HanaGemela commented Jul 30, 2019

ckamm commented Jul 30, 2019 • edited Loading

HanaGemela commented Aug 6, 2019

ckamm commented Aug 7, 2019

HanaGemela commented Aug 7, 2019

ckamm commented Aug 7, 2019

HanaGemela commented Aug 7, 2019

ckamm commented Aug 12, 2019

ckamm commented Aug 12, 2019

HanaGemela commented Sep 6, 2019

guruz commented Jul 28, 2016 •

edited by SamuAlfageme

Loading

ckamm commented Nov 21, 2018 •

edited

Loading

michaelstingl commented Jul 22, 2019 •

edited

Loading

HanaGemela commented Jul 29, 2019 •

edited

Loading

ckamm commented Jul 30, 2019 •

edited

Loading