OAuth2 token expiring during a very long discovery makes sync impossible #6814

ogoffart · 2018-10-15T13:29:05Z

As reported in #6677 and https://github.com/owncloud/enterprise/issues/2880:

The discovery of huge trees can takes long, and if it takes more than one hour, the oauth2 token expires and the sync aborts.
Next sync will start discovery from scratch again, which will take more than one hour again, and the client will never sync.

Workaround: use selective sync to avoid syncing part of the tree, wait untill this syncs successfully, then enable more directories, little by little.

ogoffart · 2018-10-15T13:45:49Z

Possible solutions:

Wait for 2.6 and the new_discovery_algorithm branch to be merged. This have a much faster discovery as it does work in parallel. Hopefully this can be down under 1h. In the long term (although this is not yet implemented), we can start propagation in parallel, which means that part already propagated do not need to be re-discovered on next sync.
Have a discovery cache ( Caching remote discovery results #2976 (comment) ) . But this would also be a 2.6 feature and would be pointless if we have propagation in parallel of discovery.
Maybe the most obvious: instead of breaking the sync when the token expire, just renew the token and continue. However, this is not as simple as it sounds.
I guess this could be do-able if we:
- In HttpCredentials::slotAuthentication when using oauth2, we need to request the new token. We need to introduce a new state while we are requesting a token.
- intercept that particular error in AbstractNetworkJob::slotFinished, and redo the request and enqueue it somewhere. (where?)
- Queue any other requests that are done while we are requesting the token.
- Once we have the token, we can re-do all the requests. In case of error, cancel them all.

All of that is quite some work, and maybe bit difficult to fix in 2.5.1

ckamm · 2018-10-15T14:34:35Z

I think 3 is the only approach that's not just a workaround: There could be odd situations where discovery takes a while longer, or other network activity that's broken by expiring tokens. Maybe some server admin makes the tokens expire much more frequently.

But as you say it's also error prone. There could even be multiple requests in flight that all come back with a token expired error. Restarting requests isn't too hard (done for redirects already) - but all the queuing and holding would be quite complex.

guruz · 2018-10-15T17:18:21Z

@ogoffart @ckamm Why not blindly renew the token every 10 minutes and for each new request use the current latest one? ("viel renew hilft viel")
(Do this independently of the authentication state machine, e.g. in parallel to everything else)
(I don't know how this works server side, will the old and new token be valid at same time or will this somehow then clash? CC @DeepDiver1975 )

ckamm · 2018-10-16T06:29:39Z

@guruz Does indeed sound easiest if both tokens would be valid at the same time for a bit. It'd break if people set expiry to less than the client's renew interval, but that sounds acceptable.

DeepDiver1975 · 2018-10-16T06:43:07Z

Refresh will kill the old token.
Afaik you are getting Back the expiration with the token. Can this help to know when to renew the token?

guruz · 2018-10-16T06:45:12Z

@DeepDiver1975 Meh.
It can help, but we'd still need to pause (and queue and later restart) any requests that we're about-to-send (and pause the sync).
How easy would it be to change the OAuth2 app to keep some tokens alive? (if @ogoffart and @ckamm agree that the client side approach in #6814 (comment) is OK)

DeepDiver1975 · 2018-10-16T07:02:50Z

Are you getting back an explicit error which tells you that the token has expired? Should be ...

Following the flow in the spec refreshing happens after expiration: https://tools.ietf.org/html/rfc6749#section-1.5

Killing the old access token after refresh is a requirement as per specs if I'm not misstaken

ogoffart · 2018-10-16T07:32:43Z

The server implementation is fine. The problem is the state machine we have client side, it needs to support several call at the same time, and be able to resume arbitrary calls.

Currently, the sync engine abort, and let the connection validator renew the token, then the sync restarts, from scratch.

As I said in the first comment, we would need to queue the pending call, and re-do call that failled.
This is definitively doable, but is not a trivial change.

I'll have a try implementing it, and let's see if we merge it in 2.5 or not.

guruz · 2018-10-16T14:21:13Z

we would need to queue the pending call

There is many pending calls, not only one. Also outside the sync engine you might have things (e.g. selective sync folder requests, activities etc) that you need to fail in a nice way or retry. Maybe a whole new state for each network job? PendingAuthTokenChange?

ogoffart · 2018-10-16T14:28:47Z

My idea was to ass a state in the http credentials.
It is possible, in AbstractNetworkJob::slotFinished to detect the job that failed for this reason, and restart them, or enqueue them, depending on the state.
The problem is that i don't know how to avoid starting new jobs. as they kind of create the QNetworkReply directly. Maybe by just aborting them so we reach AbstractNetworkJob::slotFinished

I actually hope that the jobs can be restarted. The redirection code seem to restart them, so i assume it is possible.

ckamm · 2018-10-17T11:20:58Z

Most request creation already funnels through Account::sendRawRequest and AbstractNetworkJob::sendRequest, but does indeed expect that a QNetworkReply will be available immediately. I imagine most callers don't really care about the reply being available immediately?

Some use it though and lots of places call AbstractNetworkJob::reply (though usually once finished only). I imagine making it possibly return 0 would need careful investiation of all of these.

Restarting is indeed already done for redirects: It works if the body isn't sequential (I don't think we ever use that) and if there's no implicit body. Implicit bodies happen for POST requests with url query parameters. Qt will automatically move the url args into the request body, and I didn't find a way to access that body or the url args afterwards. We could work around that by doing it ourselves.

OAuth2 access token typically only has a token valid for 1 hour. Before this patch, when the token was timing out during the sync, the sync was aborted, and the ConnectionValidator was then requesting a new token, so the sync can be started over. If the discovery takes longer than the oauth2 validity, this means that the sync can never proceed, as it would be always restarted from scratch. With this patch, we try to transparently renew the OAuth2 token and restart the jobs that failed because the access token was invalid. Note that some changes were required in the GETFile job because it handled the error itself and so it was erroring the jobs before its too late. Issue #6814

ogoffart mentioned this issue Oct 15, 2018

Owncloud deletes files & tries to sync again (with little success) #6677

Closed

guruz added the Discussion label Oct 15, 2018

guruz added this to the 2.5.1 milestone Oct 15, 2018

ogoffart mentioned this issue Oct 17, 2018

OAuth2: Refresh the token without aborting the sync #6819

Merged

ogoffart added the ReadyToTest QA, please validate the fix/enhancement label Oct 19, 2018

ogoffart closed this as completed Oct 31, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OAuth2 token expiring during a very long discovery makes sync impossible #6814

OAuth2 token expiring during a very long discovery makes sync impossible #6814

ogoffart commented Oct 15, 2018

ogoffart commented Oct 15, 2018

ckamm commented Oct 15, 2018

guruz commented Oct 15, 2018 •

edited

Loading

ckamm commented Oct 16, 2018 •

edited

Loading

DeepDiver1975 commented Oct 16, 2018

guruz commented Oct 16, 2018

DeepDiver1975 commented Oct 16, 2018

ogoffart commented Oct 16, 2018 •

edited

Loading

guruz commented Oct 16, 2018

ogoffart commented Oct 16, 2018

ckamm commented Oct 17, 2018 •

edited

Loading

OAuth2 token expiring during a very long discovery makes sync impossible #6814

OAuth2 token expiring during a very long discovery makes sync impossible #6814

Comments

ogoffart commented Oct 15, 2018

ogoffart commented Oct 15, 2018

ckamm commented Oct 15, 2018

guruz commented Oct 15, 2018 • edited Loading

ckamm commented Oct 16, 2018 • edited Loading

DeepDiver1975 commented Oct 16, 2018

guruz commented Oct 16, 2018

DeepDiver1975 commented Oct 16, 2018

ogoffart commented Oct 16, 2018 • edited Loading

guruz commented Oct 16, 2018

ogoffart commented Oct 16, 2018

ckamm commented Oct 17, 2018 • edited Loading

guruz commented Oct 15, 2018 •

edited

Loading

ckamm commented Oct 16, 2018 •

edited

Loading

ogoffart commented Oct 16, 2018 •

edited

Loading

ckamm commented Oct 17, 2018 •

edited

Loading