Refresh with connector only once #2547

nabokihms · 2022-06-01T17:22:58Z

Preflight Checklist

I agree to follow the Code of Conduct that this project adheres to.
I have searched the issue tracker for an issue that matches the one I want to file, without success.

Problem Description

There is a problem with concurrent token refresh requests in Dex. If there is no lock on the application side, Dex receives many refresh requests simultaneously. Then Dex tries to go to an external provider for each request and refresh user info.

It works ok for providers like LDAP or Atlassian Crowd. Dex sends them credentials. They send user info back. Users can face minor inconsistency in returned data, but overall it works (yet with the number of unnecessary requests).

The most problematic providers have refresh flow with refresh tokens rotation, e.g., OIDC and Gitlab. Dex stores refresh tokens in the offline session's connector data field for future refreshes. It is impossible to send these providers the same refresh token twice because this token will be immediately invalidated.

Dex should provide consistency for refresh requests to avoid errors. Each refresh request should use data from the previous refresh request.

Proposed Solution

Dex has expiry.refreshToken.reuseInteval option. With this option enabled, Dex does not issue new refresh tokens for each concurrent request. Instead, Dex replies with the previously issued token that helps avoid concurrent request failures.

However, Dex still makes requests to an external provider, which was mentioned in #2113

My proposal - refresh a token with a connector only once during the reuse interval. Most Dex backends, e.g., etcd, Kubernetes provide consistency out of the box. We only have to use it right.

I made a POC for Kubernetes storage. The full patch can be found here.

Key points of the patch:

Call of the refresh token with the connector method was moved to the updater function, which means Dex will call refresh between get and update operations.

refreshTokenUpdater := func(old storage.RefreshToken) (storage.RefreshToken, error) {
        ...
	// Dex will call the connector's Refresh method only once if request is not in reuse interval.
	newIdent, rerr := s.refreshWithConnector(ctx, rCtx, ident)
	if rerr != nil {
		return old, rerr
	}
	...
}

Add locks to refresh tokens on updating. I chose a simple annotation lock implementation.
- Before updating, each concurrent request tries to set an annotation on the refresh token object in Kubernetes
- The one that succeeded goes forward and requests an external provider
- Others who received 409 (conflict) status code start waiting for annotation to be removed
- After successful refreshing, the first request removes an annotation and saves the update refresh token data to the Kubernetes object
- Others see that annotation is gone, so they just read the previously updated refresh token from the Kubernetes object and return the data

func (cli *client) UpdateRefreshToken(id string, updater func(old storage.RefreshToken) (storage.RefreshToken, error)) error {
	lock := newRefreshTokenLock(cli)

	if err := lock.Lock(id); err != nil {
		return err
	}
	defer lock.Unlock(id)
	...
}

func (l *refreshTokenLock) Lock(id string) error {
	for i := 0; i <= 60; i++ {
		ok, err := l.setLockAnnotation(id)
		if err != nil {
			return err
		}
		if !ok {
			return nil
		}
		time.Sleep(lockCheckPeriod)
	}
	return fmt.Errorf("timeout waiting for refresh token %s lock", id)
}

The complete code for set_annotation can be found here in case if anyone is interested.

It guarantees that the refresh is called only once, even if there is more than just one instance of Dex.

Alternatives Considered

One of the alternatives I consider is to abstain from OfflineSession objects. This object is implied to be a link for all refresh tokens to have a convenient way to invalidate them. However, it seems that this is the only place in the Dex database schema where we need to use transactions to update both offline sessions and refresh tokens simultaneously because offline sessions contain connectorData, which is used for every token refresh.

Move connectorData back to refresh tokens.
Remove offline sessions.

Refresh flow will become much easier with these changes.

Additional Information

A related issue with an unknown OIDC provider #2360 (the invalid_grant error for refresh token grant often means that refresh token is invalid).

Gitlab connect behavior using refresh tokens - #2352 (comment)

The text was updated successfully, but these errors were encountered:

nabokihms added area/connector area/storage labels Jun 1, 2022

nabokihms mentioned this issue Jun 2, 2022

Tokens don't refresh in personal configs #2360

Open

3 tasks

nabokihms mentioned this issue Aug 8, 2022

[Gitlab] Refresh token is invalid or has already been claimed by another client #2613

Open

3 tasks

nabokihms mentioned this issue Oct 2, 2022

fix: refresh token only once for all concurrent requests #2692

Merged

sagikazarmark closed this as completed in #2692 Oct 3, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refresh with connector only once #2547

Refresh with connector only once #2547

nabokihms commented Jun 1, 2022

Refresh with connector only once #2547

Refresh with connector only once #2547

Comments

nabokihms commented Jun 1, 2022

Preflight Checklist

Problem Description

Proposed Solution

Alternatives Considered

Additional Information