Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refresh with connector only once #2547

Closed
2 tasks done
nabokihms opened this issue Jun 1, 2022 · 0 comments · Fixed by #2692
Closed
2 tasks done

Refresh with connector only once #2547

nabokihms opened this issue Jun 1, 2022 · 0 comments · Fixed by #2692

Comments

@nabokihms
Copy link
Member

Preflight Checklist

  • I agree to follow the Code of Conduct that this project adheres to.
  • I have searched the issue tracker for an issue that matches the one I want to file, without success.

Problem Description

There is a problem with concurrent token refresh requests in Dex. If there is no lock on the application side, Dex receives many refresh requests simultaneously. Then Dex tries to go to an external provider for each request and refresh user info.

It works ok for providers like LDAP or Atlassian Crowd. Dex sends them credentials. They send user info back. Users can face minor inconsistency in returned data, but overall it works (yet with the number of unnecessary requests).

The most problematic providers have refresh flow with refresh tokens rotation, e.g., OIDC and Gitlab. Dex stores refresh tokens in the offline session's connector data field for future refreshes. It is impossible to send these providers the same refresh token twice because this token will be immediately invalidated.

Dex should provide consistency for refresh requests to avoid errors. Each refresh request should use data from the previous refresh request.

Proposed Solution

Dex has expiry.refreshToken.reuseInteval option. With this option enabled, Dex does not issue new refresh tokens for each concurrent request. Instead, Dex replies with the previously issued token that helps avoid concurrent request failures.

However, Dex still makes requests to an external provider, which was mentioned in #2113


My proposal - refresh a token with a connector only once during the reuse interval. Most Dex backends, e.g., etcd, Kubernetes provide consistency out of the box. We only have to use it right.

I made a POC for Kubernetes storage. The full patch can be found here.

Key points of the patch:

  1. Call of the refresh token with the connector method was moved to the updater function, which means Dex will call refresh between get and update operations.
refreshTokenUpdater := func(old storage.RefreshToken) (storage.RefreshToken, error) {
        ...
	// Dex will call the connector's Refresh method only once if request is not in reuse interval.
	newIdent, rerr := s.refreshWithConnector(ctx, rCtx, ident)
	if rerr != nil {
		return old, rerr
	}
	...
}
  1. Add locks to refresh tokens on updating. I chose a simple annotation lock implementation.

    • Before updating, each concurrent request tries to set an annotation on the refresh token object in Kubernetes
    • The one that succeeded goes forward and requests an external provider
    • Others who received 409 (conflict) status code start waiting for annotation to be removed
    • After successful refreshing, the first request removes an annotation and saves the update refresh token data to the Kubernetes object
    • Others see that annotation is gone, so they just read the previously updated refresh token from the Kubernetes object and return the data
func (cli *client) UpdateRefreshToken(id string, updater func(old storage.RefreshToken) (storage.RefreshToken, error)) error {
	lock := newRefreshTokenLock(cli)

	if err := lock.Lock(id); err != nil {
		return err
	}
	defer lock.Unlock(id)
	...
}
func (l *refreshTokenLock) Lock(id string) error {
	for i := 0; i <= 60; i++ {
		ok, err := l.setLockAnnotation(id)
		if err != nil {
			return err
		}
		if !ok {
			return nil
		}
		time.Sleep(lockCheckPeriod)
	}
	return fmt.Errorf("timeout waiting for refresh token %s lock", id)
}

The complete code for set_annotation can be found here in case if anyone is interested.

It guarantees that the refresh is called only once, even if there is more than just one instance of Dex.

Alternatives Considered

One of the alternatives I consider is to abstain from OfflineSession objects. This object is implied to be a link for all refresh tokens to have a convenient way to invalidate them. However, it seems that this is the only place in the Dex database schema where we need to use transactions to update both offline sessions and refresh tokens simultaneously because offline sessions contain connectorData, which is used for every token refresh.

  1. Move connectorData back to refresh tokens.
  2. Remove offline sessions.

Refresh flow will become much easier with these changes.

Additional Information

A related issue with an unknown OIDC provider #2360 (the invalid_grant error for refresh token grant often means that refresh token is invalid).

Gitlab connect behavior using refresh tokens - #2352 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant