Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using paginate with the retry plugin #33

Closed
steve-norman-rft opened this issue Apr 29, 2020 · 13 comments
Closed

Using paginate with the retry plugin #33

steve-norman-rft opened this issue Apr 29, 2020 · 13 comments
Assignees
Labels
Type: Support Any questions, information, or general needs around the SDK or GitHub APIs

Comments

@steve-norman-rft
Copy link

Can the retry plugin be used for calls that are being used with the paginate plugin?

I'm trying to workaround an issue where GitHub App installation tokens occasionally don't work if used too quickly after they've been created, and GitHub have suggested retrying requests after a short time (error returned by the GitHub API is 401 Bad Authentication). Trying to make use of the retry plugin along with the paginate plugin and it looks like the paginate plugin ignores any settings from the retry plugin when either specified as part of the call to paginate, or when setup on the octokit object.

Tested this by using a bad API url as the 401 error is sporadic so hard to test.

Log output when trying to use the paginate plugin:

  sdlc:github:info Wed Apr 29 2020 11:19:33 GMT+0100 (British Summer Time), Before Hook +0ms
  sdlc:github Wed Apr 29 2020 11:19:33 GMT+0100 (British Summer Time), Error Hook +0ms
  sdlc:github URL: https://api.github.com/orgs/octokit/credential-authorizations2?per_page=100, Status: 404 +1ms
  sdlc:github RateLimit: 5000, RateLimitRemaining: 4982 +0ms

Log output when trying to use a normal request:

  sdlc:github:info Wed Apr 29 2020 11:19:56 GMT+0100 (British Summer Time), Before Hook +0ms
  sdlc:github Wed Apr 29 2020 11:20:23 GMT+0100 (British Summer Time), Error Hook +0ms
  sdlc:github URL: https://api.github.com/orgs/octokit/credential-authorizations2?per_page=100, Status: 404 +0ms
  sdlc:github RateLimit: 5000, RateLimitRemaining: 4976 +0ms

This is with a retries of 5 and a retryAfter of 5 to make it obvious that retries are happening. The paginate call errors out immediately, the request call errors out over 25 seconds after the before hook is called. So it looks to me like paginate isn't honouring the retry plugin whereas request is?

I've tried to get this running in runkit, https://runkit.com/steve-norman-rft/5ea944308e20c1001ad95df5, but for some reason the paginate plugin is causing an error when it is included:

image

The error is generated from line 14 of the notebook. If I just use the retry plugin the first test completes as expected with the retries, the second fails as paginate doesn't exist.

@gr2m gr2m added the Type: Support Any questions, information, or general needs around the SDK or GitHub APIs label May 5, 2020
@gr2m gr2m self-assigned this May 6, 2020
@gr2m
Copy link
Contributor

gr2m commented May 6, 2020

The retry plugin ignores responses with a 401 response by default:

https://github.com/octokit/plugin-retry.js/blob/ab2523129ebe464e2016073b5f7e7624ce7bc634/src/index.ts#L17

I'm trying to workaround an issue where GitHub App installation tokens occasionally don't work if used too quickly after they've been created

Very interesting, I've never run into this myself or heard about this behavior before. If that is in fact a common problem, then I think we should build handling for it into the @octokit/auth-app authentication strategy. The way authentication strategies are hooked into the event lifecycle would allow us to build the retry-functionality in there.

Could you maybe share more of your code? Or could whoever you talked to from GitHub comment here with some more insights? They can also contact me in the internal GitHub Slack if that works better

I've tried to get this running in runkit, but for some reason the paginate plugin is causing an error when it is included:

@octokit/plugin-paginate-rest exports { paginateRest }, not { paginate }

@gr2m gr2m added the Status: Needs info Full requirements are not yet known, so implementation should not be started label May 6, 2020
@gr2m gr2m removed their assignment May 6, 2020
@steve-norman-rft
Copy link
Author

steve-norman-rft commented May 7, 2020

Thanks for looking into this. I've fixed the runkit example to show that I don't think paginate is using the retry plugin. Here is the output:

image

The paginate call fails after 1 call not after the 3 retries that the request call did if you look at the timings of the log entries. I wonder if the perhaps I'm getting the wrong parameters to paginate to enable the retry plugin for that call?

Here is some code with which I was able to reproduce the authentication issue that sent me down the retry route this morning, took a couple of minutes till it failed though.

(async () => {
  try {
    const { Octokit } = require('@octokit/core')
    const { paginateRest } = require('@octokit/plugin-paginate-rest')
    const { enterpriseCloud } = require('@octokit/plugin-enterprise-cloud')
    const { retry } = require('@octokit/plugin-retry')

    const { createAppAuth } = require('@octokit/auth-app')

    const auth = createAppAuth({
      id: process.env.GITHUBAPPID,
      privateKey: process.env.GITHUBPRIVATEKEY.replace(/\\n/g, '\n') // Replace there as newlines get messed up by keychain, but just a standard pem file
    })

    const installationAuthentication = await auth({ type: 'installation', installationId: 12345, refresh: true })

    const PluginOctokit = Octokit.plugin(paginateRest, enterpriseCloud, retry)
    const octokit = new PluginOctokit({
      auth: installationAuthentication.token
    })

    const credentials = await octokit.paginate('GET /orgs/:org/credential-authorizations', {
      org: 'org-with-saml-enabled',
      per_page: 10
    })

    console.log(credentials.length)
  } catch (error) {
    console.log('Error: ', error)
    process.exit(1)
  }
})()

I then just ran the script with:

while node test-auth.js; do :; done

Which gave the output of (trimmed a bit and sanitised):

5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
5
Error:  RequestError [HttpError]: Bad credentials
    at /Users/steve/Dev/nodejs/github/node_modules/@octokit/core/node_modules/@octokit/request/dist-node/index.js:66:23
    at processTicksAndRejections (internal/process/task_queues.js:93:5)
    at async Job.doExecute (/Users/steve/Dev/nodejs/github/node_modules/bottleneck/light.js:405:18) {
  name: 'HttpError',
  status: 401,
  headers: {
    'access-control-allow-origin': '*',
    'access-control-expose-headers': 'ETag, Link, Location, Retry-After, X-GitHub-OTP, X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, X-OAuth-Scopes, X-Accepted-OAuth-Scopes, X-Poll-Interval, X-GitHub-Media-Type, Deprecation, Sunset',
    connection: 'close',
    'content-length': '83',
    'content-security-policy': "default-src 'none'",
    'content-type': 'application/json; charset=utf-8',
    date: 'Thu, 07 May 2020 08:18:33 GMT',
    'referrer-policy': 'origin-when-cross-origin, strict-origin-when-cross-origin',
    server: 'GitHub.com',
    status: '401 Unauthorized',
    'strict-transport-security': 'max-age=31536000; includeSubdomains; preload',
    vary: 'Accept-Encoding, Accept, X-Requested-With',
    'x-content-type-options': 'nosniff',
    'x-frame-options': 'deny',
    'x-github-media-type': 'github.v3; format=json',
    'x-github-request-id': 'FFA9:2D3E7:28187E7:2EEC0BC:5EB3C458',
    'x-ratelimit-limit': '60',
    'x-ratelimit-remaining': '59',
    'x-ratelimit-reset': '1588843112',
    'x-xss-protection': '1; mode=block'
  },
  request: {
    method: 'GET',
    url: 'https://api.github.com/orgs/org-with-saml-enabled/credential-authorizations?per_page=10',
    headers: {
      accept: 'application/vnd.github.v3+json',
      'user-agent': 'octokit-core.js/2.5.0 Node.js/12.13.1 (macOS Catalina; x64)',
      authorization: 'token [REDACTED]'
    },
    request: { hook: [Function: bound bound register] }
  },
  documentation_url: 'https://developer.github.com/v3'
}

What I am doing is "extreme" in that installation tokens aren't being cached, but that it can fail is what I need to recover from. I did also remove the token redaction while testing last week so that I could ensure that the authorization token was being set with the token being generated and that was the case for all of the times I saw failures. I only really spotted this when I was doing work over all of the organisations that had the app installed, although only 5, and would occasionally get one failing, this testing wasn't as extreme as the while loop above, just occasional runs to test some code. I thought it might be an odd caching situation but I did get it at one point when I ran the script the first time so there should be no caching going on at that point.

@droidpl is aware of this and there is also a support ticket open with some details (Javier asked me to raise it as they had better access to the logs). The support ticket, 665607, has the following information in it:

I looked at the logs, and it's likely that you are running into a race condition between generating the installation token and trying to use it. There's about 200ms between calls. (Image on the ticket, has details that shouldn't be here I think)
Sometimes it works, sometimes it doesn't. You might want to alter the logic somewhat, either 4-5 retries with an exponential backoff, or at least a delay between the calls.

And after I queried that (as the documentation does not say that installations token might not work when generated) they replied:

However, there is no way to ensure all authentication database replicas are updated instantaneously, so it's not a race condition we're going to be able to solve. Any time an object is created via the API, we recommend that you retry with exponential backoff when the object appears not to exist at first.

While doing some more investigation I did come across https://gh.neting.ccmunity/t5/GitHub-API-Development-and/Random-401-errors-after-using-freshly-generated-installation/td-p/22905 which gave a similar answer.

I did note that both answers so to use exponential backoffs of the retry which isn't something I think the retry plugin does currently.

Perhaps we should split this discussion though into two issues if that is easier as the bad credentials is a different issue and probably does belong on the auth-app really.

@gr2m gr2m removed the Status: Needs info Full requirements are not yet known, so implementation should not be started label May 7, 2020
@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

Thanks Steve for the extensive reply, much appreciated!

I'll investigate the retry/pagination plugins combination first. After that I will investigate the problem with the access tokens

@gr2m gr2m self-assigned this May 7, 2020
@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

See my runkit here:
https://runkit.com/gr2m/octokit-plugin-paginate-and-retry

  • I changed the error code from 401 to 500, because 401s are exempt from retries by default, see https://github.com/octokit/plugin-retry.js/#readme

    Retries requests for server 4xx/5xx responses except 400, 401, 403 and 404.

  • I added a log message after logging the errors: request failed after ${error.request.request.retryCount} retries to show that the request was indeed retried, with and without using the .paginate method

@steve-norman-rft
Copy link
Author

Thanks, but I think it still shows the problem?

The .request call is retrying with error 500's, but the paginate isn't as far as the logging is concerned?

Also, in my test with the 401 errors the .request was retrying 3 times even though it was receiving an error 401 so by all accounts the retry plugin shouldn't be trying to retry? I've added the extra logging into my runkit as well in case that helps (it shows the same as yours by the look of it).

@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

The .request call is retrying with error 500's, but the paginate isn't as far as the logging is concerned?

it is, the logs is on page 2

image

Also, in my test with the 401 errors the .request was retrying 3 times even though it was receiving an error 401 so by all accounts the retry plugin shouldn't be trying to retry?

Hmmm I'll investigate, thank you for the clarification, I'm sorry I misunderstood you the first time

@steve-norman-rft
Copy link
Author

wasn't seeing page 2 on your runkit:

image

But changing mine to a 500 does show a second page. So that partly covers that. Paginate appears to be honouring the doNotRetry but from my runkit the .request call appears to be ignoring that even for 401s.

@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

Yes, I think you are right. Smells like a bug in the retry plugin to me, I'll investigate further and will keep you posted

When I remove the request.* options, it still tries 3 times, but only when setting the error code to 500

@steve-norman-rft
Copy link
Author

OK. I think being able to retry 401s / 404s might be something that is needed as well if the comment from support is correct:

Any time an object is created via the API, we recommend that you retry with exponential backoff when the object appears not to exist at first.

The way that comment reads to me they are saying that not only the installation token issues I am seeing throw 401s, but it could also mean that creating an issue via the api and checking it straight away might return a 404?

@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

I opened a separate internal issue with GitHub about it. I asked them to document that behavior in the docs, to make it "official". Once it is, I will built handling of that 401 directly into @octokit/auth-app, so you won't need the retry plugin. You won't need to worry about it at all. I'll keep you posted on that front, too :)

@steve-norman-rft
Copy link
Author

I opened a separate internal issue with GitHub about it. I asked them to document that behavior in the docs, to make it "official". Once it is, I will built handling of that 401 directly into @octokit/auth-app, so you won't need the retry plugin. You won't need to worry about it at all. I'll keep you posted on that front, too :)

If you can let @droidpl know the details of that he can track it as well for us.

@gr2m
Copy link
Contributor

gr2m commented May 7, 2020

I'm pretty sure that the installation access token delay is what's causing "Bad Credentials" errors for my WIP app, too.

image

I could never find out what was causing them, this is likely it.

@gr2m
Copy link
Contributor

gr2m commented Jul 26, 2020

I think this issue can be closed via octokit/auth-app.js#65?

@gr2m gr2m closed this as completed Jul 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Support Any questions, information, or general needs around the SDK or GitHub APIs
Projects
None yet
Development

No branches or pull requests

3 participants
@gr2m @steve-norman-rft and others